Python Iterate over Symbols

I am trying to infer a median string search for a sequence in the ACGT genome. The problem I have is to say AAAAAAAA AAAAAAAC and so on until I have tried all possible combinations.

In essence, I am in brute force by creating two lists, one of which contains A, C, G, T, and the other an 8-character sequence and after each repeated search and replacement of characters. The problem is that I do not test all combinations, because when two iterations simultaneously jump the letter.

Is there any way to go AAAAAAAA - AAAAAAAC - AAAAAAAG - AAAAAAAT - AAAAAACA etc. easy?

+5
source share
4 answers

Using itertools

itertools.product("ACGT", repeat=8)
+10
source

itertools product(), (), () .., product() - . @jamylak.

+2

As stated above, use itertools,

itertools.product("ACGT", repeat=8) # will work in your case.
+2
source

Using regex inverter on page examples wiki Picard, invert a regular expression [ACGT]{8}. You can also try the online inverter in UtilityMill , but this server will timeout when generating 8-character strings, but I successfully received up to 6 characters in a valid amount of time.

+1
source

All Articles