I am looking for an efficient python solution to the following problem. I have a DNA sequence and would like to pick a random position in the string that satisfies the middle position of a certain triplet pattern. For example if the DNA string was "ACTGTGACTACTGGGGG", and the triple was "ACT", I would like to return, at random, one of positions 1, 7, 10.
>Solution :
You can find all the indices for the substring matches using re.finditer, then use random.choice to select one of those.
>>> import re
>>> import random
>>> s = 'ACTGTGACTACTGGGGG'
>>> f = 'ACT'
>>> random.choice([m.start() for m in re.finditer(f, s)])
0
>>> random.choice([m.start() for m in re.finditer(f, s)])
9
Note that indices in Python are 0-based so in your example the valid starting locations are 0, 6, and 9.