I am trying to create a function where the user is able to input a file name (containing a DNA sequence), and the respective number of bases present in the selected file are counted and output onto the screen in the order: #A, #G, #C, #T. I then want to save that output as a new file under a user input name with extension .count. In bash I am then trying to concatenate those file (will be 100 in total) into a single .csv document with the following format:
| File | #A,#G,#C,#T |
|---|---|
| file.count 1 | 23,43,32,41 |
| file.count 2 | etc… |
To open the file, I have:
def openseq(filename):
filename=input("enter file to open: ")
openfile=open(filename,"r")
dnatext=print(openfile.read())
return dnatext
and then originally I was trying to next a for loop within (under dnatext) with the following:
for i in dnatext:
comma = ","
numberofbases=str(dnatext.count('A')) + comma + str(dnatext.count('G')) + comma + str(dnatext.count('C')) + comma + str(dnatext.count('T'))
return numberofbases
And then to save the file under a new name input by the user:
directory="<desired directory>" #removed directory for privacy
newname= input("Enter output file name: ")
filepath = directory + newname + ".count"
filepath.close()
But no matter how i move things around i either get the error message TypeError: ‘NoneType’ object is not iterable or that some variable is not defined. I’ve tried a few ways to try and resolve this but am just not having any luck and seeing as i am relatively new to coding (especially combination of python and bash) i would very much appreciate some help or even an explanation as to why I am unable to count the number of bases in the inputed sequence.
Ideally I am trying to get this all into 1 or 2 functions so I can call them easily in bash but I am not sure whether that is even possible.
>Solution :
It’s hard to tell without a MWE, but in general, "NoneType is not iterable" means (perhaps unsurprisingly) that you’re trying to iterate over the value None.
In your code you posted there’s only one place where you iterate:
for i in dnatext:
Here dnatext is expected to be iterable and your error suggests it’s in fact None.
The cause is probably the bug in this function:
def openseq(filename):
filename=input("enter file to open: ")
openfile=open(filename,"r")
dnatext=print(openfile.read()) # <-- this line
return dnatext
The print function doesn’t return anything (or returns None, depending on how you look at it). So this function will always return None.
Instead, you probably (but again, it’s hard to say) want:
def openseq(filename):
filename = input("enter file to open: ")
with open(filename,"r") as openfile:
dnatext = openfile.read()
print(dnatext)
return dnatext
which
- uses a context manager to also close the file handle for you, and
- prints and returns the data that was read (instead of just printing it)