I am trying to use pandas to convert a textfile with the following code…
AFNI_data_location = "C:\\Users\\jes6785\\Box\\Almeida_Lab\\BITREC\\BEHAVIORAL_FILES_CLEANED_JULY_2023\\"
BIDS_output_location = "C:\\Users\\jes6785\\Box\\Almeida_Lab\\BITREC\\BIDS\\"
condition=["circle", "emot", "neut", "square"] #should be listed in filename
subj = os.listdir(AFNI_data_location) # this lists out all the files in the directory - for later
#Identify the folders in the directory, directories are already split by sessions
for subj in subj:
print(subj)
subj_only = subj.split("-")[0]
session = subj.split("-")[1]
filedir = os.path.join(AFNI_data_location, subj)
participant_time=pd.DataFrame(columns=['onset', 'duration', 'trial-type'])
run="2"
for condition in condition: # loops through each condition
file = glob.glob(os.path.join(filedir, '*run' + run + '*' + condition + '*.txt'))
for file in file:
print(file)
data = pd.read_csv(file, sep = " ", header = None)
data = data.T
data.columns = ["onset"]
data.insert(1, "duration", run)
data.insert(2, "trial-type", condition)
participant_time= pd.concat([participant_time, data], ignore_index = True)
participant_time=participant_time.sort_values('onset')
participant_time.to_csv((os.path.join(BIDS_output_location,"sub-" + subj_only + "_ses-0" + session + "_task-cpt-" + run + "_events.tsv")), index=False, sep="\t")
The first time this loops through the list I get the expected output of (note I put this into table format so it’s clearer to read)…
| onset | duration | trial-type |
|---|---|---|
| 9.308 | 2 | square |
| 10.758 | 2 | neut |
| 12.957 | 2 | square |
| 16.485 | 2 | emot |
| 18.947 | 2 | circle |
| 21.947 | 2 | neut |
The second time and for all further loops I get just 1 character of the third column which should be "condition"
| onset | duration | trial-type |
|---|---|---|
| 9.308 | 2 | s |
| 9.308 | 2 | q |
| 9.308 | 2 | u |
| 9.308 | 2 | a |
| 9.306 | 2 | r |
| 9.308 | 2 | e |
How do I fix this? I don’t understand what I am doing wrong as it works correctly for the first loop? The first time it loops, it treats "condition" as a string but the second time it splits the strings into single characters and doesn’t do what I need.
>Solution :
You’re having this issue due to using the variable name condition as both the loop variable and the list of conditions. It creates a conflict causing unexpected output. Change the inner for loop to:
for cond in conditions:
files = glob.glob(os.path.join(filedir, '*run' + run + '*' + cond + '*.txt'))
for file in files:
print(file)
data = pd.read_csv(file, sep=" ", header=None)
data = data.T
data.columns = ["onset"]
data.insert(1, "duration", run)
data.insert(2, "trial-type", cond)
participant_time = pd.concat([participant_time, data], ignore_index=True)
participant_time = participant_time.sort_values('onset')