Currently my database looks like this:
import pandas as pd
output = [['Owner', 'Database', 'Schema', 'Table', 'Column', 'Comment', 'Status'], ['', 'VICE_DEV', 'AIRFLOW', 'TASK_INSTANCE', '_LOAD_DATETIME', 'Load datetime'], ['', 'VICE_DEV', 'AIRFLOW', 'TEST', '_LOAD_FILENAME', 'load file name', 'ADDED']]
df = pd.DataFrame(output[1:], columns=output[0])
It looks like this:
Owner Database Schema ... Column Comment Status
0 VICE_DEV AIRFLOW ... _LOAD_DATETIME Load datetime None
1 VICE_DEV AIRFLOW ... _LOAD_FILENAME load file name ADDED
I’m now trying to loop through the Status col, and if Status is None or empty I want to create a list of SQL string queries, and go to the next row and check, etc.
I’ve tried this:
query_list = []
for column in df['Status']:
if column is None:
sql = f"ALTER TABLE {df['Table']} ALTER {df['Column']} COMMENT {df['Comment']}; "
# print(sql)
query_list.append(sql)
But its not correct, its getting both rows, instead I want to only get the rows with the column df[‘Status’] None (or empty). Also, the formatting is off. I’ve tried by creating an empty list and an empty df.
>Solution :
If I understand it correctly you want to create one query if Status is None. This query uses the values from the particular row that is None. The problem then is that you insert whole columns into your query and not the values from the particular row with Status==None. I think what you want is to iterate over all columns in your dataframe and then check if the status is None. If so, create the query based on that row. This can be done with the following code for example:
import pandas as pd
output = [['Owner', 'Database', 'Schema', 'Table', 'Column', 'Comment', 'Status'], ['', 'VICE_DEV', 'AIRFLOW', 'TASK_INSTANCE', '_LOAD_DATETIME', 'Load datetime'], ['', 'VICE_DEV', 'AIRFLOW', 'TEST', '_LOAD_FILENAME', 'load file name', 'ADDED']]
df = pd.DataFrame(output[1:], columns=output[0])
query_list = []
for index, row in df.iterrows():
if row['Status'] is None:
sql = f"ALTER TABLE {row['Table']} ALTER {row['Column']} COMMENT {row['Comment']}; "
# print(sql)
query_list.append(sql)
query_list
Output:
['ALTER TABLE TASK_INSTANCE ALTER _LOAD_DATETIME COMMENT Load datetime; ']