Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python variable inheritance/links

I observed something weird/new to me, which certainly comes handy in the moment, however I’d like to understand what’s happening in the background, to avoid unwanted modification of variables.

Take the code below (add_actual_wat_columns function), I create a new variable from a dictionary value (wat_days), modify it, and without actually putting it back in the original dictionary (df_dict), the original dictionary will also be updated.

Is this specific to pandas or a generic Python feature? If so, how can I avoid it when I need to?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Bonus question: is there a better way to typehint variable types so the syntax highlighting/autocomplete works properly in VSCode?

def main():
    file = "wapd_le.xlsx"
    raw = pd.read_excel(file, header=[0, 1])
    df_cols = list(raw.columns.unique(level=0))
    df_cols.pop(0)
    df_list = []

    for i in df_cols:
        df_list.append(raw["Vendor data"].join(raw[i]))
    df_dict = dict(zip(df_cols, df_list))

    print(df_dict.keys())
    sum_rows(df_dict)
    add_actual_wat_columns(df_dict)
    write_to_excel(df_dict)


def add_actual_wat_columns(df_dict: dict):
    """Creates proper WAT columns for each period.

    Base report from Cashcube sums FI doc payment terms, this function divides that value
    by the monthly FI doc count, adding new columns to the WAT Days dataframe.

    Args:
        df_dict (dict): contains dataframe descriptions as keys and dataframes as values.
    """
    # TODO refactor to have toggle for WAPD/WAT and add proper columns to either (both?) sheet.
    wat_days: pd.DataFrame
    wat_days = df_dict["WAT Days"]
    periods = list(wat_days.columns)[2:]
    actual_wat_periods = [str(x) + " actual WAT" for x in periods]
    wat_days[actual_wat_periods] = wat_days[periods].div(
        df_dict["Count (FI Document Number)"][periods]
    )
    wat_days["Sum actual WAT"] = wat_days[actual_wat_periods[0:-1]].mean(
        axis=1, numeric_only=True
    )
    wat_days.rename(columns={"Sum actual WAT": "Avg actual WAT"}, inplace=True)

>Solution :

Since your object is mutable, you are mutating it.*

wat_days = df_dict["WAT Days"] # wat_days is the object in df_dict
...
wat_days[actual_wat_periods] = ... # modify that object.

Another example:

things = {1: 2, 2: [3]}
x = things[1]
x += 1
y = things[2]
y[0] += 1
y += [1]
z = things[2][0]
z += 1
print(things)
# {1: 2, 2: [4, 1]}

do you see what is going on?
If you need an new copy of a mutable object, have a look at copy.

*Note that this is perhaps a backwards explanation—it’s just that x and z are modified by + by being replaced. The name-to-object binding works in the same way for x y and z.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading