Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert a print statement to a dictionary Pandas

Here I am comparing a data frame to a list of standard values (seen below). Instead of the print statement I would like it convert it to a dictionary. Here is the code I have so far:

valid= {'Industry': ['Automotive', 'Banking / Finance','Biotech / Pharma','Commercial Buildings','Construction / Distribution',
                  'Consumer Products','Education','Education - K-12','Education - University / Higher','Entertainment / Media','Financial',
                  'Food & Beverage','Gas','Government','Government - Federal','Government - State / Local','Healthcare','High Security',
                  'Hospitality / Entertainment','Manufacturing / Communications','Other','Petrochem / Energy',
                  'Property Management / Real Estate','Public Facility / Non-Profit','Residential','Restaurant','Retail','Services - B2B',
                  'Technology','Telecom / Utilities','Transportation','Utilities','Food Retail','Specialized Retail','IT','Corrections',
                  'Core Commercial (SME)'],
        'SME Vertical': ['Agriculture, Food and Manufacturing','Architectural services','Arts, entertainment and recreation','Automobile',
                'Chemistry / Pharmacy','Construction','Education','Hotels','Offices','Other Industries','Other Services',
                'Project management and design','Real Estate and promotion','Restaurants, Café and Bars',
                'Energy, Infrastructure, Environment and Mining','Financial and Insurance Services',
                'Human health and social work activities','Professional, scientific, technical and communication activities',
                'Public administration and defence, compulsory social security','Retail/Wholesale','Transport, Logistics and Storage'],
        'System Type': ['Access','Access Control','Alarm Systems','Asset Tracking','Banking','Commander','EAS','Financial products','Fire',
                    'Fire Alarm','Integrated Solution','Intercom','Intercom systems','Intrusion - Traditional','Locking devices & Systems',
                    'Locks & Safes','Paging','Personal Safety','Retail & EAS Products','SaaS','SATS','Services',
                    'Sonitrol Integrated Solution','Sonitrol - Integrated Solution','Sonitrol - Managed Access',
                    'Sonitrol - Verified Audio Intrusion','Time & Attendance','TV-Distribution','Unknown','Video','Video Systems'],
        'Account Type': ['Commercial','International','National','Regional','Reseller','Residential','Small']}
 
mask = df1.apply(lambda c: c.isin(valid[c.name]))
df1.mask(mask|df1.eq(' ')).stack()
 
for r, v in df1.mask(mask|df1.eq(' ')).stack().iteritems():
    print(f'error found in row "{r[0]}", column "{r[1]}": "{v}" is invalid')

Here is the current output of the print statements

error found in row "1", column "Industry": "gas" is invalid
error found in row "1", column "SME Vertical": "hotels" is invalid
error found in row "2", column "Industry": "healthcare" is invalid
error found in row "3", column "Industry": "other" is invalid
error found in row "3", column "SME Vertical": "project management and design" is invalid
error found in row "4", column "Account Type": "small" is invalid

This output is good in terms of the format but I can’t get it to write to a dictionary.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Example output from the dictionary:

{error found in row “1”: column "Industry": "gas" is invalid, error found in row "1": column "SME Vertical": "hotels" is invalid …. Etc}

>Solution :

This is straightforward, but YOU need to decide what the format will be. What you have shown above is not a valid dictionary.

Maybe like this, as a list of dictionaries, one for each error?

errors = []
for r, v in df1.mask(mask|df1.eq(' ')).stack().iteritems():
    errors.append({
        "row": r[0],
        "column": r[1],
        "message": v + " is invalid"
    })
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading