Home use Pandas to drop values from csv

Questions

use Pandas to drop values from csv

April 7, 2022

I have a csv that I want to use to search an api for data, but the row which stores the data used for the api search can contain a second value separated by ;
like this:

2 Jan Rohls Kunst und Religion zwischen Mittelalter und Barock : von Dante bis Bach -
3 Karl-Markus Ritter    Der Dom zu Speyer : weltliche Macht und christlicher Glaube 9783806241280; 3806241287
4 Hape Kerkeling    Ich bin dann mal weg : meine Reise auf dem Jakobsweg    9783890296005; 3890296009

I want to remove the values after and including ; so that only the 9783XXX remains in the row ISBN. It would be perfect, if the code could be included in the existing process, but I haven’t found a way how to manage that.

Here’s the full extracted csv: https://pastebin.com/frAK9NBG

These are some scripts I use to

a) CLean a pre-existing csv:

import pandas as pd
import glob
import os
path= "./**/extract/"

filelist=glob.glob('./**/Reihe A/Reihe*.csv',recursive=True)
print(filelist)
for file in filelist:
    data=pd.read_csv(file, sep="\t",  encoding='utf8')
    data.columns
    title=[] #this gets all the titles
    for row in data['Titel']:
        title.append(row)
    author=[] #this gets all the authors
    for row in data['Verfasser']:
        author.append(row)
    isbn=[] #this gets all the isbns
    for row in data['ISBN']:
        isbn.append(row)
    df=pd.DataFrame({'Verfasser': author, 'Titel': title, 'ISBN': isbn}) #create new csvs based on extracted data
#save csv in set path
df.to_csv(path + file +"_" +"extract", sep='\t', encoding='utf8')
#df.to_csv(path + file +'_' + 'extract.csv', sep="\t") #add endpath ,  encoding='utf-8' back if needed
print(file +'_' + 'extract.csv' + ' saved to ' + path)

and to

b) search the api

import glob
import pandas as pd
from urllib.request import urlopen
#import generated csvs from other script
filelist=glob.glob('./**/Reihe*_extract.csv',recursive=True)
print(filelist)
for file in filelist:
    #read csv, make a list of all isbns
    data=pd.read_csv(file, sep="\t",  encoding='utf8')
    isbnlist=[]
    for row in data['ISBN']:
        isbnlist.append(row)
    #for each isbn in list, get data from api
    for isbn in isbnlist:
        url = 'http://sru.k10plus.de/gvk!rec=1?version=1.1&operation=searchRetrieve&query=pica.isb%3D[isbn]&maximumRecords=10&recordSchema=marcxml'  
    url = url.replace('[isbn]', isbn)
    print(url)

as of right now, I’m splitting my codes based on their function (clean csv, search api etc.) but I plan to make a launcher script that does all for the end user.

Any help is appreciated.

>Solution :

Try this, split by seperator and keep wanted split:

data['ISBN'] = [x.split(' ')[0] for x in data['ISBN']] #keeps first portion of split.

byMR

Published April 07, 2022

Add a comment

Making an upward pointing arrow from downward pointing one in css

byMR

April 7, 2022

Questions

Executing execvp() in Child process after fork() still taking over parent process?

byMR

April 7, 2022

Questions

How to get multi decimals digit using regex?

byMR

April 7, 2022

Questions

YAML library giving two different outputs

byMR

April 7, 2022

Questions

How to get new Set from array of arrays by using Array.map function?

byMR

April 7, 2022

Questions

How to find the day name?

byMR

April 7, 2022

use Pandas to drop values from csv

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Making an upward pointing arrow from downward pointing one in css

Executing execvp() in Child process after fork() still taking over parent process?

How to get multi decimals digit using regex?

YAML library giving two different outputs

How to get new Set from array of arrays by using Array.map function?

How to find the day name?

Keep Up to Date with the Most Important News

use Pandas to drop values from csv

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Making an upward pointing arrow from downward pointing one in css

Executing execvp() in Child process after fork() still taking over parent process?

How to get multi decimals digit using regex?

YAML library giving two different outputs

How to get new Set from array of arrays by using Array.map function?

How to find the day name?

Discover more from Dev solutions