Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas for-loop with a list of columns

I’m trying to open links in my dataframe using selenium webdriver, the dataframe ‘df1’ looks like this:

user repo1 repo2 repo3
0 breed cs149-f22 kattis2canvas grpc-maven-skeleton
1 GrahamDumpleton mod_wsgi wrapt NaN

The links I want to open include the content in column ‘user’ and one of 3 ‘repo’ columns. I encounter a bug when I iterate the ‘repo’ columns.

Could anyone help me out? Thank you!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Here is my best try:

repo_cols = [col for col in df1.columns if 'repo' in col]

for index, row in df1.iterrows():
    user = row['user']
    for repo_name in repo_cols:
        try:
            repo = row['repo_name']
            current_url = f'https://github.com/{user}/{repo}/graphs/contributors'
            driver.get(current_url)
            time.sleep(0.5)
        except:
            pass

Here is the bug I encounter:

KeyError: 'repo_name' 

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3079             try:
-> 3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'repo_name'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
<ipython-input-50-eb068230c3fd> in <module>
      4     user = row['user']
      5     for repo_name in repo_cols:
----> 6         repo = row['repo_name']
      7         current_url = f'https://github.com/{user}/{repo}/graphs/contributors'
      8         driver.get(current_url)

~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    851 
    852         elif key_is_scalar:
--> 853             return self._get_value(key)
    854 
    855         if is_hashable(key):

~\anaconda3\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
    959 
    960         # Similar to Index.get_value, but we do not fall back to positional
--> 961         loc = self.index.get_loc(label)
    962         return self.index._get_values_for_loc(self, loc, label)
    963 

~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3080                 return self._engine.get_loc(casted_key)
   3081             except KeyError as err:
-> 3082                 raise KeyError(key) from err
   3083 
   3084         if tolerance is not None:

KeyError: 'repo_name'


>Solution :

I think you should remove the quotation mark on the:

repo = row[‘repo_name’]

It should be:

repo = row[repo_name]

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading