Warm tip: This article is reproduced from serverfault.com, please click

How to Access Private Github Repo File (.csv) in Python using Pandas or Requests

发布于 2020-06-03 02:47:16

I had to switch my public Github repository to private and cannot access files, not with access tokens that I was able to with the public Github repo.

I can access my private repo's CSV with curl: ''' curl -s https://{token}@raw.githubusercontent.com/username/repo/master/file.csv

'''

However, I want to access this information in my python file. When the repo was public I could simply use: ''' url = 'https://raw.githubusercontent.com/username/repo/master/file.csv' df = pd.read_csv(url, error_bad_lines=False)

'''

This no longer works now that the repo is private, and I cannot find a work around to download this CSV in python instead of pulling from terminal.

If I try: ''' requests.get(https://{token}@raw.githubusercontent.com/username/repo/master/file.csv) ''' I get a 404 response, which is basically the same thing that is happening with the pd.read_csv(). If I click on the raw file I see that a temporary token is created and the URL is: ''' https://raw.githubusercontent.com/username/repo/master/file.csv?token=TEMPTOKEN ''' Is there a way to attach my permanent private access token so that I can always pull this data from github?

Questioner
everwitt7
Viewed
0
everwitt7 2020-12-14 03:51:30

This is what ended up working for me - leaving it here if anyone runs into the same issue. Thanks for the help!

    import json, requests, urllib, io

    user='my_github_username'
    pao='my_pao'

    github_session = requests.Session()
    github_session.auth = (user, pao)

    # providing raw url to download csv from github
    csv_url = 'https://raw.githubusercontent.com/user/repo/master/csv_name.csv'

    download = github_session.get(url_swing).content
    downloaded_csv = pandas.read_csv(io.StringIO(download.decode('utf-8')), error_bad_lines=False)