Warm tip: This article is reproduced from serverfault.com, please click

python 3.x-将多个 pandas 数据框作为带有多个工作表的单个excel文件上传到Google Cloud Storage

(python 3.x - Upload multiple pandas dataframe as single excel file with multiple sheets to Google Cloud Storage)

发布于 2020-05-08 14:15:18

我是Google Cloud Storage的新手。在我的python代码中,我有几个数据框,我想将它们作为多个表格的单个excel文件存储在GCS存储桶中。在本地目录中,我可以使用ExcelWriter做到这一点。这是该代码

writer = pd.ExcelWriter(filename)
dataframe1.to_excel(writer, 'sheet1', index=False)
dataframe2.to_excel(writer, 'sheet2', index=False)
writer.save()

我不想将临时文件保存在本地目录中,然后将其上传到GCS。

Questioner
Nishant Igave
Viewed
0
Sarath Gadde 2020-11-30 17:19:23

你可以使用engine = xlsxwriter实例化ExcelWriter()并使用fs-gcsfs将bytes数组写入GCS存储桶中的excel文件。

根据你的情况,你可以执行以下操作:

import io
import pandas as pd
from fs_gcsfs import GCSFS

gcsfs = GCSFS(bucket_name='name_of_your_bucket',
                      root_path='path/to/excel', 
#set a different root path if you wish to upload multiple files in different locations
                      strict=False)
gcsfs.fix_storage()

output = io.BytesIO()
writer = pd.ExcelWriter(output, engine='xlsxwriter')

dataframe1.to_excel(writer, sheet_name='sheet1', index=False)
dataframe2.to_excel(writer, sheet_name='sheet2', index=False)

writer.save()
xlsx_data = output.getvalue()

with gcsfs.open('./excel_file.xlsx', 'wb') as f:
  f.write(xlsx_data) 

PS:我必须使用strict = False,因为fs-gcsfs无法找到根路径(请检查文档中fs-gcsfs的限制部分)

来源:https : //xlsxwriter.readthedocs.io/working_with_pandas.html#saving-the-dataframe-output-to-a-string