I am new to Google Cloud Storage. In my python code, I have couple of Dataframes and I want to store them in a GCS bucket as a single excel file with multiple sheets. In local directory, I am able to do that with using ExcelWriter. Here is the code for that
writer = pd.ExcelWriter(filename)
dataframe1.to_excel(writer, 'sheet1', index=False)
dataframe2.to_excel(writer, 'sheet2', index=False)
writer.save()
I don't want to save a temp file in local directory and then upload it to GCS.
You can instantiate your ExcelWriter() with engine=xlsxwriter and use fs-gcsfs to write the bytes array to excel file on your GCS bucket.
In your case you can do the following:
import io
import pandas as pd
from fs_gcsfs import GCSFS
gcsfs = GCSFS(bucket_name='name_of_your_bucket',
root_path='path/to/excel',
#set a different root path if you wish to upload multiple files in different locations
strict=False)
gcsfs.fix_storage()
output = io.BytesIO()
writer = pd.ExcelWriter(output, engine='xlsxwriter')
dataframe1.to_excel(writer, sheet_name='sheet1', index=False)
dataframe2.to_excel(writer, sheet_name='sheet2', index=False)
writer.save()
xlsx_data = output.getvalue()
with gcsfs.open('./excel_file.xlsx', 'wb') as f:
f.write(xlsx_data)
PS: I had to use strict=False as fs-gcsfs wasn't able to locate the root path (Do check the limitations section in the documentation for fs-gcsfs)
Source: https://xlsxwriter.readthedocs.io/working_with_pandas.html#saving-the-dataframe-output-to-a-string