Warm tip: This article is reproduced from serverfault.com, please click

Automating folder creation in S3

发布于 2020-11-28 20:07:31

I have an S3 bucket into which clients drop data files (CSV files) each month. I was wondering there was a way that I could automatically create a new "folder" (object) every time the files are dropped each month and put the newest files into that "folder". I need the CSV files separated by month so that AWS Glue can create new partitions when I run incremental crawlers on this bucket.

For example, let's say I have a S3 bucket called "client." On December 1st, a new CSV file ("DecClientData") will be dropped into that "client" bucket. I want to know if there is a way to automate the following two processes:

  1. Create a "folder" (let's call it "dec") within "client".
  2. Place the "DecClientData" file in the "dec" "folder".

Thanks in advance for any assistance you can provide!

Questioner
learner
Viewed
0
Andre.IDK 2020-11-29 06:17:27

S3 doesn't have the notion of folders commonly found in file systems but instead has a flat structure, more details can be found here.

Instead, the full path of an object is stored in its Key (filename). For example, an object can be stored in Amazon S3 with a Key of files/2020-12/data.txt regardless of the existence of files and 2020-12 directories (they are not really directories but zero-length objects).

In your case, to solve both points you are mentioning, you should leverage S3 event notifications and use them as a Lambda Trigger. When the Lambda function is triggered, it is passed the name of the object (Key) as an argument, at that point you can simply change its Key.

I.e. Object is uploaded in s3://my_bucket/uploads/file.txt, this creates an event notification that triggers a Lambda function. The functions gets the object and re-uploads it to s3://my_bucket/files/dec/file.txt (and deletes the original one).