Warm tip: This article is reproduced from stackoverflow.com, please click
export-to-csv python reddit

Write a csv file after scrapping data from reddit

发布于 2020-05-10 19:15:14

I am new to coding and I am not being able to write a CSV file with the data I scrapped from Reddit.

First, I scrapped data using the pushshift API, which returned the results in a list format like the following image enter image description here

I want to write that data to a CVS file to run a content analysis in R. With each line (0000, 00001, etc) as a row. However, I have not been able to run a code that organizes each parameter in a column. For instance, I want the columns to be submissions.author, submissions.num_comments, submissions.title, to name a few.

I ran this piece of code but the results are not exactly what I'm looking for

import csv
 with open('my_file.csv', 'w') as f:
    writer = csv.writer(f)
    with open('my_file.csv', 'w') as f:
      for row in lastest_submissions:
        row_text = ','.join(row) + '\n'  
        f.write(row_text)

The outcome looks like this enter image description here

What I would like is that the name of the parameter is the header and the parameter answer is the content in each cell. For example, for parameter 'author':'bl00d', the header would be author and the content in the cell would be bl00d (for the line 0000).

I appreciate the help and hints I could get. Also, let me know if I should provide the complete code

Questioner
PaComSc
Viewed
9
TechSavvy 2020-02-25 20:17

In your case as you already have the data in the form of list of dictionaries I think you may wanna try using csv.Dictwriter()

A sample code piece:

import csv
lstdc = [{'name':'Jack', 'age': 26}, 
        {'name':'John', 'age': 27},
        {'name':'Lisa', 'age': 36},
        {'name':'Adam', 'age': 16}]

fieldNames = list((lstdc[0]).keys())

with open('list_of_dict_to_csv.csv','w', newline='\n') as f:
    writer = csv.DictWriter(f, fieldNames)
    writer.writeheader()
    for val in lstdc:
        writer.writerow(val)

you can replace the lstdc with latest_submissions and list_of_dict_to_csv.csv with my_file.csv

Replacing the iteration of list dictionaries with built in writerows()

with open('list_of_dict_to_csv.csv','w', newline='\n') as f:
    writer = csv.DictWriter(f, fieldNames)
    writer.writeheader()
    writer.writerows(lstdc)

Hope this helps!