Warm tip: This article is reproduced from stackoverflow.com, please click
python

sum values of two csv files and results to a new file

发布于 2020-03-27 10:21:38

I need some help and don't know how to start with it.

Basically, i have two csv input files (coming from 2 different nodes) and would like to sum the values with the result going to elasticsearch..

Any help is greatly appreciated...

CSV1

node    link    rate-in rate-out
node1   link1   10  20
node1   link2   30  50
node1   link3   40  60

CSV2

node    link    rate-in rate-out
node2   link1   20  10
node2   link2   50  70
node2   link3   80  40

Result --> create elasticsearch index

node    link    rate-in rate-out
allnode link1   30  30
allnode link2   80  120
allnode link3   120 100

Thanks!

Questioner
subok
Viewed
59
Joel Baumert 2019-07-03 20:37

Pandas has good support for reading and manipulating CSV files.


    import pandas as pd

    df = pd.concat([
        pd.read_csv('csv1.csv'),
        pd.read_csv('csv2.csv')
    ])

    result=df.groupby('link', as_index=False).sum()
    result['node'] = 'allnode'

    result.to_csv('result.csv')

Where csv1.csv is:

    node,link,rate-in,rate-out
    node1,link1,10,20
    node1,link2,30,50
    node1,link3,40,60

And csv2.csv is:

    node,link,rate-in,rate-out
    node2,link1,20,10
    node2,link2,50,70
    node2,link3,80,40

The result.csv file will contain:


    link,rate-in,rate-out,node
    link1,30,30,allnode
    link2,80,120,allnode
    link3,120,100,allnode

If you want to reorder the columns you can do that by providing an order list


    result[[
        'node','link','rate-in','rate-out'
    ]].to_csv('result.csv', index=False)