dataframe math pandas python

# python - pandas 列转换以获得累积美元金额

``````district      item       salesAmount
Arba          pen        10
Arba          pen        20
Arba          pencil     30
Arba          laptop     10000
Arba          coil       100
Arba          coil       200
Cebu          pen        100
Cebu          pen        20
Cebu          laptop     20000
Cebu          laptop     20000
Cebu          fruit      800
Cebu          oil        300
``````

`df.groupby(['district', 'item']).agg({'salesAmount': 'sum'})` 结果如下：

``````district      item       salesAmount
Arba          laptop     10000
Arba          coil       300
Arba          pencil     30
Arba          pen        30
Cebu          laptop     40000
Cebu          fruit      800
Cebu          oil        300
Cebu          pen        120
``````

``````district    item    salesAmount cumsalesAmount  totaldistrictAmount
Arba        laptop  10000       10000           10360
Arba        coil    300         10300           10360
Arba        pencil  30          10330           10360
Arba        pen     30          10360           10360
Cebu        laptop  40000       40000           41220
Cebu        fruit   800         40800           41220
Cebu        oil     300         41100           41220
Cebu        pen     120         41220           41220
``````

Lilly

84
jezrael 2020-01-31 17:49

`sum`每两列的第一个汇总

``````print (df.dtypes)
district       object
item           object
salesAmount     int64
dtype: object

df1 = df.groupby(['district', 'item'], as_index=False)['salesAmount'].sum()
``````

``````df1 = df.groupby(['district', 'item'], as_index=False).agg({'salesAmount': 'sum'})
print (df1)
district    item  salesAmount
0     Arba    coil          300
1     Arba  laptop        10000
2     Arba     pen           30
3     Arba  pencil           30
4     Cebu   fruit          800
5     Cebu  laptop        40000
6     Cebu     oil          300
7     Cebu     pen          120
``````

``````df1 = df1.sort_values(['district','salesAmount'], ascending=[True, False])
df1['cumsalesAmount'] = df1.groupby('district')['salesAmount'].cumsum()
df1['totaldistrictAmount'] = df1.groupby('district')['salesAmount'].transform('sum')
#alternative
#df1['totaldistrictAmount'] = df1.groupby('district')['cumsalesAmount'].transform('last')
print (df1)
district    item  salesAmount  cumsalesAmount  totaldistrictAmount
1     Arba  laptop        10000           10000                10360
0     Arba    coil          300           10300                10360
2     Arba     pen           30           10330                10360
3     Arba  pencil           30           10360                10360
5     Cebu  laptop        40000           40000                41220
4     Cebu   fruit          800           40800                41220
6     Cebu     oil          300           41100                41220
7     Cebu     pen          120           41220                41220
``````