问题:在最近的需求开发中,有这么个分组比例计算求和问题,根据字段'CPN'进行分组,计算每一笔PO Line Actual CT2R * line 数量比重,取名为'Weighted(QTY)CT2R',再根据相同的'CPN'对每行'Weighted(QTY)CT2R'值进行汇总求和得到总的'Weighted(QTY)CT2R'值,如下图填充色为黄色的单元格即是我们所需要的目标值

具体计算逻辑如下:


用Pandas代码实现上述需求如下所示:
- import pandas as pd
-
- df = pd.DataFrame([['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',10,90],
- ['01-0989',200,50],
- ['02-0437',20,80],
- ['02-0437',20,80],
- ['02-0437',20,80]
- ],columns = ['cpn','po_line_qty','actual_ct2r'])
-
- # 根据字段'cpn'进行分组,对字段'po_line_qty'中的值进行求和,取名为total
- total = df.groupby('cpn').agg({'po_line_qty':sum}).reset_index()
- # 将字段'po_line_qty'更名为'total_po_line_qty'
- total = total.rename(columns = {'po_line_qty':'total_po_line_qty'})
- # df表与total表根据字段'cpn'进行左连接,取名为new_res
- new_res = pd.merge(df,total,how='left',on='cpn')
-
- def weighted_qty_ct2r(row):
- scale = row['po_line_qty'] / row['total_po_line_qty']
- weighted_qty_ct2r = scale * row['actual_ct2r']
- return weighted_qty_ct2r
-
- # 生成字段'weighted_qty_ct2r'
- new_res['weighted_qty_ct2r'] = new_res.apply(lambda row:weighted_qty_ct2r(row), axis=1)
- # 根据字段'cpn'进行分组,对字段'weighted_qty_ct2r'中的值进行求和,取名为df_result
- df_result = new_res.groupby('cpn').agg({'weighted_qty_ct2r':sum})
df

total

new_res

df_result
