Warm tip: This article is reproduced from serverfault.com, please click

Python- Generate values for a new column using wildcard list search on another column

发布于 2020-12-10 16:34:03

Currently I am working on assigning different inflow rates (float values) to each product based on the product code => There should be 2 columns: PRODUCT_CODE' and 'INFLOW_RATE'. The product code has 4 characters and the rule is as followed:

  • If the code starts with 'L','H' or 'M': assign float value = 1.0 to 'INFLOW_RATE' column.
  • If the codes are 'SVND' or 'SAVL': assign float value = 0.1 to 'INFLOW_RATE' column.
  • Other cases: assign float value = 0.5 to 'INFLOW_RATE' column.

The sample data is as followed:

Pic

There are > 50 product codes so I believe it is best to check the conditions and assign values using wildcards. So far I managed to come up with this code:

Import re    
CFIn_01 = ['SVND','SAVL']
CFIn_10 = ["M.+","L.+","H.+"]
file_consol['INFLOW_RATE'] = 0.5
file_consol.loc[file_consol['PRODUCT_CODE'].isin(CFIn_01), 'INFLOW_RATE'] = 0.1
file_consol.loc[file_consol['PRODUCT_CODE'].isin(CFIn_10), 'INFLOW_RATE'] = 1.0

However, when I check the result, all columns of 'INFLOW_RATE' are still filled with 0.5, instead of the rules I expected. I'm not sure what will be the appropriate code for this problem. Any help or advise is appreciated!

Questioner
Zack Nguyen
Viewed
0
Anna Semjén 2020-12-11 00:42:56

Create your custom function like you would do with a simple string:

def my_func(word: str):
    if word.startswith('H') or word.startswith('L') or word.startswith('M'): 
        out = 0.1
    elif word == 'SVND' or word == 'SAVL':
        out = 1.0
    else:
        out = 0.5
    return out

Then apply the function:

df['INFLOW'] = df.PRODUCT_CODE.apply(my_func)