I'm trying to get number of days between two dates using below function
df['date'] = pd.to_datetime(df.date)
# Creating a function that returns the number of days
def calculate_days(date):
today = pd.Timestamp('today')
return today - date
# Apply the function to the column date
df['days'] = df['date'].apply(lambda x: calculate_days(x))
The results looks like this
153 days 10:16:46.294037
but I want it to say 153. How do I handle this?
For performance you can subtract values without apply
for avoid loops use Series.rsub
for subtract from rigth side:
df['date'] = pd.to_datetime(df.date)
df['days'] = df['date'].rsub(pd.Timestamp('today')).dt.days
What working like:
df['days'] = (pd.Timestamp('today') - df['date']).dt.days
If want use your solution:
df['date'] = pd.to_datetime(df.date)
def calculate_days(date):
today = pd.Timestamp('today')
return (today - date).days
df['days'] = df['date'].apply(lambda x: calculate_days(x))
Or:
df['date'] = pd.to_datetime(df.date)
def calculate_days(date):
today = pd.Timestamp('today')
return (today - date)
df['days'] = df['date'].apply(lambda x: calculate_days(x)).dt.days
I tried the shorter code
df['days'] = df['date'].sub(pd.Timestamp('today')).dt.days
but got the errorTypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Timestamp'
@ShadowWalker _ there is used
df['date'] = pd.to_datetime(df.date)
? Btw, what is your pandas version?I corrected a typo but the result has negative values. I tried changing to
pd.Timestamp('today').sub(df['date']).dt.days
but got an error Timestamp has no attribute sub@ShadowWalker - Sorry, it was my typo, need
rsub
, answer was edited.