Warm tip: This article is reproduced from serverfault.com, please click

pandas-在python中使用fillna时越界纳秒时间戳错误?

(pandas - Getting Out of bounds nanosecond timestamp error while using fillna in python?)

发布于 2020-12-03 06:02:21

out of bounds nanosecond timestamp尝试将默认值传递给空值列出现错误

df3['check_date']=df3['eventDate0142'].fillna(df3['statusDateTi'].fillna(pd.to_datetime('9999-12-31')))

如何解决?

Questioner
LDF_VARUM_ELLAM_SHERIAAVUM
Viewed
11
jezrael 2020-12-03 14:57:06

问题在 pandas 中,最大时间戳为:

print (pd.Timestamp.max)
2262-04-11 23:47:16.854775807

因此在 pandas 中引发错误:

print (pd.to_datetime('9999-12-31'))
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 9999-12-31 00:00:00

样品

df1 = pd.DataFrame({'eventDate0142': [np.nan,  np.nan, '2016-04-01'], 
                   'statusDateTi': [np.nan, '2019-01-01', '2017-04-01']})

df3 = df1.apply(pd.to_datetime)

print (df3)
  eventDate0142 statusDateTi  
0           NaT          NaT 
1           NaT   2019-01-01  
2    2016-04-01   2017-04-01  

可能的解决方案是使用纯python,但随后所有pandas datetimelike方法均失败-所有数据均转换为dates:

from datetime import  date

print (date.fromisoformat('9999-12-31'))
9999-12-31


df3['check_date'] = (df3['eventDate0142'].dt.date
                        .fillna(df3['statusDateTi'].dt.date
                        .fillna(date.fromisoformat('9999-12-31'))))
print (df3)
  eventDate0142 statusDateTi  check_date
0           NaT          NaT  9999-12-31
1           NaT   2019-01-01  2019-01-01
2    2016-04-01   2017-04-01  2016-04-01

print (df3.dtypes)
eventDate0142    datetime64[ns]
statusDateTi     datetime64[ns]
check_date               object
dtype: object

或通过将时间戳转换为每日时段Series.dt.to_period,然后Periods用于表示跨度跨度

print (pd.Period('9999-12-31'))
9999-12-31

df3['check_date'] = (df3['eventDate0142'].dt.to_period('d')
                        .fillna(df3['statusDateTi'].dt.to_period('d')
                        .fillna(pd.Period('9999-12-31'))))
print (df3)
  eventDate0142 statusDateTi  check_date
0           NaT          NaT  9999-12-31
1           NaT   2019-01-01  2019-01-01
2    2016-04-01   2017-04-01  2016-04-01

print (df3.dtypes)
eventDate0142    datetime64[ns]
statusDateTi     datetime64[ns]
check_date            period[D]
dtype: object

如果分配回所有列:

df3['eventDate0142'] = df3['eventDate0142'].dt.to_period('d')
df3['statusDateTi'] = df3['statusDateTi'].dt.to_period('d')
df3['check_date'] = (df3['eventDate0142']
                        .fillna(df3['statusDateTi']
                        .fillna(pd.Period('9999-12-31'))))
print (df3)
  eventDate0142 statusDateTi  check_date
0           NaT          NaT  9999-12-31
1           NaT   2019-01-01  2019-01-01
2    2016-04-01   2017-04-01  2016-04-01

print (df3.dtypes)
eventDate0142    period[D]
statusDateTi     period[D]
check_date       period[D]
dtype: object