对于当前项目,我计划通过运行多个循环来按时间范围过滤JSON文件,每次循环的范围都略有偏移。但是,下面的代码会产生TypeError: Invalid comparison between dtype=datetime64[ns] and date
line的错误after_start_date = df["Date"] >= start_date
。
我已经尝试在Python代码以及相应的JSON文件中修改日期的格式。是否有任何巧妙的调整可以对齐日期类型/格式?
JSON文件具有以下格式:
[
{"No":"121","Stock Symbol":"A","Date":"05/11/2017","Text Main":"Sample text"}
]
相应的代码如下所示:
import string
import json
import pandas as pd
import datetime
from dateutil.relativedelta import *
# Loading and reading dataset
file = open("Glassdoor_A.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df['Date'] = pd.to_datetime(df['Date'])
# Create an empty dictionary
d = dict()
# Filtering by date
start_date = datetime.date.fromisoformat('2017-01-01')
end_date = datetime.date.fromisoformat('2017-01-31')
for i in df.iterrows():
start_date += relativedelta(months=+3)
end_date += relativedelta(months=+3)
print(start_date)
print(end_date)
after_start_date = df["Date"] >= start_date
before_end_date = df["Date"] <= end_date
between_two_dates = after_start_date & before_end_date
filtered_dates = df.loc[between_two_dates]
print(filtered_dates)
你可以使用pd.to_datetime('2017-01-31')
代替datetime.date.fromisoformat('2017-01-31')
。
我希望这有帮助!