I have a pandas dataframe with Time and values columns. I am trying to create two new columns 'START_TIME" and 'END_TIME'. It is medication related data and it is stored poorly in the database so I am trying to transform the table. In this case, the medication for a patient started at 2018-11-07 23:59:32 with a dose value of 80.o so, I want to capture that as the start time of the medication and end time is the first zero after the last value. That would be one round of medication. Whenever a new value starts it is considered as the second round of medication and I'd like to capture the start time and end time in the following way as explained earlier.
Time Values
2018-11-07 23:59:32 80.0
2018-11-08 04:35:09 80.0
2018-11-08 05:31:24 40.0
2018-11-24 18:29:30 0.0
2018-11-24 18:33:14 0.0
2018-11-26 17:39:31 20.0
2018-11-26 18:51:07 20.0
2018-11-26 21:04:35 0.0
2018-11-26 21:05:20 0.0
2018-11-26 21:13:44 0.0
2018-11-26 21:25:57 0.0
2018-11-29 02:19:57 7.0
2018-12-09 16:02:06 5.0
2018-12-09 16:33:03 2.5
2018-12-09 21:02:10 0.0
I believe it cannot be done with a simple for and if loop as I started with a simple step and it failed
for i in df['Values']:
if i+1 != 0:
df['START_TIME'] = df['TIME'].copy()
Expected DataFrame:
Time Values START_TIME END_TIME
2018-11-07 23:59:32 80.0 2018-11-07 23:59:32
2018-11-08 04:35:09 80.0
2018-11-08 05:31:24 40.0
2018-11-24 18:29:30 0.0 2018-11-24 18:29:30
2018-11-24 18:33:14 0.0
2018-11-26 17:39:31 20.0 2018-11-26 17:39:31
2018-11-26 18:51:07 20.0
2018-11-26 21:04:35 0.0 2018-11-26 21:04:35
2018-11-26 21:05:20 0.0
2018-11-26 21:13:44 0.0
2018-11-26 21:25:57 0.0
2018-11-29 02:19:57 7.0 2018-11-29 02:19:57
2018-12-09 16:02:06 5.0
2018-12-09 16:33:03 2.5
2018-12-09 21:02:10 0.0 2018-12-09 21:02:10
I'd really appreciate if I can get some help.
IIUC create the condition by using diff
, then the value equal to -1 and 1 will be the end and start point
s=df.Values.eq(0).astype(int).diff().fillna(-1)
df.loc[s==-1,'START_TIME']=df.Time
df.loc[s==1,'END_TIME']=df.Time
df
Out[334]:
Time Values START_TIME END_TIME
0 2018-11-07 23:59:32 80.0 2018-11-07 23:59:32 NaT
1 2018-11-08 04:35:09 80.0 NaT NaT
2 2018-11-08 05:31:24 40.0 NaT NaT
3 2018-11-24 18:29:30 0.0 NaT 2018-11-24 18:29:30
4 2018-11-24 18:33:14 0.0 NaT NaT
5 2018-11-26 17:39:31 20.0 2018-11-26 17:39:31 NaT
6 2018-11-26 18:51:07 20.0 NaT NaT
7 2018-11-26 21:04:35 0.0 NaT 2018-11-26 21:04:35
8 2018-11-26 21:05:20 0.0 NaT NaT
9 2018-11-26 21:13:44 0.0 NaT NaT
10 2018-11-26 21:25:57 0.0 NaT NaT
11 2018-11-29 02:19:57 7.0 2018-11-29 02:19:57 NaT
12 2018-12-09 16:02:06 5.0 NaT NaT
13 2018-12-09 16:33:03 2.5 NaT NaT
14 2018-12-09 21:02:10 0.0 NaT 2018-12-09 21:02:10