I have a DataFrame that looks like df = pd.DataFrame({'col1': [.8,.9,1,1,1,.9,1,.9,.8]})
.
The goal I have is once a number 1 in 'col1' is found, remove the next five rows.
Example
col1
0 0.8
1 0.9
2 1.0
3 1.0
4 1.0
5 0.9
6 1.0
7 0.9
8 0.8
Expected Output
col1
0 0.8
1 0.9
2 1.0
3 0.8
Any ideas?
You could use numpy.r_ to generate the integers:
position_of_1 = np.argmax(df.col1.eq(1)) # df.col1.eq(1).idxmax() not fool-proof
integers = np.r_[: position_of_1 + 1,
range(position_of_1 + 6, len(df))
]
df.iloc[integers]
col1
0 0.8
1 0.9
2 1.0
8 0.8
Thanks to @Ben, for the suggestion on np.argmax; it would be much better/safer to use np.argmax, for scenarios where the index are not numbers or not in proper form:
I see it is not working with df = pd.DataFrame({'col1': [.8,.9,1,1,1,.9,1,.9,.8,.7,.6,.5,1,.4,.3,.5,.7,.9,.5,.4]})
@JohnsonFrancis. what should be the output.
Nice, but I would change
df.col1.eq(1).idxmax()
tonp.argmax(df.col1.eq(1))
to make this a little more bullet-proof. (Consider the case wheredf
's index is not 0, 1, 2, ...`thanks @Ben, I'll edit now. That's a great suggestion
I believe, the requirement is to skip next 5 rows, when a '1' is found. So, if there are multiple '1's this code won't work. This code takes care of only the first '1' found.