温馨提示:本文翻译自stackoverflow.com，查看原文请点击：python - How to drop_duplicates

python duplicates

python - 如何drop_duplicates

发布于 2020-03-27 11:59:03

我有原始数据，如下例。在时刻t1，变量的值为x1，并且仅当其值不等于x1时，才应在时刻t2记录该变量。有一种方法可以将python中数据框中的值与先前的值进行比较，如果相同，则将其删除。我尝试了关注功能，但不起作用。请提供帮助。

df
time                 Variable   Value
2014-07-11 19:50:20  Var1       10
2014-07-11 19:50:30  Var1       20
2014-07-11 19:50:40  Var1       20
2014-07-11 19:50:50  Var1       30
2014-07-11 19:50:60  Var1       20 
2014-07-11 19:50:70  Var2       50
2014-07-11 19:50:80  Var2       60
2014-07-11 19:50:90  Var2       70

编码：

for y in df.time:
    for x in df.Value:
        if y == y:
            if x == x:
                df1 = df.drop_duplicates(subset = ['time', 'Variable', 'Value'], keep=False) 
            else:
                df1 = df.drop_duplicates(['time', 'Variable', 'Value'])

预期产量：

df
time                 Variable   Value
2014-07-11 19:50:20  Var1       10
2014-07-11 19:50:30  Var1       20
2014-07-11 19:50:50  Var1       30
2014-07-11 19:50:60  Var1       20 
2014-07-11 19:50:70  Var2       50
2014-07-11 19:50:80  Var2       60
2014-07-11 19:50:90  Var2       70

提问者

NguyenTram

被浏览

24

查看英文版

查看原文

DYZ 2017-06-08 05:39

df.drop_duplicates(subset=['Variable','Value'],keep='first')
#                time Variable  Value
#2014-07-11  19:50:20     Var1     10
#2014-07-11  19:50:30     Var1     20
#2014-07-11  19:50:50     Var2     30
#2014-07-11  19:50:60     Var2     40
#2014-07-11  19:50:70     Var2     50

NguyenTram 2017-06-08 05:52:42

谢谢。那么，为什么我们只得到2个子集而却得到3个子集？

DYZ 2017-06-08 05:54:54

您不希望具有相同值的相同变量。

NguyenTram 2017-06-08 07:23:55

非常感谢。您的答案有效，但我的数据出现一个问题，在t3时刻，相同的var1，值3与值1相同，并且我想在t3的var1处保留值3，因为它与t1不连续。我更新了我的数据。

DYZ 2017-06-08 07:25:19

这是一个不同的问题，需要不同的答案。

NguyenTram 2017-06-08 07:45:02

谢谢。我将提出另一个问题。

相关问题

1

如何使用python cut方法创建bin，接受一个参数并返回适当的bin？

2

从具有特定条件的列表列表创建字典

3

根据行值选择列，Python，Pandas

4

在数据框中绘制零和一的计数

5

python函数。

6

在两个DataFrame之间执行大量Pandas查找的最佳方法

7

如何获取Pandas数据透视表中的列数和每列的宽度？

8

在Pandas数据框中分组时缺少所需值时显示一列

9

Python隐藏壁虱但显示壁虱标签

10

获取Entry和checkbutton值Tkinter时出现问题

热门github

1

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / DeepSeek / Qwen), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Plugins/Artifacts) and Thinking. One-click FREE deployment of your private ChatGPT/ Claude / DeepSeek application. (翻译：LobeChat 是开源的高性能聊天机器人框架，支持语音合成、多模态、可扩展的（Function Call）插件系统。)

2

Collection of leaked system prompts

3

Jelly Evolution Simulator

4

Master programming by recreating your favorite technologies from scratch. (翻译：在这个项目中，你能学会如何创造自己的各种工具，引擎，游戏，框架，库......)

5

Agent S: an open agentic framework that uses computers like a human

6

An open source payments switch written in Rust to make payments fast, reliable and affordable (翻译：YOLOv8 🚀 in PyTorch > ONNX > CoreML > TFLite)

7

Python - 100天从新手到大师

8

Truly independent web browser

9

Curated list of project-based tutorials (翻译：收藏了基于项目的教程列表)

10

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/ (翻译：12 节课程，开始使用生成式 AI 进行构建)

11

ChatGPT DAN, Jailbreaks prompt

12

A quick example of how one can "synchronize" a 3d scene across multiple windows using three.js and localStorage

13

real time face swap and one-click video deepfake with only a single image