温馨提示:本文翻译自stackoverflow.com，查看原文请点击：r - Merge rows with the same ID but with overlapping variables

grouping id merge r reduce

r - 合并具有相同ID但变量重叠的行

发布于 2020-04-15 11:13:52

我在r中有超过6000个观测值和96个变量的数据。

数据与个人组及其活动等有关。如果有组返回，则再次记录组ID号并进行新的观察。我需要按ID合并行，以使＃个人的记录数最高，但是活动等是两种观察结果的组合。

数据包含＃个人，活动，影响，到达时间等。问题在于，某些观测值分为两行，因此可能在另一行中记录了同一组的活动。两种观察的组ID相同，但其中一个可能记录了＃个个体并记录了一些活动记录或影响，但第二个观察可能不完整，只有组ID和影响（除了第一个记录中的影响））。小组中的个人＃永远不会改变，因此我需要某种方式将它们组合在一起，以使活动具有累加性，但#visitors具有最高的价值，需要最早记录到达时间，并且需要将出发时间记录为2个观察结果中的较晚者。

有谁知道如何根据组ID合并观察值，但如何根据变量更改合并协议。

在此处输入图片说明

提问者

Andrew Torsney

被浏览

69

查看英文版

查看原文

Em Laskey 2020-02-04 21:58

我不确定这是否真的是您想要的，但是要基于多个条件组合数据帧的行，可以使用该dplyr包及其summarise()功能。我生成了一些数据以直接在R中使用，您必须根据需要修改代码。

# generate data
ID<-rep(1:20,2)
visitors<-sample(1:50, 40, replace=TRUE)
impact<-sample(rep(c("a", "b", "c", "d", "e"), 8))
arrival<-sample(rep(8:15, 5))
departure <- sample(rep(16:23, 5))

df<-data.frame(ID, visitors, impact, arrival, departure)
df$impact<-as.character(df$impact)

# summarise rows with identical ID
df_summary <- df %>%
  group_by(ID) %>%
  summarise(visitors = max(visitors), arrival = min(arrival), 
            departure = max(departure), impact = paste0(impact, collapse =", "))

希望这可以帮助！

Andrew Torsney 2020-02-05 22:15:10

这正是我想要的，并且非常适合我的数据。我真的很感谢您的帮助。

Em Laskey 2020-02-05 23:04:41

很高兴我能帮助你！如果您对答案满意，可以接受吗？谢谢！

Andrew Torsney 2020-02-11 17:39:02

抱歉，这是我问过的第一个问题，所以我没有意识到我必须接受答案。现在接受了。

相关问题

1

过滤具有特定条件的所有列的行

2

ggplot2绘图区域内的两个轴标签

3

错误：在R中找不到函数...

4

创建加载消息，这些消息将根据 shiny 的应用程序中情节的加载时间而改变

5

热图生成R中的cut.default错误

6

r中的apply函数存在问题：仅在第一列中应用

7

R在滑动窗口时间段内创建先前事件的计数

8

使用setDT将一个数据帧中的许多列合并到另一数据帧中

9

根据 shiny dashboard 其他选项卡中的操作在选项卡中显示下载按钮

10

用奇怪的格式解析R中的日期

热门github

1

Python tool for converting files and office documents to Markdown.

2

Run LLMs with MLX

3

基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.

4

DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coordinate multiple specialized lower-level agents, enabling automated task decomposition and efficient execution across diverse and complex domains.

5

An AI Hedge Fund Team

6

A cryptocurrency trading API with more than 100 exchanges in JavaScript / TypeScript / Python / C# / PHP / Go (翻译：一个 JavaScript / Python / PHP 加密货币交易 API，支持 100 多个比特币/山寨币交易所)

7

"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"

8

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages. (翻译：PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力开发者训练出更好的模型，并应用落地。)

9

zero-shot voice conversion & singing voice conversion, with real-time support

10

AI wearables. Put it on, speak, transcribe, automatically

11

12

3D Reconstruction for all

13

PowerShell for every system! (翻译：适用于各系统的PowerShell)

14

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

15

All Algorithms implemented in Python (翻译：用 Python 实现的所有算法)