Warm tip: This article is reproduced from stackoverflow.com, please click
excel r excel-2019

How to rearrange cells within a row based on partial string matches in Excel, R

发布于 2020-03-27 10:24:36

I have a large excel spreadsheet with 8 columns. Simplified it looks like this:

Identity column1  column2  column3  column4  column5  column6  column7
     1     tk=tr   cont=fb  sourc=a  camp=b    med=c    nm=he   ad=fst
     2     camp=a  cont=fb  sourc=a   med=b     NA        NA     NA
     3     vid=tr  sourc=b  camp=b    med=c   cont=fb     NA     NA

I need to organize it so that all of the cells containing cont are in the same column, all of the cells containing camp are in the same column, and so on. They are only partial matches though, as you can see some say camp=b, and some say camp=a, for example. I need each column to only have one type of string in it, one for cont, one for sourc, one for camp, one for med.

The "identity" column is formatted correctly, and the trk and vid values are unnecessary, but they only appear in some of the rows.

I looked into conditional formatting, advanced filters and sorts, ordering by custom lists, and vba all for excel. I'm happy to convert to an R file and work there.

I would like the output to look like this:

Identity  column1  column2  column3  column4  column5  column6  Column7
   1       cont=fb  sourc=a  camp=b    med=c    NA        NA       NA
   2       cont=fb  sourc=a  camp=a    med=b    NA        NA       NA
   3       cont=fb  sourc=b  camp=b    med=c    NA        NA       NA

The cont=, sourc=, camp=, and med= values are the only ones I need in the final product, so it would be great if I could ignore/delete the other ones in producing this, however not entirely necessary.

Thank you so much for any help you could offer!

Questioner
jawk23
Viewed
129
cybernetic.nomad 2019-07-03 23:36

What you want can be done with INDEX/MATCH

Use the following formula in each of the four columns:

=INDEX(B2:H2,MATCH("cont=*",B2:H2,))

=INDEX(B2:H2,MATCH("sourc=*",B2:H2,))

=INDEX(B2:H2,MATCH("camp=*",B2:H2,))

=INDEX(B2:H2,MATCH("med=*",B2:H2,))

And populate down

enter image description here