Warm tip: This article is reproduced from stackoverflow.com, please click
json r

How to modify a txt file in R that is not in the correct R format?

发布于 2020-03-27 10:27:41

I have several .txt files with multiple data points that do not have the correct header format, I'm trying to take out the unnecessary data so R can read the data. Some parts need to be removed and the X and Y columns need to be identified. Here's an example of what the text file reads, where six is referring to the X component and siy is referring to the Y component:

{
    "description": "",
    "name": "1ml",
    "references": [
        {
            "siclassids": [
            ],
            "siname": "1ml",
            "sipoints": [
                {
                    "six": 397.32000732421875,
                    "siy": 0.8571428656578064
                },
                {
                    "six": 400.20001220703125,
                    "siy": 0.75
                },
                {
                    "six": 403.08999633789062,
                    "siy": 0.60000002384185791

There are hundreds of these data points in several different files, is there any way I could get r to organize these and read out the data in graphs?

Thanks!

Questioner
Emma Madigan
Viewed
64
jay.sf 2019-07-03 23:45

You may use regular expressions. The grep identifies the interesting lines. gsub finds "x" and "y" and the corresponding values, and assembles them with a ,. strsplit splits at the comma into a list.

l <- readLines("dp.txt")
l <- setNames(do.call(rbind.data.frame, 
        strsplit(gsub(".+si(.)\\D*(\\d+\\.\\d+).+", "\\1, \\2", 
                      l[grep("\\d{2,}", l)]), ",")), c("axis", "coord"))
l$coord <- as.numeric(l$coord)
l
#   axis coord
# 1    x     4
# 2    y     3
# 3    x     5
# 4    y     2
# 5    x     6
# 6    y     1