How to split a column in a dataframe into multiple columns in R? -


i've dataframe 1 varaible supposedly containing data of multiple columns, such id, year, months etc in 1 single column.

this weather data internet. it's badly organized. has no column name, , data pushed 1 column.

each row says characters 1-11 represent id, , next 4 characters represent year, , next few characters represent climatic measure jan month, , flags (blanks or chatacters), , february month , again few flags (blanks or chatacters), , march , flags (blanks or chatacters).. , on. detailed structure of data explained here

ush00011084 1974  1628     1606     1363     1039     1343      903     2536      839     2048      358     1118      754    ush00011084 1975  1714     1837     1544f    2828     1758     1898     4848     2110     2217     1197     1512     1445    ush00011084 1976   825      541      989      600     2502     1448      971     1157      704      899     1340      856a   ush00011084 1977  1319      528     2665      473      285     1590     2337     3733      961      434     1259      981    ush00011084 1978  2722     1023     1574     1214     2919     2136     1548      988      875       46      917     1379    ush00011084 1979  1927     2671     1285     1063      966     1160     2282     1120      979      292     1470      812    ush00011084 1980  1639      368     3799     2005     1423     1826. 1   917      423     1449     1353     1039      287    ush00011084 1981    38b    2846     1170      127     1334      995     2022     1343      467      413      513     1909    ush00011084 1982  1631     3097      910     1127      879     1416     2103     1482     1060      551      863     1702    ush00011084 1983  1207     2210     2604     1925      820     1714      662     1235     1204      394     1145     2219    

now, i'm trying organize data dataframe separating data respective columns. tried multiple things failed badly.

require(reshape2)

 colsplit(only_req_col, " ", c("id", "year","jan", "j1", "feb", "f1", "mar", "m1", "apr", "a1", "may", "my1","jun","jn1", "jul", "jl1", "aug","a1","sep", "s1","oct", "o1","nov", "n1", "dec","d1" )) 

didn't work!

require(tidyr)

separate(data = data_2, col = v1, = c("id", "year","jan", "j1", "feb", "f1", "mar", "m1", "apr", "a1", "may", "my1","jun","jn1", "jul", "jl1", "aug","a1","sep", "s1","oct", "o1","nov", "n1", "dec","d1" ), sep = "") 

didn't work!

require(reshape2)

data_3 <- colsplit(gsub(pattern = "[0-9]"," ",data_2),            names= c("id", "year","jan", "j1", "feb", "f1", "mar", "m1", "apr", "a1", "may", "my1","jun","jn1", "jul", "jl1", "aug","a1","sep", "s1","oct", "o1","nov", "n1", "dec","d1" )) 

didn't work!

i keep monthly data each year , id, , ignore flags.

i achieved using substr each month's position, involved lot of manual counting correct positions of months:

id <- substr(data_3[,1], 1,11) year= substr(data_3[,1], 13,16) jan = substr(data_3[,1], 18,22) feb = substr(data_3[,1], 27,31) mar = substr(data_3[,1], 36,40) apr = substr(data_3[,1], 45,49) may = substr(data_3[,1], 53,57) jun = substr(data_3[,1], 63,67) jul = substr(data_3[,1], 72,76) aug = substr(data_3[,1], 81,85) sep = substr(data_3[,1], 90,95) oct = substr(data_3[,1], 99,103) nov = substr(data_3[,1], 108,112) dec = substr(data_3[,1], 117,121)   prcp_data <- data.frame(id, year, jan, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec) 

now output looks fine:

> prcp_data               id year   jan   feb   mar   apr   may   jun   jul   aug    sep   oct   nov   dec 1    ush00011084 1890 -9999 -9999 -9999 -9999  -999 -9999 -9999 -9999 -9999  -9999 -9999   432 2    ush00011084 1891  1397  1425  1461   419    69  1702  1080   437   508      0  2362  1333 3    ush00011084 1892  3162  1118   650   406    96  1981  3114  3442   762    254   419 -9999 4    ush00011084 1893  1359  1544  1181   965   185   876  1080  1638   876   1613  1237  1237 5    ush00011084 1894   610  4188  2002   572   118   673 -9999 -9999 -9999     76 -9999   191 6    ush00011084 1895 -9999   381  1016  1016   101  1016   762 -9999 -9999    508 -9999   762 7    ush00011084 1896  1118  3404  1499 -9999    81  2794  2375   470   356   1270   864   572 8    ush00011084 1897   622  3124  1207  1105    12    64  1867  2489     0      0 -9999 -9999 9    ush00011084 1900 -9999 -9999  1857  1788    57  3292  1989   993  1552   1646   488  1542 10   ush00011084 1926 -9999  1404  2619   905   127  1723  2149  2950  3477    884   823   900 

any better solutions achieve way did?


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -