r - Reading specific data from large dataset based on criteria to avoid reading entire file into memory -


software: r studio
version: 0.98.1102
operating system: windows 7 professional

issue #1: have .txt file 100mb+. has 4 variables , on 500,000 observations each variable.
issue #2: assuming column1 column dates factors. possible change class of column1 class of date using colclasses argument of read.csv()?
if read file via:

mydata <- read.csv("myfile", sep = ";", na.strings = "?", stringsasfactors = false) 

issue #1
file loads indefinitely on computer due size of file.

the file has format

column1         column2          column3
dog                     bird                  apple
cat                    dove                 orange
rat                    sparrow              kiwi
may                     bird                  apple
cat                    dove                 orange
rat                    sparrow              kiwi

i'm trying figure out how following:
1. read rows of data set column 1 has "dog"
2. read rows of data set column 1 has dog , column2 has bird

things have been trying far 1. read can load entire data , subset avoid that. reason file large load initially. instead, load specific data based on criteria

issue #2
assuming column1 in form of 05/01/2015 had class of "factor". possible change class of column 1 class "date" using colclasses argument of read.csv? perhaps this?

mydata <- read.csv("myfile", sep = ";", na.strings = "?",    stringsasfactors = false, colclasses = c(column1 =as.date(column1)) 

or perhaps this

mydata <- read.csv("myfile", sep = ";", na.strings = "?",    stringsasfactors = false, colclasses = c(column1 =strptime(column1 %mm%dd%yy)) 

you can read data chunks, 1000 line @ time , subset them.

temp <- read.csv('file.csv', nrows=1000, stringsasfactors=false) 

but using loop not idea in r. so, i'd prefer using sqldf

library(sqldf) power <- read.csv.sql("file.csv", sql = "select * file codition ",                        header = true) 

see more options on how in question how read lines fulfil condition csv r


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -