r - Reading specific data from large dataset based on criteria to avoid reading entire file into memory -

August 15, 2014

software: r studio
version: 0.98.1102
operating system: windows 7 professional

issue #1: have .txt file 100mb+. has 4 variables , on 500,000 observations each variable.
issue #2: assuming column1 column dates factors. possible change class of column1 class of date using colclasses argument of read.csv()?
if read file via:

mydata <- read.csv("myfile", sep = ";", na.strings = "?", stringsasfactors = false)

issue #1
file loads indefinitely on computer due size of file.

the file has format

column1 column2    column3
dog          bird    apple
cat          dove   orange
rat          sparrow   kiwi
may          bird    apple
cat          dove   orange
rat          sparrow   kiwi

i'm trying figure out how following:
1. read rows of data set column 1 has "dog"
2. read rows of data set column 1 has dog , column2 has bird

things have been trying far 1. read can load entire data , subset avoid that. reason file large load initially. instead, load specific data based on criteria

issue #2
assuming column1 in form of 05/01/2015 had class of "factor". possible change class of column 1 class "date" using colclasses argument of read.csv? perhaps this?

mydata <- read.csv("myfile", sep = ";", na.strings = "?",    stringsasfactors = false, colclasses = c(column1 =as.date(column1))

or perhaps this

mydata <- read.csv("myfile", sep = ";", na.strings = "?",    stringsasfactors = false, colclasses = c(column1 =strptime(column1 %mm%dd%yy))

you can read data chunks, 1000 line @ time , subset them.

temp <- read.csv('file.csv', nrows=1000, stringsasfactors=false)

but using loop not idea in r. so, i'd prefer using sqldf

library(sqldf) power <- read.csv.sql("file.csv", sql = "select * file codition ",                        header = true)

see more options on how in question how read lines fulfil condition csv r

Search This Blog

Script

r - Reading specific data from large dataset based on criteria to avoid reading entire file into memory -

Comments

Post a Comment

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

javascript - Bootstrap Popover: iOS Safari strange behaviour -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -