csv - Golang file reading only reading last line -
so took publicly available data looks -
this file
http://expirebox.com/download/b149b744768fb11aee9c5e26ad409bcc.html
,,,% of total expenditure,,, function code,type of activity,expenditure,dollars/student (ada),"this district (ada 49,497)",all unified school districts,statewide average 1000-1999ÊÊ,instructionÊÊ,"$249,397,226","$5,039",42%,62%,62% 1000,instruction,"$247,472,790ÊÊ","$5,000",42%,48%,49% 1110,special education: separate classes,"$1,004,074",$20,n/a,n/a,n/a 1120,special education: resource specialist instruction,"$781,629",$16,n/a,n/a,n/a 1130,special education: supplemental aids & services in regular classrooms,"$46,747",$1,n/a,n/a,n/a 1180,special education: nonpublic agencies/schools (npa/s),n/a,n/a,n/a,n/a,n/a 1190,special education: other specialized instructional services,"$91,985",$2,n/a,n/a,n/a 1100-1199,instruction - special education,"$1,924,436ÊÊ",$39,0%,14%,13% "subtotal, instruction",,"$249,397,226","$5,039",42%,62%,62% 2000-2999ÊÊ,instruction-related servicesÊÊ,"$132,783,414","$2,683",22%,12%,12% 2100,instructional supervision , administration,"$89,551,041","$1,809",n/a,n/a,n/a 2110,instructional supervision,n/a,n/a,n/a,n/a,n/a 2120,instructional research,n/a,n/a,n/a,n/a,n/a 2130,curriculum development,"$348,369",$7,n/a,n/a,n/a 2140,in-house instructional staff development,"$19,855",$0,n/a,n/a,n/a 2150,instructional administration of special projects,n/a,n/a,n/a,n/a,n/a 2100-2199,instructional supervision , administration,"$89,919,265ÊÊ","$1,817",15%,4%,4% 2200,administrative unit (au) of multidistrict selpa,$0,$0,0%,0%,0% 2420,"instructional library, media, , technology","$8,295,033ÊÊ",$168,1%,1%,1% 2490,other instructional resources,"$538,734",$11,n/a,n/a,n/a 2495,parent participation,"$97,830",$2,n/a,n/a,n/a 2490-2495,other instructional resources,"$636,565ÊÊ",$13,0%,1%,0% 2700,school administration,"$33,932,551ÊÊ",$686,6%,7%,7% "subtotal, instruction-related services",,"$132,783,414","$2,683",22%,12%,12% 3000-3999ÊÊ,pupil servicesÊÊ,"$45,325,938",$916,8%,8%,8% 4000-4999ÊÊ,ancillary servicesÊÊ,"$2,207,263",$45,0%,1%,1% 5000-5999ÊÊ,community servicesÊÊ,$0,$0,0%,0%,0% 6000-6999ÊÊ,enterpriseÊÊ,"$4,264",$0,0%,0%,0% 7000-7999ÊÊ,general administrationÊÊ,"$27,916,858",$564,5%,5%,6% 8000-8999ÊÊ,plant servicesÊÊ,"$55,172,247","$1,115",9%,11%,10% 9000-9999ÊÊ,other outgoÊÊ,"$81,981,716",n/a,14%,2%,2% "total expenditures, activities",,"$594,788,926","$12,017",100%,100%,100%
it's in csv.
i have tried code
file, err := os.open("expenses.csv") if err != nil { log.fatal(err) } defer file.close() scanner := bufio.newscanner(file) scanner.scan() { fmt.println(scanner.text()) } if err := scanner.err(); err != nil { log.fatal(err) }
and this
content, err := ioutil.readfile("expenses.csv") lines := strings.split(string(content), "\n") fmt.println(lines) check(err) dat, err := os.open("expenses.csv") check(err) defer dat.close() reader := csv.newreader(dat) reader.lazyquotes = true reader.fieldsperrecord = -1 rawcsvdata, err := reader.readall() check(err) fmt.println(rawcsvdata) _, each := range rawcsvdata { fmt.println(each) }
where check is
func check(e error) { if e != nil { panic(e) } }
in both cases result -
"total expenditures, activities",,"$594,788,926","$12,017",100%,100%,100%,1%15%,4%,4%aa,n/a,n/anified school districts,statewide average
rather lines.
why reading last line?
the basic problem file has \r
line endings. isn't valid utf-8. together, going cause scanner
lot of trouble.
first, can see what's in file using xxd
00000000: 2c2c 2c25 206f 6620 546f 7461 6c20 4578 ,,,% of total ex 00000010: 7065 6e64 6974 7572 652c 2c2c 0d46 756e penditure,,,.fun
if look, you'll see line ending 0d
, \r
. scanner
needs either \r\n
or \n
.
next, may run trouble because isn't utf-8. Ê
in there 0xca
, not valid utf-8 encoding. can see in xxd
again:
000000b0: 3939 39ca ca2c 494e 5354 5255 4354 494f 999..,instructio 000000c0: 4eca ca2c 2224 3234 392c 3339 372c 3232 n..,"$249,397,22
go ship along bytes (and Ê
), lot of editors try do, it's cause trouble.
if possible, reformat file use either unix or windows line endings in utf-8.
Comments
Post a Comment