Reading a file in text format and writing it into CSV file in data format using Python -
i trying convert folder full of text files data in csv format in order perform further data analysis. first line of text file has headers strings separated ;
, second line on contains corresponding data. not able read file in text format , write csv file in data format. piece of code looks gives me errors not being able convert strings buffer interface.
import os import sys import csv # open file full_path = "c:\\documents , settings\\30695\\my documents\\database" dirs = os.listdir( full_path ) # print files , directories file in dirs: path = full_path+'\\'+file print (file) filename = (os.path.splitext(file)[0]) print (filename) txt_file = filename csv_file = filename in_txt = csv.reader(open(full_path+'\\'+txt_file+'.txt', "rt"), delimiter = ';') out_csv = csv.writer(open(full_path+'\\'+csv_file+'.csv', 'wb')) out_csv.writerows(in_txt)
i not sure if of right or wrong because want csv file separated delimiter ;
, numbers should available in data format calculations.
the input file looks bit this:
"createtime";"grid cosphi";"grid current";"grid frequency";"grid kw";"grid var";"grid voltage";"pitch angle 1";"pitch angle 2";"pitch angle 3";"rotor rpm";"temp. 5 214";"temp. 6 217";"temp. 9 227";"winddirection";"windspeed" 9/21/14 11:30:01 pm;n/a;n/a;49.963;211688.734;n/a;n/a;-1.06;-1.039;-1.119;19.379;47.167;36;64;n/a;6.319 9/21/14 11:40:01 pm;n/a;n/a;50.002;170096.297;n/a;n/a;-1.003;-0.96;-1.058;19.446;47.182;36.084;63.772;n/a;5.628 9/21/14 11:50:01 pm;n/a;n/a;50.021;175038.734;n/a;n/a;-0.976;-0.961;-1.082;18.805;47;36.223;63.153;n/a;5.577 9/22/14 12:00:01 am;n/a;n/a;49.964;229942.016;n/a;n/a;-1.047;-1.018;-1.066;18.957;47.125;36.293;63.766;n/a;6.431 9/22/14 12:10:01 am;n/a;n/a;49.908;200873.844;n/a;n/a;-0.997;-0.985;-1.06;19.229;47.028;36.334;63.962;n/a;6.076 9/22/14 12:20:01 am;n/a;n/a;49.934;234467.609;n/a;n/a;-1.028;-0.986;-1.001;18.995;47.056;36.401;63.732;n/a;6.067
if run code under python 2.6 or 2.7 fine. python 3.x more picky how open files , write them.
the 2.7 documentation works binary mode opened files reading , writing. in 3.4 opening of files has smartened , should open reading or writing r
, resp. w
(leaving out t
, or `b'), interpreter can needed:
in_txt = csv.reader(open(os.path.join(full_path, txt_file+'.txt'), "r"), delimiter = ';') out_csv = csv.writer(open(os.path.join(full_path, csv_file+'.csv'), 'w'))
i update whole code somewhat:
import os import sys import csv # open file full_path = r"c:\documents , settings\30695\my documents\database" dirs = os.listdir( full_path ) # print files , directories file in dirs: path = os.path.join(full_path, file) print (file) filename, ext = os.path.splitext(file) if ext != '.txt': continue print (filename) txt_file = filename csv_file = filename in_txt = csv.reader(open(os.path.join(full_path, txt_file+'.txt'), "r"), delimiter = ';') out_csv = csv.writer(open(os.path.join(full_path, csv_file+'.csv'), 'w')) out_csv.writerows(in_txt)
using raw raw string path don't have escape backslashes; replacing concatenation of strings create full filename using os.path.join()
(i had because tested on linux); , skipping non .txt
files, because once create .csv
files in fullpath
directory, going spewed out listdir()
well.
what generate yaml files out of csv files in yaml
utility of ruamel.yaml, iterate on lines in input , convert them process_line:
import dateutil.parser # https://pypi.python.org/pypi/python-dateutil def process_line(line): """convert lines, trying, int, float, date""" ret_val = [] elem in line: try: res = int(elem) ret_val.append(res) continue except valueerror: pass try: res = float(elem) ret_val.append(res) continue except valueerror: pass try: res = dateutil.parser.parse(elem) ret_val.append(res) continue except typeerror: pass ret_val.append(elem) return ret_val
to use need replace out_csv.writerows(in_txt)
like:
for line in in_txt: out_csv.writerow(convert_line(line))
Comments
Post a Comment