python - pysqlite insert unicode data 8-bit bytestring error -


i know similar permutations of question have been asked before, answers don't seem shed light on doing wrong here.

i trying insert row: (pdb) print row ['886', '39', '83474', '0', '0', '0', '0', '0', '1.00', 'd', '20070813', 'r', 'c', 'b', "sock 4pk", '\xe9\x9e\x8b\xe5\xad\x90\xe5\xb0\xba\xe5\xaf\xb86-9.5/24-27.5cm', 'pr']

into table: create table item ("whs" int,"dept" int,"item" int,"dsun" int,"oh" int,"ohrtv" int,"adjp" int," adjn" int,"sell" text,"stat" text,"lsldt" int,"cat1" text,"cat2" text,"cat3" text,"des1" text,"sgn3" text,"unit" text);

the sgn3 column seems causing problems. defined text, , data inserted utf-8. why receiving sqlite3 error?

programmingerror: 'you must not use 8-bit bytestrings unless use text_factory can interpret 8-bit bytestr...= str). highly recommended instead switch application unicode strings.'

here code doing insert:

query = 'insert %s values(%s)' % (     self.tablename,     ','.join(['?' field in row]) ) self.con.execute(query, row) 

and here procedure creates generator of records inserted:

def encode_utf_8(self, csv_data, csv_encoding):     """decodes 'csv_encoding' , encodes utf-8.        accepts open csv file encoding using scheme recognized      python. returns generator.        """     line in csv_data:         try:             yield line.decode(csv_encoding).encode('utf-8')         except unicodedecodeerror:             next 

that 1 of helpful error messages i've ever seen. says. feed unicode objects, not utf-8-encoded str objects. in other words, lose .encode('utf-8') or maybe follow later decode('utf-8') ...what exactly csvdata?

if ever unicodedecodeerror in existing code:

(1) should more useful intended (sweep under carpet)

(2) may wish change next pass

response comment

haha, useful error message

haha??? wasn't joking; tells do.

csvdata csv file in case encoding using big5 in python 2.x

what calling "a csv file":

(1) csvdata = open('my_big5_file', 'rb') (2) csvdata = csv.reader(open('my_big5_file', 'rb')) (3) other; please specify  

if chose not encode utf-8, rows ascii right?

utterly wrong. bytes_read_from_file.decode('big5') produces unicode object. may read the python unicode howto.

so need explicitly change them unicode before saving database?

no, unicode already. depending on csvdata is, may want encode utf8 them through csv mechanism , decode them later.


Comments