Python line.replace returns UnicodeEncodeError -

- May 15, 2012

i have tex file generated rst source using sphinx, encoded utf-8 without bom (according notepad++) , named final_report.tex, following content:

% generated sphinx. \documentclass[letterpaper,11pt,english]{sphinxmanual} \usepackage[utf8]{inputenc} \begin{document}  \chapter{preface} krimson4 nice programming language. umlauts äöüßÅö. “double quotation mark” problem. johnny’s apostrophe allows connecting multiple ports. components include data describe how ellipsis … software interoperability – dash – not ok. \end{document}

now, before compile tex source pdf, want replace lines in tex file nicer results. script inspired another question.

#!/usr/bin/python # -*- coding: utf-8 -*- import os  newfil=os.path.join("build", "latex", "final_report.tex-new") oldfil=os.path.join("build", "latex", "final_report.tex")  def freplace(old, new):     open(newfil, "wt", encoding="utf-8") fout:         open(oldfil, "rt", encoding="utf-8") fin:             line in fin:                 print(line)                 fout.write(line.replace(old, new))     os.remove(oldfil)     os.rename(newfil, oldfil)  freplace('\documentclass[letterpaper,11pt,english]{sphinxmanual}', '\documentclass[letterpaper, 11pt, english]{book}')

this works on ubuntu 16.04 python 2.7 python 3.5, fails on windows python 3.4. error message is:

file "c:\python34\lib\encodings\cp850.py", line 19, in encode     return codecs.charmap_encode(input,self.errors,encoding_map)[0] unicodeencodeerror: 'charmap' codec can't encode character '\u201c' in position 11: character maps <undefined>

where 201c stands left double quotation mark. if remove problematic character, script proceeds till finds next problematic character.

in end, need solution works on linux , windows python 2.7 , 3.x. tried quite lot of solutions suggested here on so, not yet find 1 works me...

you need specify correct encoding encoding="the_encoding":

with open(oldfil, "rt", encoding="utf-8") fin,  open(newfil, "wt", encoding="utf-8") fout:

if don't preferred encoding used.

open

in text mode, if encoding not specified encoding used platform dependent: locale.getpreferredencoding(false) called current locale encoding

Search This Blog

Prevent

Python line.replace returns UnicodeEncodeError -

Comments

Post a Comment

Popular posts from this blog

github - Git errors while pushing -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

Unity3d perpendicular vector3 -