» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with unicode + xml

利用python处理xml -- 中文编码问题 - 郑新星 - CSDNBlog

xml协会规定,所有xml解析器均需要支持utf-8和utf-16两种编码而不要求别的编码,估计python提供的xml处理模块就是不支持gb2312的。而windows下的中文文本文件,默认情况下一般为gb2312编码的,因此处理的时候,就会带来不方便的地方。解决办法是将整个xml文件弄成utf-8格式的,

XML: del.icio.us/tag/xml

ishida >> utilities: Unicode code converter

Helps you convert between Unicode character numbers, characters, UTF-8 and UTF-16 code units in hex, percent escapes,and Numeric Character References (hex and decimal).

XML: del.icio.us/tag/xml

Invalid XML Characters: when valid UTF8 does not mean valid XML

strip the invalid xml character(s) although it's valid a UTF8 encoding.

XML: del.icio.us/tag/xml

Survival guide to i18n

If you can successfully process text with Diacritic marks, you are likely using either iso-8859-1, or its close cousin, windows-1252. There is no polite way to put this, so if you are running on a Microsoft platform, or cut and paste from documents produced by Microsoft software, or even allow comments to be posted by people who might be doing one of the above, you need to be aware of the 27 differences, summarized by the table below:

XML: del.icio.us/tag/xml

Unicode HOWTO in Python

This HOWTO discusses Python's support for Unicode, and explains various problems that people commonly encounter when trying to work with Unicode.

XML: del.icio.us/tag/xml

Page 1 | Next >>