jawa.util.utf module

Utility methods for handling oddities in character encoding encountered when parsing and writing JVM ClassFiles or object serialization archives.

Note

http://bugs.python.org/issue2857 was an attempt in 2008 to get support for MUTF-8/CESU-8 into the python core.

jawa.util.utf.decode_modified_utf8(s: bytes) → str[source]

Decodes a bytestring containing modified UTF-8 as defined in section 4.4.7 of the JVM specification.

Parameters:s – bytestring to be converted.
Returns:A unicode representation of the original string.
jawa.util.utf.encode_modified_utf8(u: str) → bytearray[source]

Encodes a unicode string as modified UTF-8 as defined in section 4.4.7 of the JVM specification.

Parameters:u – unicode string to be converted.
Returns:A decoded bytearray.