compiler construction - Do critical characters of a python encoding ever come outside of quotes? -
i working on python scanner library , came across encodings. guess, lexical scanner never have problems 'critical' characters because come inside quoted strings. i.e. in unicode, characters of form
0x00000000 - 0x0000007f: 0xxxxxxx 0x00000080 - 0x000007ff: 110xxxxx 10xxxxxx 0x00000800 - 0x0000ffff: 1110xxxx 10xxxxxx 10xxxxxx 0x00010000 - 0x001fffff: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 0x00200000 - 0x03ffffff: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 0x04000000 - 0x7fffffff: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
, code above x80: not mix ' , ". not have care encoding. right or not?
references: https://www.python.org/dev/peps/pep-0263/ https://docs.python.org/2/tutorial/interpreter.html#source-code-encoding
there other encodings, simple assumption isn't true. 1 byte encodings, seems correct, multi byte encodings other utf-8 wrong.
Comments
Post a Comment