python - django countries encoding is not giving correct name -


i using django_countries module countries list, problem there couple of countries special characters 'Åland islands' , 'saint barthélemy'.

i calling method country name:

country_label = fields.country(form.cleaned_data.get('country')[0:2]).name 

i know country_label lazy translated proxy object of django utils, not giving right name rather gives 'Ã…land islands'. suggestions please?

django stores unicode string using code points , identifies string unicode further processing. utf-8 uses 4 8-bit bytes encoding, unicode string that's being used django needs decoded or interpreted code point notation utf-8 notation @ point. in case of Åland islands, seems happening it's taking utf-8 byte encoding , interpret code points convert string.

the string django_countries returns u'\xc5land islands' \xc5 utf code point notation of Å. in utf-8 byte notation \xc5 becomes \xc3\x85 each number \xc3 , \x85 8-bit byte. see: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=xc5&mode=hex

or can use country_label = fields.country(form.cleaned_data.get('country')[0:2]).name.encode('utf-8') go u'\xc5land islands' '\xc3\x85land islands'

if take each byte , use them code points, you'll see it'll give these characters: Ã… see: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=xc3&mode=hex and: http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=x85&mode=hex

see code snippet html notation of these characters.

<div id="test">&#xc3;&#x85;&#xc5;</div>

so i'm guessing have 2 different encodings in application. 1 way u'\xc5land islands' u'\xc3\x85land islands' in utf-8 environment encode utf-8 convert u'\xc5' '\xc3\x85' , decode unicode iso-8859 give u'\xc3\x85land islands'. since it's not in code you're providing, i'm guessing it's happening somewhere between moment set country_label , moment output isn't displayed properly. either automatically because of encodings settings, or through explicit assignation somewhere.

first edit:

to set encoding app, add # -*- coding: utf-8 -*- @ top of py file , <meta charset="utf-8"> in of template. , unicode string django.utils.functional.proxy object can call unicode(). this:

country_label = unicode(fields.country(form.cleaned_data.get('country')[0:2]).name) 

second edit:

one other way figure out problem use force_bytes (https://docs.djangoproject.com/en/1.8/ref/utils/#module-django.utils.encoding) this:

from django.utils.encoding import force_bytes country_label = fields.country(form.cleaned_data.get('country')[0:2]).name forced_country_label = force_bytes(country_label, encoding='utf-8', strings_only=false, errors='strict')  

but since tried many conversions without success, maybe problem more complex. can share version of django_countries, python , django app language settings? can go see directly in djano_countries package (that should in python directory), find file data.py , open see looks like. maybe data corrupted.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -