博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Unix语言环境与Unicode(“ ascii”编解码器无法编码字符…)
阅读量:2520 次
发布时间:2019-05-11

本文共 5329 字,大约阅读时间需要 17 分钟。

You might get unusual errors about Unicode and inability to convert to ASCII. Programs might just crash at random. Those are often simple to fix — all you need is correct locale configuration.

您可能会遇到有关Unicode的异常错误,并且无法转换为ASCII。 程序可能只是随机崩溃。 这些通常很容易修复-您所需要的只是正确的语言环境配置。

Has this ever happened to you?

你曾经发生过这些事情吗?

Traceback (most recent call last):Traceback (most recent call last):  File   File "aogonek.py", line "aogonek.py" , line 1, in 1 , in 
printprint (( uu '' u0105u0105 '' )) UnicodeEncodeError: UnicodeEncodeError : 'ascii' codec can't encode character 'u0105' in position 0: ordinal not in range(128)'ascii' codec can't encode character 'u0105' in position 0: ordinal not in range(128)
Input: ąInput: ąDesired ascii(): 'u0105'Desired ascii(): 'u0105'Real ascii(): 'udcc4udc85'Real ascii(): 'udcc4udc85'

All those errors have the same root cause: incorrect locale configuration. To fix them all, you need to generate the missing locales and set them.

所有这些错误都有相同的根本原因:不正确的语言环境配置。 要修复它们,您需要生成缺少的语言环境并进行设置。

检查当前使用的语言环境 (Check currently used locale)

The locale command (without arguments) should tell you which locales you’re currently using. (The list might be shorter on your end)

语言环境命令(不带参数)应告诉您当前正在使用的语言环境。 (列表可能会短一些)

$ locale$ locale LANGLANG == "en_US.UTF-8""en_US.UTF-8" LC_CTYPELC_CTYPE == "en_US.UTF-8""en_US.UTF-8" LC_NUMERICLC_NUMERIC == "en_US.UTF-8""en_US.UTF-8" LC_TIMELC_TIME == "en_US.UTF-8""en_US.UTF-8" LC_COLLATELC_COLLATE == "en_US.UTF-8""en_US.UTF-8" LC_MONETARYLC_MONETARY == "en_US.UTF-8""en_US.UTF-8" LC_MESSAGESLC_MESSAGES == "en_US.UTF-8""en_US.UTF-8" LC_PAPERLC_PAPER == "en_US.UTF-8""en_US.UTF-8" LC_NAMELC_NAME == "en_US.UTF-8""en_US.UTF-8" LC_ADDRESSLC_ADDRESS == "en_US.UTF-8""en_US.UTF-8" LC_TELEPHONELC_TELEPHONE == "en_US.UTF-8""en_US.UTF-8" LC_MEASUREMENTLC_MEASUREMENT == "en_US.UTF-8""en_US.UTF-8" LC_IDENTIFICATIONLC_IDENTIFICATION == "en_US.UTF-8""en_US.UTF-8" LC_ALLLC_ALL ==

If any of those is set to C or POSIX, has a different encoding than UTF-8 (sometimes spelled utf8) is empty (with the exception of LC_ALL), or if you see any errors, you need to reconfigure your locale.

如果其中任何一个设置为CPOSIX ,则其编码不同于UTF-8 (有时拼写为utf8 )为空( LC_ALL除外),或者如果看到任何错误,则需要重新配置区域设置。

检查语言环境可用性并安装缺少的语言环境 (Check locale availability and install missing locales)

The first thing you need to do is check locale availability. To do this, run locale -a. This will produce a list of all installed locales. You can use grep to get a more reasonable list.

您需要做的第一件事是检查语言环境可用性。 为此,请运行locale -a 。 这将产生所有已安装语言环境的列表。 您可以使用grep获取更合理的列表。

The best locale to use is the one for your language, with the UTF-8 encoding. The locale will be used by some console apps for output. I’m going to use en_US.UTF-8 in this guide.

最好使用的语言环境是使用UTF-8编码的语言。 某些控制台应用程序将使用语言环境进行输出。 我将在本指南中使用en_US.UTF-8

If you can’t see any UTF-8 locales, or no appropriate locale setting for your language of choice, you might need to generate those. The required actions depend on your distro/OS.

如果看不到任何UTF-8语言环境,或者没有针对您选择的语言的适当语言环境设置,则可能需要生成这些语言环境。 所需的操作取决于您的发行版/操作系统。

  • Debian, Ubuntu, and derivatives: install language-pack-en-base, run sudo dpkg-reconfigure locales
  • RHEL, CentOS, Fedora: install glibc-langpack-en
  • Arch Linux: uncomment relevant entries in /etc/locale.gen and run sudo locale-gen
  • For other OSes, refer to the documentation.
  • Debian,Ubuntu及其衍生产品:安装language-pack-en-base ,运行sudo dpkg-reconfigure语言环境
  • RHEL,CentOS,Fedora:安装glibc-langpack-en
  • Arch Linux:在/etc/locale.gen中取消注释相关条目,然后运行sudo locale-gen
  • 对于其他操作系统,请参考文档。

You need a UTF-8 locale to ensure compatibility with software. Avoid the C and POSIX locales (it’s ASCII) and locales with other encodings (those aren’t used by ~anyone these days)

您需要使用UTF-8语言环境来确保与软件的兼容性。 避免使用CPOSIX语言环境(它是ASCII)以及具有其他编码的语言环境(这些天来〜的人都没有使用过)

在系统范围内配置 (Configure system-wide)

On some systems, you may be able to configure locale system-wide. Check your system documentation for details. If your system has systemd, run

在某些系统上,您可能可以在系统范围内配置区域设置。 查看系统文档以了解详细信息。 如果您的系统已系统化,请运行

sudo localectl set-locale LANG=en_US.UTF-8sudo localectl set-locale LANG=en_US.UTF-8

为单个用户配置 (Configure for a single user)

If your environment does not allow system-wide locale configuration (macOS, shared server with generated but unconfigured locales), or if you want to ensure it’s always configured independently of system settings.

如果您的环境不允许进行系统范围的区域设置(macOS,具有已生成但未配置的区域设置的共享服务器),或者要确保始终独立于系统设置进行配置。

To do this, you need to edit the configuration file for your shell. If you’re using bash, it’s .bashrc (or .bash_profile on macOS). For zsh users, .zshrc. Add this line (or equivalent in your shell):

为此,您需要编辑Shell的配置文件。 如果您使用的是bash, 则为.bashrc (或在macOS上为.bash_profile )。 对于zsh用户,请使用.zshrc 。 添加以下行(或您的shell中的等效行):

That should be enough. Note that those settings don’t apply to programs not launched through a shell.

那应该足够了。 请注意,这些设置不适用于不是通过外壳启动的程序。



Python/Windows corner: Python 3.7 will fix this on Unix by assuming UTF-8 if it encounters the C locale. On Windows, Python 3.6 is using UTF-8 interactively, but not when using shell redirections to files or pipes.

Python / Windows角落:如果 3.7遇到C语言环境,它将假定UTF-8在Unix上解决此问题。 在Windows上,Python 3.6交互地使用UTF-8,但在使用Shell重定向到文件或管道时不使用。

This post was brought to you by ą — U+0105 LATIN SMALL LETTER A WITH OGONEK.

此帖是由±— U + 0105带OGONEK的拉丁文小写字母A带给您的。

翻译自:

转载地址:http://ggqwd.baihongyu.com/

你可能感兴趣的文章
java 中打印调用栈
查看>>
开发 笔记
查看>>
数据挖掘算法比赛 - 简单经验总结
查看>>
【RFID防碰撞协议/算法】二进制搜索防碰撞算法
查看>>
win7(64位)php5.5-Apache2.4-mysql5.6环境安装
查看>>
同一行内不同大小的字体垂直居中
查看>>
【翻译】(13)Prebuilts
查看>>
C++中的单例模式(转)
查看>>
使用ptrace向已运行进程中注入.so并执行相关函数(转)
查看>>
Android 9 patch 图片 (.9.png 格式图片) 的特点和制作(转)
查看>>
CentOS7系统安装 Maria Db(MYSQL)教程
查看>>
三十四.MySQL主从同步 、主从同步模式
查看>>
zk 的配额
查看>>
openstack Icehouse发布
查看>>
这是第四版本 到了第五个版本我的程序代码大改了 效果很好 期待我再升级...
查看>>
一步步学习SPD2010--第十章节--SP网站品牌化(1)--设置CSS和颜色编码页面编辑器选项...
查看>>
[LeetCode] Template to N Sum for (LC 15. 3Sum, LC 18. 4Sum)(N>2)
查看>>
20171130-构建之法:现代软件工程-阅读笔记
查看>>
二维数组与排序
查看>>
loadrunner设置Analysis分析时去掉思考时间
查看>>