Python is a simple, minimalistic, and easy-to-comprehend programming language that is globally-accepted and universally-used today. It’s simple, easy-to-learn and abstract away many low level computation details to allow users to code as if Python is their own mother tongue.
This article will show you what causes "LookupError: unknown encoding: cp65001" error message and a few ways to fix it.
Why does “LookupError: unknown encoding: cp65001” occurs?
If you’re getting this error, you are probably running a legacy version of Python (usually Python 2.x) on Windows.
But first, what is
cp65001 is the Windows way of calling UTF-8, or Unicode. The "LookupError: unknown encoding: cp65001" error message simply means that Unicode characters that your Python script is trying to print cannot be represented on the screen by current encoding.
Windows uses a "code page" system to support multiple languages and characters in the Windows console (Command Prompt and the PowerShell). It’s not only about displaying the characters, but also about encoding/decoding the input and output when the console interacts with external programs.
In this case, you have several options :
- Enable beta support for Unicode UTF-8 on modern versions of Windows (Windows 10+)
- Set encoding for current session using
- Set default character encoding in Windows Registry.
- Set PYTHONIOENCODING environment variable to UTF-8.
- Try installing win-unicode-console package from PyPI
- Try another terminal such as Tabby or CMDer
Enable Windows support for UTF-8
Unicode on Windows have always been a complex subject. Starting from Windows 10 insider build 17035 (nominal build 17134), Microsoft has added the option to setting the locale code page to UTF-8. That means most up-to-date Windows 10 and Windows 11 machines are applicable to this method.
In order to Enable Windows support for UTF-8 globally, follow the steps mentioned below.
- Press Win+R keyboard shortcut and open
- Switch to Administrative tab
- Click the Change system locale button.
- Enable Beta: Use Unicode UTF-8 for worldwide language support
- Reboot for the changes to take effect.
Alternatively, you can create a
*.reg file with the following contents and run it with administrator rights to achieve the same results.
Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage] "ACP"="65001" "OEMCP"="65001" "MACCP"="65001"
Set encoding for current session using chcp command
If you just want to quickly run your Python script, you can simply use
chcp command to change the encoding for the current session. Settings will be lost once you reboot the computer. In order to start a Command Prompt window with a different code page, follow the steps below:
cmd /K chcp 1252
Alternatively, you can put the command
chcp 1252 in your PowerShell Profile, which will run it automatically when you open PowerShell. However, this won’t do anything for cmd.exe.
Set default character encoding in Windows Registry
Windows Registry is a great option if you don’t want to manually run
chcp every time you start a new Command Line window. In order to set default character encoding in Windows Registry, follow the steps below:
- Browse to HKEY_CURRENT_USER\Console\%SystemRoot%_system32_cmd.exe
- Change the value of CodePage to 1252
- Reboot the machine for the changes to take effect
Set PYTHONIOENCODING environment variable
Python relies on a set of environment variables to initially set up the Python environment at startup, one of them is
PYTHONIOENCODING. If this is set before running the interpreter, it overrides the encoding used for stdin/stdout/stderr.
If you’ve tried all the methods above without suceeding, try to run
set PYTHONIOENCODING=UTF-8 in Command Prompt without reopening the window to see if everything going well. For PowerShell, run
$env:PYTHONIOENCODING = "UTF-8"
Install win-unicode-console package
win-unicode-console is a Python package to enable Unicode input and display when running Python from Windows console. This is great in case you want an automatic way to fix encoding error messages programmatically.
The package is not needed on Python 3.6 and newer since the underlying issue has been resolved (see https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep528).
There are several ways to use win-unicode-console:
- Python patch (recommended)
- Opt-in runner
- Opt-out runner
- Customized runner
As this involves in programming knowledge, you can head over to win-unicode-console page to find out more about how to use the package.
Docker/WSL on Windows
For those of you who are using Docker inside WSL on Windows, you’re essentially inside a double-virtualized environment. This means that things are harder to debug and fix. Users have reported that the following method works for them, however, we haven’t had the chance to confirm it.
If you encounter "LookupError: unknown encoding: cp65001" while working inside Docker/WSL, try creating a file at
/usr/bin/docker-compose with the following contents.
"/mnt/c/Windows/System32/cmd.exe /C chcp 65001" INT /mnt/c/Windows/System32/cmd.exe /C chcp 850 /mnt/c/Program\ Files/Docker/Docker/resources/bin/docker-compose.exe "[email protected]" /mnt/c/Windows/System32/cmd.exe /C chcp 65001trap
After that, you should be able to run
docker-compose without any other issues.
We hope that the article helped you successfully debugged LookupError: unknown encoding: cp65001” error in Python, as well as avoid encountering it in the future. We’ve also written a few other guides for fixing common Python errors, such as Fix “Max retries exceeded with URL” ,Python Unresolved Import in VSCode or “IndexError: List Index Out of Range” in Python. If you have any suggestion, please feel free to leave a comment below.