Fix “LookupError: unknown encoding: cp65001” error in Python

Python is a simple, minimalistic, and easy-to-comprehend programming language that is globally-accepted and universally-used today. It’s simple, easy-to-learn and abstract away many low level computation details to allow users to code as if Python is their own mother tongue.

This article will show you what causes "LookupError: unknown encoding: cp65001" error message and a few ways to fix it.

Why does “LookupError: unknown encoding: cp65001” occurs?

If you’re getting this error, you are probably running a legacy version of Python (usually Python 2.x) on Windows.

But first, what is cp65001? cp65001 is the Windows way of calling UTF-8, or Unicode. The "LookupError: unknown encoding: cp65001" error message simply means that Unicode characters that your Python script is trying to print cannot be represented on the screen by current encoding.

Windows uses a "code page" system to support multiple languages and characters in the Windows console (Command Prompt and the PowerShell). It’s not only about displaying the characters, but also about encoding/decoding the input and output when the console interacts with external programs.

In this case, you have several options :

Enable Windows support for UTF-8

Unicode on Windows have always been a complex subject. Starting from Windows 10 insider build 17035 (nominal build 17134), Microsoft has added the option to setting the locale code page to UTF-8. That means most up-to-date Windows 10 and Windows 11 machines are applicable to this method.

In order to Enable Windows support for UTF-8 globally, follow the steps mentioned below.

  1. Press Win+R keyboard shortcut and open intl.cpl.
  2. Switch to Administrative tab
  3. Click the Change system locale button.
  4. Enable Beta: Use Unicode UTF-8 for worldwide language support
  5. Reboot for the changes to take effect.

Alternatively, you can create a *.reg file with the following contents and run it with administrator rights to achieve the same results.

Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage] "ACP"="65001" "OEMCP"="65001" "MACCP"="65001"
Code language: JavaScript (javascript)

Set encoding for current session using chcp command

If you just want to quickly run your Python script, you can simply use chcp command to change the encoding for the current session. Settings will be lost once you reboot the computer. In order to start a Command Prompt window with a different code page, follow the steps below:

  1. Start Run (Win+R)
  2. Type cmd /K chcp 1252

Alternatively, you can put the command chcp 1252 in your PowerShell Profile, which will run it automatically when you open PowerShell. However, this won’t do anything for cmd.exe.

Set default character encoding in Windows Registry

Windows Registry is a great option if you don’t want to manually run chcp every time you start a new Command Line window. In order to set default character encoding in Windows Registry, follow the steps below:

  1. Start Run (Win+R)
  2. Type regedit.exe
  3. Browse to HKEY_CURRENT_USER\Console\%SystemRoot%_system32_cmd.exe
  4. Change the value of CodePage to 1252
  5. Reboot the machine for the changes to take effect

Set PYTHONIOENCODING environment variable

Python relies on a set of environment variables to initially set up the Python environment at startup, one of them is PYTHONIOENCODING. If this is set before running the interpreter, it overrides the encoding used for stdin/stdout/stderr.

If you’ve tried all the methods above without suceeding, try to run set PYTHONIOENCODING=UTF-8 in Command Prompt without reopening the window to see if everything going well. For PowerShell, run $env:PYTHONIOENCODING = "UTF-8"

Install win-unicode-console package

win-unicode-console is a Python package to enable Unicode input and display when running Python from Windows console. This is great in case you want an automatic way to fix encoding error messages programmatically.

The package is not needed on Python 3.6 and newer since the underlying issue has been resolved (see https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep528).

There are several ways to use win-unicode-console:

  • Python patch (recommended)
  • Opt-in runner
  • Opt-out runner
  • Customized runner

As this involves in programming knowledge, you can head over to win-unicode-console page to find out more about how to use the package.

Docker/WSL on Windows

For those of you who are using Docker inside WSL on Windows, you’re essentially inside a double-virtualized environment. This means that things are harder to debug and fix. Users have reported that the following method works for them, however, we haven’t had the chance to confirm it.

If you encounter "LookupError: unknown encoding: cp65001" while working inside Docker/WSL, try creating a file at /usr/bin/docker-compose with the following contents.

#!/usr/bin/env bash trap "/mnt/c/Windows/System32/cmd.exe /C chcp 65001" INT /mnt/c/Windows/System32/cmd.exe /C chcp 850 /mnt/c/Program\ Files/Docker/Docker/resources/bin/docker-compose.exe "[email protected]" /mnt/c/Windows/System32/cmd.exe /C chcp 65001
Code language: JavaScript (javascript)

After that, you should be able to run docker-compose without any other issues.

We hope that the article helped you successfully debugged LookupError: unknown encoding: cp65001” error in Python, as well as avoid encountering it in the future. We’ve also written a few other guides for fixing common Python errors, such as Fix “Max retries exceeded with URL” ,Python Unresolved Import in VSCode or “IndexError: List Index Out of Range” in Python. If you have any suggestion, please feel free to leave a comment below.

Click to rate this post!
[Total: 0 Average: 0]

Leave a Comment