UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (2025)

# UnicodeDecodeError: 'charmap' codec can't decode byte

The Python "UnicodeDecodeError: 'charmap' codec can't decode byte in position"occurs when we specify an incorrect encoding or don't explicitly set theencoding keyword argument when opening a file.

To solve the error, specify the correct encoding, e.g. utf-8.

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (1)

Here is an example of how the error occurs.

I have a file called example.txt with the following contents.

example.txt

Copied!

𝘈Ḇ𝖢𝕯٤ḞԍНǏhello world

And here is the code that tries to decode the contents of example.txt.

main.py

Copied!

# ⛔️ UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1: character maps to <undefined>with open('example.txt', 'r', encoding='cp856') as f: lines = f.readlines() print(lines)

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (2)

The error is caused because the example.txt file doesn't use the specifiedencoding (cp856).

example.txt

Copied!

𝘈Ḇ𝖢𝕯٤ḞԍНǏhello world

# Specifying the correct encoding when opening the file

If you know the encoding the file uses, make sure to specify it using theencoding keyword argument.

Otherwise, the first thing you can try is setting the encoding to utf-8.

main.py

Copied!

with open('example.txt', 'r', encoding='utf-8') as f: lines = f.readlines() # ✅ ['𝘈Ḇ𝖢𝕯٤ḞԍНǏ\n', 'hello world'] print(lines)

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (3)

The utf-8 encoding is capable of encoding over a million valid character code points in Unicode.

The same approach can be used if you use theopen()function directly instead of using thewith statement.

main.py

Copied!

my_file = open('example.txt', 'r', encoding='utf-8')lines = my_file.readlines()print(lines) # ['𝘈Ḇ𝖢𝕯٤ḞԍНǏ\n', 'hello world']

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (4)

You can view all of the standard encodings inthis tableof the official docs.

Some of the common encodings are ascii, latin-1 and utf-32.

# Specifying an encoding when using the patlib module

If you use the pathlib module, specify an encoding when calling the specificmethod.

main.py

Copied!

from pathlib import Pathtext = Path('example.txt').read_text(encoding='utf-8')# 𝘈Ḇ𝖢𝕯٤ḞԍНǏ# hello worldprint(text)

You can pass the encoding when calling methods such asPath.read_textorPath.write_text.

# Ignoring characters that cannot be decoded

If the error persists, you could set theerrors keyword argumentto ignore to ignore the characters that cannot be decoded.

Note that ignoring characters that cannot be decoded can lead to data loss.

main.py

Copied!

# 👇️ Set errors to ignorewith open('example.txt', 'r', encoding='utf-8', errors='ignore') as f: lines = f.readlines() # ✅ ['𝘈Ḇ𝖢𝕯٤ḞԍНǏ\n', 'hello world'] print(lines)

Opening the file with an incorrect encoding with errors set to ignore won'traise a UnicodeDecodeError.

main.py

Copied!

with open('example.txt', 'r', encoding='cp856', errors='ignore') as f: lines = f.readlines() # ✅ ['\xadרט©ז\xadצ\xadץ»┘©×םן\n', 'hello world'] print(lines)

The characters that cannot be decoded are simply ignored.

# Opening the file in binary mode

If you don't need to interact with the contents of the file, you can open it inbinary mode without decoding it.

main.py

Copied!

with open('example.txt', 'rb') as f: lines = f.readlines() # ✅ [b'\xf0\x9d\x98\x88\xe1\xb8\x86\xf0\x9d\x96\xa2\xf0\x9d\x95\xaf\xd9\xa4\xe1\xb8\x9e\xd4\x8d\xd0\x9d\xc7\x8f\n', b'hello world'] print(lines)

We opened the file in binary mode (using the rb - read binary mode), so thelines list contains bytes objects.

You can use this approach if you need to upload the file to a remote server anddon't need to decode it.

Encoding is the process of converting a string to a bytes object and decoding is the process of converting a bytes object to a string.

When decoding a bytes object, we have to use the same encoding that was used toencode the string to a bytes object.

# Try using the cp437 encoding

If the error persists, try to use thecp437 encoding when opening thefile.

main.py

Copied!

with open('example.txt', 'r', encoding='cp437') as f: lines = f.readlines() # ✅ ['≡¥ÿêß╕å≡¥ûó≡¥ò»┘ñß╕₧╘ì╨¥╟Å\n', 'hello world'] print(lines)

The Code page 437 encoding is the character set of the original IBM personalcomputer and includes all printable ASCII characters as well as some accentedletters.

If you still get an error, set the errors keyword argument to ignore in thecall to theopen() function.

main.py

Copied!

with open('example.txt', 'r', encoding='cp437', errors='ignore') as f: lines = f.readlines() # ✅ ['≡¥ÿêß╕å≡¥ûó≡¥ò»┘ñß╕₧╘ì╨¥╟Å\n', 'hello world'] print(lines)

The characters that cannot be decoded are simply ignored which may cause dataloss.

If the error persists, try other encodings such as utf-16, utf-32,latin-1, etc.

# Trying to find the encoding of the file

You can try to figure out what the encoding of the file is by using the filecommand.

The command is available on macOS and Linux, but can also be used on Windows ifyou have Git and Git Bash installed.

Make sure to run the command in Git Bash if on Windows.

Open your shell in the directory that contains the file and run the followingcommand.

shell

Copied!

file *

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (5)

The screenshot shows that the file uses the ASCII encoding.

This is the encoding you should specify when opening the file.

main.py

Copied!

with open('example.txt', 'r', encoding='ascii') as f: lines = f.readlines() print(lines)

If you are on Windows, you can also:

  1. Open the file in the basic version of Notepad.
  2. Click on "Save as".
  3. Look at the selected encoding right next to the "Save" button.

UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (6)

The screenshot shows that the encoding for the file is UTF-8, so that's whatwe have to specify when calling theopen() function.

main.py

Copied!

with open('example.txt', 'r', encoding='utf-8') as f: lines = f.readlines() print(lines)

# Try using the latin-1 encoding

If the error persists, try to use thelatin-1 encoding when openingthe file.

main.py

Copied!

with open('example.txt', 'r', encoding='latin-1') as f: lines = f.readlines() # ['ð\x9d\x98\x88á¸\x86ð\x9d\x96¢ð\x9d\x95¯Ù¤á¸\x9eÔ\x8dÐ\x9dÇ\x8f\n', 'hello world'] print(lines)

Make sure to check if you get legible results when using the latin-1 encoding.

# Using a different encoding causes the error

Here is an example that shows how using a different encoding to encode a stringto bytes than the one used to decode the bytes object causes the error.

main.py

Copied!

my_text = '𝘈Ḇ𝖢𝕯٤ḞԍНǏ'my_binary_data = my_text.encode('utf-8')# ⛔️ UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1: character maps to <undefined>my_text_again = my_binary_data.decode('cp856')

We can solve the error by using the utf-8 encoding to decode the bytes object.

main.py

Copied!

my_text = '𝘈Ḇ𝖢𝕯٤ḞԍНǏ'my_binary_data = my_text.encode('utf-8')# 👉️ b'\xf0\x9d\x98\x88\xe1\xb8\x86\xf0\x9d\x96\xa2\xf0\x9d\x95\xaf\xd9\xa4\xe1\xb8\x9e\xd4\x8d\xd0\x9d\xc7\x8f'print(my_binary_data)# ✅ Specify the correct encodingmy_text_again = my_binary_data.decode('utf-8')print(my_text_again) # 👉️ '𝘈Ḇ𝖢𝕯٤ḞԍНǏ'
UnicodeDecodeError: 'charmap' codec can't decode byte | bobbyhadz (2025)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Rubie Ullrich

Last Updated:

Views: 6004

Rating: 4.1 / 5 (52 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Rubie Ullrich

Birthday: 1998-02-02

Address: 743 Stoltenberg Center, Genovevaville, NJ 59925-3119

Phone: +2202978377583

Job: Administration Engineer

Hobby: Surfing, Sailing, Listening to music, Web surfing, Kitesurfing, Geocaching, Backpacking

Introduction: My name is Rubie Ullrich, I am a enthusiastic, perfect, tender, vivacious, talented, famous, delightful person who loves writing and wants to share my knowledge and understanding with you.