chardet是一个Python库,用于检测文本文件的编码格式。
以下是一些基本的用法:
with open('example.txt', 'rb') as f:
result = chardet.detect(f.read())
print(result['encoding'])
byte_str = b'\xe4\xbd\xa0\xe5\xa5\xbd'
result = chardet.detect(byte_str)
print(result['encoding'])
file_list = ['example1.txt', 'example2.txt']
results = []
for file in file_list:
with open(file, 'rb') as f:
result = chardet.detect(f.read())
results.append(result)
print(results)
import io
with open('example.txt', 'rb') as f:
stream = f.read()
result = chardet.detect_stream(stream)
print(result['encoding'])