注:如果运行下述脚本过程中遇到 No module named 'frontend',可执行 python -m pip install frontend(要求 Python >=3.8)或 python -m pip install PyMuPDF
Python 脚本
# extract_pdf_text.pyimport fitz
defparsePDF(filePath):with fitz.open(filePath)as doc:
text =""for page in doc.pages():
text += page.get_text()if text:return text
text = parsePDF(r'D:\downloads\intput.pdf')withopen('output.txt', mode='w', encoding='utf8')as f:
f.write(text)