图书馆.它同时处理doc"和docx"One can use the textract library.It take care of both "doc" as well as "docx"import textracttext = textract.process("path/to/file.extension")您甚至可以使用antiword"(sudo apt-get install antiword),然后先将 doc 转换为 docx,然后通读 docx2txt.You can even use 'antiword' (sudo apt-get install antiword) and then convert doc to first into docx and then read through docx2txt.antiword filename.doc > filename.docx最终,后端的 textract 正在使用 antiword.Ultimately, textract in the backend is using antiword. 这篇关于用python读取.doc文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
08-20 09:52
查看更多