Parsing .docx in python 3

I am currently writing a python 3 program that parses some docx files and extracts text and images from them. I am trying to use docx , but it will not be imported into my program. I installed lxml, Pillow and python-docx, but it does not import. When I try to use python-docx from the terminal, I cannot use example-extracttext.py or example-makedocument.py, which makes me think that the installation is not working properly. Is there a way to check if it is installed correctly or is there a way to get this to work properly so that I can import it into my project? I'm on Ubuntu 13.10.

+3
source share
2 answers

I recommend that you try the latest version of python-docx, which is installed as follows:

$ pip install --pre python-docx

Documentation is available here: http://python-docx.readthedocs.org/

The installation should result in a message that looks successful. You may need to install sudo to temporarily use root privileges:

$ sudo pip install --pre python-docx

After installation, you should be able to do the following in the Python interpreter:

>>> from docx import Document
>>>

If instead you get something like this, the installation did not work as expected:

>>> from docx import Document
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named docx

As you can provide more feedback on your attempts, I can develop an answer.

, v0.2.x python-docx . API v0.3.x + , . . , , , , , .

, Python 3 ​​ v0.3.0. Python 3.

+5

sudo pip install --pre python-docx python-docx.

0

All Articles