Word documents contain formatted text wrapped within three object levels. Lowest level- Run objects, Middle level- Paragraph objects and Highest level- Document object.
So, we cannot work with these documents using normal text editors. But, we can manipulate these word documents in python using the python-docx module.
2. After installation import “docx” NOT “python-docx”.
3. Use “docx.Document” class to start working with the word document.
Notice the page break in the second page.
Code #2: Now, to open a word document, create an instance along with passing the path to the document.
List of paragraph objects:->>> [<docx.text.paragraph.Paragraph object at 0x7f45b22dc128>, <docx.text.paragraph.Paragraph object at 0x7f45b22dc5c0>, <docx.text.paragraph.Paragraph object at 0x7f45b22dc0b8>, <docx.text.paragraph.Paragraph object at 0x7f45b22dc198>, <docx.text.paragraph.Paragraph object at 0x7f45b22dc0f0>] List of runs objects in 1st paragraph:->>> [<docx.text.run.Run object at 0x7f45b22dc198>] Text in the 1st paragraph:->>> Heading for the document The whole content of the document:->>> Heading for the document Your paragraph goes here, hey there, bold here, and these words are italic Heading level 2
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.