A Python tool for comparing content differences between multiple Word documents.
- Compare multiple Word documents (.docx format)
- Calculate similarity matrix between documents
- Identify content differences
- Provide detailed comparison reports
- Python 3.12+
- Dependencies: python-docx
# Clone the repository
git clone https://github.com/CrueChan/word-comparison.git
cd word-comparison
# Install dependencies
uv syncpip install python-docx- Place Word documents to compare in the project directory
- Modify file paths in
main.py:
files = [
"document1.docx",
"document2.docx"
]- Run the program:
python main.pyComparison Results:
Documents have differences:
Similarity between document 1 and document 2: 95.67%
Similarity Matrix:
Document 1: ['100.00%', '95.67%']
Document 2: ['95.67%', '100.00%']
word-comparison/
├── main.py # Main program file
├── pyproject.toml # Project configuration
├── uv.lock # Dependency lock file
├── .gitignore # Git ignore file
├── .python-version # Python version
└── README.md # Project documentation
Issues and Pull Requests are welcome to improve this project.