first
This commit is contained in:
52
openspec/changes/add-office-document-support/proposal.md
Normal file
52
openspec/changes/add-office-document-support/proposal.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# Add Office Document Support
|
||||
|
||||
**Status**: ✅ IMPLEMENTED & TESTED
|
||||
|
||||
## Summary
|
||||
Add support for Microsoft Office document formats (DOC, DOCX, PPT, PPTX) in the OCR processing pipeline and extend JWT token validity period to 1 day.
|
||||
|
||||
## Motivation
|
||||
Currently, the system only supports image formats (PNG, JPG, JPEG) and PDF files. Many users have documents in Microsoft Office formats that require OCR processing. This change will:
|
||||
1. Enable processing of Word and PowerPoint documents
|
||||
2. Improve user experience by extending token validity
|
||||
3. Leverage existing PDF-to-image conversion infrastructure
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
### 1. Office Document Support
|
||||
- Add Python libraries for Office document conversion:
|
||||
- `python-docx2pdf` or `python-docx` + `pypandoc` for Word documents
|
||||
- `python-pptx` for PowerPoint documents
|
||||
- Implement conversion pipeline:
|
||||
- Option A: Office → PDF → Images → OCR
|
||||
- Option B: Office → Images → OCR (direct conversion)
|
||||
- Extend file validation to accept `.doc`, `.docx`, `.ppt`, `.pptx` formats
|
||||
- Add conversion methods to `OCRService` class
|
||||
|
||||
### 2. Token Validity Extension
|
||||
- Update `ACCESS_TOKEN_EXPIRE_MINUTES` from 30 minutes to 1440 minutes (24 hours)
|
||||
- Ensure security measures are in place for longer-lived tokens
|
||||
|
||||
## Impact Analysis
|
||||
- **Backend Services**: Minimal changes to existing OCR processing flow
|
||||
- **Dependencies**: New Python packages for Office document handling
|
||||
- **Performance**: Slight increase in processing time for document conversion
|
||||
- **Security**: Longer token validity requires careful consideration
|
||||
- **Storage**: Temporary files during conversion process
|
||||
|
||||
## Success Criteria
|
||||
1. Successfully process Word documents (.doc, .docx) with OCR
|
||||
2. Successfully process PowerPoint documents (.ppt, .pptx) with OCR
|
||||
3. JWT tokens remain valid for 24 hours
|
||||
4. All existing functionality continues to work
|
||||
5. Conversion quality maintains text readability for OCR
|
||||
|
||||
## Timeline
|
||||
- Implementation: 2-3 hours ✅
|
||||
- Testing: 1 hour ✅
|
||||
- Documentation: 30 mins ✅
|
||||
- Total: ~4 hours ✅ COMPLETED
|
||||
|
||||
## Actual Time
|
||||
- Total development time: ~6 hours (including debugging and testing)
|
||||
- Primary issues resolved: Configuration loading, MIME type mapping, validation logic, API endpoint fixes
|
||||
Reference in New Issue
Block a user