test: add unit tests for DocumentTypeDetector

- Create test directory structure for backend - Add pytest fixtures for test files (PDF, images, Office docs) - Add 20 unit tests covering: - PDF type detection (editable, scanned, mixed) - Image file detection (PNG, JPG) - Office document detection (DOCX) - Text file detection - Edge cases (file not found, unknown types) - Batch processing and statistics - Mark tasks 1.1.4 and 1.3.5 as completed in tasks.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-19 12:14:59 +08:00
parent 1d0b63854a
commit 0fcb2492c9
6 changed files with 486 additions and 2 deletions
--- a/openspec/changes/dual-track-document-processing/tasks.md
+++ b/openspec/changes/dual-track-document-processing/tasks.md
@@ -5,7 +5,7 @@
  - [x] 1.1.1 Add PyMuPDF>=1.23.0
  - [x] 1.1.2 Add pdfplumber>=0.10.0
  - [x] 1.1.3 Add python-magic-bin>=0.4.14
-  - [ ] 1.1.4 Test dependency installation
+  - [x] 1.1.4 Test dependency installation
 - [x] 1.2 Create UnifiedDocument model in backend/app/models/
  - [x] 1.2.1 Define UnifiedDocument dataclass
  - [x] 1.2.2 Add DocumentElement model
@@ -17,7 +17,7 @@
  - [x] 1.3.2 Add PDF editability checking logic
  - [x] 1.3.3 Add Office document detection
  - [x] 1.3.4 Create routing logic to determine processing track
-  - [ ] 1.3.5 Add unit tests for detector
+  - [x] 1.3.5 Add unit tests for detector

 ## 2. Direct Extraction Track
 - [x] 2.1 Create DirectExtractionEngine service