Files
OCR/openspec/changes/archive/2025-11-26-enhance-memory-management/specs/memory-management/spec.md
egg a227311b2d chore: archive enhance-memory-management proposal (75/80 tasks)
Archive incomplete proposal for later continuation.
OCR processing has known quality issues to be addressed in future work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-26 16:10:45 +08:00

3.9 KiB

Memory Management Specification

ADDED Requirements

Requirement: Model Manager

The system SHALL provide a ModelManager class that manages model lifecycle with reference counting and idle timeout mechanisms.

Scenario: Loading a model

GIVEN a request to load a model WHEN the model is not already loaded THEN the ModelManager creates a new instance and sets reference count to 1

Scenario: Reusing loaded model

GIVEN a model is already loaded WHEN another request for the same model arrives THEN the ModelManager returns the existing instance and increments reference count

Scenario: Unloading idle model

GIVEN a model with zero reference count WHEN the idle timeout period expires THEN the ModelManager unloads the model and frees memory

Requirement: Service Pool

The system SHALL implement an OCRServicePool that manages a pool of OCRService instances with one instance per GPU/CPU device.

Scenario: Acquiring service from pool

GIVEN a task needs processing WHEN a service is requested from the pool THEN the pool returns an available service or queues the request if all services are busy

Scenario: Releasing service to pool

GIVEN a task has completed processing WHEN the service is released THEN the service becomes available for other tasks in the pool

Requirement: Memory Monitoring

The system SHALL continuously monitor GPU and CPU memory usage and trigger preventive actions based on configurable thresholds.

Scenario: Memory warning threshold

GIVEN memory usage reaches 80% (warning threshold) WHEN a new task is requested THEN the system logs a warning and may defer non-critical operations

Scenario: Memory critical threshold

GIVEN memory usage reaches 95% (critical threshold) WHEN a new task is requested THEN the system attempts CPU fallback or rejects the task

Requirement: Concurrency Control

The system SHALL limit concurrent PP-StructureV3 predictions using semaphores to prevent memory exhaustion.

Scenario: Concurrent prediction limit

GIVEN the maximum concurrent predictions is set to 2 WHEN 2 predictions are already running THEN additional prediction requests wait in queue until a slot becomes available

Requirement: Resource Cleanup

The system SHALL ensure all resources are properly cleaned up after task completion or failure.

Scenario: Successful task cleanup

GIVEN a task completes successfully WHEN the task finishes THEN all allocated memory, temporary files, and model references are released

Scenario: Failed task cleanup

GIVEN a task fails with an error WHEN the error handler runs THEN cleanup is performed in the finally block regardless of failure reason

MODIFIED Requirements

Requirement: OCR Service Instantiation

The OCR service instantiation SHALL use pooled instances instead of creating new instances for each task.

Scenario: Task using pooled service

GIVEN a new OCR task arrives WHEN the task starts processing THEN it acquires a service from the pool instead of creating a new instance

Requirement: PP-StructureV3 Model Management

The PP-StructureV3 model SHALL be subject to the same lifecycle management as other models, removing its permanent exemption from unloading.

Scenario: PP-StructureV3 unloading

GIVEN PP-StructureV3 has been idle for the configured timeout WHEN memory pressure is detected THEN the model can be unloaded to free memory

Requirement: Task Resource Tracking

Tasks SHALL track their resource usage including estimated and actual memory consumption.

Scenario: Task memory tracking

GIVEN a task is processing WHEN memory metrics are collected THEN the task records both estimated and actual memory usage for analysis

REMOVED Requirements

Requirement: Permanent Model Loading

The requirement for PP-StructureV3 to remain permanently loaded SHALL be removed.

Scenario: Dynamic model loading

GIVEN the system starts WHEN no tasks are using PP-StructureV3 THEN the model is not loaded until first use