#78: NO_USABLE_TEXT (Foto-PDF) finalisiert sofort zu FAILED_FINAL

Bisher wurde NO_USABLE_TEXT (kein OCR-Text im PDF) wie alle anderen
deterministischen Inhaltsfehler mit der 1-Retry-Regel behandelt und
landete beim ersten Auftreten in FAILED_RETRYABLE. Da ein Bild-Scan ohne
OCR-Text sich zwischen Läufen nicht verändert, ist ein Wiederholversuch
sinnlos – der Status muss sofort FAILED_FINAL sein.

Geändert: ProcessingOutcomeTransition erkennt NO_USABLE_TEXT als
Sonderfall und liefert ohne Retry-Prüfung FAILED_FINAL. PAGE_LIMIT_EXCEEDED
und CONTENT_NOT_EXTRACTABLE behalten die 1-Retry-Regel.

Tests angepasst: Bestehende Tests, die FAILED_RETRYABLE für NO_USABLE_TEXT
erwarteten, wurden auf das korrekte Verhalten umgestellt oder auf
PAGE_LIMIT_EXCEEDED umgeschrieben. Neue Lifecycle-Tests für NO_USABLE_TEXT
(sofort FAILED_FINAL → SKIPPED_FINAL_FAILURE) hinzugefügt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-04 15:08:01 +02:00
parent 349ee69a7f
commit 18f9c33bbb
4 changed files with 100 additions and 18 deletions
@@ -6,6 +6,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.AiTechnicalFailure;
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposalReady;
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
@@ -26,10 +27,14 @@ import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
* <li><strong>Naming proposal ready:</strong> Status becomes
* {@link ProcessingStatus#PROPOSAL_READY}, counters unchanged,
* {@code retryable=false}.</li>
* <li><strong>Pre-check content error (first occurrence):</strong>
* <li><strong>Pre-check content error {@link PreCheckFailureReason#NO_USABLE_TEXT}:</strong>
* Status becomes {@link ProcessingStatus#FAILED_FINAL} immediately,
* content error counter incremented by 1, {@code retryable=false}.
* Image-only PDFs without OCR text will not yield usable text on retry.</li>
* <li><strong>Pre-check content error (other reason, first occurrence):</strong>
* Status becomes {@link ProcessingStatus#FAILED_RETRYABLE},
* content error counter incremented by 1, {@code retryable=true}.</li>
* <li><strong>Pre-check content error (second or later occurrence):</strong>
* <li><strong>Pre-check content error (other reason, second or later occurrence):</strong>
* Status becomes {@link ProcessingStatus#FAILED_FINAL},
* content error counter incremented by 1, {@code retryable=false}.</li>
* <li><strong>AI functional failure (first occurrence):</strong>
@@ -112,11 +117,16 @@ final class ProcessingOutcomeTransition {
);
}
case PreCheckFailed ignored2 -> {
// Deterministic content error from pre-check: apply the 1-retry rule
case PreCheckFailed preCheckFailed -> {
FailureCounters updatedCounters = existingCounters.withIncrementedContentErrorCount();
boolean isFirstOccurrence = existingCounters.contentErrorCount() == 0;
if (preCheckFailed.failureReason() == PreCheckFailureReason.NO_USABLE_TEXT) {
// Image-only PDFs without OCR text will not change on retry.
yield new ProcessingOutcome(ProcessingStatus.FAILED_FINAL, updatedCounters, false);
}
// Other deterministic content errors: apply the 1-retry rule
boolean isFirstOccurrence = existingCounters.contentErrorCount() == 0;
if (isFirstOccurrence) {
yield new ProcessingOutcome(ProcessingStatus.FAILED_RETRYABLE, updatedCounters, true);
} else {