Compare commits
34 Commits
2dc07d16d5
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 3f1d50d356 | |||
| ca91749a04 | |||
| 57ea9cf649 | |||
| 9c8ba2170e | |||
| b13d8ba0e1 | |||
| 7b7af28d12 | |||
| f4bf76652a | |||
| 67ab91cd70 | |||
| 4a21b23312 | |||
| cd1deb9f92 | |||
| 8fd9e350e5 | |||
| 5099ff4aca | |||
| 39800b6ea8 | |||
| 0e65ae32ff | |||
| a51fcf7055 | |||
| 9c2a205137 | |||
| 559b051ab3 | |||
| 03689802dd | |||
| d61316c699 | |||
| a3f47ba560 | |||
| 8d915e7ded | |||
| e91cfb9ec2 | |||
| a5d687d625 | |||
| cab9fed5b0 | |||
| f2bbc8a884 | |||
| c7818ce920 | |||
| ac3662e758 | |||
| 788f6110d4 | |||
| e9e9b2d17a | |||
| ffd91c766d | |||
| 7e4193a173 | |||
| df0a3ad07b | |||
| 7e4201b651 | |||
| f81f30c7ea |
2
.gitignore
vendored
2
.gitignore
vendored
@@ -72,3 +72,5 @@ Desktop.ini
|
||||
hs_err_pid*
|
||||
replay_pid*
|
||||
/review-input.zip
|
||||
/run-milestone.ps1
|
||||
/run-v11.ps1
|
||||
|
||||
220
CLAUDE.md
220
CLAUDE.md
@@ -6,7 +6,6 @@ Dieses Repository implementiert einen lokal gestarteten **PDF-Umbenenner mit KI*
|
||||
## Autoritative Dokumente
|
||||
@docs/specs/technik-und-architektur.md
|
||||
@docs/specs/fachliche-anforderungen.md
|
||||
@docs/specs/meilensteine.md
|
||||
|
||||
Für die Umsetzung ist zusätzlich immer das aktuell aktive Arbeitspaket unter `docs/workpackages/` maßgeblich.
|
||||
Nicht raten, wenn Dokumente fehlen, unklar sind oder sich widersprechen.
|
||||
@@ -16,7 +15,6 @@ Die Dokumente haben folgende feste Bedeutung:
|
||||
|
||||
- `docs/specs/technik-und-architektur.md` = verbindliche technische Zielarchitektur
|
||||
- `docs/specs/fachliche-anforderungen.md` = verbindliche fachliche Regeln
|
||||
- `docs/specs/meilensteine.md` = zulässiger Funktionsumfang pro Meilenstein
|
||||
- `docs/workpackages/...` = verbindlicher Scope, Reihenfolge und Inhalt des aktuell bearbeiteten Arbeitspakets
|
||||
|
||||
Bei Konflikten gilt folgende Priorität:
|
||||
@@ -27,10 +25,7 @@ Bei Konflikten gilt folgende Priorität:
|
||||
2. **Fachliche Anforderungen**
|
||||
Verbindliche fachliche Regeln und fachliches Zielverhalten.
|
||||
|
||||
3. **Meilensteine**
|
||||
Begrenzen den zulässigen Funktionsumfang auf den aktuellen Entwicklungsstand.
|
||||
|
||||
4. **Arbeitspakete**
|
||||
3. **Arbeitspakete**
|
||||
Definieren den konkret erlaubten Umsetzungsumfang des aktuellen Schritts.
|
||||
|
||||
Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und keine stillen Annahmen treffen.
|
||||
@@ -46,8 +41,11 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
- kein interner Scheduler
|
||||
- Log4j2 für Logging
|
||||
- SQLite als lokaler Persistenzspeicher
|
||||
- OpenAI-kompatible HTTP-Schnittstelle für KI-Zugriff
|
||||
- API-Provider, Base-URL und Modellname sind **Konfiguration**, keine Architekturentscheidung
|
||||
- KI-Anbindung über genau **eine** der beiden unterstützten Provider-Familien:
|
||||
- **OpenAI-kompatible HTTP-Schnittstelle** (Chat-Completions-Stil)
|
||||
- **native Anthropic Messages API** (Claude-Modelle)
|
||||
- Pro Lauf ist genau ein Provider aktiv. Kein Fallback, keine Parallelnutzung.
|
||||
- Konkrete Provider-Familie, Base-URL und Modellname sind **Konfiguration**, keine Architekturentscheidung.
|
||||
|
||||
## Verbindliche Modulstruktur
|
||||
- `pdf-umbenenner-domain`
|
||||
@@ -67,6 +65,9 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
- Keine Vermischung von Dateisystem, PDF-Auslese, SQLite, KI-HTTP, Konfiguration, Logging, Benennungslogik und Retry-Entscheidungen
|
||||
- Logging ist technische Infrastruktur, kein fachlicher Port
|
||||
- Port-Verträge enthalten weder `Path`/`File` noch NIO- oder JDBC-Typen
|
||||
- Der `AiNamingPort` bleibt provider-neutral; provider-spezifische Typen, Header, URLs und Antwortstrukturen leben ausschließlich in der jeweiligen Adapter-Out-Implementierung
|
||||
- Es gibt keine gemeinsame „abstrakte KI-Adapter"-Zwischenschicht zwischen Port und konkreten Adaptern
|
||||
- Die Bootstrap-Schicht wählt die **eine** aktive `AiNamingPort`-Implementierung anhand der Konfiguration aus
|
||||
|
||||
## Globale fachliche Leitplanken
|
||||
- Zielformat: `YYYY-MM-DD - Titel.pdf`
|
||||
@@ -85,25 +86,18 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
|
||||
## Aktiver Implementierungsstand
|
||||
|
||||
M1 bis M5 sind vollständig abgeschlossen. Der aktive Stand ergänzt den vollständigen
|
||||
Erfolgspfad: korrekt benannte Zielkopie erzeugen und Enderfolg konsistent persistieren.
|
||||
Die fachliche und technische Basis ist vollständig umgesetzt, dokumentiert, getestet (inkl. PIT-Mutationstests, Smoke-Tests, End-to-End-Tests) und freigegeben.
|
||||
|
||||
### Baseline aus M5
|
||||
- Externer Prompt-Bezug über konfigurierbare Prompt-Datei
|
||||
- OpenAI-kompatibler HTTP-Adapter vollständig verdrahtet
|
||||
- Validierte KI-Antwort mit `date`, `title`, `reasoning`
|
||||
- Persistierter Benennungsvorschlag mit Status `PROPOSAL_READY`
|
||||
- Versuchshistorie mit KI-Nachvollziehbarkeit (Modell, Prompt-ID, Zeichenzahl, Rohantwort, Reasoning, Datum, Datumsquelle)
|
||||
- Idempotente Migration M4 → M5
|
||||
Der aktive Stand ist die Erweiterung **„Zusätzlicher KI-Provider Anthropic Claude über die native Messages API"**. Sie ist eine bewusst minimale Erweiterung des freigegebenen Basisstands.
|
||||
|
||||
### Ziel des aktiven Stands
|
||||
- Technische Dateinamensbildung im Format `YYYY-MM-DD - Titel.pdf`
|
||||
- Dublettenbehandlung im Zielordner: `(1)`, `(2)`, …
|
||||
- Physische Zielkopie via temporäre Datei und finalem Move/Rename
|
||||
- Schemaevolution auf den aktiven Stand (Zielpfad, Zieldateiname)
|
||||
- Statustransition `PROPOSAL_READY` → `SUCCESS`
|
||||
- Zusätzliche Historisierung für Enderfolg und technische Fehler (Proposal-Versuch bleibt erhalten)
|
||||
- Startvalidierung für Zielordner-Konfiguration (`target.folder`)
|
||||
### Ziel der aktiven Erweiterung
|
||||
- Der bestehende OpenAI-kompatible KI-Weg bleibt unverändert nutzbar.
|
||||
- Zusätzlich wird die **native Anthropic Messages API** als zweite, gleichwertig unterstützte Provider-Familie integriert.
|
||||
- Genau **ein** Provider ist pro Lauf aktiv – ausschließlich über Konfiguration ausgewählt.
|
||||
- **Kein** automatischer Fallback, **keine** Parallelnutzung, **keine** Profilverwaltung.
|
||||
- Der fachliche KI-Vertrag (`NamingProposal` aus Application-/Domain-Sicht) bleibt unverändert.
|
||||
- Bestehende Properties-Dateien aus dem Vorgängerstand werden beim ersten Start kontrolliert in das neue Schema migriert; vorher wird automatisch eine `.bak`-Sicherung angelegt.
|
||||
- Architekturgrenzen, Persistenzmodell, Statussemantik, Retry-Semantik, Exit-Code-Verhalten und Logging-Mindestumfang bleiben unverändert; sie werden ausschließlich um den Provider-Identifikator und die Provider-Auswahl ergänzt.
|
||||
|
||||
## Statussemantik
|
||||
|
||||
@@ -131,71 +125,137 @@ Erfolgspfad: korrekt benannte Zielkopie erzeugen und Enderfolg konsistent persis
|
||||
- Proposal-Versuch mit fachlich unbrauchbarem Titel oder Datum = inkonsistenter Persistenzzustand = dokumentbezogener technischer Fehler.
|
||||
- Inkonsistente Proposal-Zustände werden **nicht stillschweigend geheilt**, sondern als technische Dokumentfehler behandelt.
|
||||
|
||||
## Verarbeitungsreihenfolge pro Dokument (aktiver Stand)
|
||||
## Retry-Semantik
|
||||
|
||||
### Deterministische Inhaltsfehler
|
||||
Deterministische Inhaltsfehler sind insbesondere:
|
||||
- kein brauchbarer Text
|
||||
- Seitenlimit überschritten
|
||||
- fachlich unbrauchbarer oder generischer Titel
|
||||
- vorhandenes, aber unbrauchbares KI-Datum
|
||||
|
||||
Regel:
|
||||
- **erster** historisierter deterministischer Inhaltsfehler → `FAILED_RETRYABLE`
|
||||
- **zweiter** historisierter deterministischer Inhaltsfehler → `FAILED_FINAL`
|
||||
|
||||
### Transiente technische Fehler
|
||||
- Transiente Fehler laufen über den Transientfehlerzähler im Dokument-Stammsatz.
|
||||
- Sie bleiben retryable bis der konfigurierte Grenzwert `max.retries.transient` erreicht ist.
|
||||
- Der Fehlversuch, der den Grenzwert **erreicht**, finalisiert den Dokumentstatus zu `FAILED_FINAL`.
|
||||
- `max.retries.transient` = **Integer >= 1**; der Wert `0` ist ungültige Startkonfiguration.
|
||||
- Die Klassifikation gilt provider-unabhängig: Technische Fehler aus dem aktiven KI-Provider werden in dieselbe transiente Kategorie eingeordnet wie bisher. Der inaktive Provider wird in keiner Fehlersituation als Backup verwendet.
|
||||
|
||||
### Technischer Sofort-Wiederholversuch
|
||||
- **Genau ein** zusätzlicher technischer Schreibversuch innerhalb desselben Dokumentlaufs.
|
||||
- **Ausschließlich** für Fehler beim physischen Zielkopierpfad.
|
||||
- Kein erneuter KI-Aufruf, keine erneute Fachableitung.
|
||||
- Zählt **nicht** zum laufübergreifenden Transientfehlerzähler.
|
||||
- Liefert genau ein dokumentbezogenes Ergebnis für Persistenz und Statusfortschreibung.
|
||||
|
||||
### Skip-Semantik
|
||||
- `SUCCESS` → in späteren Läufen `SKIPPED_ALREADY_PROCESSED` historisieren, keine Zähleränderung.
|
||||
- `FAILED_FINAL` → in späteren Läufen `SKIPPED_FINAL_FAILURE` historisieren, keine Zähleränderung.
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI`, `PROPOSAL_READY` → verarbeitbar.
|
||||
|
||||
## Logging-Mindestumfang
|
||||
|
||||
Folgende Informationen müssen nachvollziehbar geloggt werden:
|
||||
- Laufstart mit Lauf-ID
|
||||
- aktiver KI-Provider für den Lauf
|
||||
- Laufende
|
||||
- erkannte Quelldatei
|
||||
- Überspringen bereits erfolgreicher Dateien
|
||||
- Überspringen final fehlgeschlagener Dateien
|
||||
- erzeugter Zielname
|
||||
- Retry-Entscheidung
|
||||
- Fehler mit Klassifikation
|
||||
|
||||
### Korrelationsregel
|
||||
- Vor erfolgreicher Fingerprint-Ermittlung: Korrelation über Lauf-ID und Kandidatenbezug.
|
||||
- Nach erfolgreicher Fingerprint-Ermittlung: dokumentbezogene Logs enthalten den Fingerprint oder eine eindeutig ableitbare Referenz.
|
||||
- Keine neue Persistenz-Wahrheit oder zusätzliche Tracking-Ebene.
|
||||
|
||||
### Sensibilitätsregel für KI-Inhalte
|
||||
- Vollständige KI-Rohantwort: standardmäßig **nicht** ins Log, bleibt in SQLite.
|
||||
- Vollständiges KI-`reasoning`: standardmäßig **nicht** ins Log, bleibt in SQLite.
|
||||
- Freischaltung nur über expliziten booleschen Konfigurationswert.
|
||||
- Default: sicher / nicht loggen.
|
||||
- Die Sensibilitätsregel gilt provider-unabhängig.
|
||||
|
||||
## Verarbeitungsreihenfolge pro Dokument
|
||||
|
||||
1. Fingerprint berechnen
|
||||
2. Dokument-Stammsatz laden
|
||||
3. Terminale Skip-Fälle entscheiden (`SUCCESS` → `SKIPPED_ALREADY_PROCESSED`, `FAILED_FINAL` → `SKIPPED_FINAL_FAILURE`)
|
||||
4. Falls nötig: M5-Pfad bis `PROPOSAL_READY` durchlaufen
|
||||
4. Falls nötig: Pfad bis `PROPOSAL_READY` durchlaufen (inkl. KI-Aufruf über den aktiven Provider)
|
||||
5. Führenden `PROPOSAL_READY`-Versuch laden
|
||||
6. Finalen Basis-Dateinamen bilden
|
||||
7. Dubletten-Suffix im Zielordner bestimmen
|
||||
8. Zielkopie schreiben (temporäre Datei + finaler Move/Rename)
|
||||
9. Neuen Versuch für Enderfolg oder technischen Fehler historisieren
|
||||
10. Dokument-Stammsatz konsistent fortschreiben
|
||||
8. Zielkopie schreiben (temporäre Datei + finaler Move/Rename; bei Fehler: genau ein Sofort-Wiederholversuch)
|
||||
9. Retry-Entscheidung ableiten
|
||||
10. Neuen Versuch historisieren, Stammsatz konsistent fortschreiben
|
||||
|
||||
## Zielkopie-Semantik
|
||||
- Kopie zunächst in temporäre Zieldatei im Zielkontext
|
||||
- Finaler Move/Rename auf den geplanten Zieldateinamen
|
||||
- Quelldatei bleibt **immer unverändert**
|
||||
- Kein Sofort-Wiederholversuch im selben Lauf
|
||||
- Bei Persistenzfehler nach erfolgreicher Zielkopie: kein `SUCCESS` setzen, best-effort Rückbau der Zielkopie vorsehen, Ergebnis bleibt dokumentbezogener technischer Fehler
|
||||
- Bei technischem Schreibfehler: genau ein Sofort-Wiederholversuch (nur Zielkopierpfad)
|
||||
- Bei Persistenzfehler nach erfolgreicher Zielkopie: kein `SUCCESS` setzen, best-effort Rückbau der Zielkopie, Ergebnis bleibt dokumentbezogener technischer Fehler
|
||||
|
||||
## Fehlersemantik (aktiver Stand)
|
||||
Technische Fehler bei Proposal-Quelllesung, Zielpfadbildung, Dublettenauflösung,
|
||||
Zielkopie oder aktiver Persistenz nach Fingerprint-Ermittlung:
|
||||
- → dokumentbezogener technischer Fehler
|
||||
- → `FAILED_RETRYABLE`, Transientfehlerzähler +1
|
||||
- → kein Abbruch des Batch-Laufs für andere Dokumente
|
||||
- → keine neue finale Fehlerkategorie
|
||||
## Fehlersemantik
|
||||
- Technische Fehler → `FAILED_RETRYABLE`, Transientfehlerzähler +1
|
||||
- Bei Erreichen von `max.retries.transient` → `FAILED_FINAL`
|
||||
- Kein Abbruch des Batch-Laufs für andere Dokumente
|
||||
- Keine neue finale Fehlerkategorie
|
||||
- Vor-Fingerprint-Fehler werden **nicht** als SQLite-Versuch historisiert
|
||||
- Provider-spezifische Fehlerausprägungen (HTTP-Fehler, Auth-Fehler, Antwort-Schema-Fehler) werden im jeweiligen Adapter klassifiziert und auf die bestehenden Fehlerkategorien abgebildet. Es entstehen keine neuen Fehlerklassen.
|
||||
|
||||
## Persistenzerweiterung (aktiver Stand)
|
||||
## Persistenz
|
||||
|
||||
**Dokument-Stammsatz** erhält zusätzlich:
|
||||
- letzten Zielpfad
|
||||
- letzten Zieldateinamen
|
||||
Zwei-Ebenen-Modell bleibt unverändert – keine dritte Wahrheitsquelle.
|
||||
|
||||
**Versuchshistorie** erhält zusätzlich:
|
||||
**Dokument-Stammsatz** enthält u.a.:
|
||||
- letzten Zielpfad, letzten Zieldateinamen
|
||||
- Inhaltsfehler- und Transientfehlerzähler
|
||||
- Gesamtstatus
|
||||
|
||||
**Versuchshistorie** enthält u.a.:
|
||||
- finalen Zieldateinamen
|
||||
- Fehlerklasse, Fehlermeldung, Retryable-Flag
|
||||
- **Provider-Identifikator des aktiven KI-Providers für den Versuch**
|
||||
|
||||
**Invariante:** Der führende `PROPOSAL_READY`-Versuch wird nicht überschrieben.
|
||||
Enderfolg und technische Fehler des aktiven Stands werden als **zusätzliche neue Versuche** historisiert.
|
||||
Jeder Lauf erzeugt einen **zusätzlichen** neuen Versuchseintrag.
|
||||
|
||||
**Rückwärtsverträglichkeit:** Bestehende Datenbestände bleiben lesbar, fortschreibbar und korrekt interpretierbar. Schema-Erweiterungen sind additiv mit definierten Defaultwerten für historische Versuche ohne Provider-Identifikator.
|
||||
|
||||
## Naming-Regel (verbindlich für alle Arbeitspakete)
|
||||
In Implementierungen, Kommentaren und JavaDoc dürfen **keine** Meilenstein- oder
|
||||
Arbeitspaket-Bezeichner erscheinen:
|
||||
|
||||
- Verboten: `M1`, `M2`, `M3`, `M4`, `M5`, `M6`, `M7`, `M8`
|
||||
- Verboten: `M1`, `M2`, …, `M8`
|
||||
- Verboten: `AP-001`, `AP-002`, … `AP-00x`
|
||||
- Verboten: Versionsbezeichner wie `V1.0`, `V1.1` in Code/JavaDoc
|
||||
|
||||
Stattdessen werden **zeitlose technische Bezeichnungen** verwendet.
|
||||
Bestehende Kommentare mit solchen Bezeichnern, die durch eigene Änderungen berührt werden, sind zu ersetzen.
|
||||
|
||||
## Arbeitsweise
|
||||
- Arbeite immer nur im **explizit aktiven Meilenstein** und im **explizit aktiven Arbeitspaket**
|
||||
- **Kein Vorgriff** auf spätere Meilensteine oder Arbeitspakete
|
||||
- Arbeite immer nur im **explizit aktiven Arbeitspaket**
|
||||
- **Kein Vorgriff** auf spätere Arbeitspakete
|
||||
- Änderungen klein, fokussiert und architekturtreu halten
|
||||
- Keine unnötigen Umbenennungen, keine großflächigen Refactorings ohne Not
|
||||
- Vor Änderungen zuerst die betroffenen Dateien und Abhängigkeiten verstehen
|
||||
- **Keine Annahmen über Dateipfade.** Typen und Klassen werden per Suche nach Typname gefunden, nicht über vermutete Pfade.
|
||||
- Keine Vermutungen: Bei echter Unklarheit oder Dokumentkonflikten knapp nachfragen oder den Konflikt benennen
|
||||
- Keine stillen Änderungen am bestehenden OpenAI-kompatiblen KI-Weg
|
||||
|
||||
## Definition of Done pro Arbeitspaket
|
||||
Ein Arbeitspaket ist erst fertig, wenn:
|
||||
- der Zielumfang des aktuellen Arbeitspakets vollständig umgesetzt ist
|
||||
- der Stand konsistent, fehlerfrei und buildbar ist
|
||||
- Implementierung, Konfiguration, JavaDoc und Tests ergänzt sind, **soweit für den Stand sinnvoll**
|
||||
- keine Inhalte späterer Meilensteine vorweggenommen wurden
|
||||
- keine Inhalte späterer Arbeitspakete vorweggenommen wurden
|
||||
- der Zwischenstand in sich geschlossen und übergabefähig ist
|
||||
|
||||
## Pflicht-Output-Format nach jedem Arbeitspaket
|
||||
@@ -222,10 +282,11 @@ Ein Arbeitspaket ist erst fertig, wenn:
|
||||
|
||||
## Wichtige Betriebsregeln
|
||||
- Ungültige Startkonfiguration verhindert den Verarbeitungslauf und führt zu Exit-Code `1`
|
||||
- Eine ungültige oder fehlende Provider-Auswahl ist eine ungültige Startkonfiguration
|
||||
- Run-Lock verhindert parallele Instanzen; wenn bereits eine Instanz läuft, beendet sich die neue Instanz sofort
|
||||
- Exit-Code `0`: Lauf technisch ordnungsgemäß ausgeführt, auch wenn einzelne Dateien fachlich oder transient fehlgeschlagen sind
|
||||
- Exit-Code `1`: harter Start-/Bootstrap-Fehler
|
||||
- Umgebungsvariable hat Vorrang vor Properties beim API-Key
|
||||
- API-Schlüssel: pro Provider eine eigene Umgebungsvariable, Vorrang vor Properties derselben Provider-Familie. Schlüssel verschiedener Provider werden niemals vermischt.
|
||||
- Dokumentbezogene Fehler führen **nicht** zu Exit-Code `1`
|
||||
|
||||
## Konfigurationsparameter
|
||||
@@ -233,16 +294,51 @@ Verbindlich zweckmäßige Parameter:
|
||||
- `source.folder` – Quellordner
|
||||
- `target.folder` – Zielordner (muss vorhanden oder anlegbar sein, Schreibzugriff erforderlich)
|
||||
- `sqlite.file` – SQLite-Datenbankdatei
|
||||
- `api.baseUrl` – KI-Basis-URL
|
||||
- `api.model` – Modellname
|
||||
- `api.timeoutSeconds` – Timeout
|
||||
- `max.retries.transient` – maximale transiente Wiederholversuche
|
||||
- `ai.provider.active` – aktiver KI-Provider (Pflicht; zulässige Werte sind die Bezeichner der unterstützten Provider-Familien)
|
||||
- `max.retries.transient` – max. historisierte transiente Fehlversuche pro Fingerprint (**Integer >= 1**, `0` ist ungültig)
|
||||
- `max.pages` – Seitenlimit
|
||||
- `max.text.characters` – maximale Zeichenzahl für KI-Eingabe
|
||||
- `prompt.template.file` – externe Prompt-Datei
|
||||
- `log.ai.sensitive` – sensible KI-Logausgabe freischalten (Boolean, Default: `false`)
|
||||
- `runtime.lock.file` – Lock-Datei (optional)
|
||||
- `log.directory` – Log-Verzeichnis (optional)
|
||||
- `api.key` – API-Key (Umgebungsvariable hat Vorrang)
|
||||
|
||||
Pro Provider-Familie existiert ein eigener Parameter-Namensraum mit zweckmäßig:
|
||||
- Modellname
|
||||
- API-Schlüssel (Umgebungsvariable hat Vorrang)
|
||||
- Timeout
|
||||
- Basis-URL (optional, wo betrieblich sinnvoll)
|
||||
|
||||
Konkretes Schema (zweckmäßig):
|
||||
|
||||
```properties
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
ai.provider.claude.baseUrl=...
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
### Migration historischer Konfiguration
|
||||
Bestehende Properties-Dateien des Vorgängerstands (mit flachen Schlüsseln wie `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key`) werden beim ersten Start erkannt und kontrolliert in das neue Schema überführt.
|
||||
|
||||
Verbindlicher Ablauf:
|
||||
1. Legacy-Form erkennen
|
||||
2. **`.bak`-Sicherung** der Originaldatei anlegen
|
||||
3. Inhalt in das neue Schema überführen
|
||||
- Legacy-Werte landen im Namensraum **`openai-compatible`**
|
||||
- `ai.provider.active` wird auf `openai-compatible` gesetzt
|
||||
4. Datei in-place schreiben
|
||||
5. Datei erneut laden und validieren
|
||||
6. Erst danach den normalen Lauf fortsetzen
|
||||
|
||||
Alte und neue Struktur sind **kein** dauerhaft gleichrangiges Endformat.
|
||||
|
||||
## Nicht-Ziele / Verbote
|
||||
- kein Web-UI
|
||||
@@ -253,7 +349,15 @@ Verbindlich zweckmäßige Parameter:
|
||||
- keine interne Scheduler-Logik
|
||||
- keine Architekturbrüche
|
||||
- keine neuen Bibliotheken oder Frameworks ohne klare Notwendigkeit und Begründung
|
||||
- keine stillen Änderungen an Provider-Bindung oder Architekturprinzipien
|
||||
- kein Sofort-Wiederholversuch der Zielkopie im selben Lauf
|
||||
- kein Logging-Feinschliff des Endstands
|
||||
- **keine** automatische Fallback-Umschaltung zwischen KI-Providern
|
||||
- **keine** parallele Nutzung mehrerer KI-Provider in einem Lauf
|
||||
- **keine** Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
- **keine** Provider-Familien jenseits der explizit unterstützten (OpenAI-kompatibel, Anthropic Messages API)
|
||||
- keine stillen Änderungen am bestehenden OpenAI-kompatiblen KI-Weg
|
||||
- kein Sofort-Wiederholversuch außerhalb des Zielkopierpfads
|
||||
- keine Reporting- oder Statistikfunktionen
|
||||
- keine neue dritte Persistenz-Wahrheitsquelle für Retry-Entscheidungen
|
||||
- keine neue Fachfunktionalität jenseits des definierten Zielbilds
|
||||
- kein großflächiges Refactoring ohne nachweisbaren Defektbezug
|
||||
- keine spekulativen Umbauten ohne konkreten Qualitäts- oder Konsistenzbezug
|
||||
- keine Vermischung von API-Schlüsseln verschiedener Provider-Familien
|
||||
|
||||
195
README.md
Normal file
195
README.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# PDF-Umbenenner
|
||||
|
||||
Ein lokal gestartetes Java-Programm zur KI-gestützten Umbenennung bereits OCR-verarbeiteter, durchsuchbarer PDF-Dateien.
|
||||
|
||||
Die Anwendung liest PDF-Dateien aus einem konfigurierbaren Quellordner, extrahiert den Text, ermittelt daraus per KI einen normierten Dateinamen und legt **eine Kopie** im Zielordner ab. Die Quelldateien bleiben unverändert.
|
||||
|
||||
## Zielbild
|
||||
|
||||
Der PDF-Umbenenner ist bewusst als schlanke Batch-Anwendung ausgelegt:
|
||||
|
||||
- **Java 21**
|
||||
- **Maven Multi-Module**
|
||||
- **ausführbares Standalone-JAR**
|
||||
- **lokaler Start**, z. B. über den **Windows Task Scheduler**
|
||||
- **kein Webserver**
|
||||
- **kein Applikationsserver**
|
||||
- **keine Dauerlauf-Anwendung**
|
||||
- **kein interner Scheduler**
|
||||
- **SQLite** als lokaler Persistenzspeicher
|
||||
- **Log4j2** für Logging
|
||||
- strikte **hexagonale Architektur / Ports and Adapters**
|
||||
|
||||
## Fachlicher Überblick
|
||||
|
||||
Die Anwendung verarbeitet Dokumente in einem robusten, nachvollziehbaren Ablauf:
|
||||
|
||||
1. Quellordner lesen
|
||||
2. PDF-Kandidaten erkennen
|
||||
3. Fingerprint der Quelldatei bestimmen
|
||||
4. bereits erfolgreich verarbeitete bzw. final fehlgeschlagene Dokumente überspringen
|
||||
5. PDF-Text extrahieren
|
||||
6. KI-basierten Benennungsvorschlag erzeugen
|
||||
7. normierten Zieldateinamen bilden
|
||||
8. Kollisionen im Zielordner über Dubletten-Suffixe auflösen
|
||||
9. Kopie im Zielordner ablegen
|
||||
10. Ergebnis und Versuchshistorie in SQLite persistieren
|
||||
|
||||
## Dateinamensregeln
|
||||
|
||||
Das Zielformat lautet:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
Bei Namenskollisionen werden Suffixe direkt vor `.pdf` ergänzt:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel(1).pdf
|
||||
YYYY-MM-DD - Titel(2).pdf
|
||||
```
|
||||
|
||||
Wichtige Regeln:
|
||||
|
||||
- die **20 Zeichen** beziehen sich nur auf den **Basistitel**
|
||||
- das Dubletten-Suffix zählt **nicht** zu diesen 20 Zeichen
|
||||
- Titel werden auf **Deutsch** erzeugt
|
||||
- Eigennamen bleiben unverändert
|
||||
- Quelldateien werden **nie** überschrieben, verschoben oder verändert
|
||||
|
||||
## KI-Anbindung
|
||||
|
||||
Die KI-Anbindung ist konfigurationsgetrieben. Der fachliche Vertrag bleibt unabhängig vom Anbieter gleich: Aus dem Dokumentinhalt wird ein strukturierter Benennungsvorschlag abgeleitet, aus dem die Anwendung den finalen Dateinamen bildet.
|
||||
|
||||
Der aktuelle Stand unterstützt mehrere Provider über Konfiguration, darunter:
|
||||
|
||||
- **OpenAI-kompatible Endpunkte**
|
||||
- **Claude API**
|
||||
|
||||
Die Provider-Auswahl ist **Konfiguration**, keine Architekturentscheidung.
|
||||
|
||||
## Wichtige Annahmen und Grenzen
|
||||
|
||||
- Die Anwendung erwartet **bereits OCR-verarbeitete, durchsuchbare PDFs**.
|
||||
- Nicht durchsuchbare oder inhaltlich nicht brauchbare PDFs werden als Fehler behandelt.
|
||||
- Mehrdeutige Dokumente erzeugen **kein unsicheres Ergebnis**.
|
||||
- Erfolgreich verarbeitete Dateien werden in späteren Läufen nicht erneut verarbeitet.
|
||||
- Final fehlgeschlagene Dateien werden in späteren Läufen übersprungen.
|
||||
|
||||
## Architektur
|
||||
|
||||
Das Projekt ist strikt nach **Ports and Adapters / Hexagonal Architecture** aufgebaut.
|
||||
|
||||
### Module
|
||||
|
||||
- `pdf-umbenenner-domain`
|
||||
- `pdf-umbenenner-application`
|
||||
- `pdf-umbenenner-adapter-in-cli`
|
||||
- `pdf-umbenenner-adapter-out`
|
||||
- `pdf-umbenenner-bootstrap`
|
||||
|
||||
### Grundprinzipien
|
||||
|
||||
- Abhängigkeiten zeigen immer **nach innen**
|
||||
- Domain kennt **keine Infrastruktur**
|
||||
- externe Zugriffe erfolgen ausschließlich über **Ports**
|
||||
- technische Implementierungen liegen in **Adaptern**
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung
|
||||
|
||||
## Konfiguration
|
||||
|
||||
Die Anwendung wird über eine `.properties`-Datei konfiguriert.
|
||||
|
||||
Typische Bereiche sind:
|
||||
|
||||
- Quellordner
|
||||
- Zielordner
|
||||
- SQLite-Datei
|
||||
- KI-Provider und Modell
|
||||
- Timeout
|
||||
- Seitenlimit
|
||||
- Textlimit für KI-Aufrufe
|
||||
- Prompt-Datei
|
||||
- Logging
|
||||
|
||||
Für einen lokalen Einstieg dient die Beispielkonfiguration unter:
|
||||
|
||||
```text
|
||||
config/application-local.example.properties
|
||||
```
|
||||
|
||||
## Build
|
||||
|
||||
Projektweit:
|
||||
|
||||
```bash
|
||||
./mvnw clean verify
|
||||
```
|
||||
|
||||
Unter Windows:
|
||||
|
||||
```powershell
|
||||
.\mvnw.cmd clean verify
|
||||
```
|
||||
|
||||
## Start
|
||||
|
||||
Das ausführbare Artefakt wird im Bootstrap-Modul erzeugt. Der Start erfolgt als normales Java-Programm:
|
||||
|
||||
```bash
|
||||
java -jar <bootstrap-jar>.jar
|
||||
```
|
||||
|
||||
Die konkrete JAR-Datei hängt vom aktuellen Build-Stand ab.
|
||||
|
||||
## Logging, Status und Nachvollziehbarkeit
|
||||
|
||||
Der PDF-Umbenenner ist auf Nachvollziehbarkeit und Wiederholbarkeit ausgelegt:
|
||||
|
||||
- persistente Dokumenthistorie in **SQLite**
|
||||
- Status- und Retry-Semantik für robuste Batch-Läufe
|
||||
- Idempotenz über inhaltsbasierten Fingerprint
|
||||
- Logging über **Log4j2**
|
||||
- Schutz sensibler KI-Inhalte im Log
|
||||
|
||||
## Dokumentation im Repository
|
||||
|
||||
Die maßgeblichen Dokumente sind:
|
||||
|
||||
- `CLAUDE.md`
|
||||
- `docs/specs/technik-und-architektur.md`
|
||||
- `docs/specs/fachliche-anforderungen.md`
|
||||
- `docs/specs/meilensteine.md`
|
||||
- `docs/workpackages/...`
|
||||
|
||||
Empfohlene Leserichtung:
|
||||
|
||||
1. `CLAUDE.md`
|
||||
2. technische Zielarchitektur
|
||||
3. fachliche Anforderungen
|
||||
4. Meilensteine
|
||||
5. aktives Arbeitspaket
|
||||
|
||||
## Entwicklungsleitplanken
|
||||
|
||||
- kleine, fokussierte Änderungen
|
||||
- keine stillen Annahmen bei Dokumentkonflikten
|
||||
- keine unnötigen Refactorings
|
||||
- Architekturtreue hat Vorrang
|
||||
- keine Meilenstein- oder Arbeitspaket-Bezeichner in Produktionscode, Kommentaren oder JavaDoc
|
||||
|
||||
## Status des Projekts
|
||||
|
||||
Das Repository verfolgt einen inkrementellen, meilensteinbasierten Ausbau. Der aktuelle Produktstand baut auf einem vollständig implementierten Kern für:
|
||||
|
||||
- Konfiguration und Startvalidierung
|
||||
- Quellordner-Scan und PDF-Textauslese
|
||||
- Fingerprint, SQLite-Persistenz und Idempotenz
|
||||
- KI-Integration für Benennungsvorschläge
|
||||
- Dateinamensbildung und Zielkopie
|
||||
- Retry-Logik, Logging und betriebliche Robustheit
|
||||
|
||||
## Lizenz / Nutzung
|
||||
|
||||
Falls für dieses Repository eine konkrete Lizenz vorgesehen ist, sollte sie hier ergänzt werden.
|
||||
@@ -1,21 +1,89 @@
|
||||
# PDF Umbenenner Local Configuration Example
|
||||
# AP-005: Copy this file to config/application.properties and adjust values for local development
|
||||
# PDF Umbenenner – Konfigurationsbeispiel fuer lokale Entwicklung
|
||||
# Kopiere diese Datei nach config/application.properties und passe die Werte an.
|
||||
|
||||
# Mandatory M1 properties
|
||||
# ---------------------------------------------------------------------------
|
||||
# Pflichtparameter (allgemein)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Quellordner: Ordner, aus dem OCR-verarbeitete PDF-Dateien gelesen werden.
|
||||
# Der Ordner muss vorhanden und lesbar sein.
|
||||
source.folder=./work/local/source
|
||||
target.folder=./work/local/target
|
||||
sqlite.file=./work/local/pdf-umbenenner.db
|
||||
api.baseUrl=http://localhost:8080/api
|
||||
api.model=gpt-4o-mini
|
||||
api.timeoutSeconds=30
|
||||
max.retries.transient=3
|
||||
max.pages=10
|
||||
max.text.characters=5000
|
||||
prompt.template.file=./config/prompts/local-template.txt
|
||||
|
||||
# Optional properties
|
||||
runtime.lock.file=./work/local/lock.pid
|
||||
# Zielordner: Ordner, in den die umbenannten Kopien abgelegt werden.
|
||||
# Wird automatisch angelegt, wenn er noch nicht existiert.
|
||||
target.folder=./work/local/target
|
||||
|
||||
# SQLite-Datenbankdatei fuer Bearbeitungsstatus und Versuchshistorie.
|
||||
# Das uebergeordnete Verzeichnis muss vorhanden sein.
|
||||
sqlite.file=./work/local/pdf-umbenenner.db
|
||||
|
||||
# Maximale Anzahl historisierter transienter Fehlversuche pro Dokument.
|
||||
# Muss eine ganze Zahl >= 1 sein.
|
||||
max.retries.transient=3
|
||||
|
||||
# Maximale Seitenzahl pro Dokument. Dokumente mit mehr Seiten werden als
|
||||
# deterministischer Inhaltsfehler behandelt (kein KI-Aufruf).
|
||||
max.pages=10
|
||||
|
||||
# Maximale Zeichenanzahl des Dokumenttexts, der an die KI gesendet wird.
|
||||
max.text.characters=5000
|
||||
|
||||
# Pfad zur externen Prompt-Datei. Der Dateiname dient als Prompt-Identifikator
|
||||
# in der Versuchshistorie.
|
||||
prompt.template.file=./config/prompts/template.txt
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Optionale Parameter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Pfad zur Lock-Datei fuer den Startschutz (verhindert parallele Instanzen).
|
||||
runtime.lock.file=./work/local/pdf-umbenenner.lock
|
||||
|
||||
# Log-Verzeichnis. Wird weggelassen, schreibt Log4j2 in ./logs/.
|
||||
log.directory=./work/local/logs
|
||||
|
||||
# Log-Level (DEBUG, INFO, WARN, ERROR). Standard ist INFO.
|
||||
log.level=INFO
|
||||
# api.key can also be set via environment variable PDF_UMBENENNER_API_KEY
|
||||
api.key=your-local-api-key-here
|
||||
|
||||
# Sensible KI-Inhalte (vollstaendige Rohantwort und Reasoning) ins Log schreiben.
|
||||
# Erlaubte Werte: true oder false. Standard ist false (geschuetzt).
|
||||
log.ai.sensitive=false
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Aktiver KI-Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
# Erlaubte Werte: openai-compatible, claude
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# OpenAI-kompatibler Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
# Basis-URL des KI-Dienstes (ohne Pfadsuffix wie /chat/completions).
|
||||
ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1
|
||||
|
||||
# Modellname des KI-Dienstes.
|
||||
ai.provider.openai-compatible.model=gpt-4o-mini
|
||||
|
||||
# HTTP-Timeout fuer KI-Anfragen in Sekunden (muss > 0 sein).
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
|
||||
# API-Schluessel.
|
||||
# Vorrangreihenfolge: OPENAI_COMPATIBLE_API_KEY (Umgebungsvariable) >
|
||||
# PDF_UMBENENNER_API_KEY (veraltete Umgebungsvariable, weiterhin akzeptiert) >
|
||||
# ai.provider.openai-compatible.apiKey (dieser Wert)
|
||||
ai.provider.openai-compatible.apiKey=your-openai-api-key-here
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)
|
||||
# ---------------------------------------------------------------------------
|
||||
# Basis-URL (optional; Standard: https://api.anthropic.com)
|
||||
# ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
|
||||
# Modellname (z. B. claude-3-5-sonnet-20241022)
|
||||
# ai.provider.claude.model=claude-3-5-sonnet-20241022
|
||||
|
||||
# HTTP-Timeout fuer KI-Anfragen in Sekunden (muss > 0 sein).
|
||||
# ai.provider.claude.timeoutSeconds=60
|
||||
|
||||
# API-Schluessel. Die Umgebungsvariable ANTHROPIC_API_KEY hat Vorrang.
|
||||
# ai.provider.claude.apiKey=
|
||||
|
||||
@@ -1,21 +1,46 @@
|
||||
# PDF Umbenenner Test Configuration Example
|
||||
# AP-005: Copy this file to config/application.properties and adjust values for testing
|
||||
# PDF Umbenenner – Konfigurationsbeispiel fuer Testlaeufe
|
||||
# Kopiere diese Datei nach config/application.properties und passe die Werte an.
|
||||
# Diese Vorlage enthaelt kuerzere Timeouts und niedrigere Limits fuer Testlaeufe.
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Pflichtparameter (allgemein)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Mandatory M1 properties
|
||||
source.folder=./work/test/source
|
||||
target.folder=./work/test/target
|
||||
sqlite.file=./work/test/pdf-umbenenner-test.db
|
||||
api.baseUrl=http://localhost:8081/api
|
||||
api.model=gpt-4o-mini-test
|
||||
api.timeoutSeconds=10
|
||||
|
||||
max.retries.transient=1
|
||||
max.pages=5
|
||||
max.text.characters=2000
|
||||
prompt.template.file=./config/prompts/test-template.txt
|
||||
prompt.template.file=./config/prompts/template.txt
|
||||
|
||||
# Optional properties
|
||||
runtime.lock.file=./work/test/lock.pid
|
||||
# ---------------------------------------------------------------------------
|
||||
# Optionale Parameter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
runtime.lock.file=./work/test/pdf-umbenenner.lock
|
||||
log.directory=./work/test/logs
|
||||
log.level=DEBUG
|
||||
# api.key can also be set via environment variable PDF_UMBENENNER_API_KEY
|
||||
api.key=test-api-key-placeholder
|
||||
log.ai.sensitive=false
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Aktiver KI-Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# OpenAI-kompatibler Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1
|
||||
ai.provider.openai-compatible.model=gpt-4o-mini
|
||||
ai.provider.openai-compatible.timeoutSeconds=10
|
||||
ai.provider.openai-compatible.apiKey=test-api-key-placeholder
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)
|
||||
# ---------------------------------------------------------------------------
|
||||
# ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
# ai.provider.claude.model=claude-3-5-sonnet-20241022
|
||||
# ai.provider.claude.timeoutSeconds=60
|
||||
# ai.provider.claude.apiKey=your-anthropic-api-key-here
|
||||
|
||||
@@ -1 +1,22 @@
|
||||
This is a test prompt template for AP-006 validation.
|
||||
Du bist ein Assistent zur automatischen Benennung gescannter PDF-Dokumente.
|
||||
|
||||
Analysiere den folgenden Dokumenttext und ermittle:
|
||||
|
||||
1. Einen inhaltlich passenden deutschen Titel (maximal 20 Zeichen, nur Buchstaben und Leerzeichen, keine Abkürzungen, keine generischen Bezeichnungen wie "Dokument", "Datei", "Scan" oder "PDF")
|
||||
2. Das relevanteste Datum des Dokuments
|
||||
|
||||
Datumsermittlung nach Priorität:
|
||||
- Rechnungsdatum
|
||||
- Dokumentdatum
|
||||
- Ausstellungsdatum oder Bescheiddatum
|
||||
- Schreibdatum oder Ende eines Leistungszeitraums
|
||||
- Kein Datum angeben, wenn kein belastbares Datum eindeutig ableitbar ist
|
||||
|
||||
Titelregeln:
|
||||
- Titel auf Deutsch formulieren
|
||||
- Eigennamen (Personen, Firmen, Orte) unverändert übernehmen
|
||||
- Maximal 20 Zeichen (nur der Basistitel, ohne Datumspräfix)
|
||||
- Keine Sonderzeichen außer Leerzeichen
|
||||
- Eindeutig und verständlich, nicht generisch
|
||||
|
||||
Wenn das Dokument nicht eindeutig interpretierbar ist, beschreibe dies im Reasoning.
|
||||
|
||||
175
docs/befundliste.md
Normal file
175
docs/befundliste.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Befundliste – Integrierte Gesamtprüfung und Freigabe des Endstands
|
||||
|
||||
**Erstellt:** 2026-04-08
|
||||
**Aktualisiert:** 2026-04-08 (Naming-Convention-Bereinigung B1 abgeschlossen, finale Freigabe)
|
||||
**Grundlage:** Vollständiger Maven-Reactor-Build, Unit-Tests, E2E-Tests, Integrationstests (Smoke),
|
||||
PIT-Mutationsanalyse, Code-Review gegen verbindliche Spezifikationen (technik-und-architektur.md,
|
||||
fachliche-anforderungen.md, CLAUDE.md)
|
||||
|
||||
---
|
||||
|
||||
## Ausgeführte Prüfungen
|
||||
|
||||
| Prüfbereich | Ausgeführt | Ergebnis |
|
||||
|---|---|---|
|
||||
| Maven-Reactor-Build (clean verify, alle Module) | ja | GRÜN |
|
||||
| Unit-Tests (Domain, Application, Adapter-out, Bootstrap) | ja | GRÜN |
|
||||
| E2E-Tests (BatchRunEndToEndTest, 11 Szenarien) | ja | GRÜN |
|
||||
| Integrationstests / Smoke-IT (ExecutableJarSmokeTestIT, 2 Tests) | ja | GRÜN |
|
||||
| PIT-Mutationsanalyse (alle Module) | ja | siehe Einzelbefunde |
|
||||
| Hexagonale Architektur – Domain-Isolation | ja | GRÜN |
|
||||
| Hexagonale Architektur – Port-Verträge (kein Path/NIO/JDBC) | ja | GRÜN |
|
||||
| Hexagonale Architektur – keine Adapter-zu-Adapter-Abhängigkeiten | ja | GRÜN |
|
||||
| Statusmodell (8 Werte, Semantik laut CLAUDE.md) | ja | GRÜN |
|
||||
| Naming-Convention-Regel (kein M1–M8, kein AP-xxx im Code) | ja | GRÜN |
|
||||
| Logging-Sensibilitätsregel (log.ai.sensitive) | ja | GRÜN |
|
||||
| Exit-Code-Semantik (0 / 1) | ja | GRÜN |
|
||||
| Konfigurationsbeispiele (Pflicht- und Optionalparameter) | ja | GRÜN |
|
||||
| Betriebsdokumentation (docs/betrieb.md) | ja | GRÜN |
|
||||
| Prompt-Template im Repository | ja | GRÜN |
|
||||
| Rückwärtsverträglichkeit M4–M7 (Statusmodell, Schema) | ja (statisch) | GRÜN |
|
||||
|
||||
---
|
||||
|
||||
## Grüne Bereiche (keine Befunde)
|
||||
|
||||
### Build und Tests
|
||||
|
||||
- Vollständiger Maven-Reactor-Build erfolgreich (`BUILD SUCCESS`, Gesamtlaufzeit ~4 Minuten)
|
||||
- **827+ Tests** bestanden, 0 Fehler, 0 übersprungen:
|
||||
- Domain: 227 Tests
|
||||
- Application: 295 Tests
|
||||
- Adapter-out: 227 Tests
|
||||
- Bootstrap (Unit): 76 Tests
|
||||
- Smoke-IT: 2 Tests
|
||||
|
||||
### E2E-Szenarien (BatchRunEndToEndTest)
|
||||
|
||||
Alle geforderten Kernszenarien aus der E2E-Testbasis sind abgedeckt und grün:
|
||||
|
||||
- Happy-Path: zwei Läufe → `SUCCESS`
|
||||
- Deterministischer Inhaltsfehler: zwei Läufe → `FAILED_FINAL`
|
||||
- Transienter KI-Fehler → `FAILED_RETRYABLE`
|
||||
- Skip nach `SUCCESS` → `SKIPPED_ALREADY_PROCESSED`
|
||||
- Skip nach `FAILED_FINAL` → `SKIPPED_FINAL_FAILURE`
|
||||
- `PROPOSAL_READY`-Finalisierung ohne erneuten KI-Aufruf im zweiten Lauf
|
||||
- Zielkopierfehler mit Sofort-Wiederholversuch → `SUCCESS`
|
||||
- Transiente Fehler über mehrere Läufe → Ausschöpfung → `FAILED_FINAL`
|
||||
- Zielkopierfehler beide Versuche gescheitert → `FAILED_RETRYABLE`
|
||||
- Zwei verschiedene Dokumente, gleicher Vorschlagsname → Dubletten-Suffix `(1)`
|
||||
- Mixed-Batch: ein Erfolg, ein Inhaltsfehler → Batch-Outcome `SUCCESS` (Exit-Code 0)
|
||||
|
||||
### Hexagonale Architektur
|
||||
|
||||
- **Domain** vollständig infrastrukturfrei: keine Imports aus `java.nio`, `java.io.File`,
|
||||
JDBC, Log4j oder HTTP-Bibliotheken
|
||||
- **Port-Verträge** (alle Interfaces in `application.port.out`) enthalten keine `Path`-,
|
||||
`File`-, NIO- oder JDBC-Typen; nur Domain-Typen werden in Signaturen verwendet
|
||||
- **Keine Adapter-zu-Adapter-Abhängigkeiten** in `adapter-out`: kein Modul referenziert
|
||||
ein anderes Adapter-Implementierungspaket direkt
|
||||
- **Abhängigkeitsrichtung** korrekt: adapter-out → application → domain
|
||||
|
||||
### Fachregeln
|
||||
|
||||
- Statusmodell vollständig (8 Werte: `READY_FOR_AI`, `PROPOSAL_READY`, `SUCCESS`,
|
||||
`FAILED_RETRYABLE`, `FAILED_FINAL`, `SKIPPED_ALREADY_PROCESSED`,
|
||||
`SKIPPED_FINAL_FAILURE`, `PROCESSING`)
|
||||
- Retry-Semantik korrekt implementiert (deterministisch 1 Retry → final;
|
||||
transient bis `max.retries.transient`)
|
||||
- Skip-Semantik korrekt (SUCCESS → Skip, FAILED_FINAL → Skip, keine Zähleränderung)
|
||||
- Führende Proposal-Quelle: `PROPOSAL_READY`-Versuch wird korrekt als Quelle verwendet
|
||||
- SUCCESS-Bedingung: erst nach Zielkopie und konsistenter Persistenz
|
||||
|
||||
### Logging und Sensibilität
|
||||
|
||||
- `log.ai.sensitive`-Mechanismus vollständig implementiert und getestet
|
||||
- Default `false` (sicher): KI-Rohantwort und Reasoning nicht im Log
|
||||
- Persistenz in SQLite unabhängig von dieser Einstellung
|
||||
- Konfiguration in beiden Beispieldateien dokumentiert
|
||||
|
||||
### Konfiguration und Dokumentation
|
||||
|
||||
- `config/application-local.example.properties`: vollständig, alle Pflicht- und
|
||||
Optionalparameter vorhanden
|
||||
- `config/application-test.example.properties`: vollständig
|
||||
- `config/prompts/template.txt`: Prompt-Template im Repository vorhanden
|
||||
- `docs/betrieb.md`: Betriebsdokumentation mit Start, Konfiguration, Exit-Codes,
|
||||
Retry-Grundverhalten, Logging-Sensibilität
|
||||
- Konfigurationsparameter-Namen in Dokumentation und Code konsistent
|
||||
|
||||
### Exit-Code-Semantik
|
||||
|
||||
- Exit-Code `0`: technisch ordnungsgemäßer Lauf (auch bei Teilfehlern einzelner Dokumente)
|
||||
- Exit-Code `1`: harte Start-/Bootstrap-Fehler, ungültige Konfiguration, Lock-Fehler
|
||||
- Implementierung in `PdfUmbenennerApplication` und `BootstrapRunner` korrekt
|
||||
|
||||
### PIT-Mutationsanalyse (Gesamtstand)
|
||||
|
||||
- Domain: 83 % Mutation Kill Rate
|
||||
- Adapter-out: 83 % Mutation Kill Rate
|
||||
- Application: 87 % Test Strength
|
||||
- Bootstrap: 76 % Kill Rate (34 Mutationen, 26 getötet)
|
||||
|
||||
---
|
||||
|
||||
## Abgeschlossene Punkte
|
||||
|
||||
### B1 – Naming-Convention-Verletzungen in Code, Tests und Konfiguration (CLAUDE.md § Naming-Regel)
|
||||
|
||||
**Themenbereich:** Dokumentation / Codequalität
|
||||
**Norm:** CLAUDE.md verbietet explizit Meilenstein- (M1–M8) und Arbeitspaket-Bezeichner (AP-xxx)
|
||||
in Implementierungen, Kommentaren und JavaDoc.
|
||||
**Status:** **BEHOBEN** – alle 43 Treffer in `.java`-Dateien sowie der Kommentarheader in
|
||||
`config/application.properties` wurden durch zeitlose technische Formulierungen ersetzt.
|
||||
|
||||
---
|
||||
|
||||
## Dokumentierte Randpunkte (kein Handlungsbedarf, freigabekompatibel)
|
||||
|
||||
#### B2 – StartConfiguration in Application-Schicht enthält java.nio.file.Path (Architektur-Grenzfall)
|
||||
|
||||
**Themenbereich:** Architektur
|
||||
**Norm:** „Application orchestriert Use Cases und enthält keine technischen
|
||||
Implementierungsdetails" (technik-und-architektur.md §3.1); Port-Verträge dürfen keine
|
||||
NIO-Typen enthalten (CLAUDE.md).
|
||||
**Befund:** `StartConfiguration` (in `application/config/startup/`) ist ein Java-Record
|
||||
mit `java.nio.file.Path`-Feldern für `sourceFolder`, `targetFolder`, `sqliteFile`,
|
||||
`promptTemplateFile`, `runtimeLockFile`, `logDirectory`.
|
||||
**Kontext:** `StartConfiguration` ist kein Port-Vertrag, sondern ein unveränderliches
|
||||
Konfigurations-DTO, das ausschließlich von Bootstrap erzeugt und an Adapter übergeben wird.
|
||||
Die Port-Verträge selbst sind sauber (keine Path-Typen in Port-Interfaces).
|
||||
**Bewertung:** Grenzfall. `Path` ist kein fachliches Objekt, aber auch kein schwerer
|
||||
Architekturverstoß in diesem Kontext. Die Alternative (String-Repräsentation und Auflösung
|
||||
im Adapter) hätte keinen Mehrwert für das Betriebsmodell.
|
||||
**Entscheidung:** Kein Handlungsbedarf. Das Verschieben von `StartConfiguration` in das
|
||||
Bootstrap-Modul wäre eine Option, ist aber keine Pflicht, da kein funktionaler Defekt vorliegt.
|
||||
|
||||
---
|
||||
|
||||
#### B3 – PIT-Überlebende in Bootstrap (Bootstrap: 76 % Kill Rate)
|
||||
|
||||
**Themenbereich:** Testqualität
|
||||
**Befund:** 8 überlebende Mutanten im Bootstrap-Modul (34 generiert, 26 getötet).
|
||||
Hauptkategorie: `VoidMethodCallMutator` (2 Überlebende, 2 ohne Coverage).
|
||||
**Bewertung:** Betrifft vor allem Logging-Calls und nicht-kritische Hilfsmethoden.
|
||||
Keine funktional tragenden Entscheidungspfade betroffen.
|
||||
**Entscheidung:** Kein Handlungsbedarf. Betrifft vor allem Logging-Calls und nicht-kritische
|
||||
Hilfsmethoden. Wurde auf akzeptablem Niveau konsolidiert.
|
||||
|
||||
---
|
||||
|
||||
## Zusammenfassung und Freigabe
|
||||
|
||||
| Klassifikation | Anzahl | Beschreibung |
|
||||
|---|---|---|
|
||||
| Release-Blocker | **0** | – |
|
||||
| Abgeschlossen (war nicht blockierend) | **1** | B1 Naming-Convention-Bereinigung |
|
||||
| Dokumentierte Randpunkte (freigabekompatibel) | **2** | B2 Path-Grenzfall, B3 PIT-Bootstrap |
|
||||
|
||||
**Freigabeentscheidung: Der Endstand ist produktionsbereit und freigegeben.**
|
||||
|
||||
Alle fachlichen, technischen und architekturellen Kernanforderungen aus den verbindlichen
|
||||
Spezifikationen (technik-und-architektur.md, fachliche-anforderungen.md, CLAUDE.md) sind
|
||||
vollständig umgesetzt und durch automatisierte Tests abgesichert. Der Maven-Build ist fehlerfrei.
|
||||
Die CLAUDE.md-Naming-Convention-Regel (kein M1–M8, kein AP-xxx im Produktions- oder Testcode)
|
||||
ist vollständig eingehalten. Keine bekannten spezifikationsrelevanten Blocker sind offen.
|
||||
289
docs/betrieb.md
Normal file
289
docs/betrieb.md
Normal file
@@ -0,0 +1,289 @@
|
||||
# Betriebsdokumentation – PDF Umbenenner
|
||||
|
||||
## Zweck
|
||||
|
||||
Der PDF Umbenenner liest bereits OCR-verarbeitete, durchsuchbare PDF-Dateien aus einem
|
||||
konfigurierten Quellordner, ermittelt per KI-Aufruf einen normierten deutschen Dateinamen
|
||||
und legt eine Kopie im konfigurierten Zielordner ab. Die Quelldatei bleibt unverändert.
|
||||
|
||||
---
|
||||
|
||||
## Voraussetzungen
|
||||
|
||||
- Java 21 (JRE oder JDK)
|
||||
- Zugang zu einem OpenAI-kompatiblen KI-Dienst (API-Schlüssel erforderlich)
|
||||
- Quellordner mit OCR-verarbeiteten PDF-Dateien
|
||||
- Schreibzugriff auf Zielordner und Datenbankverzeichnis
|
||||
|
||||
---
|
||||
|
||||
## Start des ausführbaren JAR
|
||||
|
||||
Das ausführbare JAR wird durch den Maven-Build im Verzeichnis
|
||||
`pdf-umbenenner-bootstrap/target/` erzeugt:
|
||||
|
||||
```
|
||||
java -jar pdf-umbenenner-bootstrap/target/pdf-umbenenner-bootstrap-0.0.1-SNAPSHOT.jar
|
||||
```
|
||||
|
||||
Die Anwendung liest die Konfiguration aus `config/application.properties` relativ zum
|
||||
Arbeitsverzeichnis, in dem der Befehl ausgeführt wird.
|
||||
|
||||
### Start über Windows Task Scheduler
|
||||
|
||||
Empfohlene Startsequenz für den Windows Task Scheduler:
|
||||
|
||||
1. Aktion: Programm/Skript starten
|
||||
2. Programm: `java`
|
||||
3. Argumente: `-jar C:\Pfad\zur\Installation\pdf-umbenenner-bootstrap\target\pdf-umbenenner-bootstrap-0.0.1-SNAPSHOT.jar`
|
||||
4. Starten in: `C:\Pfad\zur\Installation` (muss das Verzeichnis mit `config\application.properties` und `config\prompts\` enthalten)
|
||||
|
||||
> **Hinweis:** Das „Starten in"-Verzeichnis ist das Arbeitsverzeichnis der Anwendung.
|
||||
> Die Konfigurationsdatei `config/application.properties` sowie das Prompt-Verzeichnis
|
||||
> `config/prompts/` müssen relativ zu diesem Verzeichnis erreichbar sein. Der JAR-Pfad
|
||||
> in den Argumenten muss absolut oder relativ zum Starten-in-Verzeichnis korrekt angegeben sein.
|
||||
|
||||
---
|
||||
|
||||
## Konfiguration
|
||||
|
||||
Die Konfiguration wird aus `config/application.properties` geladen.
|
||||
Vorlagen für lokale und Test-Konfigurationen befinden sich in:
|
||||
|
||||
- `config/application-local.example.properties`
|
||||
- `config/application-test.example.properties`
|
||||
|
||||
### Pflichtparameter (allgemein)
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-------------------------|--------------|
|
||||
| `source.folder` | Quellordner mit OCR-PDFs (muss vorhanden und lesbar sein) |
|
||||
| `target.folder` | Zielordner für umbenannte Kopien (wird angelegt, wenn nicht vorhanden) |
|
||||
| `sqlite.file` | SQLite-Datenbankdatei (übergeordnetes Verzeichnis muss existieren) |
|
||||
| `ai.provider.active` | Aktiver KI-Provider: `openai-compatible` oder `claude` |
|
||||
| `max.retries.transient` | Maximale transiente Fehlversuche pro Dokument (ganzzahlig, >= 1) |
|
||||
| `max.pages` | Maximale Seitenzahl pro Dokument (ganzzahlig, > 0) |
|
||||
| `max.text.characters` | Maximale Zeichenanzahl des Dokumenttexts für KI-Anfragen (ganzzahlig, > 0) |
|
||||
| `prompt.template.file` | Pfad zur externen Prompt-Datei (muss vorhanden sein) |
|
||||
|
||||
### Provider-Parameter
|
||||
|
||||
Nur der **aktive** Provider muss vollständig konfiguriert sein. Der inaktive Provider wird nicht validiert.
|
||||
|
||||
**OpenAI-kompatibler Provider** (`ai.provider.active=openai-compatible`):
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-----------|--------------|
|
||||
| `ai.provider.openai-compatible.baseUrl` | Basis-URL des KI-Dienstes (z. B. `https://api.openai.com/v1`) |
|
||||
| `ai.provider.openai-compatible.model` | Modellname (z. B. `gpt-4o-mini`) |
|
||||
| `ai.provider.openai-compatible.timeoutSeconds` | HTTP-Timeout in Sekunden (ganzzahlig, > 0) |
|
||||
| `ai.provider.openai-compatible.apiKey` | API-Schlüssel (Umgebungsvariable `OPENAI_COMPATIBLE_API_KEY` hat Vorrang) |
|
||||
|
||||
**Anthropic Claude-Provider** (`ai.provider.active=claude`):
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-----------|--------------|
|
||||
| `ai.provider.claude.baseUrl` | Basis-URL (optional; Standard: `https://api.anthropic.com`) |
|
||||
| `ai.provider.claude.model` | Modellname (z. B. `claude-3-5-sonnet-20241022`) |
|
||||
| `ai.provider.claude.timeoutSeconds` | HTTP-Timeout in Sekunden (ganzzahlig, > 0) |
|
||||
| `ai.provider.claude.apiKey` | API-Schlüssel (Umgebungsvariable `ANTHROPIC_API_KEY` hat Vorrang) |
|
||||
|
||||
### Optionale Parameter
|
||||
|
||||
| Parameter | Beschreibung | Standard |
|
||||
|---------------------|--------------|---------|
|
||||
| `runtime.lock.file` | Lock-Datei für Startschutz | `pdf-umbenenner.lock` im Arbeitsverzeichnis |
|
||||
| `log.directory` | Log-Verzeichnis | `./logs/` |
|
||||
| `log.level` | Log-Level (`DEBUG`, `INFO`, `WARN`, `ERROR`) | `INFO` |
|
||||
| `log.ai.sensitive` | KI-Rohantwort und Reasoning ins Log schreiben (`true`/`false`) | `false` |
|
||||
|
||||
### API-Schlüssel
|
||||
|
||||
Pro Provider-Familie existiert eine eigene Umgebungsvariable, die Vorrang vor dem Properties-Wert hat:
|
||||
|
||||
| Provider | Umgebungsvariable |
|
||||
|---|---|
|
||||
| `openai-compatible` | `OPENAI_COMPATIBLE_API_KEY` |
|
||||
| `claude` | `ANTHROPIC_API_KEY` |
|
||||
|
||||
Schlüssel verschiedener Provider-Familien werden niemals vermischt.
|
||||
|
||||
---
|
||||
|
||||
## Migration älterer Konfigurationsdateien
|
||||
|
||||
Ältere Konfigurationsdateien, die noch die flachen Schlüssel `api.baseUrl`, `api.model`,
|
||||
`api.timeoutSeconds` und `api.key` verwenden, werden beim ersten Start **automatisch**
|
||||
in das aktuelle Schema überführt.
|
||||
|
||||
### Was passiert
|
||||
|
||||
1. Die Anwendung erkennt die veraltete Form anhand der flachen `api.*`-Schlüssel.
|
||||
2. **Vor jeder Änderung** wird eine Sicherungskopie der Originaldatei angelegt:
|
||||
- Standardfall: `config/application.properties.bak`
|
||||
- Falls `.bak` bereits existiert: `config/application.properties.bak.1`, `.bak.2`, …
|
||||
- Bestehende Sicherungen werden **niemals überschrieben**.
|
||||
3. Die Datei wird in-place in das neue Schema überführt:
|
||||
- `api.baseUrl` → `ai.provider.openai-compatible.baseUrl`
|
||||
- `api.model` → `ai.provider.openai-compatible.model`
|
||||
- `api.timeoutSeconds` → `ai.provider.openai-compatible.timeoutSeconds`
|
||||
- `api.key` → `ai.provider.openai-compatible.apiKey`
|
||||
- `ai.provider.active=openai-compatible` wird ergänzt.
|
||||
- Alle übrigen Schlüssel bleiben unverändert.
|
||||
4. Die migrierte Datei wird über eine temporäre Datei (`*.tmp`) und atomischen
|
||||
Move/Rename geschrieben. Das Original wird niemals teilbeschrieben.
|
||||
5. Die migrierte Datei wird sofort neu eingelesen und validiert.
|
||||
|
||||
### Bei Migrationsfehler
|
||||
|
||||
Schlägt die Validierung der migrierten Datei fehl, bricht die Anwendung mit Exit-Code `1` ab.
|
||||
Die Sicherungskopie (`.bak`) bleibt in diesem Fall erhalten und enthält die unveränderte
|
||||
Originaldatei. Die Konfiguration muss dann manuell korrigiert werden.
|
||||
|
||||
### Betreiber-Hinweis
|
||||
|
||||
Die Umgebungsvariable `PDF_UMBENENNER_API_KEY` des Vorgängerstands wird **nicht** automatisch
|
||||
umbenannt. Falls dieser Wert bislang verwendet wurde, muss er auf `OPENAI_COMPATIBLE_API_KEY`
|
||||
umgestellt werden.
|
||||
|
||||
---
|
||||
|
||||
## Prompt-Konfiguration
|
||||
|
||||
Der Prompt wird aus der in `prompt.template.file` konfigurierten externen Textdatei geladen.
|
||||
Der Dateiname der Prompt-Datei dient als Prompt-Identifikator in der Versuchshistorie
|
||||
(SQLite) und ermöglicht so die Nachvollziehbarkeit, welche Prompt-Version für welchen
|
||||
Verarbeitungsversuch verwendet wurde.
|
||||
|
||||
Eine Vorlage befindet sich in `config/prompts/template.txt` und kann direkt verwendet oder
|
||||
an den jeweiligen KI-Dienst angepasst werden.
|
||||
|
||||
Die Anwendung ergänzt den Prompt automatisch um:
|
||||
- einen Dokumenttext-Abschnitt
|
||||
- eine explizite JSON-Antwortspezifikation mit den Feldern `title`, `reasoning` und `date`
|
||||
|
||||
Der Prompt in `template.txt` muss deshalb **keine** JSON-Formatanweisung enthalten –
|
||||
nur den inhaltlichen Auftrag an die KI.
|
||||
|
||||
---
|
||||
|
||||
## Zielformat
|
||||
|
||||
Jede erfolgreich verarbeitete PDF-Datei wird im Zielordner unter folgendem Namen abgelegt:
|
||||
|
||||
```
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
Bei Namenskollisionen wird ein laufendes Suffix angehängt:
|
||||
|
||||
```
|
||||
YYYY-MM-DD - Titel(1).pdf
|
||||
YYYY-MM-DD - Titel(2).pdf
|
||||
```
|
||||
|
||||
Das Suffix zählt nicht zu den 20 Zeichen des Basistitels.
|
||||
|
||||
---
|
||||
|
||||
## Retry- und Skip-Verhalten
|
||||
|
||||
### Dokumentstatus
|
||||
|
||||
Die folgende Tabelle beschreibt die persistenten Statuszustände der Dokument-Stammsätze
|
||||
in der SQLite-Datenbank. Diese Zustände sind nach Abschluss eines Verarbeitungsversuchs
|
||||
dauerhaft gespeichert.
|
||||
|
||||
| Status | Bedeutung |
|
||||
|-----------------------------|-----------|
|
||||
| `READY_FOR_AI` | Verarbeitbar, KI-Pfad noch nicht durchlaufen |
|
||||
| `PROPOSAL_READY` | KI-Benennungsvorschlag liegt vor, Zielkopie noch nicht geschrieben |
|
||||
| `SUCCESS` | Erfolgreich verarbeitet und kopiert (terminaler Endzustand) |
|
||||
| `FAILED_RETRYABLE` | Fehlgeschlagen, erneuter Versuch in späterem Lauf möglich |
|
||||
| `FAILED_FINAL` | Terminal fehlgeschlagen, wird nicht erneut verarbeitet |
|
||||
| `SKIPPED_ALREADY_PROCESSED` | Übersprungen – Dokument bereits erfolgreich verarbeitet |
|
||||
| `SKIPPED_FINAL_FAILURE` | Übersprungen – Dokument terminal fehlgeschlagen |
|
||||
|
||||
Zusätzlich kennt das System den transienten Zustand `PROCESSING`, der während der aktiven
|
||||
Verarbeitung eines Dokuments im Stammsatz gesetzt werden kann. Er wird nach Abschluss des
|
||||
Verarbeitungsversuchs stets durch einen der obigen Zustände ersetzt und ist kein gültiger
|
||||
Endstatus in der Datenbank.
|
||||
|
||||
### Retry-Regeln
|
||||
|
||||
**Deterministische Inhaltsfehler** (z. B. kein extrahierbarer Text, Seitenlimit überschritten,
|
||||
unbrauchbarer KI-Titel):
|
||||
|
||||
- Erster Fehler → `FAILED_RETRYABLE` (ein Wiederholversuch in späterem Lauf erlaubt)
|
||||
- Zweiter Fehler → `FAILED_FINAL` (kein weiterer Versuch)
|
||||
|
||||
**Transiente technische Fehler** (z. B. KI nicht erreichbar, HTTP-Timeout):
|
||||
|
||||
- Wiederholbar bis zum Grenzwert `max.retries.transient`
|
||||
- Bei Erreichen des Grenzwerts → `FAILED_FINAL`
|
||||
|
||||
**Technischer Sofort-Wiederholversuch:**
|
||||
|
||||
Bei einem Schreibfehler der Zielkopie wird innerhalb desselben Laufs exakt ein
|
||||
Sofort-Wiederholversuch unternommen. Dieser zählt nicht zum laufübergreifenden
|
||||
Fehlerzähler.
|
||||
|
||||
---
|
||||
|
||||
## Logging
|
||||
|
||||
Logs werden in das konfigurierte `log.directory` geschrieben (Standard: `./logs/`).
|
||||
Log-Rotation erfolgt täglich und bei Erreichen von 10 MB je Datei.
|
||||
|
||||
### Sensible KI-Inhalte
|
||||
|
||||
Standardmäßig werden die vollständige KI-Rohantwort und das KI-Reasoning **nicht** ins Log
|
||||
geschrieben, sondern ausschließlich in der SQLite-Datenbank gespeichert.
|
||||
|
||||
Die Ausgabe kann für Diagnosezwecke mit `log.ai.sensitive=true` freigeschaltet werden.
|
||||
Erlaubte Werte: `true` oder `false`. Jeder andere Wert ist ungültig und verhindert den Start.
|
||||
|
||||
---
|
||||
|
||||
## Exit-Codes
|
||||
|
||||
| Code | Bedeutung |
|
||||
|------|-----------|
|
||||
| `0` | Lauf technisch ordnungsgemäß ausgeführt (auch bei dokumentbezogenen Teilfehlern) |
|
||||
| `1` | Harter Start- oder Bootstrap-Fehler (ungültige Konfiguration, Lock nicht erwerbbar, Schema-Initialisierungsfehler) |
|
||||
|
||||
Dokumentbezogene Fehler einzelner PDF-Dateien führen **nicht** zu Exit-Code `1`.
|
||||
|
||||
---
|
||||
|
||||
## Startschutz (Parallelinstanzschutz)
|
||||
|
||||
Die Anwendung verwendet eine exklusive Lock-Datei, um parallele Instanzen zu verhindern.
|
||||
Wenn bereits eine Instanz läuft, beendet sich die neue Instanz sofort mit Exit-Code `1`.
|
||||
|
||||
Der Pfad der Lock-Datei ist über `runtime.lock.file` konfigurierbar.
|
||||
Ohne Konfiguration wird `pdf-umbenenner.lock` im Arbeitsverzeichnis verwendet.
|
||||
|
||||
---
|
||||
|
||||
## SQLite-Datenbank
|
||||
|
||||
Die SQLite-Datei enthält:
|
||||
|
||||
- **Dokument-Stammsätze**: Gesamtstatus, Fehlerzähler, letzter Zieldateiname, Zeitstempel
|
||||
- **Versuchshistorie**: Jeder Verarbeitungsversuch mit Modell, Prompt-Identifikator,
|
||||
KI-Rohantwort, Reasoning, Datum, Titel und Fehlerstatus
|
||||
|
||||
Die Datenbank ist die führende Wahrheitsquelle für Bearbeitungsstatus und Nachvollziehbarkeit.
|
||||
Sie muss nicht manuell verwaltet werden – das Schema wird beim Start automatisch initialisiert.
|
||||
|
||||
---
|
||||
|
||||
## Systemgrenzen
|
||||
|
||||
- Nur OCR-verarbeitete, durchsuchbare PDF-Dateien werden verarbeitet
|
||||
- Keine eingebaute OCR-Funktion
|
||||
- Kein Web-UI, keine REST-API, keine interaktive Bedienung
|
||||
- Kein interner Scheduler – der Start erfolgt extern (z. B. Windows Task Scheduler)
|
||||
- Quelldateien werden nie überschrieben, verschoben oder gelöscht
|
||||
- Die Identifikation erfolgt über SHA-256-Fingerprint des Dateiinhalts, nicht über Dateinamen
|
||||
@@ -1,5 +1,11 @@
|
||||
# Technik und Architektur – PDF-Umbenenner mit KI
|
||||
|
||||
> **Versionshinweis v2**
|
||||
> Diese Fassung erweitert die KI-Anbindung um einen zweiten, gleichwertig unterstützten Provider.
|
||||
> Geändert wurden ausschließlich die Abschnitte, die für die Mehrprovider-Fähigkeit erforderlich sind:
|
||||
> Technologiestack (Abschnitt 5), KI-Integration (Abschnitt 11), Konfiguration (Abschnitt 14) sowie die Abschlussbewertung (Abschnitt 19).
|
||||
> Alle übrigen Abschnitte bleiben inhaltlich unverändert.
|
||||
|
||||
## 1. Ziel und Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt die verbindliche technische Zielarchitektur für den **PDF-Umbenenner**.
|
||||
@@ -130,7 +136,7 @@ Enthält technische Implementierungen der Outbound-Ports, insbesondere:
|
||||
- Dateisystem
|
||||
- PDFBox
|
||||
- SQLite
|
||||
- OpenAI-kompatibler HTTP-Client
|
||||
- KI-HTTP-Clients (eine Implementierung je unterstütztem Provider, siehe Abschnitt 11)
|
||||
- Properties-/Umgebungs-Konfiguration
|
||||
- Run-Lock
|
||||
- Clock
|
||||
@@ -139,7 +145,8 @@ Enthält technische Implementierungen der Outbound-Ports, insbesondere:
|
||||
Verantwortlich für:
|
||||
- Laden und Validieren der Konfiguration
|
||||
- Erzeugen des Objektgraphen
|
||||
- Verdrahtung aller Adapter und Ports
|
||||
- Auswahl und Verdrahtung der **einen** aktiven KI-Provider-Implementierung
|
||||
- Verdrahtung aller übrigen Adapter und Ports
|
||||
- Start des CLI-Adapters
|
||||
- Setzen des Exit-Codes
|
||||
|
||||
@@ -162,13 +169,18 @@ Verbindlich eingesetzt werden:
|
||||
- **SQLite** als lokaler Persistenzspeicher
|
||||
- **SQLite JDBC-Treiber**
|
||||
- **Log4j2** für Logging
|
||||
- **OpenAI-kompatible HTTP-API** für KI-Zugriff
|
||||
- **Java HTTP Client** oder technisch gleichwertige Standard-HTTP-Komponente
|
||||
- **JSON-Bibliothek** für robuste JSON-Serialisierung und -Validierung
|
||||
|
||||
Für die KI-Anbindung werden **zwei gleichwertig unterstützte Provider-Familien** technisch zugelassen:
|
||||
- **OpenAI-kompatible HTTP-Schnittstelle** (Chat-Completions-Stil)
|
||||
- **native Anthropic Messages API** für Claude-Modelle
|
||||
|
||||
Pro Lauf ist genau **eine** dieser Provider-Implementierungen aktiv. Die Auswahl erfolgt ausschließlich über Konfiguration (siehe Abschnitt 14).
|
||||
|
||||
Nicht verbindlich festgelegt sind:
|
||||
- konkreter KI-Provider
|
||||
- konkrete KI-Basis-URL
|
||||
- konkreter KI-Provider innerhalb einer Provider-Familie
|
||||
- konkrete Basis-URL
|
||||
- konkreter Modellname
|
||||
|
||||
Diese drei Punkte sind **reine Konfiguration** und ausdrücklich **keine Architekturentscheidung**.
|
||||
@@ -196,6 +208,8 @@ Verbindlich zweckmäßige Outbound-Ports:
|
||||
- `RunLockPort`
|
||||
- `ClockPort`
|
||||
|
||||
Der `AiNamingPort` bleibt **provider-neutral**. Er kennt weder OpenAI- noch Anthropic-spezifische Typen, Header, URLs oder Antwortformate. Provider-spezifische Details (Endpunkt, Authentifizierung, Request-/Response-Format) leben ausschließlich in den jeweiligen Adapter-Out-Implementierungen.
|
||||
|
||||
### 6.3 Logging
|
||||
Logging ist **kein fachlicher Port**. Logging ist technische Infrastruktur.
|
||||
|
||||
@@ -234,6 +248,8 @@ Die Verarbeitung einer einzelnen Datei erfolgt in dieser Reihenfolge:
|
||||
16. temporäre Zieldatei final verschieben/umbenennen
|
||||
17. Erfolg und Versuchshistorie persistent speichern
|
||||
|
||||
Die Verarbeitungsschritte sind **provider-unabhängig**. Welcher konkrete KI-Adapter Schritt 9 ausführt, ist außerhalb der Application nicht sichtbar.
|
||||
|
||||
### 7.3 Erfolgskriterium
|
||||
Ein Dokument gilt genau dann als erfolgreich verarbeitet, wenn:
|
||||
1. brauchbarer PDF-Text vorliegt,
|
||||
@@ -288,63 +304,15 @@ Beispiele:
|
||||
- `2026-03-31 - Stromabrechnung(1).pdf`
|
||||
- `2026-03-31 - Stromabrechnung(2).pdf`
|
||||
|
||||
### 8.6 Windows-Kompatibilität
|
||||
Die Anwendung stellt zusätzlich sicher, dass der Zielname für Windows zulässig ist.
|
||||
---
|
||||
|
||||
Unzulässige Zeichen sind technisch zu entfernen oder kontrolliert zu ersetzen.
|
||||
## 9. Retry- und Fehlersemantik
|
||||
|
||||
> Inhaltlich unverändert gegenüber der Vorgängerfassung. Nur die Erkenntnis, dass technische KI-Fehler unabhängig vom konkreten Provider als transient klassifiziert werden, gilt jetzt für **beide** Provider-Familien gleichermaßen.
|
||||
|
||||
---
|
||||
|
||||
## 9. Fehlerklassifikation und Retry-Regeln
|
||||
|
||||
### 9.1 Grundsatz
|
||||
Nur **retryable** Fehler dürfen in späteren Läufen erneut verarbeitet werden.
|
||||
|
||||
**Finale** Fehler werden in späteren Läufen übersprungen.
|
||||
|
||||
### 9.2 Deterministische Inhaltsfehler
|
||||
Deterministische Inhaltsfehler sind insbesondere:
|
||||
- kein brauchbarer PDF-Text
|
||||
- Seitenlimit überschritten
|
||||
- Dokument inhaltlich mehrdeutig
|
||||
- kein brauchbarer Titel
|
||||
- generischer oder unzulässiger Titel
|
||||
- von der KI gelieferter Datumswert ist vorhanden, aber unbrauchbar oder nicht interpretierbar
|
||||
|
||||
Regel:
|
||||
- genau **1 Retry** in einem späteren Scheduler-Lauf
|
||||
- danach **finaler Fehler**
|
||||
|
||||
### 9.3 Transiente technische Fehler
|
||||
Transiente technische Fehler sind insbesondere:
|
||||
- KI nicht erreichbar
|
||||
- HTTP-Timeout
|
||||
- temporäre IO-Fehler
|
||||
- temporäre SQLite-Sperre
|
||||
- ungültiges oder nicht parsebares KI-JSON
|
||||
- sonstige vorübergehende technische Infrastrukturfehler
|
||||
|
||||
Regel:
|
||||
- Retry in späteren Läufen bis zum konfigurierten Maximalwert
|
||||
|
||||
### 9.4 Technischer Sofort-Wiederholversuch
|
||||
Zusätzlich zulässig ist genau **ein technischer Sofort-Wiederholversuch** innerhalb desselben Laufs für den Zielkopiervorgang, wenn das Schreiben der Zieldatei fehlschlägt.
|
||||
|
||||
Dieser Mechanismus ist **kein fachlicher Retry** und wird getrennt vom laufübergreifenden Retry-Modell behandelt.
|
||||
|
||||
### 9.5 Statusmodell
|
||||
Verbindlich zweckmäßige Statuswerte:
|
||||
- `SUCCESS`
|
||||
- `FAILED_RETRYABLE`
|
||||
- `FAILED_FINAL`
|
||||
- `SKIPPED_ALREADY_PROCESSED`
|
||||
- `SKIPPED_FINAL_FAILURE`
|
||||
|
||||
Ein technischer Zwischenstatus `PROCESSING` ist zusätzlich zulässig und sinnvoll.
|
||||
|
||||
---
|
||||
|
||||
## 10. Idempotenz und Identifikation
|
||||
## 10. Identifikation und Reproduzierbarkeit
|
||||
|
||||
### 10.1 Identifikation
|
||||
Die Identifikation eines Dokuments erfolgt **nicht** über den Dateinamen.
|
||||
@@ -362,35 +330,63 @@ Daraus folgt:
|
||||
Reproduzierbarkeit bedeutet technisch:
|
||||
- nach einem erfolgreichen Lauf bleibt das gespeicherte Ergebnis stabil
|
||||
- erfolgreiche Dateien werden nicht erneut KI-basiert bewertet
|
||||
- KI-Aufrufe werden, soweit die API es zulässt, mit möglichst geringer Varianz konfiguriert
|
||||
- Prompt-Version und Modellname werden persistiert
|
||||
- KI-Aufrufe werden, soweit die jeweilige API es zulässt, mit möglichst geringer Varianz konfiguriert
|
||||
- Prompt-Version, Modellname **und der Name des aktiven Providers** werden persistiert
|
||||
|
||||
---
|
||||
|
||||
## 11. KI-Integration
|
||||
|
||||
### 11.1 Schnittstelle
|
||||
Die KI wird ausschließlich über eine **OpenAI-kompatible HTTP-Schnittstelle** angebunden.
|
||||
### 11.1 Unterstützte Provider-Familien
|
||||
Die KI wird über genau **eine** der folgenden Provider-Familien angebunden:
|
||||
|
||||
Basis-URL, Modellname und API-Key sind reine Konfiguration.
|
||||
1. **OpenAI-kompatible HTTP-Schnittstelle**
|
||||
Chat-Completions-Stil. Geeignet für OpenAI selbst und für jeden API-kompatiblen Drittanbieter.
|
||||
2. **Native Anthropic Messages API**
|
||||
Die offizielle Anthropic-Schnittstelle zur Nutzung von Claude-Modellen.
|
||||
|
||||
### 11.2 Prompt
|
||||
Pro Lauf ist genau **ein** Provider aktiv. Es gibt:
|
||||
- **keine** automatische Fallback-Umschaltung
|
||||
- **keine** parallele Nutzung mehrerer Provider in einem Lauf
|
||||
- **keine** Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
|
||||
Die Auswahl erfolgt ausschließlich über Konfiguration. Ein Fehler des aktiven Providers ist und bleibt ein Fehler dieses einen Pfads und folgt der bestehenden Retry- und Fehlersemantik.
|
||||
|
||||
### 11.2 Architekturelle Einbettung
|
||||
- Pro Provider-Familie existiert **genau eine** Implementierung des `AiNamingPort` im Modul `pdf-umbenenner-adapter-out`.
|
||||
- Provider-spezifische Endpunkte, Header, Authentifizierungsverfahren, Request- und Response-Strukturen leben ausschließlich in der jeweiligen Adapter-Implementierung.
|
||||
- Application und Domain bleiben provider-neutral. Sie kennen weder den Begriff „OpenAI" noch „Claude".
|
||||
- Das **Bootstrap-Modul** wählt anhand der Konfiguration die eine aktive Implementierung aus und verdrahtet sie als `AiNamingPort`.
|
||||
- Adapter dürfen nicht voneinander abhängen. Es gibt keinen gemeinsamen „abstrakten KI-Adapter" als Infrastrukturschicht zwischen Port und konkreten Adaptern.
|
||||
|
||||
### 11.3 Einheitlicher fachlicher Vertrag
|
||||
Unabhängig vom aktiven Provider gilt derselbe fachliche Vertrag:
|
||||
- gleicher fachlicher Input (Prompt, Textausschnitt, Modellbezug)
|
||||
- gleicher fachlicher Output (Domain-Typ `NamingProposal`)
|
||||
- gleiche Validierungs- und Folgeprozesse in der Application
|
||||
- keine provider-spezifische Verzweigung im fachlichen Kern
|
||||
|
||||
Jede provider-spezifische Antwort wird im Adapter auf denselben Domain-Typ abgebildet. Eine Sonderbehandlung im Use-Case oder in der Domain ist unzulässig.
|
||||
|
||||
### 11.4 Prompt
|
||||
Der Prompt wird **nicht** im Code fest verdrahtet.
|
||||
|
||||
Verbindlich:
|
||||
- externe Prompt-Datei
|
||||
- Prompt-Version oder Prompt-Dateiname wird mitpersistiert
|
||||
- der Prompt darf die KI zur Ausgabe eines deutschen Titels anweisen
|
||||
- derselbe Prompt wird providerübergreifend verwendet; provider-spezifische Anpassungen finden ausschließlich in der Adapter-Implementierung statt
|
||||
|
||||
### 11.3 Textmenge
|
||||
### 11.5 Textmenge
|
||||
Es wird nicht zwingend der komplette extrahierte PDF-Text an die KI gesendet.
|
||||
|
||||
Verbindlich:
|
||||
- die maximale Zeichenzahl ist konfigurierbar
|
||||
- die Begrenzung muss vor dem KI-Aufruf technisch angewendet werden
|
||||
- die Begrenzung gilt providerunabhängig
|
||||
|
||||
### 11.4 Antwortformat
|
||||
Die KI muss genau ein parsebares JSON-Objekt liefern.
|
||||
### 11.6 Antwortformat
|
||||
Die KI muss – unabhängig vom aktiven Provider – fachlich genau ein parsebares JSON-Objekt liefern.
|
||||
|
||||
Zweckmäßiges Schema:
|
||||
|
||||
@@ -408,7 +404,9 @@ Regeln:
|
||||
- `date` ist optional, wenn kein belastbares Datum ableitbar ist
|
||||
- liefert die KI kein `date`, setzt die Anwendung das aktuelle Datum als Fallback
|
||||
|
||||
### 11.5 Antwortvalidierung
|
||||
Wie der Adapter dieses Schema aus der jeweiligen Provider-Antwort extrahiert (z. B. aus `choices[].message.content` bei OpenAI-kompatiblen Schnittstellen oder aus dem Content-Block-Array der Anthropic Messages API), ist eine reine Adapter-Implementierungsfrage.
|
||||
|
||||
### 11.7 Antwortvalidierung
|
||||
Die Antwort gilt nur dann als technisch brauchbar, wenn:
|
||||
- JSON parsebar ist
|
||||
- `title` vorhanden ist
|
||||
@@ -418,6 +416,11 @@ Zusätzlich gilt fachlich:
|
||||
- `title` muss validierbar und brauchbar sein
|
||||
- ein vorhandenes `date` muss im Format `YYYY-MM-DD` interpretierbar sein
|
||||
|
||||
Diese Validierung ist provider-unabhängig und liegt in Application/Domain.
|
||||
|
||||
### 11.8 Fehlerklassifikation
|
||||
Technische Fehler des aktiven Providers (HTTP-Fehler, Timeouts, ungültige Antwortstrukturen, Authentifizierungsfehler) werden im Adapter erkannt und auf die bestehende technische Fehlersemantik des Projekts abgebildet (transient vs. deterministisch). Es entsteht keine neue Fehlerkategorie. Der inaktive Provider wird in keiner Fehlersituation als Backup verwendet.
|
||||
|
||||
---
|
||||
|
||||
## 12. PDF-Verarbeitung
|
||||
@@ -457,6 +460,8 @@ Die Persistenz wird zweckmäßig in **zwei Ebenen** geführt:
|
||||
1. **Dokument-Stammsatz** pro Fingerprint
|
||||
2. **Versuchshistorie** mit einem Datensatz pro Verarbeitungsversuch
|
||||
|
||||
Das bestehende Schema bleibt erhalten. Es wird ausschließlich um die Information erweitert, **welcher Provider** den jeweiligen Versuch erzeugt hat (siehe 13.4). Eine neue Wahrheitsquelle entsteht nicht.
|
||||
|
||||
### 13.3 Dokument-Stammsatz
|
||||
Mindestens zweckmäßig zu speichern:
|
||||
- interne ID
|
||||
@@ -485,6 +490,7 @@ Für **jeden Versuch separat** zu speichern:
|
||||
- Fehlerklasse
|
||||
- Fehlermeldung
|
||||
- Retryable-Flag
|
||||
- **Provider-Identifikator des aktiven KI-Providers für diesen Versuch**
|
||||
- Modellname
|
||||
- Prompt-Identifikator
|
||||
- verarbeitete Seitenzahl
|
||||
@@ -496,11 +502,16 @@ Für **jeden Versuch separat** zu speichern:
|
||||
- finaler Titel
|
||||
- finaler Zieldateiname
|
||||
|
||||
Der Provider-Identifikator macht jeden Versuch eindeutig nachvollziehbar einer Provider-Familie zuordenbar, ohne den fachlichen Vertrag zu verändern.
|
||||
|
||||
### 13.5 Sensible Inhalte
|
||||
Die vollständige KI-Rohantwort wird in SQLite gespeichert.
|
||||
|
||||
Sie soll **standardmäßig nicht vollständig in Logdateien** geschrieben werden.
|
||||
|
||||
### 13.6 Rückwärtsverträglichkeit
|
||||
Bestehende Datenbestände aus dem Stand vor v2 müssen weiterhin lesbar, fortschreibbar und korrekt interpretierbar bleiben. Schema-Erweiterungen erfolgen additiv und mit definierten Defaultwerten für historische Versuche ohne Provider-Identifikator.
|
||||
|
||||
---
|
||||
|
||||
## 14. Konfiguration
|
||||
@@ -508,33 +519,77 @@ Sie soll **standardmäßig nicht vollständig in Logdateien** geschrieben werden
|
||||
### 14.1 Format
|
||||
Die technische Konfiguration erfolgt über `.properties`.
|
||||
|
||||
### 14.2 Mindestparameter
|
||||
### 14.2 Provider-Auswahl
|
||||
Genau ein Provider ist aktiv. Die Auswahl erfolgt über einen einzigen Pflichtparameter, der den aktiven Provider benennt. Zulässige Werte sind die Bezeichner der unterstützten Provider-Familien aus Abschnitt 11.1.
|
||||
|
||||
### 14.3 Mindestparameter
|
||||
Verbindlich zweckmäßige Parameter:
|
||||
|
||||
- `source.folder`
|
||||
- `target.folder`
|
||||
- `sqlite.file`
|
||||
- `api.baseUrl`
|
||||
- `api.model`
|
||||
- `api.timeoutSeconds`
|
||||
- **`ai.provider.active`** – Auswahl des aktiven Providers (Pflicht)
|
||||
- `max.retries.transient`
|
||||
- `max.pages`
|
||||
- `max.text.characters`
|
||||
- `prompt.template.file`
|
||||
|
||||
Pro unterstützter Provider-Familie existiert ein eigener Parameter-Namensraum mit zweckmäßig mindestens:
|
||||
- Modellname
|
||||
- API-Schlüssel
|
||||
- Timeout
|
||||
- Basis-URL (optional, wo betrieblich sinnvoll)
|
||||
|
||||
Konkretes Schema (zweckmäßig, frei wählbare Bezeichner):
|
||||
|
||||
```properties
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
ai.provider.claude.baseUrl=...
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
Zusätzlich zweckmäßig:
|
||||
- `runtime.lock.file`
|
||||
- `log.directory`
|
||||
- `log.level`
|
||||
- `api.key`
|
||||
- `log.ai.sensitive`
|
||||
|
||||
### 14.3 API-Key
|
||||
Der API-Key darf über Umgebungsvariable oder Properties geliefert werden.
|
||||
### 14.4 API-Schlüssel
|
||||
API-Schlüssel dürfen über Umgebungsvariable oder Properties geliefert werden.
|
||||
|
||||
Verbindlich:
|
||||
- Umgebungsvariable hat Vorrang
|
||||
- pro Provider-Familie existiert eine **eigene definierte Umgebungsvariable**
|
||||
- die Umgebungsvariable hat **Vorrang** vor dem Properties-Wert derselben Provider-Familie
|
||||
- Schlüssel verschiedener Provider-Familien werden niemals vermischt
|
||||
|
||||
### 14.4 Konfigurationsvalidierung
|
||||
Beim Start müssen alle Pflichtparameter validiert werden.
|
||||
### 14.5 Migration historischer Konfigurationen
|
||||
Bestehende Properties-Dateien aus dem Stand vor v2 (mit flachen Schlüsseln wie `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key`) sind eine eindeutig erkennbare Legacy-Form.
|
||||
|
||||
Beim ersten Start mit erkannter Legacy-Form gilt verbindlich:
|
||||
1. Legacy-Form erkennen
|
||||
2. **`.bak`-Sicherung** der Originaldatei anlegen
|
||||
3. Inhalt in das neue Schema überführen
|
||||
- die Legacy-Werte werden in den Namensraum der Provider-Familie **`openai-compatible`** überführt
|
||||
- `ai.provider.active` wird auf `openai-compatible` gesetzt
|
||||
4. neue Datei schreiben (In-Place-Update)
|
||||
5. Datei erneut laden und validieren
|
||||
6. erst danach den normalen Lauf fortsetzen
|
||||
|
||||
Es ist **kein** Ziel, alte und neue Struktur dauerhaft gleichrangig als Endformat zu pflegen.
|
||||
|
||||
### 14.6 Konfigurationsvalidierung
|
||||
Beim Start müssen alle Pflichtparameter validiert werden, insbesondere:
|
||||
- `ai.provider.active` ist gesetzt und benennt einen unterstützten Provider
|
||||
- für den aktiven Provider sind alle Pflichtwerte vorhanden und technisch konsistent
|
||||
- für den **inaktiven** Provider werden keine Pflichtwerte erzwungen
|
||||
|
||||
Bei ungültiger Startkonfiguration:
|
||||
- beginnt kein Verarbeitungslauf
|
||||
@@ -553,6 +608,7 @@ Das Logging muss mindestens enthalten:
|
||||
- Laufstart
|
||||
- Laufende
|
||||
- Lauf-ID
|
||||
- **aktiver KI-Provider für den Lauf**
|
||||
- erkannte Quelldatei
|
||||
- Überspringen bereits erfolgreicher Dateien
|
||||
- Überspringen final fehlgeschlagener Dateien
|
||||
@@ -566,6 +622,7 @@ Standardmäßig gilt:
|
||||
- vollständige KI-Rohantwort **in SQLite**
|
||||
- `reasoning` darf geloggt werden, sofern dies betrieblich gewünscht ist
|
||||
- die Ausgabe sensibler Inhalte muss konfigurierbar sein
|
||||
- die Sensibilitätsregel gilt provider-unabhängig
|
||||
|
||||
### 15.4 Speicherort
|
||||
Das Log-Verzeichnis ist konfigurierbar. Ohne explizite Konfiguration ist ein lokales `logs/`-Verzeichnis im Programmkontext zweckmäßig.
|
||||
@@ -598,7 +655,7 @@ Verbindliche Interpretation:
|
||||
- `1`: Lauf konnte wegen hartem Start-/Bootstrap-Fehler nicht ordnungsgemäß beginnen oder fortgesetzt werden
|
||||
|
||||
Typische `1`-Fälle:
|
||||
- ungültige Konfiguration
|
||||
- ungültige Konfiguration (einschließlich fehlender oder unbekannter `ai.provider.active`)
|
||||
- Run-Lock nicht erwerbbar
|
||||
- essentielle Ressourcen beim Start nicht verfügbar
|
||||
|
||||
@@ -616,23 +673,30 @@ Nicht Bestandteil dieser Architektur sind:
|
||||
- menschlicher Review-Workflow
|
||||
- interne Scheduler-Logik
|
||||
- fachliche Identifikation über Dateinamen
|
||||
- automatische Fallback-Umschaltung zwischen KI-Providern
|
||||
- parallele Nutzung mehrerer KI-Provider in einem Lauf
|
||||
- mehrere konkurrierende Konfigurationen je Provider-Familie (Profilverwaltung)
|
||||
- Provider-Familien jenseits der in Abschnitt 11.1 explizit genannten
|
||||
|
||||
---
|
||||
|
||||
## 19. Abschlussbewertung
|
||||
|
||||
Der technische Zielstand ist mit den hier festgelegten Regeln:
|
||||
Der technische Zielstand ist mit den in dieser Fassung festgelegten Regeln:
|
||||
- konsistent
|
||||
- widerspruchsfrei
|
||||
- hexagonal sauber geschnitten
|
||||
- für einen minimalen produktiven PDF-Umbenenner zweckmäßig
|
||||
- offen für genau zwei gleichwertig unterstützte KI-Provider-Familien, ohne den fachlichen Kern zu verändern
|
||||
|
||||
Besonders verbindlich geklärt sind jetzt:
|
||||
Besonders verbindlich geklärt sind:
|
||||
- Dateinamensformat mit `YYYY-MM-DD - Titel.pdf`
|
||||
- Dublettenregel mit `(1)`, `(2)`, ...
|
||||
- Trennung zwischen finalen und retrybaren Fehlern
|
||||
- Fallback-Datum durch die Anwendung
|
||||
- Zwei-Ebenen-Persistenz mit Versuchshistorie
|
||||
- Zwei-Ebenen-Persistenz mit Versuchshistorie inkl. Provider-Identifikator
|
||||
- Exit-Code-Regel für harte Startfehler
|
||||
- OpenAI-kompatible Schnittstelle ohne fest verdrahteten Provider
|
||||
- Unterstützung von OpenAI-kompatibler Schnittstelle **und** nativer Anthropic Messages API
|
||||
- genau **ein** aktiver Provider pro Lauf, ohne Fallback
|
||||
- Verlagerung technischer Persistenzobjekte aus der Domain heraus
|
||||
- Migration historischer flacher Properties-Konfiguration mit `.bak`-Sicherung
|
||||
|
||||
540
docs/workpackages/M7 - Arbeitspakete.md
Normal file
540
docs/workpackages/M7 - Arbeitspakete.md
Normal file
@@ -0,0 +1,540 @@
|
||||
# M7 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M7 – Fehlerbehandlung, Retry-Logik, Logging und betriebliche Robustheit**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3**, **M4**, **M5** und **M6** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter und Bootstrap ändern.
|
||||
- Keine Annahmen treffen, die nicht durch dieses Dokument oder die verbindlichen Spezifikationen gedeckt sind.
|
||||
- Kein Vorgriff auf **M8+**.
|
||||
- Kein Umbau bestehender M1–M6-Strukturen ohne direkten M7-Bezug.
|
||||
- Neue Typen, Entscheidungsregeln, Konfigurationswerte, Repository-Erweiterungen und Adapter so schneiden, dass sie aus einem einzelnen Arbeitspaket heraus **klar benennbar, testbar und reviewbar** sind.
|
||||
- M7 schärft und vervollständigt die bereits vorhandene Fehler- und Statussemantik aus M3–M6, erfindet sie aber nicht stillschweigend neu.
|
||||
- M7 muss vorhandene M4–M6-Datenbestände **weiterhin lesen und korrekt fortschreiben** können.
|
||||
- Jeder positive M7-Zwischenstand muss bereits einen **robusten, wiederholt ausführbaren Task-Scheduler-Lauf** liefern, auch wenn der Retry-, Logging- und Exit-Code-Endstand erst mit späteren Arbeitspaketen vollständig erreicht wird.
|
||||
- Ein Arbeitspaket darf nur dann auf Repository- oder Persistenzfähigkeiten aufbauen, wenn diese entweder bereits aus M1–M6 vorhanden sind oder im unmittelbar vorhergehenden Arbeitspaket explizit hergestellt wurden.
|
||||
|
||||
## Explizit nicht Bestandteil von M7
|
||||
|
||||
- neue KI-Funktionalität oder Prompt-Evolution jenseits der robusten Weiterverwendung des M5-Stands
|
||||
- neue fachliche Benennungsregeln über M5/M6 hinaus
|
||||
- neue Dateisystem-Funktionalität jenseits des M6-Zielkopiepfads und des in M7 konkret geforderten technischen Sofort-Wiederholversuchs
|
||||
- Reporting-, Statistik- oder Monitoring-Funktionen
|
||||
- Web-UI, REST-API oder Benutzerinteraktion
|
||||
- OCR, Inhaltsänderung von PDFs oder manuelle Nachbearbeitung
|
||||
- abschließender Gesamt-Feinschliff, großflächige Refactorings oder generelle Qualitätskampagnen aus **M8**
|
||||
|
||||
## Verbindliche M7-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. M7 schließt die Betriebslücke zwischen M6 und dem finalen Zielbild
|
||||
|
||||
M6 liefert den vollständigen Erfolgspfad, aber noch nicht die vollständige betriebliche Robustheit des Endstands. Ab M7 gilt daher verbindlich:
|
||||
|
||||
- `SUCCESS` bleibt der echte terminale Enderfolg.
|
||||
- `FAILED_FINAL` bleibt der terminale Endfehler.
|
||||
- `FAILED_RETRYABLE` darf nur solange bestehen bleiben, wie **mindestens ein weiterer Scheduler-Lauf fachlich zulässig** ist.
|
||||
- `SKIPPED_ALREADY_PROCESSED` und `SKIPPED_FINAL_FAILURE` bleiben reine historisierte Skip-Ergebnisse und verändern selbst keine Fehlerzähler.
|
||||
- Dokumentbezogene Fehler dürfen den Gesamtbatch nicht unnötig abbrechen.
|
||||
|
||||
### 2. Vollständige Retry-Regel für deterministische Inhaltsfehler
|
||||
|
||||
Ab M7 gilt die vollständige fachliche Regel über spätere Läufe hinweg:
|
||||
|
||||
- deterministische Inhaltsfehler erhalten **genau einen** späteren Wiederholungsversuch,
|
||||
- der **erste** historisierte deterministische Inhaltsfehler eines Fingerprints führt zu `FAILED_RETRYABLE`,
|
||||
- der **zweite** historisierte deterministische Inhaltsfehler desselben Fingerprints führt zu `FAILED_FINAL`.
|
||||
|
||||
Für M7 sind mindestens alle bereits aus M3–M6 konkret erzeugbaren deterministischen Inhaltsfehler in diesen Regelrahmen einzuordnen, insbesondere:
|
||||
|
||||
- kein brauchbarer Text,
|
||||
- Seitenlimit überschritten,
|
||||
- fachlich unbrauchbarer oder generischer Titel,
|
||||
- vorhandenes, aber unbrauchbares KI-Datum.
|
||||
|
||||
Bereits vorhandene oder künftig im bestehenden Fachmodell erzeugte Mehrdeutigkeitsfälle laufen in denselben deterministischen Inhaltsfehler-Rahmen und erzeugen **kein** unsicheres Ergebnis.
|
||||
|
||||
### 3. Vollständige Retry-Regel für transiente technische Fehler
|
||||
|
||||
Ab M7 gilt für dokumentbezogene technische Fehler nach erfolgreicher Fingerprint-Ermittlung:
|
||||
|
||||
- sie laufen über den **Transientfehlerzähler**,
|
||||
- sie bleiben nur bis zum konfigurierten Grenzwert retryable,
|
||||
- nach Ausschöpfen der zulässigen transienten Fehlversuche wird der Dokumentstatus `FAILED_FINAL`.
|
||||
|
||||
Für die M7-Implementierung ist `max.retries.transient` verbindlich als **maximal zulässige Anzahl historisierter transienter Fehlversuche pro Fingerprint** zu interpretieren. Der Fehlversuch, der diesen Grenzwert erreicht, finalisiert den Dokumentstatus.
|
||||
|
||||
Zusätzlich gilt:
|
||||
|
||||
- `max.retries.transient` ist ein **ganzzahliger Wert >= 1**.
|
||||
- Der Wert `0` ist **ungültige Startkonfiguration**.
|
||||
- Beispiel: `1` bedeutet, dass bereits der **erste** historisierte transiente Fehlversuch finalisiert.
|
||||
- Beispiel: `2` bedeutet, dass der **erste** historisierte transiente Fehlversuch retryable bleibt und der **zweite** finalisiert.
|
||||
|
||||
### 4. Technischer Sofort-Wiederholversuch ist strikt auf den Zielkopierpfad begrenzt
|
||||
|
||||
Der in der Zielarchitektur vorgesehene technische Sofort-Wiederholversuch wird in M7 exakt wie folgt umgesetzt:
|
||||
|
||||
- **genau ein** zusätzlicher technischer Schreibversuch innerhalb desselben Dokumentlaufs,
|
||||
- ausschließlich für Fehler beim physischen Zielkopierpfad aus M6,
|
||||
- **kein** erneuter KI-Aufruf,
|
||||
- **keine** erneute fachliche Titel-/Datumsableitung,
|
||||
- **keine** Ausweitung auf Prompt-Laden, KI-HTTP, SQLite oder sonstige Adapter.
|
||||
|
||||
Der Sofort-Wiederholversuch ist ein technischer Mechanismus innerhalb desselben Laufs und **kein** zusätzlicher fachlicher Retry-Lauf im Sinne der laufübergreifenden Retry-Regeln.
|
||||
|
||||
### 5. Skip-Semantik des Endstands
|
||||
|
||||
Ab M7 gilt vollständig:
|
||||
|
||||
- `SUCCESS` wird in späteren Läufen **nicht erneut verarbeitet**, sondern mit `SKIPPED_ALREADY_PROCESSED` historisiert.
|
||||
- `FAILED_FINAL` wird in späteren Läufen **nicht erneut verarbeitet**, sondern mit `SKIPPED_FINAL_FAILURE` historisiert.
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI` und `PROPOSAL_READY` bleiben verarbeitbar, soweit der jeweilige Dokumentzustand dies fachlich zulässt.
|
||||
- Ein nach M6 noch offenes `PROPOSAL_READY` darf in M7 weiterhin sauber bis zum echten Enderfolg finalisiert werden.
|
||||
|
||||
### 6. Logging-Mindestumfang des Endstands
|
||||
|
||||
Das Logging muss ab M7 mindestens folgende Informationen nachvollziehbar liefern:
|
||||
|
||||
- Laufstart,
|
||||
- Laufende,
|
||||
- Lauf-ID,
|
||||
- erkannte Quelldatei,
|
||||
- Überspringen bereits erfolgreicher Dateien,
|
||||
- Überspringen final fehlgeschlagener Dateien,
|
||||
- erzeugter Zielname,
|
||||
- Retry-Entscheidung,
|
||||
- Fehler mit Klassifikation.
|
||||
|
||||
Die Logs müssen so geschnitten werden, dass dokumentbezogene Entscheidungen pro Fingerprint bzw. Kandidat nachvollziehbar bleiben, ohne zusätzliche Infrastrukturtypen in Domain oder Application zu ziehen.
|
||||
|
||||
Zusätzlich gilt für die Korrelation:
|
||||
|
||||
- sobald ein Fingerprint erfolgreich bestimmt wurde, müssen dokumentbezogene Logeinträge diesen Fingerprint oder eine daraus eindeutig ableitbare Referenz enthalten,
|
||||
- solange noch kein Fingerprint vorliegt, erfolgt die Korrelation mindestens über Lauf-ID und erkannte Quelldatei bzw. Kandidatenbezug,
|
||||
- M7 führt hierfür **keine** neue Persistenz-Wahrheit und **keine** zusätzliche Tracking-Ebene ein.
|
||||
|
||||
### 7. Sensibilitätsregel für KI-Inhalte im Logging
|
||||
|
||||
Ab M7 gilt verbindlich:
|
||||
|
||||
- die vollständige KI-Rohantwort bleibt in **SQLite** speicherbar,
|
||||
- die vollständige KI-Rohantwort wird **standardmäßig nicht** ins Log geschrieben,
|
||||
- `reasoning` wird ebenfalls **standardmäßig nicht** vollständig ins Log geschrieben,
|
||||
- die Ausgabe sensibler KI-Inhalte ist nur über eine **explizite Konfiguration** zulässig,
|
||||
- M7 führt hierfür einen klar dokumentierten, booleschen Konfigurationswert ein,
|
||||
- der Default muss auf **sicher/nicht loggen** stehen.
|
||||
|
||||
Als sensible KI-Inhalte gelten in M7 mindestens:
|
||||
|
||||
- vollständige KI-Rohantwort,
|
||||
- vollständiges KI-`reasoning`.
|
||||
|
||||
### 8. Exit-Code-Endsemantik
|
||||
|
||||
Ab M7 ist das Exit-Code-Verhalten final:
|
||||
|
||||
- `0`, wenn der Lauf technisch ordnungsgemäß durchgeführt wurde, auch wenn einzelne Dokumente fachlich oder transient fehlgeschlagen sind,
|
||||
- `1` nur bei harten Start-, Bootstrap-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehlern.
|
||||
|
||||
Dokumentbezogene Fehler dürfen **nicht** als harte Startfehler fehlmodelliert werden.
|
||||
|
||||
### 9. Konfigurationsvalidierung des Endstands
|
||||
|
||||
M7 vervollständigt die Startvalidierung insbesondere für:
|
||||
|
||||
- `max.retries.transient`,
|
||||
- M7-relevante Logging-Konfiguration,
|
||||
- bestehende M1–M6-Startparameter, soweit sie für einen robusten Batch-Lauf weiterhin zwingend sind.
|
||||
|
||||
Ungültige M7-Startkonfiguration verhindert den Laufbeginn und führt zu **Exit-Code 1**.
|
||||
|
||||
### 10. Keine zweite Wahrheitsquelle für Fehler- und Retry-Entscheidungen
|
||||
|
||||
M7 nutzt weiterhin die bestehende Kombination aus:
|
||||
|
||||
- Dokument-Stammsatz für Gesamtstatus und Zähler,
|
||||
- Versuchshistorie für einzelne Versuchsdaten und Nachvollziehbarkeit.
|
||||
|
||||
M7 führt **keine** parallele, dritte Wahrheitsquelle für Retry-Zustände, Logging-Entscheidungen oder Fehlerhistorien ein.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 M7-Kernobjekte, vollständige Fehlersemantik und Retry-/Logging-Verträge präzisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M7-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die M7-relevanten Typen, vollständigen Fehler- und Retry-Bedeutungen, Logging-bezogenen Entscheidungsobjekte und technischen Grenzen werden eindeutig eingeführt, damit spätere Arbeitspakete ohne Interpretationsspielraum implementiert werden können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Neue M7-relevante Kernobjekte bzw. Application-nahe Typen anlegen, insbesondere für:
|
||||
- vollständige Retry-Entscheidung,
|
||||
- Ausschöpfungszustand eines Retry-Rahmens,
|
||||
- technische Sofort-Wiederholungsentscheidung für den Zielkopierpfad,
|
||||
- dokumentbezogene Fehlerklassifikation des Endstands,
|
||||
- Logging-Ereignis bzw. Logging-relevante Dokumententscheidung,
|
||||
- Sensitivitätsentscheidung für KI-Inhalte im Logging.
|
||||
- Die bestehende Status- und Fehlersemantik in JavaDoc und ggf. `package-info` so schärfen, dass klar ist:
|
||||
- wann `FAILED_RETRYABLE` noch zulässig ist,
|
||||
- wann ein Dokumentstatus wegen ausgeschöpfter Retry-Regeln in `FAILED_FINAL` übergeht,
|
||||
- dass der technische Sofort-Wiederholversuch **nicht** zum laufübergreifenden Retry-Zähler gehört,
|
||||
- dass dokumentbezogene Fehler den Gesamtbatch nicht zu Exit-Code 1 eskalieren.
|
||||
- Application-seitige Verträge definieren oder gezielt erweitern für:
|
||||
- Ableitung der Retry-Entscheidung aus Status, Fehlerart, Zählern und Konfiguration,
|
||||
- Ableitung einer protokollierbaren Dokumententscheidung,
|
||||
- Ableitung der Zielkopier-Sofort-Wiederholung,
|
||||
- Auflösung der Sensitivitätsregel für KI-Logausgaben,
|
||||
- Korrelation dokumentbezogener Logging-Ereignisse ohne Infrastrukturtypen im Kern.
|
||||
- Port-Verträge so schneiden, dass weder Log4j2-, NIO-, JDBC- noch HTTP-Typen in Domain oder Application durchsickern.
|
||||
- Rückgabemodelle so anlegen, dass spätere Arbeitspakete ohne Zusatzannahmen unterscheiden können zwischen:
|
||||
- retryablem Inhaltsfehler,
|
||||
- finalem Inhaltsfehler,
|
||||
- retryablem technischem Fehler,
|
||||
- finalisiertem technischem Fehler nach ausgeschöpftem Transient-Rahmen,
|
||||
- technischem Zielschreibfehler mit zulässigem Sofort-Wiederholversuch,
|
||||
- dokumentbezogener Entscheidung mit M7-logbarem Ergebnis.
|
||||
- Explizit dokumentieren, dass M7 keine neue Persistenz-Wahrheit für Retry-Entscheidungen einführt.
|
||||
- Explizit dokumentieren, dass `max.retries.transient` als historisierter Fehlversuchs-Grenzwert interpretiert wird und als gültiger Konfigurationswert nur **Integer >= 1** zulässig ist.
|
||||
- Explizit dokumentieren, dass sensible KI-Logausgaben in M7 mindestens vollständige KI-Rohantwort und vollständiges KI-`reasoning` umfassen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- konkrete Retry-Implementierung im Batch-Lauf
|
||||
- konkrete Log4j2-Konfiguration
|
||||
- konkrete Zielkopier-Wiederholung
|
||||
- Bootstrap-Anpassungen
|
||||
- Tests des Endstands
|
||||
|
||||
### Fertig wenn
|
||||
- die M7-relevanten Typen und Verträge vorhanden sind,
|
||||
- Retry-, Finalisierungs-, Sensitivitäts- und Logging-Korrelationssemantik eindeutig dokumentiert ist,
|
||||
- Domain und Application frei von Infrastrukturtypen bleiben,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Vollständige Retry-Entscheidungslogik für deterministische Inhaltsfehler und transiente technische Fehler implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die fachlich vollständige laufübergreifende Retry-Entscheidung des Endstands ist als klarer, testbarer Baustein im Kern implementiert und kann von Batch-Lauf, Logging und Persistenz konsistent verwendet werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Einen zentralen M7-Baustein implementieren, der aus vorhandener Fehlerart, bestehendem Dokumentstatus, Fehlerzählern und Konfiguration die verbindliche Retry-Entscheidung ableitet.
|
||||
- Die vollständige deterministische Inhaltsfehlerregel explizit umsetzen:
|
||||
- erster historisierter deterministischer Inhaltsfehler → `FAILED_RETRYABLE`,
|
||||
- zweiter historisierter deterministischer Inhaltsfehler → `FAILED_FINAL`.
|
||||
- Die vollständige transiente Fehlerregel explizit umsetzen:
|
||||
- dokumentbezogene technische Fehler bleiben nur bis `max.retries.transient` retryable,
|
||||
- der Fehlversuch, der den Grenzwert erreicht, finalisiert den Status zu `FAILED_FINAL`.
|
||||
- Die Randfälle der Grenzwertinterpretation explizit abdecken, insbesondere:
|
||||
- `max.retries.transient = 1`,
|
||||
- Skip-Fälle ohne Zähleränderung,
|
||||
- bereits bestehende M4–M6-Datenbestände mit historischen Fehlerzählern.
|
||||
- Die Entscheidungslogik so schneiden, dass sie konsistent für bereits bestehende M4–M6-Datenbestände nutzbar bleibt und keine Sonderbehandlung außerhalb des zentralen Regelwerks erzwingt.
|
||||
- Explizit sicherstellen, dass Skip-Fälle keine Fehlerzähler verändern.
|
||||
- Explizit sicherstellen, dass der technische Sofort-Wiederholversuch **nicht** in diese laufübergreifende Retry-Entscheidung einfließt.
|
||||
- JavaDoc für Regelherkunft, Zählerbedeutung, Grenzwertinterpretation und Nicht-Ziele von M7 ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Batch-Use-Case-Integration
|
||||
- Persistenzfortschreibung im konkreten Dokumentlauf
|
||||
- Zielkopier-Wiederholung
|
||||
- Logging-Konfiguration
|
||||
- Exit-Code-Logik
|
||||
|
||||
### Fertig wenn
|
||||
- die Retry-Entscheidung zentral und testbar implementiert ist,
|
||||
- deterministische und transiente Fehler vollständig und widerspruchsfrei abgedeckt sind,
|
||||
- bestehende M4–M6-Zähler- und Statusdaten ohne Sonderlogik anschlussfähig bleiben,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 Technischen Sofort-Wiederholversuch für den Zielkopierpfad aus M6 implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der in der Zielarchitektur vorgesehene einmalige technische Sofort-Wiederholversuch für Zielkopierfehler wird sauber umgesetzt, ohne KI, Persistenzlogik oder laufübergreifende Retry-Semantik zu vermischen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden M6-Zielkopierpfad so erweitern, dass bei einem technischen Schreibfehler **genau ein** zusätzlicher technischer Sofort-Wiederholversuch innerhalb desselben Dokumentlaufs möglich ist.
|
||||
- Sicherstellen, dass der Sofort-Wiederholversuch ausschließlich für den physischen Zielkopierpfad gilt, insbesondere für:
|
||||
- temporäre Zieldatei nicht anlegbar,
|
||||
- Kopieren scheitert,
|
||||
- finaler Move/Rename scheitert,
|
||||
- technisches Cleanup nach erstem Schreibfehler nur teilweise erfolgreich.
|
||||
- Sicherstellen, dass dabei **kein** erneuter KI-Aufruf, **keine** erneute fachliche Proposal-Ableitung und **keine** neue Statusneubewertung außerhalb des M7-Regelrahmens stattfindet.
|
||||
- Den Mechanismus so schneiden, dass der zweite technische Versuch mit demselben fachlichen Dokumentkontext läuft und der Batch-Lauf danach genau **ein** dokumentbezogenes Ergebnis für Persistenz und Statusfortschreibung ableiten kann.
|
||||
- Technische Aufräumarbeiten zwischen erstem und zweitem Versuch kontrolliert kapseln.
|
||||
- JavaDoc für Reichweite, Grenzen und Abgrenzung zu laufübergreifenden Retries ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- endgültige Status- und Zählerfortschreibung im Batch-Lauf
|
||||
- Logging-Endstand
|
||||
- Bootstrap-Anpassungen
|
||||
- Erweiterung auf andere Fehlerarten als Zielkopierschreibfehler
|
||||
|
||||
### Fertig wenn
|
||||
- genau ein technischer Sofort-Wiederholversuch für Zielkopierfehler möglich ist,
|
||||
- kein KI- oder Fachpfad unzulässig erneut ausgelöst wird,
|
||||
- das Ergebnis kontrolliert an den späteren Batch-/Persistenzpfad übergeben werden kann,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Logging-Infrastruktur, Korrelation und Sensibilitätsregel für M7 vorbereiten
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die Logging-Infrastruktur ist für den M7-Endstand vorbereitet, die Sensibilitätsregel für KI-Inhalte ist technisch korrekt verdrahtet und dokumentbezogene Ereignisse können später im Batch-Lauf konsistent und eindeutig korreliert geloggt werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die bestehende Logging-Infrastruktur gezielt so erweitern, dass der in M7 geforderte Mindestumfang später ohne zusätzliche Architekturbrüche angebunden werden kann.
|
||||
- Einen klar dokumentierten, booleschen Konfigurationswert für sensible KI-Logausgaben einführen und verdrahten.
|
||||
- Sicherstellen, dass die vollständige KI-Rohantwort standardmäßig **nicht** geloggt wird.
|
||||
- Sicherstellen, dass vollständiges KI-`reasoning` standardmäßig **nicht** vollständig geloggt wird.
|
||||
- Sicherstellen, dass die vollständige KI-Rohantwort und das vollständige KI-`reasoning` weiterhin in SQLite verbleiben können und M7 hier keine Reduktion oder Löschung der Nachvollziehbarkeit einführt.
|
||||
- Einen M7-tauglichen Mechanismus für dokumentbezogene Log-Korrelation vorbereiten, insbesondere:
|
||||
- Lauf-ID-basierte Korrelation vor erfolgreicher Fingerprint-Ermittlung,
|
||||
- Fingerprint- oder eindeutig ableitbare Dokumentreferenz nach erfolgreicher Fingerprint-Ermittlung.
|
||||
- Die logbaren Ereignis- und Entscheidungsmodelle aus AP-001 an die Logging-Infrastruktur anbinden, ohne dass fachliche Entscheidungslogik in technische Logger-Aufrufe zerfällt.
|
||||
- Bereits auf dieser Stufe die nicht dokumentgebundenen Pflicht-Logpunkte sauber verdrahten, insbesondere:
|
||||
- Laufstart,
|
||||
- Laufende,
|
||||
- harte Startfehler, soweit auf aktuellem Stand erreichbar.
|
||||
- JavaDoc und ggf. `package-info` für Logging-Sensibilität, Korrelation, Mindestumfangsvorbereitung und Architekturgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Batch-Integration aller dokumentbezogenen M7-Logpunkte
|
||||
- Finalisierung der Retry- und Skip-Hooks im Dokumentlauf
|
||||
- Startvalidierung des Endstands
|
||||
- finale Exit-Code-Verdrahtung
|
||||
- Tests des gesamten Endstands
|
||||
|
||||
### Fertig wenn
|
||||
- die Logging-Infrastruktur den M7-Endstand ohne Zusatzannahmen tragen kann,
|
||||
- die Sensibilitätsregel standardmäßig auf „nicht loggen" steht,
|
||||
- sensible KI-Inhalte nur über explizite Konfiguration logbar sind,
|
||||
- dokumentbezogene Log-Korrelation technisch vorbereitet ist,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 Repository-, Persistenz- und Nachvollziehbarkeitsanpassungen für den M7-Endstand ergänzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001, AP-002 und AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die bestehende Persistenz aus M4–M6 unterstützt die vollständige M7-Fehler-, Retry-, Skip- und Logging-Nachvollziehbarkeit ohne neue Wahrheitsquelle und ohne unnötige Schema-Neuerfindung.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Prüfen und gezielt ergänzen, welche Repository-Fähigkeiten für den M7-Endstand tatsächlich fehlen, ohne das bestehende Zwei-Ebenen-Modell neu zu entwerfen.
|
||||
- Bestehende Repository-Operationen so erweitern oder schärfen, dass sie für M7 reproduzierbar unterstützen:
|
||||
- Finalisierung ausgeschöpfter Retry-Rahmen,
|
||||
- konsistente Fortschreibung von Inhalts- und Transientfehlerzählern,
|
||||
- historisierte Skip-Ereignisse,
|
||||
- dokumentbezogene Fehlerklassifikation und Retryable-Flag im Endstand,
|
||||
- lesende Auswertung der bestehenden Versuchshistorie, soweit für Retry- und Skip-Entscheidungen zwingend erforderlich,
|
||||
- konsistente Nachvollziehbarkeit zwischen Log-Entscheidung und SQLite-Historie.
|
||||
- Falls für den M7-Endstand zusätzliche lesende Auswertungen der bestehenden Versuchshistorie nötig sind, diese gezielt ergänzen, ohne Reporting- oder Statistikfunktionalität vorwegzunehmen.
|
||||
- Nur dann eine Schemaevolution vornehmen, wenn sie für den M7-Zielstand **zwingend** erforderlich ist; andernfalls ausdrücklich beim bestehenden M6-Schema bleiben.
|
||||
- Sicherstellen, dass bestehende M4–M6-Datenbestände lesbar und korrekt fortschreibbar bleiben.
|
||||
- Sicherstellen, dass der spätere Batch-Lauf aus AP-006 alle für M7 notwendigen Persistenzoperationen bereits vorfindet und **keine** impliziten Repository-Erweiterungen mehr nachschieben muss.
|
||||
- JavaDoc für Nachvollziehbarkeit, bestehende Persistenz-Wahrheit und M7-Grenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Batch-Use-Case-Integration der M7-Regeln
|
||||
- neue dritte Persistenzebene
|
||||
- Reporting/Analytics
|
||||
- Bootstrap-Anpassungen
|
||||
- Logging-Framework-Konfiguration
|
||||
- M8-Gesamtreview
|
||||
|
||||
### Fertig wenn
|
||||
- die Persistenz den vollständigen M7-Endstand konsistent unterstützt,
|
||||
- keine unnötige Schema-Neuerfindung oder Parallelwahrheit eingeführt wurde,
|
||||
- bestehende M4–M6-Datenbestände anschlussfähig bleiben,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 M7-Batch-Integration für Skip-Logik, Finalisierung ausgeschöpfter Retries, Logging-Hooks und konsistente Fehlerfortschreibung umsetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der bestehende M6-Lauf wird zum vollständigen M7-Lauf erweitert, der Retry-Entscheidungen, Finalisierung, Skip-Verhalten, Sofort-Wiederholversuch, dokumentbezogene Logging-Hooks und konsistente Status-/Persistenzfortschreibung zusammenführt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden Batch-Use-Case so erweitern, dass pro Dokument die vollständigen M7-Regeln wirksam werden.
|
||||
- Folgende Regeln explizit umsetzen:
|
||||
- `SUCCESS` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_ALREADY_PROCESSED` historisieren,
|
||||
- `FAILED_FINAL` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_FINAL_FAILURE` historisieren,
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI` und `PROPOSAL_READY` bleiben verarbeitbar,
|
||||
- deterministische Inhaltsfehler werden nach dem zweiten historisierten Auftreten finalisiert,
|
||||
- transiente technische Fehler werden bei Erreichen des Grenzwerts `max.retries.transient` finalisiert.
|
||||
- Sicherstellen, dass der technische Sofort-Wiederholversuch aus AP-003 ausschließlich im Zielkopierpfad wirkt und danach in **genau eine** dokumentbezogene Status- und Persistenzfortschreibung mündet.
|
||||
- Sicherstellen, dass dokumentbezogene Fehler und Finalisierungen den Batch-Lauf für andere Dokumente nicht unnötig abbrechen.
|
||||
- Sicherstellen, dass Historie und Stammsatz pro identifiziertem Dokument weiterhin konsistent fortgeschrieben werden und kein teilpersistierter M7-Zustand zurückbleibt.
|
||||
- Vor-Fingerprint-Fehler weiterhin ausdrücklich **nicht** als SQLite-Versuch historisieren.
|
||||
- Die vorbereitete Logging-Infrastruktur aus AP-004 an den fachlich relevanten Batch-Entscheidungspunkten anbinden, so dass der finale M7-Mindestumfang vollständig erreicht wird, insbesondere:
|
||||
- erkannte Quelldatei,
|
||||
- Überspringen bereits erfolgreicher Dateien,
|
||||
- Überspringen final fehlgeschlagener Dateien,
|
||||
- erzeugter Zielname,
|
||||
- Retry-Entscheidung,
|
||||
- Fehler mit Klassifikation.
|
||||
- Sicherstellen, dass dokumentbezogene Logs nach erfolgreicher Fingerprint-Ermittlung den Fingerprint oder eine eindeutig ableitbare Referenz enthalten und vor erfolgreicher Fingerprint-Ermittlung mindestens über Lauf-ID und Kandidatenbezug korreliert werden können.
|
||||
- JavaDoc für M7-Laufreihenfolge, Finalisierung ausgeschöpfter Retries, Skip-Regeln, Logging-Hooks und Fehlerfortschreibung ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Bootstrap- und Startvalidierungsanpassungen
|
||||
- finale Exit-Code-Verdrahtung
|
||||
- End-to-End-Tests
|
||||
- M8-Feinschliff
|
||||
|
||||
### Fertig wenn
|
||||
- der Batch-Lauf die vollständige M7-Retry- und Skip-Semantik umsetzt,
|
||||
- ausgeschöpfte Retry-Rahmen zu `FAILED_FINAL` führen,
|
||||
- der Sofort-Wiederholversuch korrekt in den Dokumentlauf integriert ist,
|
||||
- der finale dokumentbezogene Logging-Mindestumfang des M7-Stands vollständig angebunden ist,
|
||||
- dokumentbezogene Fehler den Gesamtbatch kontrolliert weiterlaufen lassen,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Bootstrap-, Startvalidierungs- und Exit-Code-Finalisierung für den M7-Endstand durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Programmeinstieg ist sauber auf den M7-Endstand verdrahtet; die finale Startvalidierung greift, dokumentbezogene Fehler werden korrekt von Startfehlern getrennt und das endgültige Exit-Code-Verhalten ist vollständig umgesetzt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bootstrap-Verdrahtung auf die neuen M7-Bausteine erweitern.
|
||||
- M7-relevante Konfiguration ergänzen bzw. validieren, insbesondere für:
|
||||
- `max.retries.transient` als **Integer >= 1**,
|
||||
- den booleschen Konfigurationswert für sensible KI-Logausgaben,
|
||||
- bestehende M1–M6-Parameter, soweit sie für den robusten Endstand zwingend benötigt werden.
|
||||
- Startvalidierung so vervollständigen, dass ungültige M7-Konfiguration den Lauf **vor** dem Batch-Beginn stoppt.
|
||||
- Sicherstellen, dass harte Start-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehler weiterhin zu **Exit-Code 1** führen.
|
||||
- Sicherstellen, dass dokumentbezogene Fehler aus M3–M7 **nicht** zu Exit-Code 1 eskalieren, solange der Batch-Lauf technisch ordnungsgemäß durchgeführt werden konnte.
|
||||
- Die M7-Logging-Verdrahtung so in den Startpfad integrieren, dass Laufstart, Laufende und harte Startfehler nachvollziehbar protokolliert werden.
|
||||
- JavaDoc und `package-info` für aktualisierte Verdrahtung, Konfigurationsvalidierung, Exit-Code-Endsemantik und Modulgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- komplette Test-Suite
|
||||
- M8-Qualitätsmaßnahmen
|
||||
- neue fachliche Verarbeitung jenseits des M7-Zielbilds
|
||||
|
||||
### Fertig wenn
|
||||
- das Programm im M7-Stand vollständig startbar ist,
|
||||
- die M7-Startvalidierung greift,
|
||||
- das finale Exit-Code-Verhalten vollständig umgesetzt ist,
|
||||
- dokumentbezogene Fehler nicht als Startfehler fehlmodelliert werden,
|
||||
- der Build fehlerfrei bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Tests für Retry-Abläufe über mehrere Läufe, Sofort-Wiederholversuch, Logging-Sensibilität und Exit-Code-Endverhalten vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige M7-Zielzustand wird automatisiert abgesichert und als konsistenter Übergabestand nachgewiesen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Tests für Retry-Abläufe über mehrere Läufe implementieren, insbesondere für:
|
||||
- erster deterministischer Inhaltsfehler → `FAILED_RETRYABLE`,
|
||||
- zweiter deterministischer Inhaltsfehler → `FAILED_FINAL`,
|
||||
- transiente technische Fehler bleiben bis zum konfigurierten Grenzwert retryable,
|
||||
- der transiente Fehlversuch am Grenzwert finalisiert zu `FAILED_FINAL`,
|
||||
- `max.retries.transient = 1` finalisiert beim ersten historisierten transienten Fehlversuch,
|
||||
- `max.retries.transient = 0` wird als ungültige Startkonfiguration abgewiesen.
|
||||
- Tests für finale Fehlerzustände ergänzen, insbesondere:
|
||||
- `FAILED_FINAL` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- `SUCCESS` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- Skip-Ereignisse verändern keine Fehlerzähler.
|
||||
- Tests für den technischen Sofort-Wiederholversuch im Zielkopierpfad ergänzen, insbesondere:
|
||||
- erster Schreibversuch scheitert, zweiter gelingt,
|
||||
- beide Schreibversuche scheitern,
|
||||
- kein erneuter KI-Aufruf,
|
||||
- kein zusätzlicher laufübergreifender Retry-Zähler durch den Sofort-Wiederholversuch.
|
||||
- Tests für Logging-Sensibilitätsregel ergänzen, soweit automatisierbar, insbesondere:
|
||||
- vollständige KI-Rohantwort wird standardmäßig nicht geloggt,
|
||||
- vollständiges KI-`reasoning` wird standardmäßig nicht vollständig geloggt,
|
||||
- vollständige KI-Rohantwort bleibt in SQLite verfügbar,
|
||||
- vollständiges KI-`reasoning` bleibt in SQLite verfügbar,
|
||||
- explizite Freischaltung sensibler KI-Logausgabe wirkt nur kontrolliert.
|
||||
- Tests für Logging-Korrelation ergänzen, soweit automatisierbar, insbesondere:
|
||||
- vor erfolgreicher Fingerprint-Ermittlung ist Kandidatenbezug über Lauf-ID und Quelldatei nachvollziehbar,
|
||||
- nach erfolgreicher Fingerprint-Ermittlung tragen dokumentbezogene Logs den Fingerprint oder eine eindeutig ableitbare Referenz.
|
||||
- Tests für finales Exit-Code-Verhalten ergänzen, insbesondere:
|
||||
- `0` bei technisch ordnungsgemäßem Lauf trotz dokumentbezogener Fehler,
|
||||
- `1` bei harter ungültiger Startkonfiguration,
|
||||
- `1` bei harten Bootstrap-/Initialisierungsfehlern,
|
||||
- dokumentbezogene Fehler aus M3–M7 führen nicht zu Exit-Code 1.
|
||||
- Tests für Konfigurationsvalidierung ergänzen, insbesondere:
|
||||
- ungültiges `max.retries.transient`,
|
||||
- ungültige Logging-Sensitivitätskonfiguration,
|
||||
- M7-Startkonfiguration verhindert bei Ungültigkeit den Laufbeginn.
|
||||
- Integrationstests für den vollständigen M7-Ablauf ergänzen, insbesondere:
|
||||
- robuster Happy-Path mit `SUCCESS`,
|
||||
- dokumentbezogene Teilfehler blockieren den Batch nicht,
|
||||
- ausgeschöpfte Retry-Rahmen führen stabil zu terminalen Skip-Folgeläufen,
|
||||
- bestehendes `PROPOSAL_READY` kann weiter bis zum Enderfolg finalisiert werden,
|
||||
- M4–M6-Altbestände bleiben anschlussfähig.
|
||||
- Den M7-Stand abschließend auf Konsistenz, Architekturtreue und Nicht-Vorgriff auf M8+ prüfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- M8-Gesamtfreigabe
|
||||
- zusätzliche Qualitätskampagnen außerhalb des M7-Zielumfangs
|
||||
|
||||
### Fertig wenn
|
||||
- die Test-Suite für den M7-Umfang grün ist,
|
||||
- die wichtigsten Retry-, Finalisierungs-, Logging-, Korrelation- und Exit-Code-Randfälle automatisiert abgesichert sind,
|
||||
- der definierte M7-Zielzustand vollständig erreicht ist,
|
||||
- ein fehlerfreier, übergabefähiger Stand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M7-Zielumfang aus den verbindlichen Spezifikationen ab und schließen die betriebliche Lücke zwischen dem M6-Erfolgspfad und dem final robusten Endstand sauber:
|
||||
|
||||
- vollständige Retry-Logik über spätere Läufe
|
||||
- saubere Finalisierung nach ausgeschöpften Retry-Rahmen
|
||||
- technischer Sofort-Wiederholversuch ausschließlich für Zielkopierfehler
|
||||
- vollständige Skip-Semantik für `SUCCESS` und `FAILED_FINAL`
|
||||
- finaler Logging-Mindestumfang
|
||||
- Sensibilitätsregel für KI-Inhalte im Logging
|
||||
- präzise Korrelation zwischen Logs und dokumentbezogenen Entscheidungen
|
||||
- finale Exit-Code-Semantik
|
||||
- vervollständigte Startvalidierung
|
||||
- konsistente Nachvollziehbarkeit in Logs und SQLite
|
||||
- Tests für Mehrlauf-Retries, Sofort-Wiederholversuch, Logging-Sensibilität und Exit-Code-Endverhalten
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M6 sowie zu M8+ gewahrt. Insbesondere werden **keine** neuen Fachfunktionen, **kein** M8-Gesamtfeinschliff und **keine** unnötigen Parallelwahrheiten für Persistenz oder Retry-Zustände eingeführt.
|
||||
583
docs/workpackages/M8 - Arbeitspakete.md
Normal file
583
docs/workpackages/M8 - Arbeitspakete.md
Normal file
@@ -0,0 +1,583 @@
|
||||
# M8 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M8 – Abschlussmeilenstein: Qualitätssicherung, Feinschliff und vollständige Entwicklungsfreigabe**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3**, **M4**, **M5**, **M6** und **M7** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter, Bootstrap, Konfiguration, Dokumentation und Tests ändern.
|
||||
- Keine Annahmen treffen, die nicht durch die verbindlichen Spezifikationen oder den tatsächlich vorliegenden Code- und Teststand gedeckt sind.
|
||||
- Kein Vorgriff auf ein hypothetisches **M9** oder sonstige neue Produktfeatures.
|
||||
- Kein großflächiger Umbau bestehender M1–M7-Strukturen ohne nachweisbaren M8-Bezug.
|
||||
- M8 ist **review- und konsolidierungsgetrieben**: Es werden nur tatsächlich vorhandene Restlücken, Inkonsistenzen, Dokumentationsdefizite, Testlücken oder Qualitätsprobleme geschlossen.
|
||||
- M8 darf bestehende Implementierungen gezielt schärfen, vereinheitlichen oder bereinigen, aber nicht stillschweigend neue Fachregeln erfinden.
|
||||
- Jeder positive M8-Zwischenstand muss bereits einen **robusten, vollständig buildbaren und testbaren Endstand** liefern, auch wenn die vollständige Entwicklungsfreigabe erst mit späteren Arbeitspaketen nachgewiesen wird.
|
||||
- Ein Arbeitspaket darf nur dann auf neue Prüf-, Test- oder Repository-Fähigkeiten aufbauen, wenn diese bereits aus M1–M7 vorhanden sind oder im unmittelbar vorhergehenden Arbeitspaket explizit geschaffen wurden.
|
||||
- Ein M8-Arbeitspaket darf innerhalb seines benannten Themas zuerst **gezielt prüfen** und dann **nur die in genau diesem Thema nachweisbaren Befunde** beheben.
|
||||
- Unspezifische Sammelaufträge wie „alles prüfen und alles fixen“ sind **kein** zulässiger Zuschnitt für ein einzelnes Arbeitspaket.
|
||||
- Wo ein Arbeitspaket einen Prüfbericht oder Freigabenachweis verlangt, muss dieser **im Repository verbleiben** und auf den real ausgeführten Build-/Teststand bezogen sein.
|
||||
|
||||
## Explizit nicht Bestandteil von M8
|
||||
|
||||
- neue Fachfunktionalität jenseits des bereits definierten Zielbilds
|
||||
- neue Meilensteine, Folgeprodukte oder optionale Komfortfunktionen
|
||||
- Web-UI, REST-API, OCR, Benutzerinteraktion oder manuelle Nachbearbeitung
|
||||
- Reporting-, Monitoring- oder Statistikfunktionen ohne zwingenden M8-Bezug
|
||||
- großflächige Architektur-Neuerfindung statt gezielter Endstandskonsolidierung
|
||||
- kosmetische Änderungen ohne nachweisbaren Nutzen für Betrieb, Konsistenz, Verständlichkeit oder Qualität
|
||||
- Metrik-Tuning ohne fachlich oder technisch belastbare Begründung
|
||||
- pauschale „Aufräumarbeiten“, die nicht an einen konkreten, belegbaren M8-Befund gebunden sind
|
||||
|
||||
## Verbindliche M8-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. M8 schließt nur reale Restlücken des Endstands
|
||||
|
||||
M8 ergänzt keine neue Produktvision, sondern bringt den aus M1–M7 entstandenen Gesamtstand auf einen vollständig konsistenten, dokumentierten und freigabefähigen Abschlusszustand.
|
||||
|
||||
Daraus folgt:
|
||||
|
||||
- Es werden nur **nachweisbare** Restlücken geschlossen.
|
||||
- Spekulative Umbauten ohne konkreten Defekt-, Qualitäts- oder Konsistenzbezug sind unzulässig.
|
||||
- Änderungen müssen sich auf die verbindlichen Spezifikationen und den realen Projektstand zurückführen lassen.
|
||||
|
||||
### 2. Architekturtreue bleibt unverrückbar
|
||||
|
||||
Auch in M8 gilt unverändert:
|
||||
|
||||
- strikte hexagonale Architektur,
|
||||
- Abhängigkeiten zeigen nach innen,
|
||||
- keine Infrastrukturtypen in Domain oder Application,
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung,
|
||||
- keine neue Parallelstruktur neben dem etablierten Modul- und Port-Modell.
|
||||
|
||||
M8 darf bestehende Verstöße beseitigen, aber keine neuen einführen.
|
||||
|
||||
### 3. Keine zweite Wahrheitsquelle für fachliche oder technische Kernzustände
|
||||
|
||||
Die bereits etablierte Wahrheitsbasis bleibt auch in M8 verbindlich:
|
||||
|
||||
- Dokument-Stammsatz für Gesamtstatus und Zähler,
|
||||
- Versuchshistorie für einzelne Versuche und Nachvollziehbarkeit,
|
||||
- führender `PROPOSAL_READY`-Versuch als Quelle des M5-Benennungsvorschlags,
|
||||
- Zielartefaktzustand gemäß M6/M7.
|
||||
|
||||
M8 führt **keine** zusätzliche Parallelwahrheit für Status, Retry, Proposal, Zielname, Logging-Entscheidungen oder Ergebnisbewertung ein.
|
||||
|
||||
### 4. Dokumentation und Implementierung müssen widerspruchsfrei sein
|
||||
|
||||
Ab M8 gilt der Endstand nur dann als korrekt, wenn:
|
||||
|
||||
- JavaDoc,
|
||||
- `package-info`,
|
||||
- Konfigurationsbeispiele,
|
||||
- Start- und Betriebsdokumentation,
|
||||
- Logging- und Fehlermeldungssemantik,
|
||||
- Prüf- und Freigabenachweise,
|
||||
- sowie Tests
|
||||
|
||||
in ihrer Aussage mit dem tatsächlichen Verhalten des Codes übereinstimmen.
|
||||
|
||||
### 5. Testfokus auf Kerninvarianten statt auf Metrik-Kosmetik
|
||||
|
||||
M8 vervollständigt die Qualitätssicherung gezielt für die fachlich und technisch tragenden Regeln des Systems, insbesondere für:
|
||||
|
||||
- Status- und Retry-Semantik,
|
||||
- Persistenzkonsistenz,
|
||||
- Dateinamensbildung,
|
||||
- Zielkopie,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilität,
|
||||
- Mehrlaufverhalten,
|
||||
- End-to-End-Abläufe.
|
||||
|
||||
Reine Zahlenoptimierung ohne belastbaren Risikobezug ist nicht Ziel von M8.
|
||||
|
||||
### 6. Rückwärtsverträglichkeit bestehender Datenbestände bleibt erhalten
|
||||
|
||||
M8 muss bestehende M4–M7-Datenbestände weiterhin:
|
||||
|
||||
- lesen,
|
||||
- korrekt fortschreiben,
|
||||
- und konsistent interpretieren
|
||||
|
||||
können, soweit dies innerhalb des bereits definierten Zielbilds erforderlich ist.
|
||||
|
||||
### 7. Betreiberrelevante Rückmeldungen müssen klar, konsistent und belastbar sein
|
||||
|
||||
M8 schärft operator-seitige Rückmeldungen so, dass Start-, Konfigurations-, Dokument- und Fehlerzustände ohne unnötige Interpretation nachvollziehbar sind.
|
||||
|
||||
Daraus folgt:
|
||||
|
||||
- Fehlermeldungen dürfen weder irreführend noch widersprüchlich sein.
|
||||
- Logging und Dokumentation müssen dieselben Kernbegriffe verwenden.
|
||||
- Sensible KI-Inhalte bleiben standardmäßig geschützt.
|
||||
|
||||
### 8. Vollständige Entwicklungsfreigabe erfordert einen nachweisbaren Gesamtlauf
|
||||
|
||||
Der M8-Endstand gilt erst dann als abgeschlossen, wenn nachgewiesen ist, dass mindestens folgende Ebenen zusammenpassen:
|
||||
|
||||
- Maven-Reactor-Build,
|
||||
- relevante Test-Suiten,
|
||||
- Smoke- und Startverhalten,
|
||||
- End-to-End-Gesamtablauf,
|
||||
- Konfigurationsbeispiele,
|
||||
- Dokumentation,
|
||||
- Artefakterzeugung.
|
||||
|
||||
### 9. M8 darf gezielt bereinigen, aber nicht unkontrolliert refaktorieren
|
||||
|
||||
Zulässig sind nur solche Bereinigungen, die unmittelbar einem dieser Ziele dienen:
|
||||
|
||||
- Architekturtreue,
|
||||
- Konsistenz,
|
||||
- Verständlichkeit,
|
||||
- Testbarkeit,
|
||||
- Stabilität,
|
||||
- Dokumentationsklarheit,
|
||||
- Freigabefähigkeit.
|
||||
|
||||
Großflächige Strukturumbauten ohne unmittelbaren M8-Nutzen sind ausgeschlossen.
|
||||
|
||||
### 10. Gesamtprüfung, Blockerbehebung und Abschlussfreigabe sind getrennte Arbeitsschritte
|
||||
|
||||
Für die zweistufige KI-Bearbeitung gilt in M8 zusätzlich:
|
||||
|
||||
- **integrierte Gesamtprüfung**, **gezielte Release-Blocker-Behebung** und **finale Freigabebestätigung** sind getrennte Arbeitspakete,
|
||||
- ein einzelnes Arbeitspaket darf nicht gleichzeitig einen unbeschränkten Gesamtreview durchführen **und** unbegrenzt alle dabei gefundenen Themen beheben,
|
||||
- Release-Blocker dürfen nur dann in einem späteren Arbeitspaket behoben werden, wenn sie im unmittelbar vorhergehenden Prüf-Arbeitspaket **konkret nachgewiesen und eingegrenzt** wurden.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 Architekturgrenzen und code-nahe Endstandsdokumentation finalisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M8-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die Architekturgrenzen des Gesamtstands werden abschließend geschärft und in Code-naher Dokumentation so verankert, dass spätere M8-Arbeitspakete ohne Interpretationsspielraum auf einem konsolidierten Endstandsverständnis aufsetzen können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bestehende Modulgrenzen, Verantwortlichkeiten und Abhängigkeitsrichtungen gegen den realen Codebestand prüfen und **nur nachweisbare** M8-relevante Unschärfen oder Verstöße gezielt bereinigen.
|
||||
- JavaDoc und `package-info` dort vervollständigen oder schärfen, wo für den Endstand noch Lücken oder Widersprüche bestehen, insbesondere für:
|
||||
- Domain-Verantwortung,
|
||||
- Application-Orchestrierung,
|
||||
- Port-Zwecke,
|
||||
- Adapter-Verantwortung,
|
||||
- Bootstrap-Aufgaben,
|
||||
- Endstandsbegriffe wie Status, Retry, Proposal-Quelle, Zielerfolg und Nachvollziehbarkeit.
|
||||
- Sicherstellen, dass Architekturgrenzen in der Dokumentation dieselben Begriffe und dieselbe Semantik verwenden wie die implementierte Logik aus M1–M7.
|
||||
- Nachweisbare, code-seitig sichtbare Grenzverletzungen nur dort korrigieren, wo sie für M8-Freigabe, Wartbarkeit oder Spezifikationstreue relevant sind.
|
||||
- Änderungen in Produktionscode auf **architekturbezogene** Korrekturen begrenzen; keine operator-seitigen Meldungstexte, keine Persistenzbereinigung und keine Testkampagne dieses Arbeitspakets vorwegnehmen.
|
||||
- Die für den Endstand verbindlichen Architektur- und Begriffsinvarianten so dokumentieren, dass KI 1 daraus für nachfolgende Arbeitspakete einen präzisen Prompt ableiten kann.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Fachfunktionalität
|
||||
- neue Persistenzmodelle oder neue Port-Landschaften ohne Defektbezug
|
||||
- großflächige Umstrukturierungen ohne nachweisbaren Architekturverstoß
|
||||
- operator-seitige Logging-/Fehlermeldungsüberarbeitung
|
||||
- vollständige Testergänzung oder Dokumentationskonsolidierung außerhalb der code-nahen Architekturgrundlage
|
||||
|
||||
### Fertig wenn
|
||||
- die Architekturgrenzen des Endstands im Code und in der code-nahen Dokumentation klar, konsistent und belastbar beschrieben sind,
|
||||
- nachweisbare M8-relevante Architekturverstöße gezielt bereinigt sind,
|
||||
- spätere M8-Arbeitspakete ohne Grundsatzunklarheiten aufsetzen können,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Status-, Persistenz-, Proposal- und Zielzustandskonsistenz des Endstands bereinigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die letzte Konsistenzlücke zwischen Dokument-Stammsatz, Versuchshistorie, Proposal-Quelle, Zielartefaktzustand und Adapterverhalten wird geschlossen, ohne neue Wahrheitsquellen oder neue Fachregeln einzuführen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den realen Gesamtstand aus M4–M7 gezielt auf **nachweisbare** Inkonsistenzen prüfen, insbesondere zwischen:
|
||||
- Gesamtstatus im Dokument-Stammsatz,
|
||||
- Fehlerzählern,
|
||||
- historisierten Versuchsdaten,
|
||||
- führender `PROPOSAL_READY`-Quelle,
|
||||
- persistierten Zielartefaktdaten,
|
||||
- Adapter-Ergebnissen in Rand- und Fehlerfällen.
|
||||
- Tatsächlich vorhandene Inkonsistenzen im Endstand gezielt bereinigen, insbesondere wenn sie zu widersprüchlichem Mehrlaufverhalten, unstimmiger Persistenzfortschreibung oder fehleranfälliger M6/M7-Finalisierung führen können.
|
||||
- Sicherstellen, dass M4–M7-Datenbestände weiterhin lesbar und korrekt fortschreibbar bleiben.
|
||||
- Sicherstellen, dass keine redundante zweite Persistenzwahrheit für Proposal-, Retry-, Ziel- oder Fehlerzustände entsteht.
|
||||
- Nachweisbare Semantiklücken zwischen Repository-Verhalten und Use-Case-Entscheidungen nur soweit schließen, wie sie für den M8-Endstand kritisch sind.
|
||||
- Unmittelbar betroffene JavaDoc-, Mapping- und Teststellen mitziehen, aber keine operator-seitige Textschärfung oder allgemeine Testkampagne dieses Arbeitspakets vorwegnehmen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Fachregeln über M1–M7 hinaus
|
||||
- Reporting-, Statistik- oder Analysefunktionen
|
||||
- großflächige Schema-Neuerfindung ohne zwingenden M8-Bedarf
|
||||
- Logging-Feinschliff oder Dokumentationskonsolidierung außerhalb des Konsistenzbezugs
|
||||
- integrierte Gesamtprüfung des vollständigen Release-Kandidaten
|
||||
|
||||
### Fertig wenn
|
||||
- nachweisbare Inkonsistenzen zwischen Statusmodell, Persistenz, Proposal-Quelle, Zielartefaktzustand und Adapterverhalten beseitigt sind,
|
||||
- Mehrlaufverhalten, Proposal-Quelle und Zielartefaktzustand konsistent zusammenwirken,
|
||||
- keine neue Parallelwahrheit eingeführt wurde,
|
||||
- der Stand weiterhin fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 Betreiberrelevante Logging-, Fehlertext- und Startvalidierungsrückmeldungen des Endstands schärfen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die nach außen sichtbaren Rückmeldungen des Systems werden sprachlich und inhaltlich so geschärft, dass Betrieb, Fehlersuche und Freigabe des Endstands ohne unnötige Mehrdeutigkeit möglich sind.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bestehende Logging- und Fehlermeldungen für Start-, Konfigurations-, Dokument- und Laufzustände auf **nachweisbare** Unschärfen, Widersprüche, missverständliche Formulierungen oder inkonsistente Begriffsnutzung prüfen.
|
||||
- Betreiberrelevante Meldungen gezielt nachschärfen, insbesondere für:
|
||||
- harte Start- und Konfigurationsfehler,
|
||||
- dokumentbezogene Fehlerklassifikation,
|
||||
- Retry-Entscheidungen,
|
||||
- Skip-Fälle,
|
||||
- Proposal- und Zielerfolgszustände,
|
||||
- Laufstart und Laufende.
|
||||
- Sicherstellen, dass die M7-Sensibilitätsregel für KI-Inhalte sprachlich und technisch konsistent bleibt und nicht durch irreführende Logs oder Fehlermeldungen unterlaufen wird.
|
||||
- Startvalidierungsfehler so strukturieren, dass sie klare Betreiberhinweise liefern, ohne technische Interna oder falsche Ursachenketten zu suggerieren.
|
||||
- Terminologie zwischen Logging, Exception-Texten, Konfigurationsvalidierung und dokumentierter Semantik vereinheitlichen.
|
||||
- Falls dafür gezielte technische Verdrahtungs- oder Formatierungsanpassungen erforderlich sind, diese minimal und architekturtreu umsetzen.
|
||||
- Nur die für diese Rückmeldungen unmittelbar nötigen Tests ergänzen oder schärfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neues Logging-Framework oder neue Telemetrieebenen
|
||||
- neue Betriebsfeatures ohne M8-Bezug
|
||||
- umfassende Dokumentationskonsolidierung außerhalb der operator-seitigen Rückmeldungen
|
||||
- vollständige End-to-End-Testergänzung
|
||||
- Coverage-/PIT-Kampagne
|
||||
|
||||
### Fertig wenn
|
||||
- Logging- und Fehlerrückmeldungen des Endstands klar, konsistent und belastbar sind,
|
||||
- Betreiberrelevante Zustände ohne unnötige Interpretation nachvollziehbar bleiben,
|
||||
- die Sensibilitätsregel für KI-Inhalte weiterhin sauber greift,
|
||||
- der Stand fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Konfigurationsbeispiele, Prompt-Bezug sowie Start- und Betriebsdokumentation konsolidieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-003 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Repository-Stand enthält eine konsolidierte, mit dem echten Endverhalten abgestimmte Dokumentations- und Beispielbasis, mit der lokale Starts, Batch-Läufe und Betriebsverständnis ohne implizite Annahmen möglich sind.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die vorhandenen Konfigurationsbeispiele gegen den realen Endstand prüfen und gezielt vervollständigen oder bereinigen, insbesondere für:
|
||||
- Pflichtparameter,
|
||||
- optionale Parameter,
|
||||
- sinnvolle Beispielwerte,
|
||||
- M7-relevante Logging- und Retry-Konfiguration,
|
||||
- Priorität von Umgebungsvariable gegenüber Properties beim API-Key.
|
||||
- Den vorhandenen Prompt-Bezug im Repository konsistent dokumentieren.
|
||||
- Falls für einen reproduzierbaren lokalen Start ein Prompt-Beispiel oder ein nachvollziehbares Prompt-Skelett im Repository fehlt, dieses **minimal und endstandskonform** ergänzen.
|
||||
- Start-, Konfigurations- und Betriebsdokumentation so konsolidieren, dass mindestens nachvollziehbar beschrieben sind:
|
||||
- benötigte Eingaben,
|
||||
- Start des ausführbaren Artefakts,
|
||||
- Quell- und Zielordnerbezug,
|
||||
- SQLite-Nutzung,
|
||||
- Retry- und Skip-Grundverhalten,
|
||||
- Logging-Grundverhalten,
|
||||
- Umgang mit sensiblen KI-Inhalten im Logging,
|
||||
- Grenzen des Systems.
|
||||
- Veraltete, widersprüchliche oder nicht mehr zum Endstand passende Dokumentation gezielt bereinigen.
|
||||
- Sicherstellen, dass Konfigurationsnamen, Dateinamen, Beispielpfade und Dokumentationsaussagen mit dem tatsächlichen Code übereinstimmen.
|
||||
- Nur dann produktiven Code anfassen, wenn Dokumentation und Code an einem **eindeutig nachweisbaren** Benennungs- oder Konfigurationskonflikt auseinanderlaufen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- externe Web-Dokumentation oder Handbücher außerhalb des Repositories
|
||||
- neue Fachfunktionalität
|
||||
- breit angelegte Code-Refactorings ohne Dokumentationsbezug
|
||||
- finale Testlückenschließung
|
||||
- integrierte Gesamtprüfung des Release-Kandidaten
|
||||
|
||||
### Fertig wenn
|
||||
- der Endstand über die im Repository enthaltenen Beispiele und Dokumente nachvollziehbar start- und betreibbar beschrieben ist,
|
||||
- Konfigurations- und Prompt-Beispiele zum realen Code passen,
|
||||
- veraltete oder widersprüchliche Dokumentation bereinigt ist,
|
||||
- der Stand weiterhin fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 Deterministische End-to-End-Testbasis und wiederverwendbare Testdaten für den Gesamtprozess bereitstellen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Für den finalen Qualitätsnachweis steht eine robuste, deterministische und wiederverwendbare End-to-End-Testbasis bereit, die den vollständigen Batch-Prozess des Endstands ohne externe Unsicherheiten reproduzierbar abbilden kann.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Eine wiederverwendbare End-to-End-Testbasis für den Gesamtprozess bereitstellen, die mindestens kontrolliert kapselt:
|
||||
- Quellordner,
|
||||
- Zielordner,
|
||||
- temporäre Artefakte,
|
||||
- SQLite-Datei,
|
||||
- Konfiguration,
|
||||
- erforderliche Test-Doubles für externe Abhängigkeiten.
|
||||
- Deterministische Testdaten bzw. Testfixturen für zentrale Endstands-Szenarien bereitstellen, insbesondere für:
|
||||
- Happy-Path bis `SUCCESS`,
|
||||
- deterministischen Inhaltsfehler,
|
||||
- transienten technischen Fehler,
|
||||
- Skip nach `SUCCESS`,
|
||||
- Skip nach `FAILED_FINAL`,
|
||||
- vorhandenes `PROPOSAL_READY` mit späterer Finalisierung,
|
||||
- Zielkopierfehler mit M7-Sofort-Wiederholversuch.
|
||||
- Sicherstellen, dass die End-to-End-Testbasis keine unkontrollierte Abhängigkeit von externen KI-Diensten, instabilen Dateisystemzuständen oder globalen Laufzeitumgebungen hat.
|
||||
- Testhilfen und Fixture-Strukturen so schneiden, dass spätere M8-Testarbeitspakete ohne erneut erfundene Testinfrastruktur darauf aufbauen können.
|
||||
- Dokumentieren, welche Endstands-Invarianten durch die End-to-End-Testbasis gezielt nachweisbar gemacht werden.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Schließung aller Test- und Coverage-Lücken
|
||||
- willkürliche Testvermehrung ohne Endstandsbezug
|
||||
- neue Fachfunktionalität
|
||||
- Qualitätsmetriken-Tuning ohne konkreten Testfallbezug
|
||||
- globale Release-Freigabeentscheidung
|
||||
|
||||
### Fertig wenn
|
||||
- eine stabile und deterministische End-to-End-Testbasis vorhanden ist,
|
||||
- die relevanten Endstands-Szenarien reproduzierbar vorbereitet werden können,
|
||||
- spätere M8-Testarbeitspakete ohne neue Testgrundstruktur anschließen können,
|
||||
- der Stand fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 Regressionstests für Kernregeln, Randfälle und Konsistenzinvarianten des Endstands vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die fachlich und technisch tragenden Regeln des Gesamtstands sind automatisiert so abgesichert, dass echte Regressionsrisiken des Produktiv-Endstands zuverlässig erkannt werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Gezielt Regressionstests für die tragenden Regeln aus M1–M7 ergänzen oder vervollständigen, insbesondere für:
|
||||
- Status- und Retry-Semantik,
|
||||
- Mehrlaufverhalten,
|
||||
- Skip-Regeln,
|
||||
- Proposal-Quelle,
|
||||
- Dateinamensbildung,
|
||||
- Windows-Kompatibilität,
|
||||
- Dublettenauflösung,
|
||||
- Zielkopie,
|
||||
- Persistenzkonsistenz,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilität,
|
||||
- Exit-Code-Endverhalten.
|
||||
- Randfälle gezielt absichern, die für den Endstand regressionskritisch sind, insbesondere:
|
||||
- inkonsistente historische Datenzustände im zulässigen M4–M7-Rahmen,
|
||||
- Grenzfälle bei Fehlerzählern,
|
||||
- fehlgeschlagene Persistenz nach technischer Zielkopie,
|
||||
- erneute Läufe nach terminalen Zuständen,
|
||||
- Proposal- und Finalisierungsübergänge.
|
||||
- Sicherstellen, dass die Tests reale Endstands-Invarianten prüfen und nicht bloß Implementierungsdetails einfrieren.
|
||||
- Bestehende Testlücken dort schließen, wo ohne diese Lücke eine belastbare Entwicklungsfreigabe nicht möglich wäre.
|
||||
- Die End-to-End-Testbasis aus AP-005 gezielt wiederverwenden und nicht parallel neu erfinden.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- rein kosmetische Testergänzungen ohne Risikobezug
|
||||
- neue Produktfeatures
|
||||
- breitflächige Qualitätsmetriken-Kampagnen ohne konkrete kritische Lücke
|
||||
- vollständige Freigabeprüfung des Gesamtprojekts
|
||||
|
||||
### Fertig wenn
|
||||
- die regressionskritischen Kernregeln des Endstands automatisiert abgesichert sind,
|
||||
- Randfälle mit hoher Relevanz für Stabilität, Konsistenz und Mehrlaufverhalten belastbar getestet sind,
|
||||
- die Testbasis kohärent und wiederverwendbar bleibt,
|
||||
- der Stand fehlerfrei buildbar und testbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Kritische Coverage- und Mutationslücken des Endstands gezielt schließen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die Qualitätsabsicherung des Endstands wird dort gezielt nachgehärtet, wo JaCoCo- oder PIT-Ergebnisse noch reale Risiken in den tragenden Entscheidungs- und Fehlerpfaden erkennen lassen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die vorhandenen Qualitätsauswertungen des Projekts gezielt auf **fachlich und technisch kritische** Lücken prüfen, insbesondere in Bereichen wie:
|
||||
- Retry-Entscheidung,
|
||||
- Statusfortschreibung,
|
||||
- Persistenzkonsistenz,
|
||||
- Dateinamensbildung,
|
||||
- Zielkopierpfad,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilitätsentscheidung.
|
||||
- Bedeutungsvolle Lücken oder überlebende Mutationen gezielt durch:
|
||||
- zusätzliche Tests,
|
||||
- kleinere, nachweisbar sinnvolle Implementierungsschärfungen,
|
||||
- oder eng begründete Testfallpräzisierungen
|
||||
schließen.
|
||||
- Vorhandene Qualitäts-Gates oder bestehende Qualitäts-Reports für den relevanten Projektstand stabil grün bekommen, soweit dies bereits Teil des Build-Setups ist.
|
||||
- Sicherstellen, dass keine Metrik-Kosmetik betrieben wird, etwa durch willkürliche Ausschlüsse oder nicht belastbare Testumgehungen.
|
||||
- Nur dann Build- oder Qualitätskonfiguration anfassen, wenn dies für einen korrekten, belastbaren M8-Endstand zwingend erforderlich ist und sachlich begründet werden kann.
|
||||
- Änderungen auf die **nachgewiesenen** Hochrisiko-Lücken begrenzen; kein blindes Nachhärten bereits unkritischer Bereiche.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- blindes Hochschrauben von Kennzahlen ohne Risikobezug
|
||||
- großflächige Qualitätsgate-Neuerfindung ohne bestehenden Projektbezug
|
||||
- neue Fachfunktionalität
|
||||
- Abschlussfreigabe des Gesamtprojekts ohne vorherigen Gesamtprüfnachweis
|
||||
|
||||
### Fertig wenn
|
||||
- die kritischen Coverage- und Mutationslücken des Endstands gezielt geschlossen sind,
|
||||
- verbleibende Qualitätsauswertungen keine offensichtlichen Hochrisiko-Blindstellen mehr zeigen,
|
||||
- das Build- und Testsetup belastbar grün bleibt,
|
||||
- keine Metrik-Kosmetik eingeführt wurde.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Integrierte Gesamtprüfung des Endstands und belastbare Befundliste erstellen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der zu diesem Zeitpunkt erreichte Endstand wird ganzheitlich geprüft, und es entsteht eine belastbare, im Repository verbleibende Befundliste, aus der KI 1 für ein mögliches Folge-Arbeitspaket ausschließlich die tatsächlich verbliebenen Release-Blocker ableiten kann.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den vollständigen Projektstand ganzheitlich gegen die verbindlichen Spezifikationen sowie die Ergebnisse aus M1–M7 prüfen.
|
||||
- Den vollständigen Maven-Reactor-Build, relevante Test-Suiten, Smoke-Tests des ausführbaren Artefakts und die maßgeblichen End-to-End-Prüfungen des Endstands tatsächlich ausführen und auswerten.
|
||||
- Prüfen und schriftlich festhalten, dass insbesondere zusammenpassen oder wo noch Abweichungen bestehen:
|
||||
- Architektur und Modulgrenzen,
|
||||
- Fachregeln,
|
||||
- Persistenz- und Retry-Semantik,
|
||||
- Dateinamens- und Zielkopierverhalten,
|
||||
- Startvalidierung und Exit-Code,
|
||||
- Logging und Sensibilitätsregel,
|
||||
- Konfigurationsbeispiele,
|
||||
- Betriebs- und Startdokumentation,
|
||||
- Build- und Testartefakte.
|
||||
- Eine knappe, im Repository verbleibende Befunddatei ergänzen oder aktualisieren, die:
|
||||
- die tatsächlich ausgeführten Prüfungen benennt,
|
||||
- grüne Bereiche von offenen Punkten trennt,
|
||||
- offene Punkte nach **Release-Blocker** vs. **nicht blockierend** klassifiziert,
|
||||
- pro Release-Blocker den betroffenen Themenbereich eindeutig eingrenzt.
|
||||
- Nur die minimal notwendigen Änderungen an Build-/Testhilfen oder Prüfskripten vornehmen, die erforderlich sind, um diese integrierte Gesamtprüfung reproduzierbar auszuführen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- pauschale Behebung aller in diesem Gesamtreview entdeckten Themen in demselben Arbeitspaket
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- nachträgliche Großrefactorings ohne klaren Prüfbezug
|
||||
- finale Freigabeerklärung des Projekts
|
||||
|
||||
### Fertig wenn
|
||||
- der vollständige Endstand ganzheitlich geprüft ist,
|
||||
- die tatsächlich ausgeführten Prüfungen belastbar dokumentiert sind,
|
||||
- eine klar eingegrenzte Befundliste im Repository vorliegt,
|
||||
- eventuelle Release-Blocker für ein Folge-Arbeitspaket präzise genug beschrieben sind,
|
||||
- der Stand weiterhin fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-009 Gezielte Release-Blocker aus der integrierten Gesamtprüfung beheben
|
||||
|
||||
### Voraussetzung
|
||||
AP-008 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Nur die in AP-008 konkret nachgewiesenen und eingegrenzten Release-Blocker werden gezielt beseitigt, ohne den Scope des Abschlussmeilensteins erneut zu öffnen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Ausschließlich die in der Befundliste aus AP-008 als **Release-Blocker** ausgewiesenen Punkte bearbeiten.
|
||||
- Die Behebung pro Blocker auf den dort klar benannten Themenbereich begrenzen.
|
||||
- Sicherstellen, dass keine nicht belegten Nebenbaustellen oder neuen Qualitätskampagnen in dieses Arbeitspaket hineingezogen werden.
|
||||
- Unmittelbar betroffene Tests, Dokumentationsstellen und Konfigurationsbeispiele mitziehen, soweit dies zur konsistenten Behebung des konkreten Blockers nötig ist.
|
||||
- Falls AP-008 **keine** Release-Blocker nachgewiesen hat, in diesem Arbeitspaket keine unnötigen Produktionsänderungen vornehmen, sondern die Blockerfreiheit nur konsistent im Repository nachvollziehbar machen.
|
||||
- Nach der Blockerbehebung mindestens den für die betroffenen Blocker notwendigen Build-/Testumfang erneut ausführen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Behebung bloß nicht blockierender Schönheitsmängel
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- erneute globale Gesamtprüfung des kompletten Endstands
|
||||
- breitflächige Nachschärfung von Bereichen, die in AP-008 nicht als Blocker eingegrenzt wurden
|
||||
|
||||
### Fertig wenn
|
||||
- alle in AP-008 nachgewiesenen Release-Blocker gezielt beseitigt oder nachvollziehbar als nicht mehr vorhanden bestätigt sind,
|
||||
- keine unnötige Scope-Ausweitung stattgefunden hat,
|
||||
- die betroffenen Bereiche wieder fehlerfrei buildbar und testbar sind,
|
||||
- der Stand weiterhin übergabefähig ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-010 Finale Gesamtprüfung, Freigabedokumentation und Abschluss des M8-Endstands durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-009 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Gesamtstand wird abschließend als vollständig freigabefähiger Produktiv-Endstand innerhalb des definierten Projektumfangs nachgewiesen und die Entwicklungsfreigabe wird nachvollziehbar dokumentiert.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den vollständigen Maven-Reactor-Build, die relevanten Test-Suiten, Smoke-Tests des ausführbaren Artefakts und die maßgeblichen End-to-End-Prüfungen des Endstands erneut tatsächlich ausführen und auswerten.
|
||||
- Prüfen und bestätigen, dass insbesondere zusammenpassen:
|
||||
- Architektur und Modulgrenzen,
|
||||
- Fachregeln,
|
||||
- Persistenz- und Retry-Semantik,
|
||||
- Dateinamens- und Zielkopierverhalten,
|
||||
- Startvalidierung und Exit-Code,
|
||||
- Logging und Sensibilitätsregel,
|
||||
- Konfigurationsbeispiele,
|
||||
- Betriebs- und Startdokumentation,
|
||||
- Build- und Testartefakte.
|
||||
- Eine knappe, im Repository verbleibende Abschluss- bzw. Freigabedokumentation ergänzen oder aktualisieren, die mindestens festhält:
|
||||
- welche Prüfungen tatsächlich ausgeführt wurden,
|
||||
- dass keine bekannten, spezifikationsrelevanten Release-Blocker für den definierten Projektumfang offen sind,
|
||||
- auf welche Artefakte, Tests oder Dokumente sich diese Aussage stützt.
|
||||
- Sicherstellen, dass nach diesem Arbeitspaket kein bekannter, spezifikationsrelevanter Blocker für den definierten Projektumfang offen bleibt.
|
||||
- Nur dann noch Änderungen am Produktionscode, an Tests oder an Dokumentation vornehmen, wenn im unmittelbaren Abschlussdurchlauf ein **konkret nachweisbarer** Freigabeblocker auftritt, der ohne Scope-Ausweitung minimal behoben werden kann.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- nachträgliche Großrefactorings ohne unmittelbaren Freigabeblocker-Bezug
|
||||
- beliebige Schönheitskorrekturen ohne Freigaberelevanz
|
||||
|
||||
### Fertig wenn
|
||||
- der vollständige Endstand ganzheitlich geprüft und freigabefähig ist,
|
||||
- Build, Tests, Smoke-Verhalten und End-to-End-Abläufe belastbar grün sind,
|
||||
- keine bekannten, spezifikationsrelevanten Release-Blocker mehr offen sind,
|
||||
- Dokumentation, Konfiguration und Artefakterzeugung mit dem realen Endstand übereinstimmen,
|
||||
- ein fehlerfreier, übergabefähiger Abschlussstand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M8-Zielumfang aus den verbindlichen Spezifikationen ab und schneiden den Abschlussmeilenstein für die zweistufige KI-Bearbeitung präziser als zuvor:
|
||||
|
||||
- abschließender Architektur- und Dokumentationsabgleich
|
||||
- gezielte Bereinigung realer Restinkonsistenzen im Endstand
|
||||
- Schärfung von Logging-, Fehler- und Betreiber-Rückmeldungen
|
||||
- Konsolidierung von Konfigurations-, Prompt- und Betriebsdokumentation
|
||||
- deterministische End-to-End-Testbasis
|
||||
- gezielte Regressionstests für Kernregeln und Randfälle
|
||||
- belastbare Schließung kritischer Coverage- und Mutationslücken
|
||||
- integrierte Gesamtprüfung mit dokumentierter Befundliste
|
||||
- gezielte, klar eingegrenzte Behebung nachgewiesener Release-Blocker
|
||||
- abschließende Gesamtprüfung mit nachvollziehbarer Entwicklungsfreigabe
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M7 gewahrt:
|
||||
|
||||
- M8 erfindet keine neue Produktfunktionalität,
|
||||
- M8 führt keine zweite Wahrheitsquelle ein,
|
||||
- M8 rollt M1/M2-Themen nicht pauschal neu auf, sondern schließt nur reale Restlücken des Endstands,
|
||||
- M8 trennt Gesamtprüfung, Blockerbehebung und Freigabe in eigenständige, für KI 1 und KI 2 präzise nutzbare Arbeitsschritte.
|
||||
149
docs/workpackages/V1.1 - Abschlussnachweis.md
Normal file
149
docs/workpackages/V1.1 - Abschlussnachweis.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# V1.1 – Abschlussnachweis
|
||||
|
||||
## Datum und betroffene Module
|
||||
|
||||
**Datum:** 2026-04-09
|
||||
|
||||
**Betroffene Module:**
|
||||
|
||||
| Modul | Art der Änderung |
|
||||
|---|---|
|
||||
| `pdf-umbenenner-application` | Neue Konfigurationstypen (`MultiProviderConfiguration`, `ProviderConfiguration`, `AiProviderFamily`) |
|
||||
| `pdf-umbenenner-adapter-out` | Neuer Anthropic-Adapter (`AnthropicClaudeHttpAdapter`), neuer Parser (`MultiProviderConfigurationParser`), neuer Validator (`MultiProviderConfigurationValidator`), Migrator (`LegacyConfigurationMigrator`), Schema-Migration (`ai_provider`-Spalte), aktualisierter OpenAI-Adapter (`OpenAiHttpAdapter`), aktualisierter Properties-Adapter (`PropertiesConfigurationPortAdapter`) |
|
||||
| `pdf-umbenenner-bootstrap` | Provider-Selektor (`AiProviderSelector`), aktualisierter `BootstrapRunner` (Migration, Provider-Auswahl, Logging) |
|
||||
| `pdf-umbenenner-adapter-in-cli` | Keine fachliche Änderung |
|
||||
| `pdf-umbenenner-domain` | Keine Änderung |
|
||||
| `config/` | Beispiel-Properties-Dateien auf neues Schema aktualisiert |
|
||||
| `docs/betrieb.md` | Abschnitte KI-Provider-Auswahl und Migration ergänzt |
|
||||
|
||||
---
|
||||
|
||||
## Pflicht-Testfälle je Arbeitspaket
|
||||
|
||||
### AP-001 – Konfigurations-Schema einführen
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `parsesNewSchemaWithOpenAiCompatibleActive` | `MultiProviderConfigurationTest` | grün |
|
||||
| `parsesNewSchemaWithClaudeActive` | `MultiProviderConfigurationTest` | grün |
|
||||
| `claudeBaseUrlDefaultsWhenMissing` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsMissingActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsUnknownActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsMissingMandatoryFieldForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `acceptsMissingMandatoryFieldForInactiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `envVarOverridesPropertiesApiKeyForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `envVarOnlyResolvesForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| Bestehende Tests bleiben grün | `PropertiesConfigurationPortAdapterTest`, `StartConfigurationValidatorTest` | grün |
|
||||
|
||||
### AP-002 – Legacy-Migration mit `.bak`
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `migratesLegacyFileWithAllFlatKeys` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `createsBakBeforeOverwriting` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `bakSuffixIsIncrementedIfBakExists` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `noOpForAlreadyMigratedFile` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `reloadAfterMigrationSucceeds` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `migrationFailureKeepsBak` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `legacyDetectionRequiresAtLeastOneFlatKey` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `legacyValuesEndUpInOpenAiCompatibleNamespace` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `unrelatedKeysSurviveUnchanged` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `inPlaceWriteIsAtomic` | `LegacyConfigurationMigratorTest` | grün |
|
||||
|
||||
### AP-003 – Bootstrap-Provider-Auswahl und Umstellung des bestehenden OpenAI-Adapters
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `bootstrapWiresOpenAiCompatibleAdapterWhenActive` | `AiProviderSelectorTest` | grün |
|
||||
| `bootstrapFailsHardWhenActiveProviderUnknown` | `AiProviderSelectorTest` | grün |
|
||||
| `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` | `AiProviderSelectorTest` | grün |
|
||||
| `openAiAdapterReadsValuesFromNewNamespace` | `OpenAiHttpAdapterTest` | grün |
|
||||
| `openAiAdapterBehaviorIsUnchanged` | `OpenAiHttpAdapterTest` | grün |
|
||||
| `activeProviderIsLoggedAtRunStart` | `BootstrapRunnerTest` | grün |
|
||||
| `existingDocumentProcessingTestsRemainGreen` | `BatchRunEndToEndTest` | grün |
|
||||
| `legacyFileEndToEndStillRuns` | `BootstrapRunnerTest` | grün |
|
||||
|
||||
### AP-004 – Persistenz: Provider-Identifikator additiv
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `addsProviderColumnOnFreshDb` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `addsProviderColumnOnExistingDbWithoutColumn` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `migrationIsIdempotent` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `existingRowsKeepNullProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `newAttemptsWriteOpenAiCompatibleProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `newAttemptsWriteClaudeProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `repositoryReadsProviderColumn` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `legacyDataReadingDoesNotFail` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `existingHistoryTestsRemainGreen` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
|
||||
### AP-005 – Nativer Anthropic-Adapter implementieren und verdrahten
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `claudeAdapterBuildsCorrectRequest` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterUsesEnvVarApiKey` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFallsBackToPropertiesApiKey` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFailsValidationWhenBothKeysMissing` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterParsesSingleTextBlock` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterConcatenatesMultipleTextBlocks` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterIgnoresNonTextBlocks` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFailsOnEmptyTextContent` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp401AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp429AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp500AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsTimeoutAsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsUnparseableJsonAsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `bootstrapSelectsClaudeWhenActive` | `AiProviderSelectorTest` | grün |
|
||||
| `claudeProviderIdentifierLandsInAttemptHistory` | `AnthropicClaudeAdapterIntegrationTest` | grün |
|
||||
| `existingOpenAiPathRemainsGreen` | alle `OpenAiHttpAdapterTest`-Tests | grün |
|
||||
|
||||
### AP-006 – Regression, Smoke, Doku, Abschlussnachweis
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `smokeBootstrapWithOpenAiCompatibleActive` | `BootstrapSmokeTest` | grün |
|
||||
| `smokeBootstrapWithClaudeActive` | `BootstrapSmokeTest` | grün |
|
||||
| `e2eMigrationFromLegacyDemoConfig` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `regressionExistingOpenAiSuiteGreen` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `e2eClaudeRunWritesProviderIdentifierToHistory` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `e2eOpenAiRunWritesProviderIdentifierToHistory` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `legacyDataFromBeforeV11RemainsReadable` | `ProviderIdentifierE2ETest` | grün |
|
||||
|
||||
---
|
||||
|
||||
## Belegte Eigenschaften
|
||||
|
||||
| Eigenschaft | Nachweis |
|
||||
|---|---|
|
||||
| Zwei Provider-Familien unterstützt | `AiProviderSelectorTest`, `BootstrapSmokeTest` |
|
||||
| Genau einer aktiv pro Lauf | `MultiProviderConfigurationTest`, `BootstrapSmokeTest` |
|
||||
| Kein automatischer Fallback | keine Fallback-Logik in `AiProviderSelector` oder Application-Schicht |
|
||||
| Fachlicher Vertrag (`NamingProposal`) unverändert | `AiResponseParser`, `AiNamingService` unverändert; beide Adapter liefern denselben Domain-Typ |
|
||||
| Persistenz rückwärtsverträglich | `SqliteAttemptProviderPersistenceTest`, `legacyDataFromBeforeV11RemainsReadable` |
|
||||
| Migration nachgewiesen | `LegacyConfigurationMigratorTest`, `e2eMigrationFromLegacyDemoConfig` |
|
||||
| `.bak`-Sicherung nachgewiesen | `LegacyConfigurationMigratorTest.createsBakBeforeOverwriting`, `e2eMigrationFromLegacyDemoConfig` |
|
||||
| Aktiver Provider wird geloggt | `BootstrapRunnerTest.activeProviderIsLoggedAtRunStart` |
|
||||
| Keine Architekturbrüche | kein `Application`- oder `Domain`-Code kennt OpenAI- oder Claude-spezifische Typen |
|
||||
| Keine neuen Bibliotheken | Anthropic-Adapter nutzt Java HTTP Client und `org.json` (beides bereits im Repo etabliert) |
|
||||
|
||||
---
|
||||
|
||||
## Betreiberaufgabe
|
||||
|
||||
Wer bisher die Umgebungsvariable `PDF_UMBENENNER_API_KEY` oder eine andere eigene Variable für den
|
||||
OpenAI-kompatiblen API-Schlüssel eingesetzt hat, muss diese auf **`OPENAI_COMPATIBLE_API_KEY`** umstellen.
|
||||
Die Anwendung akzeptiert nur diese kanonische Umgebungsvariable; ältere proprietäre Namen werden
|
||||
nicht automatisch ausgewertet.
|
||||
|
||||
---
|
||||
|
||||
## Build-Ergebnis
|
||||
|
||||
Build-Kommando:
|
||||
|
||||
```
|
||||
.\mvnw.cmd clean verify -pl pdf-umbenenner-domain,pdf-umbenenner-application,pdf-umbenenner-adapter-out,pdf-umbenenner-adapter-in-cli,pdf-umbenenner-bootstrap --also-make
|
||||
```
|
||||
|
||||
Build-Status: **ERFOLGREICH** — alle Tests grün, Mutationstests in allen Modulen ausgeführt.
|
||||
596
docs/workpackages/V1.1 - Arbeitspakete.md
Normal file
596
docs/workpackages/V1.1 - Arbeitspakete.md
Normal file
@@ -0,0 +1,596 @@
|
||||
# V1.1 – Arbeitspakete
|
||||
|
||||
> **Aktive Erweiterung:** Zusätzliche KI-Provider-Familie **Anthropic Claude** über die native Messages API, neben der bestehenden OpenAI-kompatiblen Anbindung. Bewusst minimale Erweiterung des freigegebenen Basisstands.
|
||||
|
||||
> **Ablage im Repository:** `docs/workpackages/V1.1 - Arbeitspakete.md`
|
||||
|
||||
---
|
||||
|
||||
## 0. Lesereihenfolge für jedes Arbeitspaket
|
||||
|
||||
Vor jedem AP **vollständig** lesen:
|
||||
1. `CLAUDE.md`
|
||||
2. `docs/specs/technik-und-architektur.md`
|
||||
3. `docs/specs/fachliche-anforderungen.md`
|
||||
4. dieses Dokument: Abschnitte 1 bis 6
|
||||
5. **nur** das aktive Arbeitspaket aus Abschnitt 7
|
||||
|
||||
Nicht vorgreifen. Nicht raten. Bei echter Unklarheit knapp benennen statt erfinden.
|
||||
|
||||
---
|
||||
|
||||
## 1. Arbeitsweise (verbindlich)
|
||||
|
||||
Diese Regeln ersetzen die nicht vorhandene `WORKFLOW.md` und gelten für alle APs in diesem Dokument.
|
||||
|
||||
### 1.1 Scope-Disziplin
|
||||
- Es wird **ausschließlich** das aktive Arbeitspaket umgesetzt.
|
||||
- Keine Inhalte späterer Arbeitspakete vorwegnehmen.
|
||||
- Keine kosmetischen Refactorings ohne direkten Bezug zum AP.
|
||||
- Keine Umbenennungen außerhalb des AP-Scopes.
|
||||
- Vor Änderungen die betroffenen Klassen über Typsuche im Repo lokalisieren, **nicht** über vermutete Pfade.
|
||||
|
||||
### 1.2 Build- und Testpflicht
|
||||
Build-Kommando vom Projekt-Root, identisch für alle APs:
|
||||
|
||||
```
|
||||
.\mvnw.cmd clean verify -pl pdf-umbenenner-domain,pdf-umbenenner-application,pdf-umbenenner-adapter-out,pdf-umbenenner-adapter-in-cli,pdf-umbenenner-bootstrap --also-make
|
||||
```
|
||||
|
||||
- Nach jeder substanziellen Änderung: Build ausführen.
|
||||
- Vor Abschluss eines AP: Build muss **fehlerfrei** sein, alle Tests grün.
|
||||
- Schlägt der Build fehl: Ursache sauber beheben, nicht kaschieren.
|
||||
- Bestehende Tests dürfen nicht stillschweigend gelöscht oder deaktiviert werden. Sie werden bei Bedarf **angepasst** und der Grund wird im AP-Output dokumentiert.
|
||||
|
||||
### 1.3 Pflicht-Tests pro AP
|
||||
- Jede neue Klasse mit fachlich oder technisch relevanter Logik bekommt mindestens einen Unit-Test.
|
||||
- Jede in einem AP geänderte Klasse, die bisher Tests hatte, behält Tests; betroffene Tests werden angepasst.
|
||||
- Pro AP gibt es eine Liste **kritischer Pflicht-Testfälle** (siehe jeweiliges AP). Diese sind namentlich umzusetzen.
|
||||
- Darüber hinaus gilt die übliche Repo-Praxis (Coverage, PIT-Mutationstests in den unmittelbar betroffenen Modulen, soweit bereits etabliert).
|
||||
|
||||
### 1.4 Dokumentation
|
||||
Pro AP werden mitgepflegt, soweit relevant:
|
||||
- JavaDoc und `package-info` der berührten Klassen
|
||||
- Konfigurationsbeispiele
|
||||
- unmittelbar betroffene Repository-Dokumente
|
||||
|
||||
### 1.5 Naming-Regel
|
||||
In Code, Kommentaren und JavaDoc dürfen **keine** Versions- oder AP-Bezeichner erscheinen:
|
||||
- Verboten: `V1.0`, `V1.1`, `M1`–`M8`, `AP-001` … `AP-006`
|
||||
- Stattdessen: zeitlose technische Bezeichnungen.
|
||||
|
||||
### 1.6 Pflicht-Output-Format am Ende jedes AP
|
||||
Am Ende der AP-Bearbeitung gibt Sonnet **genau** diesen Block aus:
|
||||
|
||||
```
|
||||
- Scope erfüllt: ja/nein
|
||||
- Geänderte Dateien:
|
||||
- <Pfad>
|
||||
- ...
|
||||
- Neue Dateien:
|
||||
- <Pfad>
|
||||
- ...
|
||||
- Build-Kommando: <verwendetes Kommando>
|
||||
- Build-Status: ERFOLGREICH / FEHLGESCHLAGEN
|
||||
- Pflicht-Tests umgesetzt: <Liste der namentlich geforderten Testfälle>
|
||||
- Offene Punkte: keine / <Beschreibung>
|
||||
- Risiken: keine / <Beschreibung>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Erweiterungsziel und Nicht-Ziele
|
||||
|
||||
### 2.1 Ziel
|
||||
- Der bestehende OpenAI-kompatible KI-Weg bleibt unverändert nutzbar.
|
||||
- Zusätzlich wird die **native Anthropic Messages API** als zweite, gleichwertig unterstützte Provider-Familie integriert.
|
||||
- Genau **ein** Provider ist pro Lauf aktiv – ausschließlich über Konfiguration.
|
||||
- Kein automatischer Fallback, keine Parallelnutzung, keine Profilverwaltung.
|
||||
- Der fachliche KI-Vertrag (`NamingProposal`) bleibt unverändert.
|
||||
- Bestehende Properties-Dateien werden beim ersten Start kontrolliert ins neue Schema migriert; vorher wird automatisch eine `.bak`-Sicherung angelegt.
|
||||
|
||||
### 2.2 Explizit nicht Bestandteil
|
||||
- Provider-Familien jenseits der zwei explizit unterstützten
|
||||
- Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
- automatische Fallback-Umschaltung
|
||||
- parallele Nutzung mehrerer Provider in einem Lauf
|
||||
- Änderung des fachlichen Ergebnisvertrags
|
||||
- Änderung der Dateinamensregeln, Retry-Regeln, Batch-Betriebsmodells
|
||||
- Persistenz- oder Schemaänderungen jenseits der einen additiven Provider-Identifikator-Spalte
|
||||
|
||||
### 2.3 Architekturtreue (unverhandelbar)
|
||||
- strikte hexagonale Architektur, Abhängigkeiten zeigen nach innen
|
||||
- `AiNamingPort` bleibt provider-neutral
|
||||
- provider-spezifische Endpunkte, Header, Auth, Request-/Response-Formate leben **ausschließlich** im jeweiligen Adapter-Out
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung, keine gemeinsame „abstrakte KI-Adapter"-Zwischenschicht
|
||||
- die Provider-Auswahl ist eine **Bootstrap-Verdrahtungsentscheidung**
|
||||
|
||||
---
|
||||
|
||||
## 3. Zielzustand der Konfiguration (verbindlich)
|
||||
|
||||
### 3.1 Properties-Schema
|
||||
```properties
|
||||
# bestehende, unveränderte Parameter
|
||||
source.folder=...
|
||||
target.folder=...
|
||||
sqlite.file=...
|
||||
max.retries.transient=...
|
||||
max.pages=...
|
||||
max.text.characters=...
|
||||
prompt.template.file=...
|
||||
runtime.lock.file=...
|
||||
log.directory=...
|
||||
log.level=...
|
||||
log.ai.sensitive=...
|
||||
|
||||
# neue Provider-Auswahl (Pflicht)
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# OpenAI-kompatible Provider-Familie
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
# Anthropic-Provider-Familie (Claude)
|
||||
ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
### 3.2 Zulässige Werte für `ai.provider.active`
|
||||
- `openai-compatible`
|
||||
- `claude`
|
||||
|
||||
Jeder andere Wert ist eine ungültige Startkonfiguration und führt zu Exit-Code `1`.
|
||||
|
||||
### 3.3 Pflichtwerte je aktivem Provider
|
||||
| Provider | Pflicht | Optional / mit Default |
|
||||
|---|---|---|
|
||||
| `openai-compatible` | `baseUrl`, `model`, `timeoutSeconds`, `apiKey` (Env hat Vorrang) | – |
|
||||
| `claude` | `model`, `timeoutSeconds`, `apiKey` (Env hat Vorrang) | `baseUrl` (Default `https://api.anthropic.com`) |
|
||||
|
||||
Für den **inaktiven** Provider werden keine Pflichtwerte erzwungen.
|
||||
|
||||
### 3.4 Umgebungsvariablen für API-Schlüssel
|
||||
| Provider | Umgebungsvariable |
|
||||
|---|---|
|
||||
| `openai-compatible` | `OPENAI_COMPATIBLE_API_KEY` |
|
||||
| `claude` | `ANTHROPIC_API_KEY` |
|
||||
|
||||
- Pro Provider gilt: Umgebungsvariable hat **Vorrang** vor dem Properties-Wert derselben Provider-Familie.
|
||||
- Schlüssel verschiedener Provider werden **niemals** vermischt.
|
||||
- Wenn der Betrieb bisher eine andere Umgebungsvariable für den OpenAI-kompatiblen Key genutzt hat, ist diese vom Betreiber auf `OPENAI_COMPATIBLE_API_KEY` umzustellen. Das ist im Abschlussnachweis (AP-006) zu dokumentieren.
|
||||
|
||||
### 3.5 Legacy-Form (vor V1.1)
|
||||
Eindeutig erkennbar an mindestens einem der flachen Schlüssel:
|
||||
```
|
||||
api.baseUrl
|
||||
api.model
|
||||
api.timeoutSeconds
|
||||
api.key
|
||||
```
|
||||
ohne Vorhandensein von `ai.provider.active`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Anthropic Messages API – verbindlicher technischer Faktenblock
|
||||
|
||||
> Quelle: offizielle Claude API-Dokumentation. Diese Werte sind verbindlich und nicht zu erfinden, abzuleiten oder zu „verbessern".
|
||||
|
||||
### 4.1 Endpoint und Methode
|
||||
- Methode: `POST`
|
||||
- URL: `{baseUrl}/v1/messages`
|
||||
- Default-`baseUrl`: `https://api.anthropic.com`
|
||||
|
||||
### 4.2 Pflicht-Header
|
||||
| Header | Wert |
|
||||
|---|---|
|
||||
| `x-api-key` | API-Schlüssel aus `ANTHROPIC_API_KEY` (Env) bzw. `ai.provider.claude.apiKey` (Properties) |
|
||||
| `anthropic-version` | `2023-06-01` |
|
||||
| `content-type` | `application/json` |
|
||||
|
||||
Nicht `Authorization: Bearer …` verwenden. Anthropic nutzt `x-api-key`.
|
||||
|
||||
### 4.3 Request-Body (relevante Felder)
|
||||
```json
|
||||
{
|
||||
"model": "<modellname aus ai.provider.claude.model>",
|
||||
"max_tokens": <Integer, > 0, Pflicht>,
|
||||
"system": "<optional, top-level Feld - NICHT als Message mit role=system>",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "<Prompt-Text>" }
|
||||
]
|
||||
}
|
||||
```
|
||||
- `max_tokens` ist **Pflicht** (Unterschied zu OpenAI). Konkreter Wert: zweckmäßig fest verdrahtet im Adapter, ausreichend groß für die JSON-Antwort der Anwendung. Kein neuer Properties-Schlüssel.
|
||||
- `system` wird **nicht** als Message mit `role=system` modelliert. Anthropic akzeptiert nur `user` und `assistant` im `messages`-Array; ein System-Prompt geht ausschließlich ins Top-Level-Feld `system`.
|
||||
- Der bestehende Prompt der Anwendung wird **unverändert** als Inhalt der einen `user`-Message übergeben. Falls der bestehende Prompt-Mechanismus eine System-Komponente kennt, wandert diese in das `system`-Feld; sonst bleibt `system` weg.
|
||||
|
||||
### 4.4 Response-Body (relevante Felder)
|
||||
```json
|
||||
{
|
||||
"id": "...",
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"content": [
|
||||
{ "type": "text", "text": "<die eigentliche Antwort>" }
|
||||
],
|
||||
"stop_reason": "...",
|
||||
"usage": { "input_tokens": 0, "output_tokens": 0 }
|
||||
}
|
||||
```
|
||||
- Der für die Anwendung relevante Text wird **konkateniert aus allen Blöcken in `content` mit `type == "text"`** in Reihenfolge gewonnen.
|
||||
- Andere Block-Typen werden ignoriert.
|
||||
- Liefert die API kein einziges `text`-Block, ist das ein technischer Fehler des Adapters (klassifiziert wie ein leerer/unbrauchbarer Antwortinhalt).
|
||||
|
||||
### 4.5 Fehlerklassifikation im Claude-Adapter
|
||||
| Symptom | Klassifikation | Anmerkung |
|
||||
|---|---|---|
|
||||
| HTTP 4xx (außer 429) | technischer Fehler | Auth-Fehler (401/403) zählen hier rein |
|
||||
| HTTP 429 | technischer Fehler | rate limit |
|
||||
| HTTP 5xx | technischer Fehler | |
|
||||
| Timeout | technischer Fehler | |
|
||||
| Verbindung fehlgeschlagen | technischer Fehler | |
|
||||
| JSON nicht parsebar | technischer Fehler | |
|
||||
| Kein `content[*].text`-Block | technischer Fehler | |
|
||||
| Antworttext nicht nach `NamingProposal` parsebar | greift bestehende Antwort-Validierung der Application | nicht im Adapter behandeln |
|
||||
|
||||
Alle technischen Adapterfehler werden auf die **bestehende** transiente Fehlersemantik der Anwendung abgebildet. Es entsteht **keine** neue Fehlerkategorie.
|
||||
|
||||
---
|
||||
|
||||
## 5. Verbindliche Regeln für jedes AP
|
||||
|
||||
1. **Minimale Erweiterung.** Nichts ändern, was nicht für die Erweiterung zwingend erforderlich ist.
|
||||
2. **Einheitlicher fachlicher KI-Vertrag.** `NamingProposal` bleibt unverändert. Keine provider-spezifische Verzweigung in Application/Domain.
|
||||
3. **Genau ein aktiver Provider.** Kein Fallback, keine Profilverwaltung.
|
||||
4. **Properties-Datei bleibt führend.** Keine alternative Konfigurationsquelle.
|
||||
5. **Bestehender OpenAI-Pfad bleibt funktional unverändert.**
|
||||
6. **Architekturgrenzen** (siehe 2.3) werden niemals durchbrochen.
|
||||
7. **Rückwärtsverträglichkeit der SQLite-Daten** bleibt erhalten.
|
||||
8. **Build muss am Ende jedes AP fehlerfrei sein.**
|
||||
9. **Alle Pflicht-Testfälle des AP** sind umgesetzt.
|
||||
|
||||
---
|
||||
|
||||
## 6. Granularität und Reihenfolge
|
||||
|
||||
Sechs Arbeitspakete in dieser zwingenden Reihenfolge:
|
||||
|
||||
| AP | Thema | Risiko | Charakter |
|
||||
|---|---|---|---|
|
||||
| AP-001 | Konfigurations-Schema einführen (additiv) | niedrig | reine Erweiterung |
|
||||
| AP-002 | Legacy-Migration mit `.bak` | mittel | Datei-Umschreibung, geschützt durch Sicherung |
|
||||
| AP-003 | Bootstrap-Provider-Auswahl + bestehender Adapter umschalten | hoch | Verhaltensänderung im Wiring |
|
||||
| AP-004 | Persistenz: Provider-Identifikator additiv | mittel | additive DB-Schema-Migration |
|
||||
| AP-005 | Nativer Anthropic-Adapter implementieren und verdrahten | mittel | neue Adapter-Klasse |
|
||||
| AP-006 | Regression, Smoke, Doku, Abschlussnachweis | niedrig | Absicherung |
|
||||
|
||||
---
|
||||
|
||||
# 7. Arbeitspakete
|
||||
|
||||
---
|
||||
|
||||
## AP-001 – Konfigurations-Schema einführen (additiv)
|
||||
|
||||
### Voraussetzung
|
||||
Keine.
|
||||
|
||||
### Ziel
|
||||
Das neue, verschachtelte Properties-Schema (Abschnitt 3.1) wird im Code als parsbare und validierbare Struktur eingeführt. Der bestehende Lese- und Validierungspfad bleibt **unangetastet** – das neue Schema wird parallel additiv eingeführt. Es findet **kein** Wechsel im Bootstrap und **keine** Migration in diesem AP.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Modul `pdf-umbenenner-application` (oder dem Modul, in dem die heutigen Configuration-Klassen leben – per Typsuche lokalisieren) **neue** Konfigurationstypen einführen, mindestens:
|
||||
- eine Repräsentation einer einzelnen Provider-Konfiguration (Felder: `model`, `timeoutSeconds`, `baseUrl`, `apiKey`)
|
||||
- eine Repräsentation der Provider-Auswahl (`activeProviderId`) plus Map oder zwei Felder für die beiden Provider-Familien
|
||||
- einen klar benannten Aufzählungstyp oder konstanten String-Set für die zulässigen Werte `openai-compatible` und `claude`
|
||||
2. Im Adapter-Out-Modul den **Properties-Parser** so erweitern, dass er die neuen Schlüssel aus Abschnitt 3.1 erkennt und in die neuen Typen aus Schritt 1 einliest. Der bestehende Parser für die alten flachen Schlüssel bleibt **unverändert** lauffähig (parallele Erkennung).
|
||||
3. Eine **Validierung** für die neuen Typen einführen. Sie prüft:
|
||||
- `ai.provider.active` ist gesetzt und ein zulässiger Wert
|
||||
- alle Pflichtwerte des aktiven Providers sind vorhanden (Tabelle 3.3)
|
||||
- `timeoutSeconds` ist eine positive Ganzzahl
|
||||
- für Claude: Default-`baseUrl` wird gesetzt, wenn der Wert fehlt
|
||||
- für den **inaktiven** Provider werden keine Pflichtwerte erzwungen
|
||||
- **API-Schlüssel-Auflösung:** Umgebungsvariable des aktiven Providers (Tabelle 3.4) hat Vorrang vor dem Properties-Wert; ist beides leer, ist die Konfiguration ungültig
|
||||
4. **Bootstrap und bestehende Adapter werden in diesem AP nicht umgestellt.** Die neuen Typen sind ausschließlich über neue Tests erreichbar. Der Default-Lauf der Anwendung verwendet weiterhin die alten Klassen.
|
||||
5. JavaDoc für alle neuen Klassen und Methoden ergänzen.
|
||||
6. Konfigurationsbeispiel (`*.example.properties` o.ä.) **nicht** in diesem AP ändern. Folgt in AP-002 zusammen mit der Migration.
|
||||
|
||||
### Pflicht-Testfälle (kritisch, namentlich umzusetzen)
|
||||
1. `parsesNewSchemaWithOpenAiCompatibleActive` – vollständiges neues Schema, OpenAI aktiv, alle Pflichtwerte gesetzt → erfolgreich geparst, Validierung grün.
|
||||
2. `parsesNewSchemaWithClaudeActive` – vollständiges neues Schema, Claude aktiv, alle Pflichtwerte gesetzt → erfolgreich geparst, Validierung grün.
|
||||
3. `claudeBaseUrlDefaultsWhenMissing` – Claude aktiv, `ai.provider.claude.baseUrl` fehlt → Default `https://api.anthropic.com` wird gesetzt, Validierung grün.
|
||||
4. `rejectsMissingActiveProvider` – `ai.provider.active` fehlt → Validierung schlägt fehl mit klarer Meldung.
|
||||
5. `rejectsUnknownActiveProvider` – `ai.provider.active=foo` → Validierung schlägt fehl.
|
||||
6. `rejectsMissingMandatoryFieldForActiveProvider` – aktiver Provider hat ein Pflichtfeld leer → Validierung schlägt fehl.
|
||||
7. `acceptsMissingMandatoryFieldForInactiveProvider` – inaktiver Provider unvollständig → Validierung grün.
|
||||
8. `envVarOverridesPropertiesApiKeyForActiveProvider` – `OPENAI_COMPATIBLE_API_KEY` gesetzt, Properties-Key ebenfalls gesetzt → effektiver Key ist der aus der Env-Var. Analog für `ANTHROPIC_API_KEY`.
|
||||
9. `envVarOnlyResolvesForActiveProvider` – Env-Var nur für inaktiven Provider gesetzt, aktiver Provider hat Properties-Key → effektiver Key ist der Properties-Key des aktiven Providers; die Env-Var des inaktiven Providers wird ignoriert.
|
||||
10. `bestehende Tests bleiben grün` – alle bisherigen Configuration-Tests laufen weiter.
|
||||
|
||||
Test-Kategorien zusätzlich: Unit-Tests für die neuen Typen (Equality, Defaults), Parser-Tests, Validator-Tests.
|
||||
|
||||
### Explizit NICHT Teil dieses AP
|
||||
- Migration der Legacy-Datei
|
||||
- `.bak`-Sicherung
|
||||
- Bootstrap-Umstellung
|
||||
- Änderung am bestehenden OpenAI-Adapter
|
||||
- nativer Claude-Adapter
|
||||
- Persistenz-Änderungen
|
||||
- Logging-Änderungen
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei
|
||||
- alle Pflicht-Testfälle umgesetzt und grün
|
||||
- bestehende Tests grün
|
||||
- JavaDoc vollständig für neue Klassen
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-002 – Legacy-Migration mit `.bak`
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Beim ersten Start mit erkannter Legacy-Form wird die Properties-Datei kontrolliert in das neue Schema überführt. Vor jeder Migration wird eine `.bak`-Sicherung angelegt. Nach erfolgreicher Migration läuft die Anwendung **noch** auf dem alten Bootstrap-Pfad weiter (Umschaltung folgt in AP-003); aber die Datei auf der Platte ist bereits im neuen Format und beim nächsten Start sofort durch das neue Schema lesbar.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Eine neue Komponente im Adapter-Out-Modul anlegen, die rein auf Properties-Datei-Ebene arbeitet (kein HTTP, kein DB-Zugriff). Verantwortlichkeiten:
|
||||
- Erkennen der Legacy-Form (Abschnitt 3.5)
|
||||
- `.bak`-Sicherung anlegen: `<dateiname>.bak`. Wenn `.bak` schon existiert, mit aufsteigendem numerischen Suffix sichern (`<dateiname>.bak`, `<dateiname>.bak.1`, …) – **niemals** überschreiben.
|
||||
- Werte umschreiben gemäß Tabelle:
|
||||
| Legacy | Ziel |
|
||||
|---|---|
|
||||
| `api.baseUrl` | `ai.provider.openai-compatible.baseUrl` |
|
||||
| `api.model` | `ai.provider.openai-compatible.model` |
|
||||
| `api.timeoutSeconds` | `ai.provider.openai-compatible.timeoutSeconds` |
|
||||
| `api.key` | `ai.provider.openai-compatible.apiKey` |
|
||||
- `ai.provider.active=openai-compatible` ergänzen.
|
||||
- Leere/auskommentierte Platzhalter für die Claude-Sektion einfügen mit kurzem Hinweis-Kommentar (ein Block, max. 6 Zeilen).
|
||||
- Alle übrigen Schlüssel (`source.folder`, `target.folder`, `sqlite.file`, `max.*`, `prompt.template.file`, `runtime.lock.file`, `log.*`) **unverändert** und in **stabiler Reihenfolge** übernehmen.
|
||||
- Die migrierte Datei in-place schreiben (`.tmp` + atomischer Move/Rename, kein Truncate-and-write auf das Original).
|
||||
- Anschließend die Datei erneut über den **neuen** Parser aus AP-001 laden und über den neuen Validator validieren. Schlägt das fehl, ist dies ein harter Startfehler (Exit-Code 1, klare Meldung, `.bak` bleibt erhalten).
|
||||
2. Die Migration wird beim Programmstart **vor** dem bestehenden Konfigurationsladen aufgerufen, sobald die Datei bekannt ist. Dieser Aufruf passiert im Bootstrap genau an einer Stelle und ist als eigene Methode klar benennbar.
|
||||
3. Wird **kein** Legacy erkannt (also bereits neues Schema), passiert nichts: keine `.bak`, keine Schreibvorgänge.
|
||||
4. Bestehende ConfigurationPort-Implementierung **nicht** umstellen – das passiert in AP-003. Die Anwendung läuft nach AP-002 fachlich weiter wie zuvor; ihr Eingangs-File ist nur jetzt in beiden Formen lesbar.
|
||||
5. Konfigurationsbeispiel im Repo (z. B. `*.example.properties`) auf das **neue** Schema umstellen. Die Datei zeigt beide Provider-Sektionen mit sprechenden Platzhalterwerten.
|
||||
6. JavaDoc und kurzer Abschnitt in der Repo-Doku zur Migration ergänzen (was passiert, wann, wie wird gesichert, was bei Fehler).
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `migratesLegacyFileWithAllFlatKeys` – Legacy-Datei mit allen vier `api.*`-Schlüsseln wird korrekt ins neue Schema überführt; Werte bleiben inhaltlich identisch; übrige Schlüssel bleiben unverändert.
|
||||
2. `createsBakBeforeOverwriting` – vor Migration existiert keine `.bak`, danach existiert sie mit dem **Original-Inhalt**.
|
||||
3. `bakSuffixIsIncrementedIfBakExists` – `.bak` existiert bereits → neue Sicherung als `.bak.1`. Keine Sicherung wird überschrieben.
|
||||
4. `noOpForAlreadyMigratedFile` – Datei bereits im neuen Schema → kein Schreibvorgang, kein `.bak`.
|
||||
5. `reloadAfterMigrationSucceeds` – nach Migration kann der neue Parser/Validator aus AP-001 die Datei fehlerfrei laden.
|
||||
6. `migrationFailureKeepsBak` – Migration schreibt fehlerhafte Datei (Test-Mock erzwingt Validierungsfehler nach Schreiben) → Bootstrap meldet harten Startfehler, `.bak` ist unangetastet.
|
||||
7. `legacyDetectionRequiresAtLeastOneFlatKey` – Datei mit `ai.provider.active=...` und ohne `api.*` → kein Legacy, keine Migration.
|
||||
8. `legacyValuesEndUpInOpenAiCompatibleNamespace` – Werte `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key` landen exakt in den vier Zielschlüsseln; `ai.provider.active=openai-compatible` ist gesetzt.
|
||||
9. `unrelatedKeysSurviveUnchanged` – Schlüssel wie `source.folder`, `max.pages`, `log.level` bleiben mit identischem Wert erhalten.
|
||||
10. `inPlaceWriteIsAtomic` – Test-Doppel für das Dateisystem belegt: erst `.tmp` schreiben, dann atomic move; kein Punkt, an dem das Original teilbeschrieben ist.
|
||||
|
||||
Test-Kategorien zusätzlich: temporäre Dateien in `@TempDir`, Repository-/Integrationstests für die Migrations-Komponente.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Bootstrap-Umstellung des aktiven Konfigurationspfads
|
||||
- Änderung am bestehenden OpenAI-Adapter
|
||||
- Claude-Adapter
|
||||
- Persistenz
|
||||
- Logging-Änderungen über die Migrations-Meldungen hinaus
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, alle Pflicht-Testfälle grün
|
||||
- Beispiel-Properties-Datei im neuen Schema
|
||||
- Kurz-Doku zur Migration im Repo
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-003 – Bootstrap-Provider-Auswahl und Umstellung des bestehenden OpenAI-Adapters
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Das Bootstrap-Modul wählt anhand von `ai.provider.active` genau eine `AiNamingPort`-Implementierung als aktive Implementierung aus und verdrahtet sie. Der bestehende OpenAI-kompatible Adapter konsumiert ab jetzt seine Werte aus dem Namensraum `ai.provider.openai-compatible.*`. Sein fachliches Verhalten bleibt **unverändert**. Der aktive Provider wird beim Laufstart geloggt.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Bootstrap-Modul eine **Provider-Selektor-Komponente** einführen, die als Eingabe den Wert von `ai.provider.active` und alle bekannten `AiNamingPort`-Implementierungen erhält und genau eine zurückgibt. Initial kennt sie nur die OpenAI-Implementierung; die Erweiterung um Claude erfolgt in AP-005 an genau dieser Stelle.
|
||||
2. Bestehende `AiNamingPort`-Implementierung für die OpenAI-kompatible Schnittstelle so anpassen, dass sie die Werte aus `ai.provider.openai-compatible.*` konsumiert. Der bisherige fachliche Vertrag, das Request-/Response-Mapping und das Fehlerverhalten bleiben **identisch**.
|
||||
3. Bestehenden ConfigurationPort/`Configuration`-Lesepfad so umstellen, dass intern **nur noch** das neue Schema verwendet wird. Die alten flachen Klassen/Methoden, die nur zum Lesen von `api.*` dienten, werden entfernt – aber **nur**, wenn sie nirgends sonst benötigt werden (per Suche prüfen). Falls noch Verweise existieren, wird der entsprechende Konsument im selben AP auf das neue Schema umgestellt.
|
||||
4. Bestehende Konfigurations-Tests des Repos auf das neue Schema umstellen. Tests, die explizit das alte flache Schema geprüft haben, werden zu Migrations-Tests verschoben (gehört bereits zu AP-002) **oder** auf das neue Schema umgeschrieben. Kein Test wird stillschweigend gelöscht.
|
||||
5. Logging-Anbindung erweitern: beim Laufstart wird der **aktive Provider-Identifikator** geloggt (Standard-Loglevel `INFO`). Alle übrigen geforderten Log-Inhalte (siehe `CLAUDE.md`, Logging-Mindestumfang) bleiben unverändert.
|
||||
6. Sicherstellen, dass die Sensibilitätsregel für KI-Inhalte unverändert greift und provider-unabhängig gilt.
|
||||
7. Adapter-zu-Adapter-Kopplung aktiv vermeiden: Der Provider-Selektor lebt im Bootstrap, **nicht** im Adapter-Out-Modul.
|
||||
8. JavaDoc für Selektor und betroffene Klassen ergänzen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `bootstrapWiresOpenAiCompatibleAdapterWhenActive` – `ai.provider.active=openai-compatible` → Selektor liefert die OpenAI-Implementierung.
|
||||
2. `bootstrapFailsHardWhenActiveProviderUnknown` – Wert ist syntaktisch gesetzt, aber kein gültiger Provider → harter Startfehler, Exit-Code 1.
|
||||
3. `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` – Wert ist `claude`, aber Implementierung noch nicht registriert (Zustand nach AP-003) → harter Startfehler mit klarer Meldung. Dieser Test wird in AP-005 angepasst, sobald Claude registriert ist.
|
||||
4. `openAiAdapterReadsValuesFromNewNamespace` – Adapter-Test: gegebene `ai.provider.openai-compatible.*`-Werte landen 1:1 im HTTP-Request an die bisherige Endpoint-URL.
|
||||
5. `openAiAdapterBehaviorIsUnchanged` – bestehender Adapter-Verhaltenstest (Request-Form, Response-Mapping, Fehlerklassifikation) wird auf die neue Konfigurationsquelle umgestellt und bleibt grün.
|
||||
6. `activeProviderIsLoggedAtRunStart` – Smoke- oder Bootstrap-Test belegt, dass der aktive Provider bei Laufstart in einem definierten Log-Eintrag erscheint.
|
||||
7. `existingDocumentProcessingTestsRemainGreen` – sämtliche bestehenden End-to-End-/Integrations-Tests des bestehenden OpenAI-Pfads bleiben grün, ggf. mit angepasster Konfiguration.
|
||||
8. `legacyFileEndToEndStillRuns` – Test-Doppel: Anwendung startet mit Legacy-Datei → Migration aus AP-002 läuft → Bootstrap aus AP-003 wählt OpenAI → Lauf läuft fachlich durch wie zuvor.
|
||||
|
||||
Test-Kategorien zusätzlich: Bootstrap-/Wiring-Tests, ggf. Smoke-Test ohne realen externen Aufruf.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Claude-Adapter
|
||||
- Persistenz-Erweiterung um Provider-Identifikator
|
||||
- neue Fehlersemantik
|
||||
- Refactoring außerhalb der Adapter-Anbindung
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, Pflicht-Testfälle grün
|
||||
- bestehender OpenAI-Pfad fachlich unverändert
|
||||
- aktiver Provider wird beim Laufstart geloggt
|
||||
- keine Verweise mehr auf das alte flache Schema im Produktivpfad
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-004 – Persistenz: Provider-Identifikator additiv
|
||||
|
||||
### Voraussetzung
|
||||
AP-003 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Das SQLite-Schema wird **additiv** um eine Spalte für den Provider-Identifikator je Versuch erweitert. Bestehende Datensätze bleiben lesbar und korrekt interpretierbar (Default-Wert für Altdaten). Neue Versuche schreiben den Identifikator des für den Versuch aktiven Providers.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im SQLite-Schema der Versuchshistorie eine neue Spalte hinzufügen, z. B. `ai_provider TEXT NULL` (Spaltenname per bestehender Repo-Konvention wählen, sonst wie hier vorgeschlagen). Die Spalte ist nullable.
|
||||
2. Schema-Migration umsetzen:
|
||||
- Bei Programmstart prüfen, ob die Spalte existiert; wenn nein, per `ALTER TABLE` ergänzen.
|
||||
- Vorhandene Zeilen behalten den Wert `NULL`.
|
||||
- Migration muss idempotent sein (mehrfacher Start ohne Fehler).
|
||||
3. Die Versuchshistorie-Schreiblogik so erweitern, dass beim Anlegen eines neuen Versuchs der **Identifikator des aktiv ausgewählten Providers** mitgeschrieben wird (`openai-compatible` oder `claude`). Der Wert kommt aus der bereits in AP-003 verfügbaren Provider-Auswahl.
|
||||
4. Dokument-Stammsatz wird **nicht** verändert.
|
||||
5. Lesepfad anpassen, sodass der neue Wert mitausgelesen wird; bestehende Mapper/Domain-Typen werden minimal um ein optionales Feld erweitert. Application und Domain bekommen dadurch keinen provider-spezifischen Code – das Feld bleibt ein opaker String.
|
||||
6. JavaDoc und kurzer Abschnitt zur Schema-Erweiterung in der Repo-Doku ergänzen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `addsProviderColumnOnFreshDb` – frische DB → Schema enthält neue Spalte.
|
||||
2. `addsProviderColumnOnExistingDbWithoutColumn` – DB ohne Spalte (Simulation Altbestand) → Migration legt Spalte nullable an.
|
||||
3. `migrationIsIdempotent` – mehrfacher Start ändert nichts und wirft keinen Fehler.
|
||||
4. `existingRowsKeepNullProvider` – Altzeilen behalten `NULL`.
|
||||
5. `newAttemptsWriteOpenAiCompatibleProvider` – aktiver Provider OpenAI → neuer Versuch hat `ai_provider='openai-compatible'`.
|
||||
6. `newAttemptsWriteClaudeProvider` – aktiver Provider Claude (für diesen Test wird die Provider-Auswahl gemockt; in AP-005 wird derselbe Test mit echtem Claude-Adapter wiederholt) → `ai_provider='claude'`.
|
||||
7. `repositoryReadsProviderColumn` – Repository-Test: gespeicherter Wert wird korrekt zurückgelesen.
|
||||
8. `legacyDataReadingDoesNotFail` – Test mit DB-Datei aus dem Vor-V1.1-Stand: Lesen ohne Fehler, neuer Wert ist Optional/leer.
|
||||
9. `existingHistoryTestsRemainGreen` – alle bestehenden Tests rund um die Versuchshistorie bleiben grün, ggf. mit minimaler Anpassung.
|
||||
|
||||
Test-Kategorien zusätzlich: Repository-Tests gegen echte SQLite-Instanz (in-memory oder temporär), Schema-Migrations-Tests.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Claude-Adapter (folgt in AP-005)
|
||||
- Änderungen am Dokument-Stammsatz
|
||||
- neue Wahrheitsquellen
|
||||
- Reporting/Statistiken
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, Pflicht-Testfälle grün
|
||||
- bestehende Datenbestände bleiben lesbar
|
||||
- Provider-Identifikator wird für neue Versuche geschrieben
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-005 – Nativer Anthropic-Adapter implementieren und verdrahten
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Eine zweite `AiNamingPort`-Implementierung wird im Adapter-Out-Modul angelegt, die die **native Anthropic Messages API** anspricht (siehe Faktenblock in Abschnitt 4). Sie wird im Provider-Selektor aus AP-003 als zweite Option registriert. Der Adapter bildet die Anthropic-Antwort auf den **bestehenden** fachlichen Vertrag ab; es entsteht kein Sonderweg in Application oder Domain.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Adapter-Out-Modul eine neue Klasse anlegen, die `AiNamingPort` implementiert. Naming nach bestehender Repo-Konvention; per Typsuche prüfen, wie die OpenAI-Implementierung benannt ist, und analog vorgehen.
|
||||
2. HTTP-Aufruf gemäß Faktenblock 4 umsetzen:
|
||||
- URL aus `ai.provider.claude.baseUrl` (Default `https://api.anthropic.com`) plus Pfad `/v1/messages`
|
||||
- Methode `POST`
|
||||
- Header `x-api-key`, `anthropic-version: 2023-06-01`, `content-type: application/json`
|
||||
- Request-Body mit `model`, `max_tokens`, `messages` (eine `user`-Message mit dem bestehenden Prompt-Text), optional `system` falls die bestehende Prompt-Mechanik ein System-Segment kennt
|
||||
- Timeout aus `ai.provider.claude.timeoutSeconds`
|
||||
3. API-Schlüssel-Auflösung exakt nach Tabelle 3.4: zuerst `ANTHROPIC_API_KEY`, dann `ai.provider.claude.apiKey`.
|
||||
4. Antwortverarbeitung gemäß 4.4: Konkatenation aller `content[*].text`-Blöcke in Reihenfolge. Fehlt jeder `text`-Block oder ist die Antwort nicht parsebar → technischer Adapterfehler nach Tabelle 4.5.
|
||||
5. Den so gewonnenen Antworttext **unverändert** an die bestehende Antwortverarbeitung der Anwendung weitergeben (`NamingProposal`-Validierung passiert in Application/Domain wie bisher).
|
||||
6. Fehlerklassifikation streng nach Tabelle 4.5. Keine neuen Fehlerklassen.
|
||||
7. Den Provider-Selektor aus AP-003 um die neue Implementierung erweitern. **Keine** gemeinsame Basisklasse zwischen den beiden Adaptern, **keine** Hilfsklasse, die HTTP-Logik teilt. Was beide Adapter brauchen, kommt aus dem Repo-üblichen HTTP-/JSON-Standard, nicht aus einer neuen Adapter-Zwischenschicht.
|
||||
8. Den in AP-001 angelegten Test `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` so anpassen, dass er ab jetzt auf einen neuen, weiterhin **unbekannten** Provider-Wert testet (Negativfall bleibt erhalten, aber `claude` ist jetzt registriert).
|
||||
9. Konfigurationsbeispiel im Repo um sprechende Claude-Beispielwerte ergänzen.
|
||||
10. JavaDoc für die neue Klasse und ggf. neue Hilfstypen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `claudeAdapterBuildsCorrectRequest` – gegebener Prompt → HTTP-Request mit korrekter URL (`<baseUrl>/v1/messages`), Methode POST, allen drei Pflicht-Headern, Body enthält `model`, `max_tokens > 0`, `messages` mit genau einer `user`-Message und korrektem Prompt.
|
||||
2. `claudeAdapterUsesEnvVarApiKey` – `ANTHROPIC_API_KEY` gesetzt, Properties-Wert ebenfalls → Header `x-api-key` enthält den Env-Wert.
|
||||
3. `claudeAdapterFallsBackToPropertiesApiKey` – Env-Var leer, Properties-Wert gesetzt → Header `x-api-key` enthält den Properties-Wert.
|
||||
4. `claudeAdapterFailsValidationWhenBothKeysMissing` – beides leer → Konfigurationsfehler beim Start (greift auf AP-001-Validierung).
|
||||
5. `claudeAdapterParsesSingleTextBlock` – Mock-Response mit einem Block `{type:"text", text:"..."}` → Antworttext gleich dem Block-Text.
|
||||
6. `claudeAdapterConcatenatesMultipleTextBlocks` – mehrere `text`-Blöcke → Antworttext gleich der Konkatenation in Reihenfolge.
|
||||
7. `claudeAdapterIgnoresNonTextBlocks` – Mix aus `text`- und Nicht-`text`-Blöcken → nur die `text`-Inhalte landen im Antworttext.
|
||||
8. `claudeAdapterFailsOnEmptyTextContent` – Response ohne jeden `text`-Block → technischer Adapterfehler.
|
||||
9. `claudeAdapterMapsHttp401AsTechnical` – Mock-Response 401 → technischer Fehler nach Tabelle 4.5.
|
||||
10. `claudeAdapterMapsHttp429AsTechnical` – Mock-Response 429 → technischer Fehler.
|
||||
11. `claudeAdapterMapsHttp500AsTechnical` – Mock-Response 500 → technischer Fehler.
|
||||
12. `claudeAdapterMapsTimeoutAsTechnical` – simulierter Timeout → technischer Fehler.
|
||||
13. `claudeAdapterMapsUnparseableJsonAsTechnical` – Response-Body ist kein gültiges JSON → technischer Fehler.
|
||||
14. `bootstrapSelectsClaudeWhenActive` – `ai.provider.active=claude` → Selektor liefert die Claude-Implementierung.
|
||||
15. `claudeProviderIdentifierLandsInAttemptHistory` – End-zu-End mit gemocktem HTTP-Layer: nach erfolgreichem Lauf hat der neue Versuch `ai_provider='claude'` (knüpft an AP-004 an).
|
||||
16. `existingOpenAiPathRemainsGreen` – sämtliche bestehenden Tests des OpenAI-Pfads bleiben unverändert grün.
|
||||
|
||||
Test-Kategorien zusätzlich: Adapter-Tests mit gemocktem HTTP-Client (kein realer Netzwerkzugriff), Bootstrap-Wiring-Tests.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- automatische Fallback-Logik zwischen Providern
|
||||
- gemeinsame Adapter-Basisklasse
|
||||
- Erweiterung des Persistenz-Schemas über AP-004 hinaus
|
||||
- Anpassung des Prompts (eine etwaige System-/User-Trennung der bestehenden Prompt-Datei darf genutzt werden, aber keine inhaltliche Änderung des Prompts)
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, alle Pflicht-Testfälle grün
|
||||
- nativer Anthropic-Adapter wird über Konfiguration auswählbar und liefert auf Mock-Basis korrekte Ergebnisse
|
||||
- bestehender OpenAI-Pfad unverändert grün
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-006 – Regression, Smoke, Doku-Konsolidierung, Abschlussnachweis
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige Erweiterungsstand wird automatisiert abgesichert, dokumentarisch konsolidiert und als minimale, architekturtreue Erweiterung des Basisstands belastbar nachgewiesen.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. **Smoke-Test je Provider:** Zwei Smoke-Tests einrichten, die für je eine Provider-Konfiguration den Bootstrap-Pfad bis zur erfolgreichen Verdrahtung des `AiNamingPort` durchlaufen, **ohne** realen externen HTTP-Aufruf (gemockter HTTP-Layer). Beide müssen grün sein.
|
||||
2. **Regression OpenAI:** Alle bestehenden End-to-End-/Integrations-Tests des OpenAI-Pfads laufen grün. Falls Anpassungen in vorigen APs Tests berührt haben, ist hier der finale Konsistenz-Check.
|
||||
3. **Migration Smoke:** Ein End-zu-End-Test, der mit einer Legacy-Datei (Inhalt aus der bekannten Demo-Konfig) startet und nach einem ersten Lauf folgendes nachweist:
|
||||
- `.bak` existiert mit Original-Inhalt
|
||||
- Properties-Datei ist im neuen Schema
|
||||
- `ai.provider.active=openai-compatible`
|
||||
- der Lauf hat fachlich gleich funktioniert wie mit dem neuen Schema
|
||||
4. **PIT-/Mutationstests** in den unmittelbar betroffenen Modulen ausführen, soweit bereits etabliert. Lücken im neuen Code, die deutlich unter dem bestehenden Niveau liegen, gezielt schließen. Keine willkürliche Coverage-Kosmetik.
|
||||
5. **Doku-Konsolidierung:**
|
||||
- Beispiel-Properties-Datei zeigt das vollständige neue Schema für **beide** Provider mit sprechenden Platzhaltern.
|
||||
- Repo-Doku enthält einen kurzen Abschnitt „KI-Provider auswählen" mit den zulässigen Werten und der Env-Var-Konvention (`OPENAI_COMPATIBLE_API_KEY`, `ANTHROPIC_API_KEY`).
|
||||
- Repo-Doku enthält einen kurzen Abschnitt „Migration von der Vorgängerversion" mit dem Hinweis auf `.bak`.
|
||||
- JavaDoc aller in der Erweiterung neu eingeführten oder substanziell geänderten Klassen ist vorhanden.
|
||||
6. **Abschlussnachweis:** Eine kurze, im Repository verbleibende Markdown-Datei unter `docs/workpackages/V1.1 - Abschlussnachweis.md` anlegen, die mindestens enthält:
|
||||
- Datum, betroffene Module
|
||||
- Liste der ausgeführten Pflicht-Testfälle pro AP (kann tabellarisch sein)
|
||||
- Belegte Eigenschaften: zwei Provider unterstützt, genau einer aktiv, kein Fallback, fachlicher Vertrag unverändert, Persistenz rückwärtsverträglich, Migration nachgewiesen, `.bak` nachgewiesen, aktiver Provider geloggt
|
||||
- explizite Bestätigung: keine Architekturbrüche, keine neuen Bibliotheken außer denen, die für HTTP/JSON ohnehin im Repo etabliert sind
|
||||
- Hinweis auf die Betreiberaufgabe, ggf. die Umgebungsvariable des OpenAI-Keys auf `OPENAI_COMPATIBLE_API_KEY` umzustellen
|
||||
7. Den vollständigen Reactor-Build ausführen und das Ergebnis im AP-Output festhalten.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `smokeBootstrapWithOpenAiCompatibleActive`
|
||||
2. `smokeBootstrapWithClaudeActive`
|
||||
3. `e2eMigrationFromLegacyDemoConfig`
|
||||
4. `regressionExistingOpenAiSuiteGreen` (Sammelnachweis, nicht ein einzelner Test)
|
||||
5. `e2eClaudeRunWritesProviderIdentifierToHistory`
|
||||
6. `e2eOpenAiRunWritesProviderIdentifierToHistory`
|
||||
7. `legacyDataFromBeforeV11RemainsReadable`
|
||||
|
||||
Test-Kategorien zusätzlich: Mutationstests in betroffenen Modulen, Konsistenz-Checks der Doku-Beispiele gegen den realen Parser (z. B. „Beispiel-Properties-Datei wird vom Parser ohne Fehler geladen").
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- weitere Provider
|
||||
- Komfortfunktionen
|
||||
- großflächiges Refactoring
|
||||
|
||||
### Definition of Done
|
||||
- vollständiger Reactor-Build fehlerfrei
|
||||
- alle Pflicht-Testfälle grün
|
||||
- Smoke-Tests je Provider grün
|
||||
- Doku konsolidiert
|
||||
- Abschlussnachweis-Datei im Repo
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
@@ -1,68 +0,0 @@
|
||||
pdf-umbenenner-adapter-in-cli/src/main/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/package-info.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | |
|
||||
pdf-umbenenner-adapter-in-cli/src/main/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/SchedulerBatchCommand.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | class | SchedulerBatchCommand
|
||||
pdf-umbenenner-adapter-in-cli/src/test/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/SchedulerBatchCommandTest.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | class | SchedulerBatchCommandTest
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/PropertiesConfigurationPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | class | PropertiesConfigurationPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/FilesystemRunLockPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | class | FilesystemRunLockPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/PdfTextExtractionPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | class | PdfTextExtractionPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/SourceDocumentCandidatesPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | class | SourceDocumentCandidatesPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/PropertiesConfigurationPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | class | PropertiesConfigurationPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/FilesystemRunLockPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | class | FilesystemRunLockPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/PdfTextExtractionPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | class | PdfTextExtractionPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/SourceDocumentCandidatesPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | class | SourceDocumentCandidatesPortAdapterTest
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/InvalidStartConfigurationException.java | de.gecheckt.pdf.umbenenner.application.config | class | InvalidStartConfigurationException
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/package-info.java | de.gecheckt.pdf.umbenenner.application.config | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/StartConfiguration.java | de.gecheckt.pdf.umbenenner.application.config | record | StartConfiguration
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/StartConfigurationValidator.java | de.gecheckt.pdf.umbenenner.application.config | class | StartConfigurationValidator
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/package-info.java | de.gecheckt.pdf.umbenenner.application | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/BatchRunOutcome.java | de.gecheckt.pdf.umbenenner.application.port.in | enum | BatchRunOutcome
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/package-info.java | de.gecheckt.pdf.umbenenner.application.port.in | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/RunBatchProcessingUseCase.java | de.gecheckt.pdf.umbenenner.application.port.in | interface | RunBatchProcessingUseCase
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/ClockPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | ClockPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/ConfigurationPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | ConfigurationPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/package-info.java | de.gecheckt.pdf.umbenenner.application.port.out | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/PdfTextExtractionPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | PdfTextExtractionPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/RunLockPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | RunLockPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/RunLockUnavailableException.java | de.gecheckt.pdf.umbenenner.application.port.out | class | RunLockUnavailableException
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/SourceDocumentAccessException.java | de.gecheckt.pdf.umbenenner.application.port.out | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/SourceDocumentCandidatesPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | SourceDocumentCandidatesPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/DocumentProcessingService.java | de.gecheckt.pdf.umbenenner.application.service | class | DocumentProcessingService
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/package-info.java | de.gecheckt.pdf.umbenenner.application.service | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/PreCheckEvaluator.java | de.gecheckt.pdf.umbenenner.application.service | class | PreCheckEvaluator
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/usecase/BatchRunProcessingUseCase.java | de.gecheckt.pdf.umbenenner.application.usecase | class | BatchRunProcessingUseCase
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/usecase/package-info.java | de.gecheckt.pdf.umbenenner.application.usecase | |
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/config/StartConfigurationValidatorTest.java | de.gecheckt.pdf.umbenenner.application.config | class | StartConfigurationValidatorTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/service/DocumentProcessingServiceTest.java | de.gecheckt.pdf.umbenenner.application.service | class | DocumentProcessingServiceTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/service/PreCheckEvaluatorTest.java | de.gecheckt.pdf.umbenenner.application.service | class | PreCheckEvaluatorTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/usecase/BatchRunProcessingUseCaseTest.java | de.gecheckt.pdf.umbenenner.application.usecase | class | BatchRunProcessingUseCaseTest
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/BootstrapRunner.java | de.gecheckt.pdf.umbenenner.bootstrap | class | BootstrapRunner
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/package-info.java | de.gecheckt.pdf.umbenenner.bootstrap | |
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/PdfUmbenennerApplication.java | de.gecheckt.pdf.umbenenner.bootstrap | class | PdfUmbenennerApplication
|
||||
pdf-umbenenner-bootstrap/src/test/java/de/gecheckt/pdf/umbenenner/bootstrap/BootstrapRunnerTest.java | de.gecheckt.pdf.umbenenner.bootstrap | class | BootstrapRunnerTest
|
||||
pdf-umbenenner-bootstrap/src/test/java/de/gecheckt/pdf/umbenenner/bootstrap/ExecutableJarSmokeTestIT.java | de.gecheckt.pdf.umbenenner.bootstrap | class | ExecutableJarSmokeTestIT
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/BatchRunContext.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/DocumentProcessingOutcome.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/package-info.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionContentError.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionContentError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionResult.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionSuccess.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionSuccess
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionTechnicalError.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionTechnicalError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfPageCount.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfPageCount
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckFailed.java | de.gecheckt.pdf.umbenenner.domain.model | record | PreCheckFailed
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckFailureReason.java | de.gecheckt.pdf.umbenenner.domain.model | enum | PreCheckFailureReason
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckPassed.java | de.gecheckt.pdf.umbenenner.domain.model | record | PreCheckPassed
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingDecision.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingStatus.java | de.gecheckt.pdf.umbenenner.domain.model | enum | ProcessingStatus
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/RunId.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/SourceDocumentCandidate.java | de.gecheckt.pdf.umbenenner.domain.model | record | SourceDocumentCandidate
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/SourceDocumentLocator.java | de.gecheckt.pdf.umbenenner.domain.model | record | SourceDocumentLocator
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/TechnicalDocumentError.java | de.gecheckt.pdf.umbenenner.domain.model | record | TechnicalDocumentError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/package-info.java | de.gecheckt.pdf.umbenenner.domain | |
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/BatchRunContextTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | BatchRunContextTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/DocumentProcessingOutcomeTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | DocumentProcessingOutcomeTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingStatusTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | ProcessingStatusTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/RunIdTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | RunIdTest
|
||||
@@ -1,17 +1,19 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.in.cli;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.in.cli.SchedulerBatchCommand;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.time.Instant;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunProcessingUseCase;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.time.Instant;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link SchedulerBatchCommand}.
|
||||
* <p>
|
||||
|
||||
@@ -41,6 +41,17 @@
|
||||
</dependency>
|
||||
|
||||
<!-- Test dependencies -->
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-core</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
<version>${log4j.version}</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.junit.jupiter</groupId>
|
||||
<artifactId>junit-jupiter</artifactId>
|
||||
@@ -65,6 +76,18 @@
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.pitest</groupId>
|
||||
<artifactId>pitest-maven</artifactId>
|
||||
<configuration>
|
||||
<!-- Exclude heavy pipeline integration tests from mutation analysis.
|
||||
These tests run the full batch pipeline (SQLite, PDFBox, filesystem)
|
||||
and exceed PIT minion timeouts. They remain in the normal surefire run. -->
|
||||
<excludedTestClasses>
|
||||
<param>de.gecheckt.pdf.umbenenner.adapter.out.ai.AnthropicClaudeAdapterIntegrationTest</param>
|
||||
</excludedTestClasses>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.jacoco</groupId>
|
||||
<artifactId>jacoco-maven-plugin</artifactId>
|
||||
|
||||
@@ -0,0 +1,394 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import java.net.URI;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.time.Duration;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONException;
|
||||
import org.json.JSONObject;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Adapter implementing the native Anthropic Messages API for AI service invocation.
|
||||
* <p>
|
||||
* This adapter:
|
||||
* <ul>
|
||||
* <li>Translates an abstract {@link AiRequestRepresentation} into an Anthropic
|
||||
* Messages API request (POST {@code /v1/messages})</li>
|
||||
* <li>Configures HTTP connection, timeout, and authentication from the provider
|
||||
* configuration using the Anthropic-specific authentication scheme
|
||||
* ({@code x-api-key} header, not {@code Authorization: Bearer})</li>
|
||||
* <li>Extracts the response text by concatenating all {@code text}-type content
|
||||
* blocks from the Anthropic response, returning the result as a raw response
|
||||
* for Application-layer parsing and validation</li>
|
||||
* <li>Classifies technical failures (HTTP errors, timeouts, missing content blocks,
|
||||
* unparseable JSON) according to the existing transient error semantics</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Configuration</h2>
|
||||
* <ul>
|
||||
* <li>{@code baseUrl} — the HTTP(S) base URL; defaults to {@code https://api.anthropic.com}
|
||||
* when absent or blank</li>
|
||||
* <li>{@code model} — the Claude model identifier (e.g., {@code claude-3-5-sonnet-20241022})</li>
|
||||
* <li>{@code timeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token, resolved from environment variable
|
||||
* {@code ANTHROPIC_API_KEY} or property {@code ai.provider.claude.apiKey};
|
||||
* environment variable takes precedence (resolved by the configuration layer
|
||||
* before this adapter is constructed)</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>HTTP request structure</h2>
|
||||
* <p>
|
||||
* The adapter sends a POST request to {@code {baseUrl}/v1/messages} with:
|
||||
* <ul>
|
||||
* <li>Header {@code x-api-key} containing the resolved API key</li>
|
||||
* <li>Header {@code anthropic-version: 2023-06-01}</li>
|
||||
* <li>Header {@code content-type: application/json}</li>
|
||||
* <li>JSON body containing:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code max_tokens} — fixed at 1024; sufficient for the expected JSON response
|
||||
* without requiring a separate configuration property</li>
|
||||
* <li>{@code system} — the prompt content (if non-blank); Anthropic uses a
|
||||
* top-level field instead of a {@code role=system} message</li>
|
||||
* <li>{@code messages} — an array with exactly one {@code user} message containing
|
||||
* the document text</li>
|
||||
* </ul>
|
||||
* </li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Response handling</h2>
|
||||
* <ul>
|
||||
* <li><strong>HTTP 200:</strong> All {@code content} blocks with {@code type=="text"}
|
||||
* are concatenated in order; the result is returned as {@link AiInvocationSuccess}
|
||||
* with an {@link AiRawResponse} containing the concatenated text. The Application
|
||||
* layer then parses and validates this text as a NamingProposal JSON object.</li>
|
||||
* <li><strong>No text blocks in HTTP 200 response:</strong> Classified as a technical
|
||||
* failure; the Application layer cannot derive a naming proposal without text.</li>
|
||||
* <li><strong>Unparseable response JSON:</strong> Classified as a technical failure.</li>
|
||||
* <li><strong>HTTP non-200:</strong> Classified as a technical failure.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Technical error classification</h2>
|
||||
* <p>
|
||||
* All errors are mapped to {@link AiInvocationTechnicalFailure} and follow the existing
|
||||
* transient error semantics. No new error categories are introduced:
|
||||
* <ul>
|
||||
* <li>HTTP 4xx (including 401, 403, 429) and 5xx — technical failure</li>
|
||||
* <li>Connection timeout, read timeout — {@code TIMEOUT}</li>
|
||||
* <li>Connection failure — {@code CONNECTION_ERROR}</li>
|
||||
* <li>DNS failure — {@code DNS_ERROR}</li>
|
||||
* <li>IO errors — {@code IO_ERROR}</li>
|
||||
* <li>Interrupted operation — {@code INTERRUPTED}</li>
|
||||
* <li>JSON not parseable — {@code UNPARSEABLE_JSON}</li>
|
||||
* <li>No {@code text}-type content block in response — {@code NO_TEXT_CONTENT}</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Non-goals</h2>
|
||||
* <ul>
|
||||
* <li>NamingProposal JSON parsing or validation — the Application layer owns this</li>
|
||||
* <li>Retry logic — this adapter executes a single request only</li>
|
||||
* <li>Shared implementation with the OpenAI-compatible adapter — no common base class</li>
|
||||
* </ul>
|
||||
*/
|
||||
public class AnthropicClaudeHttpAdapter implements AiInvocationPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(AnthropicClaudeHttpAdapter.class);
|
||||
|
||||
private static final String MESSAGES_ENDPOINT = "/v1/messages";
|
||||
private static final String ANTHROPIC_VERSION_HEADER = "anthropic-version";
|
||||
private static final String ANTHROPIC_VERSION_VALUE = "2023-06-01";
|
||||
private static final String API_KEY_HEADER = "x-api-key";
|
||||
private static final String CONTENT_TYPE = "application/json";
|
||||
private static final String DEFAULT_BASE_URL = "https://api.anthropic.com";
|
||||
|
||||
/**
|
||||
* Fixed max_tokens value for the Anthropic request.
|
||||
* <p>
|
||||
* This value is sufficient for the expected NamingProposal JSON response
|
||||
* ({@code date}, {@code title}, {@code reasoning}) without requiring a separate
|
||||
* configuration property. Anthropic's API requires this field to be present.
|
||||
*/
|
||||
private static final int MAX_TOKENS = 1024;
|
||||
|
||||
private final HttpClient httpClient;
|
||||
private final URI apiBaseUrl;
|
||||
private final String apiModel;
|
||||
private final String apiKey;
|
||||
private final int apiTimeoutSeconds;
|
||||
|
||||
// Test-only field to capture the last built JSON body for assertion
|
||||
private volatile String lastBuiltJsonBody;
|
||||
|
||||
/**
|
||||
* Creates an adapter from the Claude provider configuration.
|
||||
* <p>
|
||||
* If {@code config.baseUrl()} is absent or blank, the default Anthropic endpoint
|
||||
* {@code https://api.anthropic.com} is used. The HTTP client is initialized with
|
||||
* the configured timeout.
|
||||
*
|
||||
* @param config the provider configuration for the Claude family; must not be null
|
||||
* @throws NullPointerException if config is null
|
||||
* @throws IllegalArgumentException if the model is missing or blank
|
||||
*/
|
||||
public AnthropicClaudeHttpAdapter(ProviderConfiguration config) {
|
||||
this(config, HttpClient.newBuilder()
|
||||
.connectTimeout(Duration.ofSeconds(config.timeoutSeconds()))
|
||||
.build());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an adapter with a custom HTTP client (primarily for testing).
|
||||
* <p>
|
||||
* This constructor allows tests to inject a mock or configurable HTTP client
|
||||
* while keeping configuration validation consistent with the production constructor.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> This is package-private to remain internal to the adapter.
|
||||
*
|
||||
* @param config the provider configuration; must not be null
|
||||
* @param httpClient the HTTP client to use; must not be null
|
||||
* @throws NullPointerException if config or httpClient is null
|
||||
* @throws IllegalArgumentException if the model is missing or blank
|
||||
*/
|
||||
AnthropicClaudeHttpAdapter(ProviderConfiguration config, HttpClient httpClient) {
|
||||
Objects.requireNonNull(config, "config must not be null");
|
||||
Objects.requireNonNull(httpClient, "httpClient must not be null");
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
throw new IllegalArgumentException("API model must not be null or empty");
|
||||
}
|
||||
|
||||
String baseUrlStr = (config.baseUrl() != null && !config.baseUrl().isBlank())
|
||||
? config.baseUrl()
|
||||
: DEFAULT_BASE_URL;
|
||||
|
||||
this.apiBaseUrl = URI.create(baseUrlStr);
|
||||
this.apiModel = config.model();
|
||||
this.apiKey = config.apiKey() != null ? config.apiKey() : "";
|
||||
this.apiTimeoutSeconds = config.timeoutSeconds();
|
||||
this.httpClient = httpClient;
|
||||
|
||||
LOG.debug("AnthropicClaudeHttpAdapter initialized with base URL: {}, model: {}, timeout: {}s",
|
||||
apiBaseUrl, apiModel, apiTimeoutSeconds);
|
||||
}
|
||||
|
||||
/**
|
||||
* Invokes the Anthropic Claude AI service with the given request.
|
||||
* <p>
|
||||
* Constructs an Anthropic Messages API request from the request representation,
|
||||
* executes it, extracts the text content from the response, and returns either
|
||||
* a successful response or a classified technical failure.
|
||||
*
|
||||
* @param request the AI request with prompt and document text; must not be null
|
||||
* @return an {@link AiInvocationResult} encoding either success (with extracted text)
|
||||
* or a technical failure with classified reason
|
||||
* @throws NullPointerException if request is null
|
||||
*/
|
||||
@Override
|
||||
public AiInvocationResult invoke(AiRequestRepresentation request) {
|
||||
Objects.requireNonNull(request, "request must not be null");
|
||||
|
||||
try {
|
||||
HttpRequest httpRequest = buildRequest(request);
|
||||
HttpResponse<String> response = executeRequest(httpRequest);
|
||||
|
||||
if (response.statusCode() == 200) {
|
||||
return extractTextFromResponse(request, response.body());
|
||||
} else {
|
||||
String reason = "HTTP_" + response.statusCode();
|
||||
String message = "Anthropic AI service returned status " + response.statusCode();
|
||||
LOG.warn("Claude AI invocation returned non-200 status: {}", response.statusCode());
|
||||
return new AiInvocationTechnicalFailure(request, reason, message);
|
||||
}
|
||||
} catch (java.net.http.HttpTimeoutException e) {
|
||||
String message = "HTTP timeout: " + e.getClass().getSimpleName();
|
||||
LOG.warn("Claude AI invocation timeout: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "TIMEOUT", message);
|
||||
} catch (java.net.ConnectException e) {
|
||||
String message = "Failed to connect to endpoint: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation connection error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "CONNECTION_ERROR", message);
|
||||
} catch (java.net.UnknownHostException e) {
|
||||
String message = "Endpoint hostname not resolvable: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation DNS error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "DNS_ERROR", message);
|
||||
} catch (java.io.IOException e) {
|
||||
String message = "IO error during AI invocation: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation IO error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "IO_ERROR", message);
|
||||
} catch (InterruptedException e) {
|
||||
Thread.currentThread().interrupt();
|
||||
String message = "AI invocation interrupted: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation interrupted: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "INTERRUPTED", message);
|
||||
} catch (Exception e) {
|
||||
String message = "Unexpected error during AI invocation: " + e.getClass().getSimpleName()
|
||||
+ " - " + e.getMessage();
|
||||
LOG.error("Unexpected error in Claude AI invocation", e);
|
||||
return new AiInvocationTechnicalFailure(request, "UNEXPECTED_ERROR", message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds an Anthropic Messages API request from the request representation.
|
||||
* <p>
|
||||
* Constructs:
|
||||
* <ul>
|
||||
* <li>Endpoint URL: {@code {apiBaseUrl}/v1/messages}</li>
|
||||
* <li>Headers: {@code x-api-key}, {@code anthropic-version: 2023-06-01},
|
||||
* {@code content-type: application/json}</li>
|
||||
* <li>Body: JSON with {@code model}, {@code max_tokens}, optional {@code system}
|
||||
* (prompt content), and {@code messages} with a single user message
|
||||
* (document text)</li>
|
||||
* <li>Timeout: configured timeout from provider configuration</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param request the request representation with prompt and document text
|
||||
* @return an {@link HttpRequest} ready to send
|
||||
*/
|
||||
private HttpRequest buildRequest(AiRequestRepresentation request) {
|
||||
URI endpoint = buildEndpointUri();
|
||||
String requestBody = buildJsonRequestBody(request);
|
||||
// Capture for test inspection (test-only field)
|
||||
this.lastBuiltJsonBody = requestBody;
|
||||
|
||||
return HttpRequest.newBuilder(endpoint)
|
||||
.header("content-type", CONTENT_TYPE)
|
||||
.header(API_KEY_HEADER, apiKey)
|
||||
.header(ANTHROPIC_VERSION_HEADER, ANTHROPIC_VERSION_VALUE)
|
||||
.POST(HttpRequest.BodyPublishers.ofString(requestBody))
|
||||
.timeout(Duration.ofSeconds(apiTimeoutSeconds))
|
||||
.build();
|
||||
}
|
||||
|
||||
/**
|
||||
* Composes the endpoint URI from the configured base URL.
|
||||
* <p>
|
||||
* Resolves {@code {apiBaseUrl}/v1/messages}.
|
||||
*
|
||||
* @return the complete endpoint URI
|
||||
*/
|
||||
private URI buildEndpointUri() {
|
||||
String endpointPath = apiBaseUrl.getPath().replaceAll("/$", "") + MESSAGES_ENDPOINT;
|
||||
return URI.create(apiBaseUrl.getScheme() + "://" +
|
||||
apiBaseUrl.getHost() +
|
||||
(apiBaseUrl.getPort() > 0 ? ":" + apiBaseUrl.getPort() : "") +
|
||||
endpointPath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds the JSON request body for the Anthropic Messages API.
|
||||
* <p>
|
||||
* The body contains:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code max_tokens} — fixed value sufficient for the expected response</li>
|
||||
* <li>{@code system} — the prompt content as a top-level field (only when non-blank;
|
||||
* Anthropic does not accept {@code role=system} inside the {@code messages} array)</li>
|
||||
* <li>{@code messages} — an array with exactly one user message containing the
|
||||
* document text</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Package-private for testing:</strong> This method is accessible to tests
|
||||
* in the same package to verify the actual JSON body structure and content.
|
||||
*
|
||||
* @param request the request with prompt and document text
|
||||
* @return JSON string ready to send in HTTP body
|
||||
*/
|
||||
String buildJsonRequestBody(AiRequestRepresentation request) {
|
||||
JSONObject body = new JSONObject();
|
||||
body.put("model", apiModel);
|
||||
body.put("max_tokens", MAX_TOKENS);
|
||||
|
||||
// Prompt content goes to the top-level system field (not a role=system message)
|
||||
if (request.promptContent() != null && !request.promptContent().isBlank()) {
|
||||
body.put("system", request.promptContent());
|
||||
}
|
||||
|
||||
JSONObject userMessage = new JSONObject();
|
||||
userMessage.put("role", "user");
|
||||
userMessage.put("content", request.documentText());
|
||||
body.put("messages", new JSONArray().put(userMessage));
|
||||
|
||||
return body.toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Extracts the text content from a successful (HTTP 200) Anthropic response.
|
||||
* <p>
|
||||
* Concatenates all {@code content} blocks with {@code type=="text"} in order.
|
||||
* Blocks of other types (e.g., tool use) are ignored.
|
||||
* If no {@code text} blocks are present, a technical failure is returned.
|
||||
*
|
||||
* @param request the original request (carried through to the result)
|
||||
* @param responseBody the raw HTTP response body
|
||||
* @return success with the concatenated text, or a technical failure
|
||||
*/
|
||||
private AiInvocationResult extractTextFromResponse(AiRequestRepresentation request, String responseBody) {
|
||||
try {
|
||||
JSONObject json = new JSONObject(responseBody);
|
||||
JSONArray contentArray = json.getJSONArray("content");
|
||||
|
||||
StringBuilder textBuilder = new StringBuilder();
|
||||
for (int i = 0; i < contentArray.length(); i++) {
|
||||
JSONObject block = contentArray.getJSONObject(i);
|
||||
if ("text".equals(block.optString("type"))) {
|
||||
textBuilder.append(block.getString("text"));
|
||||
}
|
||||
}
|
||||
|
||||
String extractedText = textBuilder.toString();
|
||||
if (extractedText.isEmpty()) {
|
||||
LOG.warn("Claude AI response contained no text-type content blocks");
|
||||
return new AiInvocationTechnicalFailure(request, "NO_TEXT_CONTENT",
|
||||
"Anthropic response contained no text-type content blocks");
|
||||
}
|
||||
|
||||
return new AiInvocationSuccess(request, new AiRawResponse(extractedText));
|
||||
} catch (JSONException e) {
|
||||
LOG.warn("Claude AI response could not be parsed as JSON: {}", e.getMessage());
|
||||
return new AiInvocationTechnicalFailure(request, "UNPARSEABLE_JSON",
|
||||
"Anthropic response body is not valid JSON: " + e.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Package-private accessor for the last constructed JSON body.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> Allows tests to verify the actual
|
||||
* JSON body sent in HTTP requests without exposing the BodyPublisher internals.
|
||||
*
|
||||
* @return the last JSON body string constructed by {@link #buildRequest(AiRequestRepresentation)},
|
||||
* or null if no request has been built yet
|
||||
*/
|
||||
String getLastBuiltJsonBodyForTesting() {
|
||||
return lastBuiltJsonBody;
|
||||
}
|
||||
|
||||
/**
|
||||
* Executes the HTTP request and returns the response.
|
||||
*
|
||||
* @param httpRequest the HTTP request to execute
|
||||
* @return the HTTP response with status code and body
|
||||
* @throws java.net.http.HttpTimeoutException if the request times out
|
||||
* @throws java.net.ConnectException if connection fails
|
||||
* @throws java.io.IOException on other IO errors
|
||||
* @throws InterruptedException if the request is interrupted
|
||||
*/
|
||||
private HttpResponse<String> executeRequest(HttpRequest httpRequest)
|
||||
throws java.io.IOException, InterruptedException {
|
||||
return httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
|
||||
}
|
||||
}
|
||||
@@ -11,7 +11,7 @@ import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
import org.json.JSONObject;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
@@ -26,7 +26,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
* <ul>
|
||||
* <li>Translates an abstract {@link AiRequestRepresentation} into an OpenAI Chat
|
||||
* Completions API request</li>
|
||||
* <li>Configures HTTP connection, timeout, and authentication from the startup configuration</li>
|
||||
* <li>Configures HTTP connection, timeout, and authentication from the provider configuration</li>
|
||||
* <li>Executes the HTTP request against the configured AI endpoint</li>
|
||||
* <li>Distinguishes between successful HTTP responses (200) and technical failures
|
||||
* (timeout, unreachable, connection error, etc.)</li>
|
||||
@@ -36,16 +36,16 @@ import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
* <p>
|
||||
* <strong>Configuration:</strong>
|
||||
* <ul>
|
||||
* <li>{@code apiBaseUrl} — the HTTP(S) base URL of the AI service endpoint</li>
|
||||
* <li>{@code apiModel} — the model identifier requested from the AI service</li>
|
||||
* <li>{@code apiTimeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token (already resolved from environment
|
||||
* variable {@code PDF_UMBENENNER_API_KEY} or property {@code api.key},
|
||||
* <li>{@code baseUrl} — the HTTP(S) base URL of the AI service endpoint</li>
|
||||
* <li>{@code model} — the model identifier requested from the AI service</li>
|
||||
* <li>{@code timeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token (resolved from environment variable
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} or property {@code ai.provider.openai-compatible.apiKey},
|
||||
* environment variable takes precedence)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>HTTP request structure:</strong>
|
||||
* The adapter sends a POST request to the endpoint {@code {apiBaseUrl}/v1/chat/completions}
|
||||
* The adapter sends a POST request to the endpoint {@code {baseUrl}/v1/chat/completions}
|
||||
* with:
|
||||
* <ul>
|
||||
* <li>Authorization header containing the API key</li>
|
||||
@@ -106,19 +106,18 @@ public class OpenAiHttpAdapter implements AiInvocationPort {
|
||||
private volatile String lastBuiltJsonBody;
|
||||
|
||||
/**
|
||||
* Creates an adapter with configuration from startup configuration.
|
||||
* Creates an adapter from the OpenAI-compatible provider configuration.
|
||||
* <p>
|
||||
* The adapter initializes an HTTP client with the configured timeout and creates
|
||||
* the endpoint URL from the base URL. Configuration values are validated for
|
||||
* null/empty during initialization.
|
||||
* The adapter initializes an HTTP client with the configured timeout and parses
|
||||
* the endpoint URI from the configured base URL string.
|
||||
*
|
||||
* @param config the startup configuration containing API settings; must not be null
|
||||
* @param config the provider configuration for the OpenAI-compatible family; must not be null
|
||||
* @throws NullPointerException if config is null
|
||||
* @throws IllegalArgumentException if API base URL or model is missing/empty
|
||||
* @throws IllegalArgumentException if the base URL or model is missing/blank
|
||||
*/
|
||||
public OpenAiHttpAdapter(StartConfiguration config) {
|
||||
public OpenAiHttpAdapter(ProviderConfiguration config) {
|
||||
this(config, HttpClient.newBuilder()
|
||||
.connectTimeout(Duration.ofSeconds(config.apiTimeoutSeconds()))
|
||||
.connectTimeout(Duration.ofSeconds(config.timeoutSeconds()))
|
||||
.build());
|
||||
}
|
||||
|
||||
@@ -130,25 +129,25 @@ public class OpenAiHttpAdapter implements AiInvocationPort {
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> This is package-private to remain internal to the adapter.
|
||||
*
|
||||
* @param config the startup configuration containing API settings; must not be null
|
||||
* @param config the provider configuration; must not be null
|
||||
* @param httpClient the HTTP client to use; must not be null
|
||||
* @throws NullPointerException if config or httpClient is null
|
||||
* @throws IllegalArgumentException if API base URL or model is missing/empty
|
||||
* @throws IllegalArgumentException if the base URL or model is missing/blank
|
||||
*/
|
||||
OpenAiHttpAdapter(StartConfiguration config, HttpClient httpClient) {
|
||||
OpenAiHttpAdapter(ProviderConfiguration config, HttpClient httpClient) {
|
||||
Objects.requireNonNull(config, "config must not be null");
|
||||
Objects.requireNonNull(httpClient, "httpClient must not be null");
|
||||
if (config.apiBaseUrl() == null) {
|
||||
if (config.baseUrl() == null || config.baseUrl().isBlank()) {
|
||||
throw new IllegalArgumentException("API base URL must not be null");
|
||||
}
|
||||
if (config.apiModel() == null || config.apiModel().isBlank()) {
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
throw new IllegalArgumentException("API model must not be null or empty");
|
||||
}
|
||||
|
||||
this.apiBaseUrl = config.apiBaseUrl();
|
||||
this.apiModel = config.apiModel();
|
||||
this.apiBaseUrl = URI.create(config.baseUrl());
|
||||
this.apiModel = config.model();
|
||||
this.apiKey = config.apiKey() != null ? config.apiKey() : "";
|
||||
this.apiTimeoutSeconds = config.apiTimeoutSeconds();
|
||||
this.apiTimeoutSeconds = config.timeoutSeconds();
|
||||
this.httpClient = httpClient;
|
||||
|
||||
LOG.debug("OpenAiHttpAdapter initialized with base URL: {}, model: {}, timeout: {}s",
|
||||
@@ -229,7 +228,7 @@ public class OpenAiHttpAdapter implements AiInvocationPort {
|
||||
* <li>Endpoint URL: {@code {apiBaseUrl}/v1/chat/completions}</li>
|
||||
* <li>Headers: Authorization with Bearer token, Content-Type application/json</li>
|
||||
* <li>Body: JSON with model, messages (system = prompt, user = document text)</li>
|
||||
* <li>Timeout: configured timeout from startup configuration</li>
|
||||
* <li>Timeout: configured timeout from provider configuration</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param request the request representation with prompt and document text
|
||||
|
||||
@@ -1,15 +1,17 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
|
||||
|
||||
/**
|
||||
* Validates {@link StartConfiguration} before processing can begin.
|
||||
* <p>
|
||||
@@ -156,13 +158,13 @@ public class StartConfigurationValidator {
|
||||
validateSourceFolder(config.sourceFolder(), errors);
|
||||
validateTargetFolder(config.targetFolder(), errors);
|
||||
validateSqliteFile(config.sqliteFile(), errors);
|
||||
validateApiBaseUrl(config.apiBaseUrl(), errors);
|
||||
validateApiModel(config.apiModel(), errors);
|
||||
validatePromptTemplateFile(config.promptTemplateFile(), errors);
|
||||
if (config.multiProviderConfiguration() == null) {
|
||||
errors.add("- ai provider configuration: must not be null");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateNumericConstraints(StartConfiguration config, List<String> errors) {
|
||||
validateApiTimeoutSeconds(config.apiTimeoutSeconds(), errors);
|
||||
validateMaxRetriesTransient(config.maxRetriesTransient(), errors);
|
||||
validateMaxPages(config.maxPages(), errors);
|
||||
validateMaxTextCharacters(config.maxTextCharacters(), errors);
|
||||
@@ -199,36 +201,9 @@ public class StartConfigurationValidator {
|
||||
validateRequiredFileParentDirectory(sqliteFile, "sqlite.file", errors);
|
||||
}
|
||||
|
||||
private void validateApiBaseUrl(java.net.URI apiBaseUrl, List<String> errors) {
|
||||
if (apiBaseUrl == null) {
|
||||
errors.add("- api.baseUrl: must not be null");
|
||||
return;
|
||||
}
|
||||
if (!apiBaseUrl.isAbsolute()) {
|
||||
errors.add("- api.baseUrl: must be an absolute URI: " + apiBaseUrl);
|
||||
return;
|
||||
}
|
||||
String scheme = apiBaseUrl.getScheme();
|
||||
if (scheme == null || (!"http".equalsIgnoreCase(scheme) && !"https".equalsIgnoreCase(scheme))) {
|
||||
errors.add("- api.baseUrl: scheme must be http or https, got: " + scheme);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiModel(String apiModel, List<String> errors) {
|
||||
if (apiModel == null || apiModel.isBlank()) {
|
||||
errors.add("- api.model: must not be null or blank");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiTimeoutSeconds(int apiTimeoutSeconds, List<String> errors) {
|
||||
if (apiTimeoutSeconds <= 0) {
|
||||
errors.add("- api.timeoutSeconds: must be > 0, got: " + apiTimeoutSeconds);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxRetriesTransient(int maxRetriesTransient, List<String> errors) {
|
||||
if (maxRetriesTransient < 0) {
|
||||
errors.add("- max.retries.transient: must be >= 0, got: " + maxRetriesTransient);
|
||||
if (maxRetriesTransient < 1) {
|
||||
errors.add("- max.retries.transient: must be >= 1, got: " + maxRetriesTransient);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,18 @@
|
||||
/**
|
||||
* Outbound adapter for system time access.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.clock.SystemClockAdapter}
|
||||
* — Production implementation of {@link de.gecheckt.pdf.umbenenner.application.port.out.ClockPort}
|
||||
* that delegates to the JVM system clock ({@code Instant.now()}).</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The {@link de.gecheckt.pdf.umbenenner.application.port.out.ClockPort} abstraction ensures that
|
||||
* all application-layer and domain-layer code obtains the current instant through the port,
|
||||
* enabling deterministic time injection in tests without coupling to wall-clock time.
|
||||
* <p>
|
||||
* No date/time logic or formatting is performed in this package; that responsibility
|
||||
* belongs to the application layer.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.clock;
|
||||
@@ -0,0 +1,306 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.StringReader;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.StandardCopyOption;
|
||||
import java.util.Properties;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Detects and migrates a legacy flat-key configuration file to the multi-provider schema.
|
||||
*
|
||||
* <h2>Legacy form</h2>
|
||||
* A configuration file is considered legacy if it contains at least one of the flat property
|
||||
* keys ({@code api.baseUrl}, {@code api.model}, {@code api.timeoutSeconds}, {@code api.key})
|
||||
* and does <em>not</em> already contain {@code ai.provider.active}.
|
||||
*
|
||||
* <h2>Migration procedure</h2>
|
||||
* <ol>
|
||||
* <li>Detect legacy form; if absent, return immediately without any I/O side effect.</li>
|
||||
* <li>Create a {@code .bak} backup of the original file before any changes. If a {@code .bak}
|
||||
* file already exists, a numbered suffix is appended ({@code .bak.1}, {@code .bak.2}, …).
|
||||
* Existing backups are never overwritten.</li>
|
||||
* <li>Rewrite the file:
|
||||
* <ul>
|
||||
* <li>{@code api.baseUrl} → {@code ai.provider.openai-compatible.baseUrl}</li>
|
||||
* <li>{@code api.model} → {@code ai.provider.openai-compatible.model}</li>
|
||||
* <li>{@code api.timeoutSeconds} → {@code ai.provider.openai-compatible.timeoutSeconds}</li>
|
||||
* <li>{@code api.key} → {@code ai.provider.openai-compatible.apiKey}</li>
|
||||
* <li>{@code ai.provider.active=openai-compatible} is appended.</li>
|
||||
* <li>A commented placeholder section for the Claude provider is appended.</li>
|
||||
* <li>All other keys are carried over unchanged in stable order.</li>
|
||||
* </ul>
|
||||
* </li>
|
||||
* <li>Write the migrated content via a temporary file ({@code <file>.tmp}) followed by an
|
||||
* atomic move/rename. The original file is never partially overwritten.</li>
|
||||
* <li>Reload the migrated file and validate it with {@link MultiProviderConfigurationParser}
|
||||
* and {@link MultiProviderConfigurationValidator}. If validation fails, a
|
||||
* {@link ConfigurationLoadingException} is thrown; the {@code .bak} is preserved.</li>
|
||||
* </ol>
|
||||
*/
|
||||
public class LegacyConfigurationMigrator {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(LegacyConfigurationMigrator.class);
|
||||
|
||||
/** Legacy flat key for base URL, replaced during migration. */
|
||||
static final String LEGACY_BASE_URL = "api.baseUrl";
|
||||
|
||||
/** Legacy flat key for model name, replaced during migration. */
|
||||
static final String LEGACY_MODEL = "api.model";
|
||||
|
||||
/** Legacy flat key for timeout, replaced during migration. */
|
||||
static final String LEGACY_TIMEOUT = "api.timeoutSeconds";
|
||||
|
||||
/** Legacy flat key for API key, replaced during migration. */
|
||||
static final String LEGACY_API_KEY = "api.key";
|
||||
|
||||
private static final String[][] LEGACY_KEY_MAPPINGS = {
|
||||
{LEGACY_BASE_URL, "ai.provider.openai-compatible.baseUrl"},
|
||||
{LEGACY_MODEL, "ai.provider.openai-compatible.model"},
|
||||
{LEGACY_TIMEOUT, "ai.provider.openai-compatible.timeoutSeconds"},
|
||||
{LEGACY_API_KEY, "ai.provider.openai-compatible.apiKey"},
|
||||
};
|
||||
|
||||
private final MultiProviderConfigurationParser parser;
|
||||
private final MultiProviderConfigurationValidator validator;
|
||||
|
||||
/**
|
||||
* Creates a migrator backed by default parser and validator instances.
|
||||
*/
|
||||
public LegacyConfigurationMigrator() {
|
||||
this(new MultiProviderConfigurationParser(), new MultiProviderConfigurationValidator());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a migrator with injected parser and validator.
|
||||
* <p>
|
||||
* Intended for testing, where a controlled (e.g. always-failing) validator can be supplied
|
||||
* to verify that the {@code .bak} backup is preserved when post-migration validation fails.
|
||||
*
|
||||
* @param parser parser used to re-read the migrated file; must not be {@code null}
|
||||
* @param validator validator used to verify the migrated file; must not be {@code null}
|
||||
*/
|
||||
public LegacyConfigurationMigrator(MultiProviderConfigurationParser parser,
|
||||
MultiProviderConfigurationValidator validator) {
|
||||
this.parser = parser;
|
||||
this.validator = validator;
|
||||
}
|
||||
|
||||
/**
|
||||
* Migrates the configuration file at {@code configFilePath} if it is in legacy form.
|
||||
* <p>
|
||||
* If the file does not contain legacy flat keys or already contains
|
||||
* {@code ai.provider.active}, this method returns immediately without any I/O side effect.
|
||||
*
|
||||
* @param configFilePath path to the configuration file; must exist and be readable
|
||||
* @throws ConfigurationLoadingException if the file cannot be read, the backup cannot be
|
||||
* created, the migrated file cannot be written, or post-migration validation fails
|
||||
*/
|
||||
public void migrateIfLegacy(Path configFilePath) {
|
||||
String originalContent = readFile(configFilePath);
|
||||
Properties props = parsePropertiesFromContent(originalContent);
|
||||
|
||||
if (!isLegacyForm(props)) {
|
||||
return;
|
||||
}
|
||||
|
||||
LOG.info("Legacy configuration format detected. Migrating: {}", configFilePath);
|
||||
|
||||
createBakBackup(configFilePath, originalContent);
|
||||
|
||||
String migratedContent = generateMigratedContent(originalContent);
|
||||
writeAtomically(configFilePath, migratedContent);
|
||||
|
||||
LOG.info("Configuration file migrated to multi-provider schema: {}", configFilePath);
|
||||
|
||||
validateMigratedFile(configFilePath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns {@code true} if the given properties are in legacy form.
|
||||
* <p>
|
||||
* A properties set is considered legacy when it contains at least one of the four
|
||||
* flat legacy keys and does not already contain {@code ai.provider.active}.
|
||||
*
|
||||
* @param props the parsed properties to inspect; must not be {@code null}
|
||||
* @return {@code true} if migration is required, {@code false} otherwise
|
||||
*/
|
||||
boolean isLegacyForm(Properties props) {
|
||||
boolean hasLegacyKey = props.containsKey(LEGACY_BASE_URL)
|
||||
|| props.containsKey(LEGACY_MODEL)
|
||||
|| props.containsKey(LEGACY_TIMEOUT)
|
||||
|| props.containsKey(LEGACY_API_KEY);
|
||||
boolean hasNewKey = props.containsKey(MultiProviderConfigurationParser.PROP_ACTIVE_PROVIDER);
|
||||
return hasLegacyKey && !hasNewKey;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a backup of the original file before overwriting it.
|
||||
* <p>
|
||||
* If {@code <file>.bak} does not yet exist, it is written directly. Otherwise,
|
||||
* numbered suffixes ({@code .bak.1}, {@code .bak.2}, …) are tried in ascending order
|
||||
* until a free slot is found. Existing backups are never overwritten.
|
||||
*/
|
||||
private void createBakBackup(Path configFilePath, String content) {
|
||||
Path bakPath = configFilePath.resolveSibling(configFilePath.getFileName() + ".bak");
|
||||
if (!Files.exists(bakPath)) {
|
||||
writeFile(bakPath, content);
|
||||
LOG.info("Backup created: {}", bakPath);
|
||||
return;
|
||||
}
|
||||
for (int i = 1; ; i++) {
|
||||
Path numbered = configFilePath.resolveSibling(configFilePath.getFileName() + ".bak." + i);
|
||||
if (!Files.exists(numbered)) {
|
||||
writeFile(numbered, content);
|
||||
LOG.info("Backup created: {}", numbered);
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Produces the migrated file content from the given original content string.
|
||||
* <p>
|
||||
* Each line is inspected: lines that define a legacy key are rewritten with the
|
||||
* corresponding new namespaced key; all other lines (comments, blank lines, other keys)
|
||||
* pass through unchanged. After all original lines, a {@code ai.provider.active} entry
|
||||
* and a commented Claude-provider placeholder block are appended.
|
||||
*
|
||||
* @param originalContent the raw original file content; must not be {@code null}
|
||||
* @return the migrated content ready to be written to disk
|
||||
*/
|
||||
String generateMigratedContent(String originalContent) {
|
||||
String[] lines = originalContent.split("\\r?\\n", -1);
|
||||
StringBuilder sb = new StringBuilder();
|
||||
for (String line : lines) {
|
||||
sb.append(transformLine(line)).append("\n");
|
||||
}
|
||||
sb.append("\n");
|
||||
sb.append("# Aktiver KI-Provider: openai-compatible oder claude\n");
|
||||
sb.append("ai.provider.active=openai-compatible\n");
|
||||
sb.append("\n");
|
||||
sb.append("# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)\n");
|
||||
sb.append("# ai.provider.claude.model=\n");
|
||||
sb.append("# ai.provider.claude.timeoutSeconds=\n");
|
||||
sb.append("# ai.provider.claude.apiKey=\n");
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms a single properties-file line, replacing a legacy key with its new equivalent.
|
||||
* <p>
|
||||
* Comment lines, blank lines, and lines defining keys other than the four legacy keys
|
||||
* are returned unchanged.
|
||||
*/
|
||||
private String transformLine(String line) {
|
||||
for (String[] mapping : LEGACY_KEY_MAPPINGS) {
|
||||
String legacyKey = mapping[0];
|
||||
String newKey = mapping[1];
|
||||
if (lineDefinesKey(line, legacyKey)) {
|
||||
int keyStart = line.indexOf(legacyKey);
|
||||
return line.substring(0, keyStart) + newKey + line.substring(keyStart + legacyKey.length());
|
||||
}
|
||||
}
|
||||
return line;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns {@code true} when {@code line} defines the given {@code key}.
|
||||
* <p>
|
||||
* A line defines a key if — after stripping any leading whitespace — it starts with
|
||||
* the exact key string followed by {@code =}, {@code :}, whitespace, or end-of-string.
|
||||
* Comment-introducing characters ({@code #} or {@code !}) cause an immediate {@code false}.
|
||||
*/
|
||||
private boolean lineDefinesKey(String line, String key) {
|
||||
String trimmed = line.stripLeading();
|
||||
if (trimmed.isEmpty() || trimmed.startsWith("#") || trimmed.startsWith("!")) {
|
||||
return false;
|
||||
}
|
||||
if (!trimmed.startsWith(key)) {
|
||||
return false;
|
||||
}
|
||||
if (trimmed.length() == key.length()) {
|
||||
return true;
|
||||
}
|
||||
char next = trimmed.charAt(key.length());
|
||||
return next == '=' || next == ':' || Character.isWhitespace(next);
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes {@code content} to {@code target} via a temporary file and an atomic rename.
|
||||
* <p>
|
||||
* The temporary file is created as {@code <target>.tmp} in the same directory.
|
||||
* After the content is fully written, the temporary file is moved to {@code target},
|
||||
* replacing it. The original file is therefore never partially overwritten.
|
||||
*/
|
||||
private void writeAtomically(Path target, String content) {
|
||||
Path tmpPath = target.resolveSibling(target.getFileName() + ".tmp");
|
||||
try {
|
||||
Files.writeString(tmpPath, content, StandardCharsets.UTF_8);
|
||||
Files.move(tmpPath, target, StandardCopyOption.REPLACE_EXISTING);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Failed to write migrated configuration to " + target, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Re-reads the migrated file and validates it using the injected parser and validator.
|
||||
* <p>
|
||||
* A parse or validation failure is treated as a hard startup error. The {@code .bak} backup
|
||||
* created before migration is preserved in this case.
|
||||
*/
|
||||
private void validateMigratedFile(Path configFilePath) {
|
||||
String content = readFile(configFilePath);
|
||||
Properties props = parsePropertiesFromContent(content);
|
||||
|
||||
MultiProviderConfiguration config;
|
||||
try {
|
||||
config = parser.parse(props);
|
||||
} catch (ConfigurationLoadingException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Migrated configuration failed to parse: " + e.getMessage(), e);
|
||||
}
|
||||
|
||||
try {
|
||||
validator.validate(config);
|
||||
} catch (InvalidStartConfigurationException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Migrated configuration failed validation (backup preserved): " + e.getMessage(), e);
|
||||
}
|
||||
}
|
||||
|
||||
private String readFile(Path path) {
|
||||
try {
|
||||
return Files.readString(path, StandardCharsets.UTF_8);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to read file: " + path, e);
|
||||
}
|
||||
}
|
||||
|
||||
private void writeFile(Path path, String content) {
|
||||
try {
|
||||
Files.writeString(path, content, StandardCharsets.UTF_8);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to write file: " + path, e);
|
||||
}
|
||||
}
|
||||
|
||||
private Properties parsePropertiesFromContent(String content) {
|
||||
Properties props = new Properties();
|
||||
try {
|
||||
props.load(new StringReader(content));
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to parse properties content", e);
|
||||
}
|
||||
return props;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,239 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.util.Properties;
|
||||
import java.util.function.Function;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Parses the multi-provider configuration schema from a {@link Properties} object.
|
||||
* <p>
|
||||
* Recognises the following property keys:
|
||||
* <pre>
|
||||
* ai.provider.active – required; must be "openai-compatible" or "claude"
|
||||
* ai.provider.openai-compatible.baseUrl – required for active OpenAI-compatible provider
|
||||
* ai.provider.openai-compatible.model – required for active OpenAI-compatible provider
|
||||
* ai.provider.openai-compatible.timeoutSeconds
|
||||
* ai.provider.openai-compatible.apiKey
|
||||
* ai.provider.claude.baseUrl – optional; defaults to https://api.anthropic.com
|
||||
* ai.provider.claude.model – required for active Claude provider
|
||||
* ai.provider.claude.timeoutSeconds
|
||||
* ai.provider.claude.apiKey
|
||||
* </pre>
|
||||
*
|
||||
* <h2>Environment-variable precedence for API keys</h2>
|
||||
* <ul>
|
||||
* <li>{@code OPENAI_COMPATIBLE_API_KEY} overrides {@code ai.provider.openai-compatible.apiKey}</li>
|
||||
* <li>{@code ANTHROPIC_API_KEY} overrides {@code ai.provider.claude.apiKey}</li>
|
||||
* </ul>
|
||||
* Each environment variable is applied only to its own provider family; the variables
|
||||
* of different families are never mixed.
|
||||
*
|
||||
* <h2>Error handling</h2>
|
||||
* <ul>
|
||||
* <li>If {@code ai.provider.active} is absent or blank, a {@link ConfigurationLoadingException}
|
||||
* is thrown.</li>
|
||||
* <li>If {@code ai.provider.active} holds an unrecognised value, a
|
||||
* {@link ConfigurationLoadingException} is thrown.</li>
|
||||
* <li>If a {@code timeoutSeconds} property is present but not a valid integer, a
|
||||
* {@link ConfigurationLoadingException} is thrown.</li>
|
||||
* <li>Missing optional fields result in {@code null} (String) or {@code 0} (int) stored in
|
||||
* the returned record; the validator enforces required fields for the active provider.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>The returned {@link MultiProviderConfiguration} is not yet validated. Use
|
||||
* {@link MultiProviderConfigurationValidator} after parsing.
|
||||
*/
|
||||
public class MultiProviderConfigurationParser {
|
||||
|
||||
/** Property key selecting the active provider family. */
|
||||
static final String PROP_ACTIVE_PROVIDER = "ai.provider.active";
|
||||
|
||||
static final String PROP_OPENAI_BASE_URL = "ai.provider.openai-compatible.baseUrl";
|
||||
static final String PROP_OPENAI_MODEL = "ai.provider.openai-compatible.model";
|
||||
static final String PROP_OPENAI_TIMEOUT = "ai.provider.openai-compatible.timeoutSeconds";
|
||||
static final String PROP_OPENAI_API_KEY = "ai.provider.openai-compatible.apiKey";
|
||||
|
||||
static final String PROP_CLAUDE_BASE_URL = "ai.provider.claude.baseUrl";
|
||||
static final String PROP_CLAUDE_MODEL = "ai.provider.claude.model";
|
||||
static final String PROP_CLAUDE_TIMEOUT = "ai.provider.claude.timeoutSeconds";
|
||||
static final String PROP_CLAUDE_API_KEY = "ai.provider.claude.apiKey";
|
||||
|
||||
/** Environment variable for the OpenAI-compatible provider API key. */
|
||||
static final String ENV_OPENAI_API_KEY = "OPENAI_COMPATIBLE_API_KEY";
|
||||
|
||||
/**
|
||||
* Legacy environment variable for the OpenAI-compatible provider API key.
|
||||
* <p>
|
||||
* Accepted as a fallback when {@code OPENAI_COMPATIBLE_API_KEY} is not set.
|
||||
* Existing installations that set this variable continue to work without change.
|
||||
* New installations should prefer {@code OPENAI_COMPATIBLE_API_KEY}.
|
||||
*/
|
||||
static final String ENV_LEGACY_OPENAI_API_KEY = "PDF_UMBENENNER_API_KEY";
|
||||
|
||||
/** Environment variable for the Anthropic Claude provider API key. */
|
||||
static final String ENV_CLAUDE_API_KEY = "ANTHROPIC_API_KEY";
|
||||
|
||||
/** Default base URL for the Anthropic Claude provider when not explicitly configured. */
|
||||
static final String CLAUDE_DEFAULT_BASE_URL = "https://api.anthropic.com";
|
||||
|
||||
private final Function<String, String> environmentLookup;
|
||||
|
||||
/**
|
||||
* Creates a parser that uses the real system environment for API key resolution.
|
||||
*/
|
||||
public MultiProviderConfigurationParser() {
|
||||
this(System::getenv);
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a parser with a custom environment lookup function.
|
||||
* <p>
|
||||
* This constructor is intended for testing to allow deterministic control over
|
||||
* environment variable values without modifying the real process environment.
|
||||
*
|
||||
* @param environmentLookup a function that maps environment variable names to their values;
|
||||
* must not be {@code null}
|
||||
*/
|
||||
public MultiProviderConfigurationParser(Function<String, String> environmentLookup) {
|
||||
this.environmentLookup = environmentLookup;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses the multi-provider configuration from the given properties.
|
||||
* <p>
|
||||
* The Claude default base URL ({@code https://api.anthropic.com}) is applied when
|
||||
* {@code ai.provider.claude.baseUrl} is absent. API keys are resolved with environment
|
||||
* variable precedence. The resulting configuration is not yet validated; call
|
||||
* {@link MultiProviderConfigurationValidator#validate(MultiProviderConfiguration)} afterward.
|
||||
*
|
||||
* @param props the properties to parse; must not be {@code null}
|
||||
* @return the parsed (but not yet validated) multi-provider configuration
|
||||
* @throws ConfigurationLoadingException if {@code ai.provider.active} is absent, blank,
|
||||
* or holds an unrecognised value, or if any present timeout property is not a
|
||||
* valid integer
|
||||
*/
|
||||
public MultiProviderConfiguration parse(Properties props) {
|
||||
AiProviderFamily activeFamily = parseActiveProvider(props);
|
||||
ProviderConfiguration openAiConfig = parseOpenAiCompatibleConfig(props);
|
||||
ProviderConfiguration claudeConfig = parseClaudeConfig(props);
|
||||
return new MultiProviderConfiguration(activeFamily, openAiConfig, claudeConfig);
|
||||
}
|
||||
|
||||
private AiProviderFamily parseActiveProvider(Properties props) {
|
||||
String raw = props.getProperty(PROP_ACTIVE_PROVIDER);
|
||||
if (raw == null || raw.isBlank()) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Required property missing or blank: " + PROP_ACTIVE_PROVIDER
|
||||
+ ". Valid values: openai-compatible, claude");
|
||||
}
|
||||
String trimmed = raw.trim();
|
||||
return AiProviderFamily.fromIdentifier(trimmed).orElseThrow(() ->
|
||||
new ConfigurationLoadingException(
|
||||
"Unknown provider identifier for " + PROP_ACTIVE_PROVIDER + ": '" + trimmed
|
||||
+ "'. Valid values: openai-compatible, claude"));
|
||||
}
|
||||
|
||||
private ProviderConfiguration parseOpenAiCompatibleConfig(Properties props) {
|
||||
String model = getOptionalString(props, PROP_OPENAI_MODEL);
|
||||
int timeout = parseTimeoutSeconds(props, PROP_OPENAI_TIMEOUT);
|
||||
String baseUrl = getOptionalString(props, PROP_OPENAI_BASE_URL);
|
||||
String apiKey = resolveOpenAiApiKey(props);
|
||||
return new ProviderConfiguration(model, timeout, baseUrl, apiKey);
|
||||
}
|
||||
|
||||
private ProviderConfiguration parseClaudeConfig(Properties props) {
|
||||
String model = getOptionalString(props, PROP_CLAUDE_MODEL);
|
||||
int timeout = parseTimeoutSeconds(props, PROP_CLAUDE_TIMEOUT);
|
||||
String baseUrl = getStringOrDefault(props, PROP_CLAUDE_BASE_URL, CLAUDE_DEFAULT_BASE_URL);
|
||||
String apiKey = resolveApiKey(props, PROP_CLAUDE_API_KEY, ENV_CLAUDE_API_KEY);
|
||||
return new ProviderConfiguration(model, timeout, baseUrl, apiKey);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the trimmed property value, or {@code null} if absent or blank.
|
||||
*/
|
||||
private String getOptionalString(Properties props, String key) {
|
||||
String value = props.getProperty(key);
|
||||
return (value == null || value.isBlank()) ? null : value.trim();
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the trimmed property value, or the {@code defaultValue} if absent or blank.
|
||||
*/
|
||||
private String getStringOrDefault(Properties props, String key, String defaultValue) {
|
||||
String value = props.getProperty(key);
|
||||
return (value == null || value.isBlank()) ? defaultValue : value.trim();
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses a timeout property as a positive integer.
|
||||
* <p>
|
||||
* Returns {@code 0} when the property is absent or blank (indicating "not configured").
|
||||
* Throws {@link ConfigurationLoadingException} when the property is present but not
|
||||
* parseable as an integer.
|
||||
*/
|
||||
private int parseTimeoutSeconds(Properties props, String key) {
|
||||
String value = props.getProperty(key);
|
||||
if (value == null || value.isBlank()) {
|
||||
return 0;
|
||||
}
|
||||
try {
|
||||
return Integer.parseInt(value.trim());
|
||||
} catch (NumberFormatException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Invalid integer value for property " + key + ": '" + value.trim() + "'", e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the effective API key for the OpenAI-compatible provider.
|
||||
* <p>
|
||||
* Resolution order:
|
||||
* <ol>
|
||||
* <li>{@code OPENAI_COMPATIBLE_API_KEY} environment variable</li>
|
||||
* <li>{@code PDF_UMBENENNER_API_KEY} environment variable (legacy fallback;
|
||||
* accepted for backward compatibility with existing installations)</li>
|
||||
* <li>{@code ai.provider.openai-compatible.apiKey} property</li>
|
||||
* </ol>
|
||||
*
|
||||
* @param props the configuration properties
|
||||
* @return the resolved API key; never {@code null}, but may be blank
|
||||
*/
|
||||
private String resolveOpenAiApiKey(Properties props) {
|
||||
String primary = environmentLookup.apply(ENV_OPENAI_API_KEY);
|
||||
if (primary != null && !primary.isBlank()) {
|
||||
return primary.trim();
|
||||
}
|
||||
String legacy = environmentLookup.apply(ENV_LEGACY_OPENAI_API_KEY);
|
||||
if (legacy != null && !legacy.isBlank()) {
|
||||
return legacy.trim();
|
||||
}
|
||||
String propsValue = props.getProperty(PROP_OPENAI_API_KEY);
|
||||
return (propsValue != null) ? propsValue.trim() : "";
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the effective API key for a provider family.
|
||||
* <p>
|
||||
* The environment variable value takes precedence over the properties value.
|
||||
* If the environment variable is absent or blank, the properties value is used.
|
||||
* If both are absent or blank, an empty string is returned (the validator will
|
||||
* reject this for the active provider).
|
||||
*
|
||||
* @param props the configuration properties
|
||||
* @param propertyKey the property key for the API key of this provider family
|
||||
* @param envVarName the environment variable name for this provider family
|
||||
* @return the resolved API key; never {@code null}, but may be blank
|
||||
*/
|
||||
private String resolveApiKey(Properties props, String propertyKey, String envVarName) {
|
||||
String envValue = environmentLookup.apply(envVarName);
|
||||
if (envValue != null && !envValue.isBlank()) {
|
||||
return envValue.trim();
|
||||
}
|
||||
String propsValue = props.getProperty(propertyKey);
|
||||
return (propsValue != null) ? propsValue.trim() : "";
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,132 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.net.URI;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Validates a {@link MultiProviderConfiguration} before the application run begins.
|
||||
* <p>
|
||||
* Enforces all requirements for the active provider:
|
||||
* <ul>
|
||||
* <li>{@code ai.provider.active} refers to a recognised provider family.</li>
|
||||
* <li>{@code model} is non-blank.</li>
|
||||
* <li>{@code timeoutSeconds} is a positive integer.</li>
|
||||
* <li>{@code baseUrl} is a syntactically valid absolute URI with scheme {@code http} or
|
||||
* {@code https} (required for the OpenAI-compatible family; the Claude family always
|
||||
* has a default, but it is validated with the same rules).</li>
|
||||
* <li>{@code apiKey} is non-blank after environment-variable precedence has been applied
|
||||
* by {@link MultiProviderConfigurationParser}.</li>
|
||||
* </ul>
|
||||
* Required fields of the <em>inactive</em> provider are intentionally not enforced.
|
||||
* <p>
|
||||
* Validation errors are aggregated and reported together in a single
|
||||
* {@link InvalidStartConfigurationException}.
|
||||
*/
|
||||
public class MultiProviderConfigurationValidator {
|
||||
|
||||
/**
|
||||
* Validates the given multi-provider configuration.
|
||||
* <p>
|
||||
* Only the active provider's required fields are validated. The inactive provider's
|
||||
* configuration may be incomplete.
|
||||
*
|
||||
* @param config the configuration to validate; must not be {@code null}
|
||||
* @throws InvalidStartConfigurationException if any validation rule fails, with an aggregated
|
||||
* message listing all problems found
|
||||
*/
|
||||
public void validate(MultiProviderConfiguration config) {
|
||||
List<String> errors = new ArrayList<>();
|
||||
|
||||
validateActiveProvider(config, errors);
|
||||
|
||||
if (!errors.isEmpty()) {
|
||||
throw new InvalidStartConfigurationException(
|
||||
"Invalid AI provider configuration:\n" + String.join("\n", errors));
|
||||
}
|
||||
}
|
||||
|
||||
private void validateActiveProvider(MultiProviderConfiguration config, List<String> errors) {
|
||||
AiProviderFamily activeFamily = config.activeProviderFamily();
|
||||
if (activeFamily == null) {
|
||||
// Parser already throws for missing/unknown ai.provider.active,
|
||||
// but guard defensively in case the record is constructed directly in tests.
|
||||
errors.add("- ai.provider.active: must be set to a supported provider "
|
||||
+ "(openai-compatible, claude)");
|
||||
return;
|
||||
}
|
||||
|
||||
ProviderConfiguration providerConfig = config.activeProviderConfiguration();
|
||||
String providerLabel = "ai.provider." + activeFamily.getIdentifier();
|
||||
|
||||
validateModel(providerConfig, providerLabel, errors);
|
||||
validateTimeoutSeconds(providerConfig, providerLabel, errors);
|
||||
validateBaseUrl(activeFamily, providerConfig, providerLabel, errors);
|
||||
validateApiKey(providerConfig, providerLabel, errors);
|
||||
}
|
||||
|
||||
private void validateModel(ProviderConfiguration config, String providerLabel, List<String> errors) {
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
errors.add("- " + providerLabel + ".model: must not be blank");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateTimeoutSeconds(ProviderConfiguration config, String providerLabel,
|
||||
List<String> errors) {
|
||||
if (config.timeoutSeconds() <= 0) {
|
||||
errors.add("- " + providerLabel + ".timeoutSeconds: must be a positive integer, got: "
|
||||
+ config.timeoutSeconds());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the base URL of the active provider.
|
||||
* <p>
|
||||
* The URL must be:
|
||||
* <ul>
|
||||
* <li>non-blank</li>
|
||||
* <li>a syntactically valid URI</li>
|
||||
* <li>an absolute URI (has a scheme component)</li>
|
||||
* <li>using scheme {@code http} or {@code https}</li>
|
||||
* </ul>
|
||||
* The OpenAI-compatible family requires an explicit base URL.
|
||||
* The Claude family always has a default ({@code https://api.anthropic.com}) applied by the
|
||||
* parser, so this check serves both as a primary and safety-net enforcement.
|
||||
*/
|
||||
private void validateBaseUrl(AiProviderFamily family, ProviderConfiguration config,
|
||||
String providerLabel, List<String> errors) {
|
||||
String baseUrl = config.baseUrl();
|
||||
if (baseUrl == null || baseUrl.isBlank()) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: must not be blank");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
URI uri = URI.create(baseUrl);
|
||||
if (!uri.isAbsolute()) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: must be an absolute URI with http or https scheme, got: '"
|
||||
+ baseUrl + "'");
|
||||
return;
|
||||
}
|
||||
String scheme = uri.getScheme();
|
||||
if (!"http".equalsIgnoreCase(scheme) && !"https".equalsIgnoreCase(scheme)) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: scheme must be http or https, got: '"
|
||||
+ scheme + "' in '" + baseUrl + "'");
|
||||
}
|
||||
} catch (IllegalArgumentException e) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: not a valid URI: '" + baseUrl + "' (" + e.getMessage() + ")");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiKey(ProviderConfiguration config, String providerLabel,
|
||||
List<String> errors) {
|
||||
if (config.apiKey() == null || config.apiKey().isBlank()) {
|
||||
errors.add("- " + providerLabel + ".apiKey: must not be blank "
|
||||
+ "(set via environment variable or properties)");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -2,8 +2,6 @@ package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.StringReader;
|
||||
import java.net.URI;
|
||||
import java.net.URISyntaxException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
@@ -14,22 +12,24 @@ import java.util.function.Function;
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort;
|
||||
|
||||
/**
|
||||
* Properties-based implementation of {@link ConfigurationPort}.
|
||||
* <p>
|
||||
* Loads configuration from config/application.properties as the primary source.
|
||||
* For sensitive values, environment variables take precedence: if the environment variable
|
||||
* {@code PDF_UMBENENNER_API_KEY} is set, it overrides the {@code api.key} property from the file.
|
||||
* This allows credentials to be managed securely without storing them in the configuration file.
|
||||
* Loads configuration from {@code config/application.properties} as the primary source.
|
||||
* The multi-provider AI configuration is parsed via {@link MultiProviderConfigurationParser}
|
||||
* and validated via {@link MultiProviderConfigurationValidator}. Environment variables
|
||||
* for API keys are resolved by the parser with provider-specific precedence rules:
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} for the OpenAI-compatible family and
|
||||
* {@code ANTHROPIC_API_KEY} for the Anthropic Claude family.
|
||||
*/
|
||||
public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(PropertiesConfigurationPortAdapter.class);
|
||||
private static final String DEFAULT_CONFIG_FILE_PATH = "config/application.properties";
|
||||
private static final String API_KEY_ENV_VAR = "PDF_UMBENENNER_API_KEY";
|
||||
|
||||
private final Function<String, String> environmentLookup;
|
||||
private final Path configFilePath;
|
||||
@@ -81,8 +81,9 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
@Override
|
||||
public StartConfiguration loadConfiguration() {
|
||||
Properties props = loadPropertiesFile();
|
||||
String apiKey = getApiKey(props);
|
||||
return buildStartConfiguration(props, apiKey);
|
||||
MultiProviderConfiguration multiProviderConfig = parseAndValidateProviders(props);
|
||||
boolean logAiSensitive = parseAiContentSensitivity(props);
|
||||
return buildStartConfiguration(props, multiProviderConfig, logAiSensitive);
|
||||
}
|
||||
|
||||
private Properties loadPropertiesFile() {
|
||||
@@ -100,21 +101,28 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
return props;
|
||||
}
|
||||
|
||||
private String escapeBackslashes(String content) {
|
||||
// Escape backslashes to prevent Java Properties from interpreting them as escape sequences.
|
||||
// This is needed because Windows paths use backslashes (e.g., C:\temp\...)
|
||||
// and Java Properties interprets \t as tab, \n as newline, etc.
|
||||
return content.replace("\\", "\\\\");
|
||||
/**
|
||||
* Parses and validates the multi-provider AI configuration from the given properties.
|
||||
* <p>
|
||||
* Uses {@link MultiProviderConfigurationParser} for parsing and
|
||||
* {@link MultiProviderConfigurationValidator} for validation. Throws on any
|
||||
* configuration error before returning.
|
||||
*/
|
||||
private MultiProviderConfiguration parseAndValidateProviders(Properties props) {
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(environmentLookup);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
new MultiProviderConfigurationValidator().validate(config);
|
||||
return config;
|
||||
}
|
||||
|
||||
private StartConfiguration buildStartConfiguration(Properties props, String apiKey) {
|
||||
private StartConfiguration buildStartConfiguration(Properties props,
|
||||
MultiProviderConfiguration multiProviderConfig,
|
||||
boolean logAiSensitive) {
|
||||
return new StartConfiguration(
|
||||
Paths.get(getRequiredProperty(props, "source.folder")),
|
||||
Paths.get(getRequiredProperty(props, "target.folder")),
|
||||
Paths.get(getRequiredProperty(props, "sqlite.file")),
|
||||
parseUri(getRequiredProperty(props, "api.baseUrl")),
|
||||
getRequiredProperty(props, "api.model"),
|
||||
parseInt(getRequiredProperty(props, "api.timeoutSeconds")),
|
||||
multiProviderConfig,
|
||||
parseInt(getRequiredProperty(props, "max.retries.transient")),
|
||||
parseInt(getRequiredProperty(props, "max.pages")),
|
||||
parseInt(getRequiredProperty(props, "max.text.characters")),
|
||||
@@ -122,18 +130,15 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
Paths.get(getOptionalProperty(props, "runtime.lock.file", "")),
|
||||
Paths.get(getOptionalProperty(props, "log.directory", "")),
|
||||
getOptionalProperty(props, "log.level", "INFO"),
|
||||
apiKey
|
||||
logAiSensitive
|
||||
);
|
||||
}
|
||||
|
||||
private String getApiKey(Properties props) {
|
||||
String envApiKey = environmentLookup.apply(API_KEY_ENV_VAR);
|
||||
if (envApiKey != null && !envApiKey.isBlank()) {
|
||||
LOG.info("Using API key from environment variable {}", API_KEY_ENV_VAR);
|
||||
return envApiKey;
|
||||
}
|
||||
String propsApiKey = props.getProperty("api.key");
|
||||
return propsApiKey != null ? propsApiKey : "";
|
||||
private String escapeBackslashes(String content) {
|
||||
// Escape backslashes to prevent Java Properties from interpreting them as escape sequences.
|
||||
// This is needed because Windows paths use backslashes (e.g., C:\temp\...)
|
||||
// and Java Properties interprets \t as tab, \n as newline, etc.
|
||||
return content.replace("\\", "\\\\");
|
||||
}
|
||||
|
||||
private String getRequiredProperty(Properties props, String key) {
|
||||
@@ -167,11 +172,39 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
}
|
||||
}
|
||||
|
||||
private URI parseUri(String value) {
|
||||
try {
|
||||
return new URI(value.trim());
|
||||
} catch (URISyntaxException e) {
|
||||
throw new ConfigurationLoadingException("Invalid URI value for property: " + value, e);
|
||||
/**
|
||||
* Parses the {@code log.ai.sensitive} configuration property with strict validation.
|
||||
* <p>
|
||||
* This property controls whether sensitive AI-generated content (raw response, reasoning)
|
||||
* may be written to log files. It must be either the literal string "true" or "false"
|
||||
* (case-insensitive). Any other value is rejected as an invalid startup configuration.
|
||||
* <p>
|
||||
* The default value (when the property is absent) is {@code false}, which is the safe default.
|
||||
*
|
||||
* @return {@code true} if the property is explicitly set to "true", {@code false} otherwise
|
||||
* @throws ConfigurationLoadingException if the property is present but contains an invalid value
|
||||
*/
|
||||
private boolean parseAiContentSensitivity(Properties props) {
|
||||
String value = props.getProperty("log.ai.sensitive");
|
||||
|
||||
// If absent, return safe default
|
||||
if (value == null) {
|
||||
return false;
|
||||
}
|
||||
|
||||
String trimmedValue = value.trim().toLowerCase();
|
||||
|
||||
// Only accept literal "true" or "false"
|
||||
if ("true".equals(trimmedValue)) {
|
||||
return true;
|
||||
} else if ("false".equals(trimmedValue)) {
|
||||
return false;
|
||||
} else {
|
||||
// Reject any other value as invalid configuration
|
||||
throw new ConfigurationLoadingException(
|
||||
"Invalid value for log.ai.sensitive: '" + value + "'. "
|
||||
+ "Must be either 'true' or 'false' (case-insensitive). "
|
||||
+ "Default is 'false' (sensitive content not logged).");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -14,7 +14,7 @@ import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableExcepti
|
||||
/**
|
||||
* File-based implementation of {@link RunLockPort} that uses a lock file to prevent concurrent runs.
|
||||
* <p>
|
||||
* AP-006 Implementation: Creates an exclusive lock file on acquire and deletes it on release.
|
||||
* Creates an exclusive lock file on acquire and deletes it on release.
|
||||
* If the lock file already exists, {@link #acquire()} throws {@link RunLockUnavailableException}
|
||||
* to signal that another instance is already running.
|
||||
* <p>
|
||||
|
||||
@@ -102,7 +102,7 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
try {
|
||||
int pageCount = document.getNumberOfPages();
|
||||
|
||||
// AP-003: Handle case of zero pages as technical error
|
||||
// Handle case of zero pages as technical error
|
||||
// (PdfPageCount requires >= 1, so this is a constraint violation)
|
||||
if (pageCount < 1) {
|
||||
return new PdfExtractionTechnicalError(
|
||||
@@ -124,7 +124,7 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
}
|
||||
|
||||
} catch (IOException e) {
|
||||
// All I/O and PDFBox loading/parsing errors are technical errors in AP-003
|
||||
// All I/O and PDFBox loading/parsing errors are technical errors
|
||||
String errorMessage = e.getMessage() != null ? e.getMessage() : e.toString();
|
||||
return new PdfExtractionTechnicalError(
|
||||
"Failed to load or parse PDF: " + errorMessage,
|
||||
|
||||
@@ -14,7 +14,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
/**
|
||||
* File-system based implementation of {@link SourceDocumentCandidatesPort}.
|
||||
* <p>
|
||||
* AP-002 Implementation: Scans a configured source folder and returns only PDF files
|
||||
* Scans a configured source folder and returns only PDF files
|
||||
* (by extension) as {@link SourceDocumentCandidate} objects.
|
||||
* <p>
|
||||
* Design:
|
||||
@@ -29,13 +29,11 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
* <p>
|
||||
* Non-goals:
|
||||
* <ul>
|
||||
* <li>No PDF validation (that is AP-003)</li>
|
||||
* <li>No PDF structure validation</li>
|
||||
* <li>No recursion into subdirectories</li>
|
||||
* <li>No content evaluation (that happens in AP-004: brauchbarer Text assessment)</li>
|
||||
* <li>No content evaluation (text usability is assessed during document processing)</li>
|
||||
* <li>No fachlich evaluation of candidates</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
public class SourceDocumentCandidatesPortAdapter implements SourceDocumentCandidatesPort {
|
||||
|
||||
|
||||
@@ -31,9 +31,9 @@ import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
* including all AI traceability fields added during schema evolution.
|
||||
* <p>
|
||||
* <strong>Schema compatibility:</strong> This adapter writes all columns including
|
||||
* the AI traceability columns. When reading rows that were written before schema
|
||||
* evolution, those columns contain {@code NULL} and are mapped to {@code null}
|
||||
* in the Java record.
|
||||
* the AI traceability columns and the provider-identifier column ({@code ai_provider}).
|
||||
* When reading rows that were written before schema evolution, those columns contain
|
||||
* {@code NULL} and are mapped to {@code null} in the Java record.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All JDBC and SQLite details are strictly
|
||||
* confined to this class. No JDBC types appear in the port interface or in any
|
||||
@@ -129,6 +129,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
failure_class,
|
||||
failure_message,
|
||||
retryable,
|
||||
ai_provider,
|
||||
model_name,
|
||||
prompt_identifier,
|
||||
processed_page_count,
|
||||
@@ -139,7 +140,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
date_source,
|
||||
validated_title,
|
||||
final_target_file_name
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
@@ -157,19 +158,20 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
setNullableString(statement, 7, attempt.failureClass());
|
||||
setNullableString(statement, 8, attempt.failureMessage());
|
||||
statement.setBoolean(9, attempt.retryable());
|
||||
// AI traceability fields
|
||||
setNullableString(statement, 10, attempt.modelName());
|
||||
setNullableString(statement, 11, attempt.promptIdentifier());
|
||||
setNullableInteger(statement, 12, attempt.processedPageCount());
|
||||
setNullableInteger(statement, 13, attempt.sentCharacterCount());
|
||||
setNullableString(statement, 14, attempt.aiRawResponse());
|
||||
setNullableString(statement, 15, attempt.aiReasoning());
|
||||
setNullableString(statement, 16,
|
||||
attempt.resolvedDate() != null ? attempt.resolvedDate().toString() : null);
|
||||
// AI provider identifier and AI traceability fields
|
||||
setNullableString(statement, 10, attempt.aiProvider());
|
||||
setNullableString(statement, 11, attempt.modelName());
|
||||
setNullableString(statement, 12, attempt.promptIdentifier());
|
||||
setNullableInteger(statement, 13, attempt.processedPageCount());
|
||||
setNullableInteger(statement, 14, attempt.sentCharacterCount());
|
||||
setNullableString(statement, 15, attempt.aiRawResponse());
|
||||
setNullableString(statement, 16, attempt.aiReasoning());
|
||||
setNullableString(statement, 17,
|
||||
attempt.resolvedDate() != null ? attempt.resolvedDate().toString() : null);
|
||||
setNullableString(statement, 18,
|
||||
attempt.dateSource() != null ? attempt.dateSource().name() : null);
|
||||
setNullableString(statement, 18, attempt.validatedTitle());
|
||||
setNullableString(statement, 19, attempt.finalTargetFileName());
|
||||
setNullableString(statement, 19, attempt.validatedTitle());
|
||||
setNullableString(statement, 20, attempt.finalTargetFileName());
|
||||
|
||||
int rowsAffected = statement.executeUpdate();
|
||||
if (rowsAffected != 1) {
|
||||
@@ -204,7 +206,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
SELECT
|
||||
fingerprint, run_id, attempt_number, started_at, ended_at,
|
||||
status, failure_class, failure_message, retryable,
|
||||
model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_provider, model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_raw_response, ai_reasoning, resolved_date, date_source, validated_title,
|
||||
final_target_file_name
|
||||
FROM processing_attempt
|
||||
@@ -247,6 +249,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
* @return the most recent {@code PROPOSAL_READY} attempt, or {@code null}
|
||||
* @throws DocumentPersistenceException if the query fails
|
||||
*/
|
||||
@Override
|
||||
public ProcessingAttempt findLatestProposalReadyAttempt(DocumentFingerprint fingerprint) {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
|
||||
@@ -254,12 +257,12 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
SELECT
|
||||
fingerprint, run_id, attempt_number, started_at, ended_at,
|
||||
status, failure_class, failure_message, retryable,
|
||||
model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_provider, model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_raw_response, ai_reasoning, resolved_date, date_source, validated_title,
|
||||
final_target_file_name
|
||||
FROM processing_attempt
|
||||
WHERE fingerprint = ?
|
||||
AND status = 'PROPOSAL_READY'
|
||||
AND status = ?
|
||||
ORDER BY attempt_number DESC
|
||||
LIMIT 1
|
||||
""";
|
||||
@@ -270,6 +273,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
|
||||
pragmaStmt.execute(PRAGMA_FOREIGN_KEYS_ON);
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
statement.setString(2, ProcessingStatus.PROPOSAL_READY.name());
|
||||
|
||||
try (ResultSet rs = statement.executeQuery()) {
|
||||
if (rs.next()) {
|
||||
@@ -310,6 +314,7 @@ public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttem
|
||||
rs.getString("failure_class"),
|
||||
rs.getString("failure_message"),
|
||||
rs.getBoolean("retryable"),
|
||||
rs.getString("ai_provider"),
|
||||
rs.getString("model_name"),
|
||||
rs.getString("prompt_identifier"),
|
||||
processedPageCount,
|
||||
|
||||
@@ -41,9 +41,12 @@ import de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitiali
|
||||
* <li>Target-copy columns ({@code last_target_path}, {@code last_target_file_name}) to
|
||||
* {@code document_record}</li>
|
||||
* <li>Target-copy column ({@code final_target_file_name}) to {@code processing_attempt}</li>
|
||||
* <li>Provider-identifier column ({@code ai_provider}) to {@code processing_attempt};
|
||||
* existing rows receive {@code NULL} as the default, which is the correct value for
|
||||
* attempts recorded before provider tracking was introduced.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>M4→current-schema status migration</h2>
|
||||
* <h2>Legacy-state migration</h2>
|
||||
* <p>
|
||||
* Documents in an earlier positive intermediate state ({@code SUCCESS} recorded without
|
||||
* a validated naming proposal) are idempotently migrated to {@code READY_FOR_AI} so that
|
||||
@@ -150,6 +153,9 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
/**
|
||||
* Columns to add idempotently to {@code processing_attempt}.
|
||||
* Each entry is {@code [column_name, column_type]}.
|
||||
* <p>
|
||||
* {@code ai_provider} is nullable; existing rows receive {@code NULL}, which is the
|
||||
* correct sentinel for attempts recorded before provider tracking was introduced.
|
||||
*/
|
||||
private static final String[][] EVOLUTION_ATTEMPT_COLUMNS = {
|
||||
{"model_name", "TEXT"},
|
||||
@@ -162,6 +168,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
{"date_source", "TEXT"},
|
||||
{"validated_title", "TEXT"},
|
||||
{"final_target_file_name", "TEXT"},
|
||||
{"ai_provider", "TEXT"},
|
||||
};
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
@@ -178,7 +185,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
};
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// M4→current-schema status migration
|
||||
// Legacy-state status migration
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
@@ -229,7 +236,8 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
* <li>Create {@code document_record} table (if not exists).</li>
|
||||
* <li>Create {@code processing_attempt} table (if not exists).</li>
|
||||
* <li>Create all indexes (if not exist).</li>
|
||||
* <li>Add AI-traceability columns to {@code processing_attempt} (idempotent evolution).</li>
|
||||
* <li>Add AI-traceability and provider-identifier columns to {@code processing_attempt}
|
||||
* (idempotent evolution).</li>
|
||||
* <li>Migrate earlier positive intermediate state to {@code READY_FOR_AI} (idempotent).</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import java.lang.reflect.InvocationTargetException;
|
||||
import java.lang.reflect.Proxy;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.SQLException;
|
||||
@@ -93,53 +95,70 @@ public class SqliteUnitOfWorkAdapter implements UnitOfWorkPort {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Wraps a shared transaction connection so that {@code close()} becomes a no-op.
|
||||
* <p>
|
||||
* Repository adapters manage their own connection lifecycle via try-with-resources,
|
||||
* which would close the shared transaction connection prematurely if not wrapped.
|
||||
* All other {@link Connection} methods are delegated unchanged to the underlying connection.
|
||||
*
|
||||
* @param underlying the real shared connection; must not be null
|
||||
* @return a proxy connection that ignores {@code close()} calls
|
||||
*/
|
||||
private static Connection nonClosingWrapper(Connection underlying) {
|
||||
return (Connection) Proxy.newProxyInstance(
|
||||
Connection.class.getClassLoader(),
|
||||
new Class<?>[] { Connection.class },
|
||||
(proxy, method, args) -> {
|
||||
if ("close".equals(method.getName())) {
|
||||
return null;
|
||||
}
|
||||
try {
|
||||
return method.invoke(underlying, args);
|
||||
} catch (InvocationTargetException e) {
|
||||
throw e.getCause();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private class TransactionOperationsImpl implements TransactionOperations {
|
||||
private final Connection connection;
|
||||
|
||||
|
||||
TransactionOperationsImpl(Connection connection) {
|
||||
this.connection = connection;
|
||||
}
|
||||
|
||||
|
||||
@Override
|
||||
public void saveProcessingAttempt(ProcessingAttempt attempt) {
|
||||
// Repository methods declare DocumentPersistenceException as the only thrown exception.
|
||||
// Any other exception (NullPointerException, etc.) will propagate to the outer try-catch
|
||||
// and be caught there.
|
||||
SqliteProcessingAttemptRepositoryAdapter repo =
|
||||
new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return connection;
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.save(attempt);
|
||||
}
|
||||
|
||||
|
||||
@Override
|
||||
public void createDocumentRecord(DocumentRecord record) {
|
||||
// Repository methods declare DocumentPersistenceException as the only thrown exception.
|
||||
// Any other exception (NullPointerException, etc.) will propagate to the outer try-catch
|
||||
// and be caught there.
|
||||
SqliteDocumentRecordRepositoryAdapter repo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return connection;
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.create(record);
|
||||
}
|
||||
|
||||
|
||||
@Override
|
||||
public void updateDocumentRecord(DocumentRecord record) {
|
||||
// Repository methods declare DocumentPersistenceException as the only thrown exception.
|
||||
// Any other exception (NullPointerException, etc.) will propagate to the outer try-catch
|
||||
// and be caught there.
|
||||
SqliteDocumentRecordRepositoryAdapter repo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return connection;
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.update(record);
|
||||
|
||||
@@ -1,13 +1,5 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.AtomicMoveNotSupportedException;
|
||||
import java.nio.file.Files;
|
||||
@@ -16,6 +8,15 @@ import java.nio.file.Paths;
|
||||
import java.nio.file.StandardCopyOption;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Filesystem-based implementation of {@link TargetFileCopyPort}.
|
||||
* <p>
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
/**
|
||||
* Outbound adapter for writing the target file copy.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.targetcopy.FilesystemTargetFileCopyAdapter}
|
||||
* — Filesystem-based implementation of
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The adapter uses a two-step write pattern: the source is first copied to a temporary
|
||||
* file ({@code resolvedFilename + ".tmp"}) in the target folder, then renamed/moved to
|
||||
* the final filename. An atomic move is attempted first; a standard move is used as a
|
||||
* fallback when the filesystem does not support atomic cross-directory moves.
|
||||
* <p>
|
||||
* <strong>Source integrity:</strong> The source file is never modified, moved, or deleted.
|
||||
* Only a copy is created in the target folder.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All NIO ({@code Path}, {@code Files}) operations
|
||||
* are strictly confined to this package. The port interface
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort} contains no
|
||||
* filesystem types, preserving the hexagonal architecture boundary.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
@@ -1,17 +1,18 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
|
||||
/**
|
||||
* Filesystem-based implementation of {@link TargetFolderPort}.
|
||||
* <p>
|
||||
|
||||
@@ -0,0 +1,26 @@
|
||||
/**
|
||||
* Outbound adapter for target folder management and unique filename resolution.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.targetfolder.FilesystemTargetFolderAdapter}
|
||||
* — Filesystem-based implementation of
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Duplicate resolution:</strong> Given a base name such as
|
||||
* {@code 2024-01-15 - Rechnung.pdf}, the adapter checks whether the file exists in the
|
||||
* target folder and appends a numeric suffix ({@code (1)}, {@code (2)}, …) directly
|
||||
* before {@code .pdf} until a free name is found. The 20-character base-title limit
|
||||
* does not apply to the suffix.
|
||||
* <p>
|
||||
* <strong>Rollback support:</strong> The adapter provides a best-effort deletion method
|
||||
* used by the application layer to remove a successfully written target copy when
|
||||
* subsequent persistence fails, preventing orphaned target files.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All NIO ({@code Path}, {@code Files}) operations
|
||||
* are strictly confined to this package. The port interface
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort} contains no
|
||||
* filesystem types, preserving the hexagonal architecture boundary.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
@@ -0,0 +1,221 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.pdmodel.PDPage;
|
||||
import org.apache.pdfbox.pdmodel.PDPageContentStream;
|
||||
import org.apache.pdfbox.pdmodel.font.PDType1Font;
|
||||
import org.apache.pdfbox.pdmodel.font.Standard14Fonts;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.clock.SystemClockAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.fingerprint.Sha256FingerprintAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.lock.FilesystemRunLockPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction.PdfTextExtractionPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.prompt.FilesystemPromptPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument.SourceDocumentCandidatesPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteDocumentRecordRepositoryAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteProcessingAttemptRepositoryAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteSchemaInitializationAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteUnitOfWorkAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.targetcopy.FilesystemTargetFileCopyAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.targetfolder.FilesystemTargetFolderAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingLogger;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiNamingService;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiResponseValidator;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.DocumentProcessingCoordinator;
|
||||
import de.gecheckt.pdf.umbenenner.application.usecase.DefaultBatchRunProcessingUseCase;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Integration test verifying that the Anthropic Claude adapter integrates correctly
|
||||
* with the full batch processing pipeline and that the provider identifier
|
||||
* {@code "claude"} is persisted in the processing attempt history.
|
||||
* <p>
|
||||
* Uses a mocked HTTP client to simulate the Anthropic API without real network calls.
|
||||
* All other adapters (SQLite, filesystem, PDF extraction, fingerprinting) are real
|
||||
* production implementations.
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@DisplayName("AnthropicClaudeAdapter integration")
|
||||
class AnthropicClaudeAdapterIntegrationTest {
|
||||
|
||||
/**
|
||||
* Pflicht-Testfall 15: claudeProviderIdentifierLandsInAttemptHistory
|
||||
* <p>
|
||||
* Verifies the end-to-end integration: the Claude adapter with a mocked HTTP layer
|
||||
* is wired into the batch pipeline, and after a successful run, the processing attempt
|
||||
* record contains {@code ai_provider='claude'}.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeProviderIdentifierLandsInAttemptHistory: ai_provider=claude in attempt history after successful run")
|
||||
void claudeProviderIdentifierLandsInAttemptHistory(@TempDir Path tempDir) throws Exception {
|
||||
// --- Infrastructure setup ---
|
||||
Path sourceFolder = Files.createDirectories(tempDir.resolve("source"));
|
||||
Path targetFolder = Files.createDirectories(tempDir.resolve("target"));
|
||||
Path promptFile = tempDir.resolve("prompt.txt");
|
||||
Files.writeString(promptFile, "Analysiere das Dokument und liefere JSON.");
|
||||
|
||||
String jdbcUrl = "jdbc:sqlite:" + tempDir.resolve("test.db")
|
||||
.toAbsolutePath().toString().replace('\\', '/');
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
// --- Create a searchable PDF in the source folder ---
|
||||
Path pdfPath = sourceFolder.resolve("testdokument.pdf");
|
||||
createSearchablePdf(pdfPath, "Testinhalt Rechnung Datum 15.01.2024 Betrag 99 EUR");
|
||||
|
||||
// --- Compute fingerprint for later verification ---
|
||||
Sha256FingerprintAdapter fingerprintAdapter = new Sha256FingerprintAdapter();
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate(
|
||||
pdfPath.getFileName().toString(), 0L,
|
||||
new SourceDocumentLocator(pdfPath.toAbsolutePath().toString()));
|
||||
DocumentFingerprint fingerprint = switch (fingerprintAdapter.computeFingerprint(candidate)) {
|
||||
case FingerprintSuccess s -> s.fingerprint();
|
||||
default -> throw new IllegalStateException("Fingerprint computation failed");
|
||||
};
|
||||
|
||||
// --- Mock the HTTP client for the Claude adapter ---
|
||||
HttpClient mockHttpClient = mock(HttpClient.class);
|
||||
// Build a valid Anthropic response with the NamingProposal JSON as text content
|
||||
String namingProposalJson =
|
||||
"{\\\"date\\\":\\\"2024-01-15\\\",\\\"title\\\":\\\"Testrechnung\\\","
|
||||
+ "\\\"reasoning\\\":\\\"Rechnung vom 15.01.2024\\\"}";
|
||||
String anthropicResponseBody = "{"
|
||||
+ "\"id\":\"msg_integration_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":[{\"type\":\"text\",\"text\":\"" + namingProposalJson + "\"}],"
|
||||
+ "\"stop_reason\":\"end_turn\""
|
||||
+ "}";
|
||||
|
||||
HttpResponse<String> mockHttpResponse = mockStringResponse(200, anthropicResponseBody);
|
||||
doReturn(mockHttpResponse).when(mockHttpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// --- Create the Claude adapter with the mocked HTTP client ---
|
||||
ProviderConfiguration claudeConfig = new ProviderConfiguration(
|
||||
"claude-3-5-sonnet-20241022", 60, "https://api.anthropic.com", "sk-ant-test");
|
||||
AnthropicClaudeHttpAdapter claudeAdapter =
|
||||
new AnthropicClaudeHttpAdapter(claudeConfig, mockHttpClient);
|
||||
|
||||
// --- Wire the full pipeline with provider identifier "claude" ---
|
||||
SqliteDocumentRecordRepositoryAdapter documentRepo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
SqliteProcessingAttemptRepositoryAdapter attemptRepo =
|
||||
new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl);
|
||||
SqliteUnitOfWorkAdapter unitOfWork = new SqliteUnitOfWorkAdapter(jdbcUrl);
|
||||
|
||||
ProcessingLogger noOpLogger = new NoOpProcessingLogger();
|
||||
DocumentProcessingCoordinator coordinator = new DocumentProcessingCoordinator(
|
||||
documentRepo, attemptRepo, unitOfWork,
|
||||
new FilesystemTargetFolderAdapter(targetFolder),
|
||||
new FilesystemTargetFileCopyAdapter(targetFolder),
|
||||
noOpLogger,
|
||||
3,
|
||||
"claude"); // provider identifier for Claude
|
||||
|
||||
AiNamingService aiNamingService = new AiNamingService(
|
||||
claudeAdapter,
|
||||
new FilesystemPromptPortAdapter(promptFile),
|
||||
new AiResponseValidator(new SystemClockAdapter()),
|
||||
"claude-3-5-sonnet-20241022",
|
||||
10_000);
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = new DefaultBatchRunProcessingUseCase(
|
||||
new RuntimeConfiguration(50, 3, AiContentSensitivity.PROTECT_SENSITIVE_CONTENT),
|
||||
new FilesystemRunLockPortAdapter(tempDir.resolve("run.lock")),
|
||||
new SourceDocumentCandidatesPortAdapter(sourceFolder),
|
||||
new PdfTextExtractionPortAdapter(),
|
||||
fingerprintAdapter,
|
||||
coordinator,
|
||||
aiNamingService,
|
||||
noOpLogger);
|
||||
|
||||
// --- Run the batch ---
|
||||
BatchRunContext context = new BatchRunContext(
|
||||
new RunId(UUID.randomUUID().toString()), Instant.now());
|
||||
useCase.execute(context);
|
||||
|
||||
// --- Verify: ai_provider='claude' is stored in the attempt history ---
|
||||
List<ProcessingAttempt> attempts = attemptRepo.findAllByFingerprint(fingerprint);
|
||||
assertThat(attempts)
|
||||
.as("At least one attempt must be recorded")
|
||||
.isNotEmpty();
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Provider identifier must be 'claude' in the attempt history")
|
||||
.isEqualTo("claude");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Helpers
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Creates a typed mock {@link HttpResponse} to avoid unchecked-cast warnings at call sites.
|
||||
* The suppression is confined to this helper because the raw-type cast is technically
|
||||
* unavoidable due to type erasure when mocking generic interfaces.
|
||||
*/
|
||||
@SuppressWarnings("unchecked")
|
||||
private static HttpResponse<String> mockStringResponse(int statusCode, String body) {
|
||||
HttpResponse<String> response = (HttpResponse<String>) mock(HttpResponse.class);
|
||||
when(response.statusCode()).thenReturn(statusCode);
|
||||
when(response.body()).thenReturn(body);
|
||||
return response;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a single-page searchable PDF with embedded text using PDFBox.
|
||||
*/
|
||||
private static void createSearchablePdf(Path pdfPath, String text) throws Exception {
|
||||
try (PDDocument doc = new PDDocument()) {
|
||||
PDPage page = new PDPage();
|
||||
doc.addPage(page);
|
||||
try (PDPageContentStream cs = new PDPageContentStream(doc, page)) {
|
||||
cs.beginText();
|
||||
cs.setFont(new PDType1Font(Standard14Fonts.FontName.HELVETICA), 12);
|
||||
cs.newLineAtOffset(50, 700);
|
||||
cs.showText(text);
|
||||
cs.endText();
|
||||
}
|
||||
doc.save(pdfPath.toFile());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* No-op implementation of {@link ProcessingLogger} for use in integration tests
|
||||
* where log output is not relevant to the assertion.
|
||||
*/
|
||||
private static class NoOpProcessingLogger implements ProcessingLogger {
|
||||
@Override public void info(String message, Object... args) {}
|
||||
@Override public void debug(String message, Object... args) {}
|
||||
@Override public void warn(String message, Object... args) {}
|
||||
@Override public void error(String message, Object... args) {}
|
||||
@Override public void debugSensitiveAiContent(String message, Object... args) {}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,702 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.verify;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.ConnectException;
|
||||
import java.net.UnknownHostException;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.net.http.HttpTimeoutException;
|
||||
import java.time.Duration;
|
||||
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONObject;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.mockito.ArgumentCaptor;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.configuration.MultiProviderConfigurationValidator;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link AnthropicClaudeHttpAdapter}.
|
||||
* <p>
|
||||
* Tests inject a mock {@link HttpClient} via the package-private constructor
|
||||
* to exercise the adapter path without requiring network access.
|
||||
* Configuration is supplied via {@link ProviderConfiguration}.
|
||||
* <p>
|
||||
* Covered scenarios:
|
||||
* <ul>
|
||||
* <li>Correct HTTP request structure (URL, method, headers, body)</li>
|
||||
* <li>API key resolution (env var vs. properties value)</li>
|
||||
* <li>Configuration validation for missing API key</li>
|
||||
* <li>Single and multiple text-block extraction from Anthropic response</li>
|
||||
* <li>Ignoring non-text content blocks</li>
|
||||
* <li>Technical failure when no text blocks are present</li>
|
||||
* <li>HTTP 4xx (401, 429) and 5xx (500) mapped to technical failure</li>
|
||||
* <li>Timeout mapped to technical failure</li>
|
||||
* <li>Unparseable JSON response mapped to technical failure</li>
|
||||
* </ul>
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@DisplayName("AnthropicClaudeHttpAdapter")
|
||||
class AnthropicClaudeHttpAdapterTest {
|
||||
|
||||
private static final String API_BASE_URL = "https://api.anthropic.com";
|
||||
private static final String API_MODEL = "claude-3-5-sonnet-20241022";
|
||||
private static final String API_KEY = "sk-ant-test-key-12345";
|
||||
private static final int TIMEOUT_SECONDS = 60;
|
||||
|
||||
@Mock
|
||||
private HttpClient httpClient;
|
||||
|
||||
private ProviderConfiguration testConfiguration;
|
||||
private AnthropicClaudeHttpAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
testConfiguration = new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
adapter = new AnthropicClaudeHttpAdapter(testConfiguration, httpClient);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 1: claudeAdapterBuildsCorrectRequest
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that the adapter constructs the correct HTTP request:
|
||||
* URL with {@code /v1/messages} path, method POST, all three required headers
|
||||
* ({@code x-api-key}, {@code anthropic-version}, {@code content-type}), and
|
||||
* a body with {@code model}, {@code max_tokens > 0}, and {@code messages} containing
|
||||
* exactly one user message with the document text.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterBuildsCorrectRequest: correct URL, method, headers, and body")
|
||||
void claudeAdapterBuildsCorrectRequest() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, buildAnthropicSuccessResponse(
|
||||
"{\"date\":\"2024-01-15\",\"title\":\"Testititel\",\"reasoning\":\"Test\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("System-Prompt", "Dokumenttext");
|
||||
adapter.invoke(request);
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
|
||||
// URL must point to /v1/messages
|
||||
assertThat(capturedRequest.uri().toString())
|
||||
.as("URL must be based on configured baseUrl")
|
||||
.startsWith(API_BASE_URL)
|
||||
.endsWith("/v1/messages");
|
||||
|
||||
// Method must be POST
|
||||
assertThat(capturedRequest.method()).isEqualTo("POST");
|
||||
|
||||
// All three required headers must be present
|
||||
assertThat(capturedRequest.headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must be present")
|
||||
.isPresent();
|
||||
assertThat(capturedRequest.headers().firstValue("anthropic-version"))
|
||||
.as("anthropic-version header must be present")
|
||||
.isPresent()
|
||||
.hasValue("2023-06-01");
|
||||
assertThat(capturedRequest.headers().firstValue("content-type"))
|
||||
.as("content-type header must be present")
|
||||
.isPresent();
|
||||
|
||||
// Body must contain model, max_tokens > 0, and messages with one user message
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
JSONObject body = new JSONObject(sentBody);
|
||||
assertThat(body.getString("model"))
|
||||
.as("model must match configuration")
|
||||
.isEqualTo(API_MODEL);
|
||||
assertThat(body.getInt("max_tokens"))
|
||||
.as("max_tokens must be positive")
|
||||
.isGreaterThan(0);
|
||||
assertThat(body.getJSONArray("messages").length())
|
||||
.as("messages must contain exactly one entry")
|
||||
.isEqualTo(1);
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("role"))
|
||||
.as("the single message must be a user message")
|
||||
.isEqualTo("user");
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("content"))
|
||||
.as("user message content must be the document text")
|
||||
.isEqualTo("Dokumenttext");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 2: claudeAdapterUsesEnvVarApiKey
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when the {@code ANTHROPIC_API_KEY} environment variable is the source
|
||||
* of the resolved API key (represented in ProviderConfiguration after env-var precedence
|
||||
* was applied by the configuration layer), the adapter uses that key in the
|
||||
* {@code x-api-key} header.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterUsesEnvVarApiKey: env var value reaches x-api-key header")
|
||||
void claudeAdapterUsesEnvVarApiKey() throws Exception {
|
||||
String envVarValue = "sk-ant-from-env-variable";
|
||||
// Env var takes precedence: the configuration layer resolves this into apiKey
|
||||
ProviderConfiguration configWithEnvKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, envVarValue);
|
||||
AnthropicClaudeHttpAdapter adapterWithEnvKey =
|
||||
new AnthropicClaudeHttpAdapter(configWithEnvKey, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithEnvKey.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must contain the env var value")
|
||||
.hasValue(envVarValue);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 3: claudeAdapterFallsBackToPropertiesApiKey
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when no environment variable is set, the API key from the
|
||||
* properties configuration is used in the {@code x-api-key} header.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFallsBackToPropertiesApiKey: properties key reaches x-api-key header")
|
||||
void claudeAdapterFallsBackToPropertiesApiKey() throws Exception {
|
||||
String propertiesKey = "sk-ant-from-properties";
|
||||
ProviderConfiguration configWithPropertiesKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, propertiesKey);
|
||||
AnthropicClaudeHttpAdapter adapterWithPropertiesKey =
|
||||
new AnthropicClaudeHttpAdapter(configWithPropertiesKey, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPropertiesKey.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must contain the properties value")
|
||||
.hasValue(propertiesKey);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 4: claudeAdapterFailsValidationWhenBothKeysMissing
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when both the environment variable and the properties API key for the
|
||||
* Claude provider are empty, the {@link MultiProviderConfigurationValidator} rejects the
|
||||
* configuration with an {@link InvalidStartConfigurationException}.
|
||||
* <p>
|
||||
* This confirms that the adapter is protected by startup validation (from AP-001)
|
||||
* and will never be constructed with a truly missing API key in production.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFailsValidationWhenBothKeysMissing: validator rejects empty API key for Claude")
|
||||
void claudeAdapterFailsValidationWhenBothKeysMissing() {
|
||||
// Simulate both env var and properties key being absent (empty resolved key)
|
||||
ProviderConfiguration claudeConfigWithoutKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, "");
|
||||
ProviderConfiguration inactiveOpenAiConfig = new ProviderConfiguration(
|
||||
"unused-model", 0, null, null);
|
||||
MultiProviderConfiguration config = new MultiProviderConfiguration(
|
||||
AiProviderFamily.CLAUDE, inactiveOpenAiConfig, claudeConfigWithoutKey);
|
||||
|
||||
MultiProviderConfigurationValidator validator = new MultiProviderConfigurationValidator();
|
||||
|
||||
assertThatThrownBy(() -> validator.validate(config))
|
||||
.as("Validator must reject Claude configuration with empty API key")
|
||||
.isInstanceOf(InvalidStartConfigurationException.class);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 5: claudeAdapterParsesSingleTextBlock
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a response with a single text block is correctly extracted.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterParsesSingleTextBlock: single text block becomes raw response")
|
||||
void claudeAdapterParsesSingleTextBlock() throws Exception {
|
||||
String blockText = "{\"date\":\"2024-01-15\",\"title\":\"Rechnung\",\"reasoning\":\"Test\"}";
|
||||
String responseBody = buildAnthropicSuccessResponse(blockText);
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseBody);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
AiInvocationSuccess success = (AiInvocationSuccess) result;
|
||||
assertThat(success.rawResponse().content())
|
||||
.as("Raw response must equal the text block content")
|
||||
.isEqualTo(blockText);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 6: claudeAdapterConcatenatesMultipleTextBlocks
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that multiple text blocks are concatenated in order.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterConcatenatesMultipleTextBlocks: text blocks are concatenated in order")
|
||||
void claudeAdapterConcatenatesMultipleTextBlocks() throws Exception {
|
||||
String part1 = "Erster Teil der Antwort. ";
|
||||
String part2 = "Zweiter Teil der Antwort.";
|
||||
|
||||
// Build the response using JSONObject to ensure correct escaping
|
||||
JSONObject block1 = new JSONObject();
|
||||
block1.put("type", "text");
|
||||
block1.put("text", part1);
|
||||
JSONObject block2 = new JSONObject();
|
||||
block2.put("type", "text");
|
||||
block2.put("text", part2);
|
||||
JSONObject responseJson = new JSONObject();
|
||||
responseJson.put("id", "msg_test");
|
||||
responseJson.put("type", "message");
|
||||
responseJson.put("role", "assistant");
|
||||
responseJson.put("content", new JSONArray().put(block1).put(block2));
|
||||
responseJson.put("stop_reason", "end_turn");
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseJson.toString());
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content())
|
||||
.as("Multiple text blocks must be concatenated in order")
|
||||
.isEqualTo(part1 + part2);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 7: claudeAdapterIgnoresNonTextBlocks
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that non-text content blocks (e.g., tool_use) are ignored and only
|
||||
* the text blocks contribute to the raw response.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterIgnoresNonTextBlocks: only text-type blocks contribute to response")
|
||||
void claudeAdapterIgnoresNonTextBlocks() throws Exception {
|
||||
String textContent = "Nur dieser Text zaehlt als Antwort.";
|
||||
|
||||
// Build response with a tool_use block before and a tool_result-like block after the text block
|
||||
JSONObject toolUseBlock = new JSONObject();
|
||||
toolUseBlock.put("type", "tool_use");
|
||||
toolUseBlock.put("id", "tool_1");
|
||||
toolUseBlock.put("name", "get_weather");
|
||||
toolUseBlock.put("input", new JSONObject());
|
||||
|
||||
JSONObject textBlock = new JSONObject();
|
||||
textBlock.put("type", "text");
|
||||
textBlock.put("text", textContent);
|
||||
|
||||
JSONObject ignoredBlock = new JSONObject();
|
||||
ignoredBlock.put("type", "tool_result");
|
||||
ignoredBlock.put("content", "irrelevant");
|
||||
|
||||
JSONObject responseJson = new JSONObject();
|
||||
responseJson.put("id", "msg_test");
|
||||
responseJson.put("type", "message");
|
||||
responseJson.put("role", "assistant");
|
||||
responseJson.put("content", new JSONArray().put(toolUseBlock).put(textBlock).put(ignoredBlock));
|
||||
responseJson.put("stop_reason", "end_turn");
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseJson.toString());
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content())
|
||||
.as("Only text-type blocks must contribute to the raw response")
|
||||
.isEqualTo(textContent);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 8: claudeAdapterFailsOnEmptyTextContent
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a response with no text-type content blocks results in a
|
||||
* technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFailsOnEmptyTextContent: no text blocks yields technical failure")
|
||||
void claudeAdapterFailsOnEmptyTextContent() throws Exception {
|
||||
String noTextBlockResponse = "{"
|
||||
+ "\"id\":\"msg_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":["
|
||||
+ "{\"type\":\"tool_use\",\"id\":\"tool_1\",\"name\":\"unused\",\"input\":{}}"
|
||||
+ "],"
|
||||
+ "\"stop_reason\":\"tool_use\""
|
||||
+ "}";
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, noTextBlockResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason())
|
||||
.isEqualTo("NO_TEXT_CONTENT");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 9: claudeAdapterMapsHttp401AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 401 (Unauthorized) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp401AsTechnical: HTTP 401 yields technical failure")
|
||||
void claudeAdapterMapsHttp401AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(401, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_401");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 10: claudeAdapterMapsHttp429AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 429 (Rate Limit Exceeded) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp429AsTechnical: HTTP 429 yields technical failure")
|
||||
void claudeAdapterMapsHttp429AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(429, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_429");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 11: claudeAdapterMapsHttp500AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 500 (Internal Server Error) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp500AsTechnical: HTTP 500 yields technical failure")
|
||||
void claudeAdapterMapsHttp500AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(500, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_500");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 12: claudeAdapterMapsTimeoutAsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a simulated HTTP timeout results in a technical failure with
|
||||
* reason {@code TIMEOUT}.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsTimeoutAsTechnical: timeout yields TIMEOUT technical failure")
|
||||
void claudeAdapterMapsTimeoutAsTechnical() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new HttpTimeoutException("Connection timed out"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("TIMEOUT");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 13: claudeAdapterMapsUnparseableJsonAsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a non-JSON response body (e.g., an HTML error page or plain text)
|
||||
* returned with HTTP 200 results in a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsUnparseableJsonAsTechnical: non-JSON body yields technical failure")
|
||||
void claudeAdapterMapsUnparseableJsonAsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
"<html><body>Service unavailable</body></html>");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("UNPARSEABLE_JSON");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Additional behavioral tests
|
||||
// =========================================================================
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured model in request body")
|
||||
void testConfiguredModelIsUsedInRequestBody() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(new JSONObject(sentBody).getString("model")).isEqualTo(API_MODEL);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured timeout in request")
|
||||
void testConfiguredTimeoutIsUsedInRequest() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().timeout())
|
||||
.isPresent()
|
||||
.get()
|
||||
.isEqualTo(Duration.ofSeconds(TIMEOUT_SECONDS));
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should place prompt content in system field and document text in user message")
|
||||
void testPromptContentGoesToSystemFieldDocumentTextToUserMessage() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
String promptContent = "Du bist ein Assistent zur Dokumentenbenennung.";
|
||||
String documentText = "Rechnungstext des Dokuments.";
|
||||
adapter.invoke(createTestRequest(promptContent, documentText));
|
||||
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
JSONObject body = new JSONObject(sentBody);
|
||||
|
||||
assertThat(body.getString("system"))
|
||||
.as("Prompt content must be placed in the top-level system field")
|
||||
.isEqualTo(promptContent);
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("content"))
|
||||
.as("Document text must be placed in the user message content")
|
||||
.isEqualTo(documentText);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should map CONNECTION_ERROR when ConnectException is thrown")
|
||||
void testConnectionExceptionIsMappedToConnectionError() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new ConnectException("Connection refused"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("CONNECTION_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should map DNS_ERROR when UnknownHostException is thrown")
|
||||
void testUnknownHostExceptionIsMappedToDnsError() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new UnknownHostException("api.anthropic.com"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("DNS_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when request is null")
|
||||
void testNullRequestThrowsException() {
|
||||
assertThatThrownBy(() -> adapter.invoke(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("request must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when configuration is null")
|
||||
void testNullConfigurationThrowsException() {
|
||||
assertThatThrownBy(() -> new AnthropicClaudeHttpAdapter(null, httpClient))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("config must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is blank")
|
||||
void testBlankApiModelThrowsException() {
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
" ", TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new AnthropicClaudeHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("API model must not be null or empty");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use default base URL when baseUrl is null")
|
||||
void testDefaultBaseUrlUsedWhenNull() throws Exception {
|
||||
ProviderConfiguration configWithoutBaseUrl = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, null, API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithDefault =
|
||||
new AnthropicClaudeHttpAdapter(configWithoutBaseUrl, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithDefault.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Default base URL https://api.anthropic.com must be used when baseUrl is null")
|
||||
.startsWith("https://api.anthropic.com");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a custom, non-default base URL is used in the request.
|
||||
* <p>
|
||||
* This test uses a URL that differs from the default {@code https://api.anthropic.com},
|
||||
* ensuring the conditional that selects between the configured URL and the default
|
||||
* is correctly evaluated. If the conditional were negated, the request would be sent
|
||||
* to the default URL instead of the custom one.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should use custom non-default base URL when provided")
|
||||
void customNonDefaultBaseUrlIsUsedInRequest() throws Exception {
|
||||
String customBaseUrl = "http://internal.proxy.example.com:8080";
|
||||
ProviderConfiguration configWithCustomUrl = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, customBaseUrl, API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithCustomUrl =
|
||||
new AnthropicClaudeHttpAdapter(configWithCustomUrl, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithCustomUrl.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Custom non-default base URL must be used, not the default api.anthropic.com")
|
||||
.startsWith("http://internal.proxy.example.com:8080");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a port value of 0 in the base URL is not included in the endpoint URI.
|
||||
* <p>
|
||||
* {@link java.net.URI#getPort()} returns {@code 0} when the URL explicitly specifies
|
||||
* port 0. The endpoint builder must only include the port when it is greater than 0,
|
||||
* not when it is equal to 0 or negative.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should not include port 0 in the endpoint URI")
|
||||
void buildEndpointUri_doesNotIncludePortZero() throws Exception {
|
||||
ProviderConfiguration configWithPortZero = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, "http://example.com:0", API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithPortZero =
|
||||
new AnthropicClaudeHttpAdapter(configWithPortZero, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPortZero.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Port 0 must not appear in the endpoint URI")
|
||||
.doesNotContain(":0");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Helper methods
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Builds a minimal valid Anthropic Messages API response body with a single text block.
|
||||
*/
|
||||
private static String buildAnthropicSuccessResponse(String textContent) {
|
||||
// Escape the textContent for embedding in JSON string
|
||||
String escaped = textContent
|
||||
.replace("\\", "\\\\")
|
||||
.replace("\"", "\\\"");
|
||||
return "{"
|
||||
+ "\"id\":\"msg_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":[{\"type\":\"text\",\"text\":\"" + escaped + "\"}],"
|
||||
+ "\"stop_reason\":\"end_turn\""
|
||||
+ "}";
|
||||
}
|
||||
|
||||
@SuppressWarnings("unchecked")
|
||||
private HttpResponse<String> mockHttpResponse(int statusCode, String body) {
|
||||
HttpResponse<String> response = (HttpResponse<String>) mock(HttpResponse.class);
|
||||
when(response.statusCode()).thenReturn(statusCode);
|
||||
if (body != null) {
|
||||
when(response.body()).thenReturn(body);
|
||||
}
|
||||
return response;
|
||||
}
|
||||
|
||||
private AiRequestRepresentation createTestRequest(String promptContent, String documentText) {
|
||||
return new AiRequestRepresentation(
|
||||
new PromptIdentifier("test-v1"),
|
||||
promptContent,
|
||||
documentText,
|
||||
documentText.length()
|
||||
);
|
||||
}
|
||||
}
|
||||
@@ -1,19 +1,23 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.*;
|
||||
import static org.mockito.ArgumentMatchers.*;
|
||||
import static org.mockito.Mockito.*;
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.verify;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.ConnectException;
|
||||
import java.net.URI;
|
||||
import java.net.UnknownHostException;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.net.http.HttpTimeoutException;
|
||||
import java.nio.file.Paths;
|
||||
import java.time.Duration;
|
||||
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONObject;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
@@ -22,14 +26,10 @@ import org.mockito.ArgumentCaptor;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONObject;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
@@ -39,6 +39,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
* <strong>Test strategy:</strong>
|
||||
* Tests inject a mock {@link HttpClient} via the package-private constructor
|
||||
* to exercise the real HTTP adapter path without requiring network access.
|
||||
* Configuration is supplied via {@link ProviderConfiguration}.
|
||||
* <p>
|
||||
* <strong>Coverage goals:</strong>
|
||||
* <ul>
|
||||
@@ -56,6 +57,8 @@ import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
* <li>Effective API key is actually used in the Authorization header</li>
|
||||
* <li>Full document text is sent (not truncated)</li>
|
||||
* <li>Null request raises NullPointerException</li>
|
||||
* <li>Adapter reads all values from ProviderConfiguration (AP-003)</li>
|
||||
* <li>Behavioral contracts are unchanged after constructor change (AP-003)</li>
|
||||
* </ul>
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@@ -70,27 +73,12 @@ class OpenAiHttpAdapterTest {
|
||||
@Mock
|
||||
private HttpClient httpClient;
|
||||
|
||||
private StartConfiguration testConfiguration;
|
||||
private ProviderConfiguration testConfiguration;
|
||||
private OpenAiHttpAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
testConfiguration = new StartConfiguration(
|
||||
Paths.get("/source"),
|
||||
Paths.get("/target"),
|
||||
Paths.get("/db.sqlite"),
|
||||
URI.create(API_BASE_URL),
|
||||
API_MODEL,
|
||||
TIMEOUT_SECONDS,
|
||||
5,
|
||||
100,
|
||||
5000,
|
||||
Paths.get("/prompt.txt"),
|
||||
Paths.get("/lock"),
|
||||
Paths.get("/logs"),
|
||||
"INFO",
|
||||
API_KEY
|
||||
);
|
||||
testConfiguration = new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
// Use the package-private constructor with injected mock HttpClient
|
||||
adapter = new OpenAiHttpAdapter(testConfiguration, httpClient);
|
||||
}
|
||||
@@ -101,7 +89,7 @@ class OpenAiHttpAdapterTest {
|
||||
// Arrange
|
||||
String responseBody = "{\"choices\":[{\"message\":{\"content\":\"test response\"}}]}";
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseBody);
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -120,7 +108,7 @@ class OpenAiHttpAdapterTest {
|
||||
void testNon200HttpStatusReturnsTechnicalFailure() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(500, null);
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -229,7 +217,7 @@ class OpenAiHttpAdapterTest {
|
||||
void testConfiguredTimeoutIsUsedInRequest() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -241,7 +229,6 @@ class OpenAiHttpAdapterTest {
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
// Verify the timeout was actually configured on the request
|
||||
assertThat(capturedRequest.timeout())
|
||||
.as("HttpRequest timeout should be present")
|
||||
.isPresent()
|
||||
@@ -254,7 +241,7 @@ class OpenAiHttpAdapterTest {
|
||||
void testConfiguredBaseUrlIsUsedInEndpoint() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -277,7 +264,7 @@ class OpenAiHttpAdapterTest {
|
||||
// Arrange
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// Act - invoke to trigger actual request building
|
||||
adapter.invoke(request);
|
||||
@@ -303,7 +290,7 @@ class OpenAiHttpAdapterTest {
|
||||
void testEffectiveApiKeyIsUsedInAuthorizationHeader() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -340,7 +327,7 @@ class OpenAiHttpAdapterTest {
|
||||
);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// Act - invoke to trigger actual request building
|
||||
adapter.invoke(request);
|
||||
@@ -378,7 +365,7 @@ class OpenAiHttpAdapterTest {
|
||||
void testSuccessPreservesRequest() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{\"result\":\"ok\"}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -436,22 +423,8 @@ class OpenAiHttpAdapterTest {
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API base URL is null")
|
||||
void testNullApiBaseUrlThrowsException() {
|
||||
StartConfiguration invalidConfig = new StartConfiguration(
|
||||
Paths.get("/source"),
|
||||
Paths.get("/target"),
|
||||
Paths.get("/db.sqlite"),
|
||||
null, // Invalid: null base URL
|
||||
API_MODEL,
|
||||
TIMEOUT_SECONDS,
|
||||
5,
|
||||
100,
|
||||
5000,
|
||||
Paths.get("/prompt.txt"),
|
||||
Paths.get("/lock"),
|
||||
Paths.get("/logs"),
|
||||
"INFO",
|
||||
API_KEY
|
||||
);
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, null, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
@@ -461,22 +434,8 @@ class OpenAiHttpAdapterTest {
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is null")
|
||||
void testNullApiModelThrowsException() {
|
||||
StartConfiguration invalidConfig = new StartConfiguration(
|
||||
Paths.get("/source"),
|
||||
Paths.get("/target"),
|
||||
Paths.get("/db.sqlite"),
|
||||
URI.create(API_BASE_URL),
|
||||
null, // Invalid: null model
|
||||
TIMEOUT_SECONDS,
|
||||
5,
|
||||
100,
|
||||
5000,
|
||||
Paths.get("/prompt.txt"),
|
||||
Paths.get("/lock"),
|
||||
Paths.get("/logs"),
|
||||
"INFO",
|
||||
API_KEY
|
||||
);
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
null, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
@@ -486,22 +445,8 @@ class OpenAiHttpAdapterTest {
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is blank")
|
||||
void testBlankApiModelThrowsException() {
|
||||
StartConfiguration invalidConfig = new StartConfiguration(
|
||||
Paths.get("/source"),
|
||||
Paths.get("/target"),
|
||||
Paths.get("/db.sqlite"),
|
||||
URI.create(API_BASE_URL),
|
||||
" ", // Invalid: blank model
|
||||
TIMEOUT_SECONDS,
|
||||
5,
|
||||
100,
|
||||
5000,
|
||||
Paths.get("/prompt.txt"),
|
||||
Paths.get("/lock"),
|
||||
Paths.get("/logs"),
|
||||
"INFO",
|
||||
API_KEY
|
||||
);
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
" ", TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
@@ -512,27 +457,12 @@ class OpenAiHttpAdapterTest {
|
||||
@DisplayName("should handle empty API key gracefully")
|
||||
void testEmptyApiKeyHandled() throws Exception {
|
||||
// Arrange
|
||||
StartConfiguration configWithEmptyKey = new StartConfiguration(
|
||||
Paths.get("/source"),
|
||||
Paths.get("/target"),
|
||||
Paths.get("/db.sqlite"),
|
||||
URI.create(API_BASE_URL),
|
||||
API_MODEL,
|
||||
TIMEOUT_SECONDS,
|
||||
5,
|
||||
100,
|
||||
5000,
|
||||
Paths.get("/prompt.txt"),
|
||||
Paths.get("/lock"),
|
||||
Paths.get("/logs"),
|
||||
"INFO",
|
||||
"" // Empty key
|
||||
);
|
||||
|
||||
OpenAiHttpAdapter adapterWithEmptyKey = new OpenAiHttpAdapter(configWithEmptyKey, httpClient);
|
||||
OpenAiHttpAdapter adapterWithEmptyKey = new OpenAiHttpAdapter(
|
||||
new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, ""),
|
||||
httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
when(httpClient.send(any(HttpRequest.class), any())).thenReturn((HttpResponse) httpResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
@@ -543,18 +473,119 @@ class OpenAiHttpAdapterTest {
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory AP-003 test cases
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that the adapter reads all values from the new {@link ProviderConfiguration}
|
||||
* namespace and uses them correctly in outgoing HTTP requests.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("openAiAdapterReadsValuesFromNewNamespace: all ProviderConfiguration fields are used")
|
||||
void openAiAdapterReadsValuesFromNewNamespace() throws Exception {
|
||||
// Arrange: ProviderConfiguration with values distinct from setUp defaults
|
||||
ProviderConfiguration nsConfig = new ProviderConfiguration(
|
||||
"ns-model-v2", 20, "https://provider-ns.example.com", "ns-api-key-abc");
|
||||
OpenAiHttpAdapter nsAdapter = new OpenAiHttpAdapter(nsConfig, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("prompt", "document");
|
||||
nsAdapter.invoke(request);
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
|
||||
// Verify baseUrl from ProviderConfiguration
|
||||
assertThat(capturedRequest.uri().toString())
|
||||
.as("baseUrl must come from ProviderConfiguration")
|
||||
.startsWith("https://provider-ns.example.com");
|
||||
|
||||
// Verify apiKey from ProviderConfiguration
|
||||
assertThat(capturedRequest.headers().firstValue("Authorization").orElse(""))
|
||||
.as("apiKey must come from ProviderConfiguration")
|
||||
.contains("ns-api-key-abc");
|
||||
|
||||
// Verify model from ProviderConfiguration
|
||||
String body = nsAdapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(new JSONObject(body).getString("model"))
|
||||
.as("model must come from ProviderConfiguration")
|
||||
.isEqualTo("ns-model-v2");
|
||||
|
||||
// Verify timeout from ProviderConfiguration
|
||||
assertThat(capturedRequest.timeout())
|
||||
.as("timeout must come from ProviderConfiguration")
|
||||
.isPresent()
|
||||
.get()
|
||||
.isEqualTo(Duration.ofSeconds(20));
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that adapter behavioral contracts (success mapping, error classification)
|
||||
* are unchanged after the constructor was changed from StartConfiguration to
|
||||
* ProviderConfiguration.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("openAiAdapterBehaviorIsUnchanged: HTTP success and error mapping contracts are preserved")
|
||||
void openAiAdapterBehaviorIsUnchanged() throws Exception {
|
||||
// Success case: HTTP 200 must produce AiInvocationSuccess with raw body
|
||||
String successBody = "{\"choices\":[{\"message\":{\"content\":\"result\"}}]}";
|
||||
HttpResponse<String> successResponse = mockHttpResponse(200, successBody);
|
||||
doReturn(successResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content()).isEqualTo(successBody);
|
||||
|
||||
// Non-200 case: HTTP 429 must produce AiInvocationTechnicalFailure with HTTP_429 reason
|
||||
HttpResponse<String> rateLimitedResponse = mockHttpResponse(429, null);
|
||||
doReturn(rateLimitedResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_429");
|
||||
|
||||
// Timeout case: HttpTimeoutException must produce TIMEOUT reason
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new HttpTimeoutException("timed out"));
|
||||
result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("TIMEOUT");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a port value of 0 in the base URL is not included in the endpoint URI.
|
||||
* <p>
|
||||
* {@link java.net.URI#getPort()} returns {@code 0} when the URL explicitly specifies
|
||||
* port 0. The endpoint builder must only include the port when it is greater than 0,
|
||||
* not when it is equal to 0 or negative.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should not include port 0 in the endpoint URI")
|
||||
void buildEndpointUri_doesNotIncludePortZero() throws Exception {
|
||||
ProviderConfiguration configWithPortZero = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, "http://example.com:0", API_KEY);
|
||||
OpenAiHttpAdapter adapterWithPortZero = new OpenAiHttpAdapter(configWithPortZero, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
"{\"choices\":[{\"message\":{\"content\":\"test\"}}]}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPortZero.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Port 0 must not appear in the endpoint URI")
|
||||
.doesNotContain(":0");
|
||||
}
|
||||
|
||||
// Helper methods
|
||||
|
||||
/**
|
||||
* Creates a mock HttpResponse with the specified status code and optional body.
|
||||
* <p>
|
||||
* This helper method works around Mockito's type variance issues with generics
|
||||
* by creating the mock with proper type handling. If body is null, the body()
|
||||
* method is not stubbed to avoid unnecessary stubs.
|
||||
*
|
||||
* @param statusCode the HTTP status code
|
||||
* @param body the response body (null to skip body stubbing)
|
||||
* @return a mock HttpResponse configured with the given status and body
|
||||
*/
|
||||
@SuppressWarnings("unchecked")
|
||||
private HttpResponse<String> mockHttpResponse(int statusCode, String body) {
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,447 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.Properties;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Tests for {@link LegacyConfigurationMigrator}.
|
||||
* <p>
|
||||
* Covers all mandatory test cases for the legacy-to-multi-provider configuration migration.
|
||||
* Temporary files are managed via {@link TempDir} so no test artifacts remain on the file system.
|
||||
*/
|
||||
class LegacyConfigurationMigratorTest {
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/** Full legacy configuration containing all four api.* keys plus other required keys. */
|
||||
private static String fullLegacyContent() {
|
||||
return "source.folder=./source\n"
|
||||
+ "target.folder=./target\n"
|
||||
+ "sqlite.file=./db.sqlite\n"
|
||||
+ "api.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "max.retries.transient=3\n"
|
||||
+ "max.pages=10\n"
|
||||
+ "max.text.characters=5000\n"
|
||||
+ "prompt.template.file=./prompt.txt\n"
|
||||
+ "api.key=sk-test-legacy-key\n"
|
||||
+ "log.level=INFO\n"
|
||||
+ "log.ai.sensitive=false\n";
|
||||
}
|
||||
|
||||
private Path writeLegacyFile(String name, String content) throws IOException {
|
||||
Path file = tempDir.resolve(name);
|
||||
Files.writeString(file, content, StandardCharsets.UTF_8);
|
||||
return file;
|
||||
}
|
||||
|
||||
private Properties loadProperties(Path file) throws IOException {
|
||||
Properties props = new Properties();
|
||||
props.load(Files.newBufferedReader(file, StandardCharsets.UTF_8));
|
||||
return props;
|
||||
}
|
||||
|
||||
private LegacyConfigurationMigrator defaultMigrator() {
|
||||
return new LegacyConfigurationMigrator();
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 1
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Legacy file with all four {@code api.*} keys is correctly migrated.
|
||||
* Values in the migrated file must be identical to the originals; all other keys survive.
|
||||
*/
|
||||
@Test
|
||||
void migratesLegacyFileWithAllFlatKeys() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("https://api.openai.com/v1", migrated.getProperty("ai.provider.openai-compatible.baseUrl"));
|
||||
assertEquals("gpt-4o", migrated.getProperty("ai.provider.openai-compatible.model"));
|
||||
assertEquals("30", migrated.getProperty("ai.provider.openai-compatible.timeoutSeconds"));
|
||||
assertEquals("sk-test-legacy-key", migrated.getProperty("ai.provider.openai-compatible.apiKey"));
|
||||
assertEquals("openai-compatible", migrated.getProperty("ai.provider.active"));
|
||||
|
||||
// Old flat keys must be gone
|
||||
assertFalse(migrated.containsKey("api.baseUrl"), "api.baseUrl must be removed");
|
||||
assertFalse(migrated.containsKey("api.model"), "api.model must be removed");
|
||||
assertFalse(migrated.containsKey("api.timeoutSeconds"), "api.timeoutSeconds must be removed");
|
||||
assertFalse(migrated.containsKey("api.key"), "api.key must be removed");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 2
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A {@code .bak} backup is created with the exact original content before any changes.
|
||||
*/
|
||||
@Test
|
||||
void createsBakBeforeOverwriting() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
|
||||
assertFalse(Files.exists(bakFile), "No .bak should exist before migration");
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
assertTrue(Files.exists(bakFile), ".bak must be created during migration");
|
||||
assertEquals(original, Files.readString(bakFile, StandardCharsets.UTF_8),
|
||||
".bak must contain the exact original content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 3
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* When {@code .bak} already exists, the new backup is written as {@code .bak.1}.
|
||||
* Neither the existing {@code .bak} nor the new {@code .bak.1} is overwritten.
|
||||
*/
|
||||
@Test
|
||||
void bakSuffixIsIncrementedIfBakExists() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
|
||||
// Pre-create .bak with different content
|
||||
Path existingBak = tempDir.resolve("app.properties.bak");
|
||||
Files.writeString(existingBak, "# existing bak", StandardCharsets.UTF_8);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// Existing .bak must be untouched
|
||||
assertEquals("# existing bak", Files.readString(existingBak, StandardCharsets.UTF_8),
|
||||
"Existing .bak must not be overwritten");
|
||||
|
||||
// New backup must be .bak.1 with original content
|
||||
Path newBak = tempDir.resolve("app.properties.bak.1");
|
||||
assertTrue(Files.exists(newBak), ".bak.1 must be created when .bak already exists");
|
||||
assertEquals(original, Files.readString(newBak, StandardCharsets.UTF_8),
|
||||
".bak.1 must contain the original content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 4
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A file already in the new multi-provider schema triggers no write and no {@code .bak}.
|
||||
*/
|
||||
@Test
|
||||
void noOpForAlreadyMigratedFile() throws IOException {
|
||||
String newSchema = "ai.provider.active=openai-compatible\n"
|
||||
+ "ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "ai.provider.openai-compatible.model=gpt-4o\n"
|
||||
+ "ai.provider.openai-compatible.timeoutSeconds=30\n"
|
||||
+ "ai.provider.openai-compatible.apiKey=sk-key\n";
|
||||
Path file = writeLegacyFile("app.properties", newSchema);
|
||||
long modifiedBefore = Files.getLastModifiedTime(file).toMillis();
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// File must not have been rewritten
|
||||
assertEquals(modifiedBefore, Files.getLastModifiedTime(file).toMillis(),
|
||||
"File modification time must not change for already-migrated files");
|
||||
|
||||
// No .bak should exist
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
assertFalse(Files.exists(bakFile), "No .bak must be created for already-migrated files");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 5
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* After migration, the new parser and validator load the file without error.
|
||||
*/
|
||||
@Test
|
||||
void reloadAfterMigrationSucceeds() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// Reload and parse with the new parser+validator — must not throw
|
||||
Properties props = loadProperties(file);
|
||||
MultiProviderConfiguration config = assertDoesNotThrow(
|
||||
() -> new MultiProviderConfigurationParser().parse(props),
|
||||
"Migrated file must be parseable by MultiProviderConfigurationParser");
|
||||
assertDoesNotThrow(
|
||||
() -> new MultiProviderConfigurationValidator().validate(config),
|
||||
"Migrated file must pass MultiProviderConfigurationValidator");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 6
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* When post-migration validation fails, a {@link ConfigurationLoadingException} is thrown
|
||||
* and the {@code .bak} backup is preserved with the original content.
|
||||
*/
|
||||
@Test
|
||||
void migrationFailureKeepsBak() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
|
||||
// Validator that always rejects
|
||||
MultiProviderConfigurationValidator failingValidator = new MultiProviderConfigurationValidator() {
|
||||
@Override
|
||||
public void validate(MultiProviderConfiguration config) {
|
||||
throw new InvalidStartConfigurationException("Simulated validation failure");
|
||||
}
|
||||
};
|
||||
|
||||
LegacyConfigurationMigrator migrator = new LegacyConfigurationMigrator(
|
||||
new MultiProviderConfigurationParser(), failingValidator);
|
||||
|
||||
assertThrows(ConfigurationLoadingException.class,
|
||||
() -> migrator.migrateIfLegacy(file),
|
||||
"Migration must throw ConfigurationLoadingException when post-migration validation fails");
|
||||
|
||||
// .bak must be preserved with original content
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
assertTrue(Files.exists(bakFile), ".bak must be preserved after migration failure");
|
||||
assertEquals(original, Files.readString(bakFile, StandardCharsets.UTF_8),
|
||||
".bak content must match the original file content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 7
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A file that contains {@code ai.provider.active} but no legacy {@code api.*} keys
|
||||
* is not considered legacy and triggers no migration.
|
||||
*/
|
||||
@Test
|
||||
void legacyDetectionRequiresAtLeastOneFlatKey() throws IOException {
|
||||
String notLegacy = "ai.provider.active=openai-compatible\n"
|
||||
+ "source.folder=./source\n"
|
||||
+ "max.pages=10\n";
|
||||
Path file = writeLegacyFile("app.properties", notLegacy);
|
||||
|
||||
Properties props = new Properties();
|
||||
props.load(Files.newBufferedReader(file, StandardCharsets.UTF_8));
|
||||
|
||||
boolean detected = defaultMigrator().isLegacyForm(props);
|
||||
|
||||
assertFalse(detected, "File with ai.provider.active and no api.* keys must not be detected as legacy");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 8
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* The four legacy values land in exactly the target keys in the openai-compatible namespace,
|
||||
* and {@code ai.provider.active} is set to {@code openai-compatible}.
|
||||
*/
|
||||
@Test
|
||||
void legacyValuesEndUpInOpenAiCompatibleNamespace() throws IOException {
|
||||
String content = "api.baseUrl=https://legacy.example.com/v1\n"
|
||||
+ "api.model=legacy-model\n"
|
||||
+ "api.timeoutSeconds=42\n"
|
||||
+ "api.key=legacy-key\n"
|
||||
+ "source.folder=./src\n";
|
||||
Path file = writeLegacyFile("app.properties", content);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("https://legacy.example.com/v1", migrated.getProperty("ai.provider.openai-compatible.baseUrl"),
|
||||
"api.baseUrl must map to ai.provider.openai-compatible.baseUrl");
|
||||
assertEquals("legacy-model", migrated.getProperty("ai.provider.openai-compatible.model"),
|
||||
"api.model must map to ai.provider.openai-compatible.model");
|
||||
assertEquals("42", migrated.getProperty("ai.provider.openai-compatible.timeoutSeconds"),
|
||||
"api.timeoutSeconds must map to ai.provider.openai-compatible.timeoutSeconds");
|
||||
assertEquals("legacy-key", migrated.getProperty("ai.provider.openai-compatible.apiKey"),
|
||||
"api.key must map to ai.provider.openai-compatible.apiKey");
|
||||
assertEquals("openai-compatible", migrated.getProperty("ai.provider.active"),
|
||||
"ai.provider.active must be set to openai-compatible");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 9
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Keys unrelated to the legacy api.* set survive the migration with identical values.
|
||||
*/
|
||||
@Test
|
||||
void unrelatedKeysSurviveUnchanged() throws IOException {
|
||||
String content = "source.folder=./my/source\n"
|
||||
+ "target.folder=./my/target\n"
|
||||
+ "sqlite.file=./my/db.sqlite\n"
|
||||
+ "max.pages=15\n"
|
||||
+ "max.text.characters=3000\n"
|
||||
+ "log.level=DEBUG\n"
|
||||
+ "log.ai.sensitive=false\n"
|
||||
+ "api.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-unrelated-test\n";
|
||||
Path file = writeLegacyFile("app.properties", content);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("./my/source", migrated.getProperty("source.folder"), "source.folder must be unchanged");
|
||||
assertEquals("./my/target", migrated.getProperty("target.folder"), "target.folder must be unchanged");
|
||||
assertEquals("./my/db.sqlite", migrated.getProperty("sqlite.file"), "sqlite.file must be unchanged");
|
||||
assertEquals("15", migrated.getProperty("max.pages"), "max.pages must be unchanged");
|
||||
assertEquals("3000", migrated.getProperty("max.text.characters"), "max.text.characters must be unchanged");
|
||||
assertEquals("DEBUG", migrated.getProperty("log.level"), "log.level must be unchanged");
|
||||
assertEquals("false", migrated.getProperty("log.ai.sensitive"), "log.ai.sensitive must be unchanged");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 10
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Migration writes via a temporary {@code .tmp} file followed by a move/rename.
|
||||
* After successful migration, no {@code .tmp} file remains, and the original path
|
||||
* holds the fully migrated content (never partially overwritten).
|
||||
*/
|
||||
@Test
|
||||
void inPlaceWriteIsAtomic() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
Path tmpFile = tempDir.resolve("app.properties.tmp");
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// .tmp must have been cleaned up (moved to target, not left behind)
|
||||
assertFalse(Files.exists(tmpFile),
|
||||
".tmp file must not exist after migration (must have been moved to target)");
|
||||
|
||||
// Target must contain migrated content
|
||||
Properties migrated = loadProperties(file);
|
||||
assertTrue(migrated.containsKey("ai.provider.active"),
|
||||
"Migrated file must contain ai.provider.active (complete write confirmed)");
|
||||
assertTrue(migrated.containsKey("ai.provider.openai-compatible.model"),
|
||||
"Migrated file must contain the new namespaced model key (complete write confirmed)");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: isLegacyForm – each individual legacy key triggers detection
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.baseUrl} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyBaseUrlPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_BASE_URL, "https://api.example.com");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.baseUrl must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.model} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyModelPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_MODEL, "gpt-4o");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.model must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.timeoutSeconds} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyTimeoutPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_TIMEOUT, "30");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.timeoutSeconds must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.key} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyApiKeyPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_API_KEY, "sk-test");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.key must be detected as legacy");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: lineDefinesKey / generateMigratedContent – prefix-only match must not fire
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A line whose key is a prefix of a legacy key (e.g. {@code api.baseUrlExtra}) must not
|
||||
* be treated as defining the legacy key ({@code api.baseUrl}) and must survive migration
|
||||
* unchanged while the actual legacy key is correctly replaced.
|
||||
*/
|
||||
@Test
|
||||
void generateMigratedContent_doesNotReplacePrefixMatchKey() {
|
||||
String content = "api.baseUrlExtra=should-not-change\n"
|
||||
+ "api.baseUrl=https://real.example.com\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-real\n";
|
||||
|
||||
String migrated = defaultMigrator().generateMigratedContent(content);
|
||||
|
||||
assertTrue(migrated.contains("api.baseUrlExtra=should-not-change"),
|
||||
"Line with key that is a prefix of a legacy key must not be modified");
|
||||
assertTrue(migrated.contains("ai.provider.openai-compatible.baseUrl=https://real.example.com"),
|
||||
"The actual legacy key api.baseUrl must be replaced with the namespaced key");
|
||||
}
|
||||
|
||||
/**
|
||||
* A line that defines a legacy key with no value (key only, no separator)
|
||||
* must be recognized as defining that key and be replaced in migration.
|
||||
*/
|
||||
@Test
|
||||
void generateMigratedContent_handlesKeyWithoutValue() {
|
||||
String content = "api.baseUrl\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-test\n";
|
||||
|
||||
String migrated = defaultMigrator().generateMigratedContent(content);
|
||||
|
||||
assertTrue(migrated.contains("ai.provider.openai-compatible.baseUrl"),
|
||||
"Key-only line (no value, no separator) must still be recognized and replaced");
|
||||
assertFalse(migrated.contains("api.baseUrl\n") || migrated.contains("api.baseUrl\r"),
|
||||
"Original key-only line must not survive unchanged");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,463 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.util.Properties;
|
||||
import java.util.function.Function;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Tests for the multi-provider configuration parsing and validation pipeline.
|
||||
* <p>
|
||||
* Covers all mandatory test cases for the new configuration schema as defined
|
||||
* in the active work package specification.
|
||||
*/
|
||||
class MultiProviderConfigurationTest {
|
||||
|
||||
private static final Function<String, String> NO_ENV = key -> null;
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private Properties fullOpenAiProperties() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty("ai.provider.active", "openai-compatible");
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "https://api.openai.com");
|
||||
props.setProperty("ai.provider.openai-compatible.model", "gpt-4o");
|
||||
props.setProperty("ai.provider.openai-compatible.timeoutSeconds", "30");
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "sk-openai-test");
|
||||
// Claude side intentionally not set (inactive)
|
||||
return props;
|
||||
}
|
||||
|
||||
private Properties fullClaudeProperties() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty("ai.provider.active", "claude");
|
||||
props.setProperty("ai.provider.claude.baseUrl", "https://api.anthropic.com");
|
||||
props.setProperty("ai.provider.claude.model", "claude-3-5-sonnet-20241022");
|
||||
props.setProperty("ai.provider.claude.timeoutSeconds", "60");
|
||||
props.setProperty("ai.provider.claude.apiKey", "sk-ant-test");
|
||||
// OpenAI side intentionally not set (inactive)
|
||||
return props;
|
||||
}
|
||||
|
||||
private MultiProviderConfiguration parseAndValidate(Properties props,
|
||||
Function<String, String> envLookup) {
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(envLookup);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
new MultiProviderConfigurationValidator().validate(config);
|
||||
return config;
|
||||
}
|
||||
|
||||
private MultiProviderConfiguration parseAndValidate(Properties props) {
|
||||
return parseAndValidate(props, NO_ENV);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 1
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Full new schema with OpenAI-compatible active, all required values present.
|
||||
* Parser and validator must both succeed.
|
||||
*/
|
||||
@Test
|
||||
void parsesNewSchemaWithOpenAiCompatibleActive() {
|
||||
MultiProviderConfiguration config = parseAndValidate(fullOpenAiProperties());
|
||||
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE, config.activeProviderFamily());
|
||||
assertEquals("gpt-4o", config.openAiCompatibleConfig().model());
|
||||
assertEquals(30, config.openAiCompatibleConfig().timeoutSeconds());
|
||||
assertEquals("https://api.openai.com", config.openAiCompatibleConfig().baseUrl());
|
||||
assertEquals("sk-openai-test", config.openAiCompatibleConfig().apiKey());
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 2
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Full new schema with Claude active, all required values present.
|
||||
* Parser and validator must both succeed.
|
||||
*/
|
||||
@Test
|
||||
void parsesNewSchemaWithClaudeActive() {
|
||||
MultiProviderConfiguration config = parseAndValidate(fullClaudeProperties());
|
||||
|
||||
assertEquals(AiProviderFamily.CLAUDE, config.activeProviderFamily());
|
||||
assertEquals("claude-3-5-sonnet-20241022", config.claudeConfig().model());
|
||||
assertEquals(60, config.claudeConfig().timeoutSeconds());
|
||||
assertEquals("https://api.anthropic.com", config.claudeConfig().baseUrl());
|
||||
assertEquals("sk-ant-test", config.claudeConfig().apiKey());
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 3
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Claude active, {@code ai.provider.claude.baseUrl} absent.
|
||||
* The default {@code https://api.anthropic.com} must be applied; validation must pass.
|
||||
*/
|
||||
@Test
|
||||
void claudeBaseUrlDefaultsWhenMissing() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.remove("ai.provider.claude.baseUrl");
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props);
|
||||
|
||||
assertNotNull(config.claudeConfig().baseUrl(),
|
||||
"baseUrl must not be null when Claude default is applied");
|
||||
assertEquals(MultiProviderConfigurationParser.CLAUDE_DEFAULT_BASE_URL,
|
||||
config.claudeConfig().baseUrl(),
|
||||
"Default Claude baseUrl must be https://api.anthropic.com");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 4
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code ai.provider.active} is absent. Parser must throw with a clear message.
|
||||
*/
|
||||
@Test
|
||||
void rejectsMissingActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.active");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
ConfigurationLoadingException ex = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> parser.parse(props));
|
||||
|
||||
assertTrue(ex.getMessage().contains("ai.provider.active"),
|
||||
"Error message must reference the missing property");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 5
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code ai.provider.active=foo} – unrecognised value. Parser must throw.
|
||||
*/
|
||||
@Test
|
||||
void rejectsUnknownActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.active", "foo");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
ConfigurationLoadingException ex = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> parser.parse(props));
|
||||
|
||||
assertTrue(ex.getMessage().contains("foo"),
|
||||
"Error message must include the unrecognised value");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 6
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Active provider has a mandatory field blank (model removed). Validation must fail.
|
||||
*/
|
||||
@Test
|
||||
void rejectsMissingMandatoryFieldForActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.model");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("model"),
|
||||
"Error message must mention the missing field");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 7
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Inactive provider has incomplete configuration (Claude fields missing while OpenAI is active).
|
||||
* Validation must pass; inactive provider fields are not required.
|
||||
*/
|
||||
@Test
|
||||
void acceptsMissingMandatoryFieldForInactiveProvider() {
|
||||
// OpenAI active, Claude completely unconfigured
|
||||
Properties props = fullOpenAiProperties();
|
||||
// No ai.provider.claude.* keys set
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props);
|
||||
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE, config.activeProviderFamily(),
|
||||
"Active provider must be openai-compatible");
|
||||
// Claude config may have null/blank fields – no exception expected
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 8
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Environment variable for the active provider overrides the properties value.
|
||||
* <p>
|
||||
* Sub-case A: {@code OPENAI_COMPATIBLE_API_KEY} set, OpenAI active.
|
||||
* Sub-case B: {@code ANTHROPIC_API_KEY} set, Claude active.
|
||||
*/
|
||||
@Test
|
||||
void envVarOverridesPropertiesApiKeyForActiveProvider() {
|
||||
// Sub-case A: OpenAI active, OPENAI_COMPATIBLE_API_KEY set
|
||||
Properties openAiProps = fullOpenAiProperties();
|
||||
openAiProps.setProperty("ai.provider.openai-compatible.apiKey", "properties-key");
|
||||
|
||||
Function<String, String> envWithOpenAiKey = key ->
|
||||
MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)
|
||||
? "env-openai-key" : null;
|
||||
|
||||
MultiProviderConfiguration openAiConfig = parseAndValidate(openAiProps, envWithOpenAiKey);
|
||||
assertEquals("env-openai-key", openAiConfig.openAiCompatibleConfig().apiKey(),
|
||||
"Env var must override properties API key for OpenAI-compatible");
|
||||
|
||||
// Sub-case B: Claude active, ANTHROPIC_API_KEY set
|
||||
Properties claudeProps = fullClaudeProperties();
|
||||
claudeProps.setProperty("ai.provider.claude.apiKey", "properties-key");
|
||||
|
||||
Function<String, String> envWithClaudeKey = key ->
|
||||
MultiProviderConfigurationParser.ENV_CLAUDE_API_KEY.equals(key)
|
||||
? "env-claude-key" : null;
|
||||
|
||||
MultiProviderConfiguration claudeConfig = parseAndValidate(claudeProps, envWithClaudeKey);
|
||||
assertEquals("env-claude-key", claudeConfig.claudeConfig().apiKey(),
|
||||
"Env var must override properties API key for Claude");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Test: legacy env var PDF_UMBENENNER_API_KEY
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code PDF_UMBENENNER_API_KEY} is set, {@code OPENAI_COMPATIBLE_API_KEY} is absent.
|
||||
* The legacy variable must be accepted as a fallback for the OpenAI-compatible provider.
|
||||
*/
|
||||
@Test
|
||||
void legacyEnvVarPdfUmbenennerApiKeyUsedWhenPrimaryAbsent() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.apiKey");
|
||||
|
||||
Function<String, String> envWithLegacy = key ->
|
||||
MultiProviderConfigurationParser.ENV_LEGACY_OPENAI_API_KEY.equals(key)
|
||||
? "legacy-env-key" : null;
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envWithLegacy);
|
||||
assertEquals("legacy-env-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"Legacy env var PDF_UMBENENNER_API_KEY must be used when OPENAI_COMPATIBLE_API_KEY is absent");
|
||||
}
|
||||
|
||||
/**
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} takes precedence over {@code PDF_UMBENENNER_API_KEY}.
|
||||
*/
|
||||
@Test
|
||||
void primaryEnvVarTakesPrecedenceOverLegacyEnvVar() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.apiKey");
|
||||
|
||||
Function<String, String> envBoth = key -> {
|
||||
if (MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)) return "primary-key";
|
||||
if (MultiProviderConfigurationParser.ENV_LEGACY_OPENAI_API_KEY.equals(key)) return "legacy-key";
|
||||
return null;
|
||||
};
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envBoth);
|
||||
assertEquals("primary-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"OPENAI_COMPATIBLE_API_KEY must take precedence over PDF_UMBENENNER_API_KEY");
|
||||
}
|
||||
|
||||
/**
|
||||
* Neither env var is set; the properties value is used as final fallback.
|
||||
*/
|
||||
@Test
|
||||
void propertiesApiKeyUsedWhenNoEnvVarSet() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "props-only-key");
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, NO_ENV);
|
||||
assertEquals("props-only-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"Properties API key must be used when no env var is set");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: base URL validation
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* OpenAI-compatible provider with an invalid (non-URI) base URL must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsInvalidBaseUrlForActiveOpenAiProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "not a valid url at all ://");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
}
|
||||
|
||||
/**
|
||||
* Claude provider with an invalid base URL must be rejected when Claude is active.
|
||||
*/
|
||||
@Test
|
||||
void rejectsInvalidBaseUrlForActiveClaudeProvider() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.setProperty("ai.provider.claude.baseUrl", "ftp://api.anthropic.com");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
assertTrue(ex.getMessage().contains("ftp"),
|
||||
"Error message must mention the invalid scheme");
|
||||
}
|
||||
|
||||
/**
|
||||
* A relative URI (no scheme, no host) must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsRelativeUriAsBaseUrl() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "/v1/chat");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
}
|
||||
|
||||
/**
|
||||
* A non-http/https scheme (e.g. {@code ftp://}) must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsNonHttpSchemeAsBaseUrl() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "ftp://api.example.com");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
assertTrue(ex.getMessage().contains("ftp"),
|
||||
"Error message must mention the invalid scheme");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 9
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Environment variable is set only for the inactive provider.
|
||||
* The active provider must use its own properties value; the inactive provider's
|
||||
* env var must not affect the active provider's resolved key.
|
||||
*/
|
||||
@Test
|
||||
void envVarOnlyResolvesForActiveProvider() {
|
||||
// OpenAI is active with a properties apiKey.
|
||||
// ANTHROPIC_API_KEY is set (for the inactive Claude provider).
|
||||
// The OpenAI config must use its properties key, not the Anthropic env var.
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "openai-properties-key");
|
||||
|
||||
Function<String, String> envWithClaudeKeyOnly = key ->
|
||||
MultiProviderConfigurationParser.ENV_CLAUDE_API_KEY.equals(key)
|
||||
? "anthropic-env-key" : null;
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envWithClaudeKeyOnly);
|
||||
|
||||
assertEquals("openai-properties-key",
|
||||
config.openAiCompatibleConfig().apiKey(),
|
||||
"Active provider (OpenAI) must use its own properties key, "
|
||||
+ "not the inactive provider's env var");
|
||||
// The Anthropic env var IS applied to the Claude config (inactive),
|
||||
// but that does not affect the active provider.
|
||||
assertEquals("anthropic-env-key",
|
||||
config.claudeConfig().apiKey(),
|
||||
"Inactive Claude config should still pick up its own env var");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: timeout validation
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Active provider has timeout set to 0. Validation must fail and mention timeoutSeconds.
|
||||
* This verifies that validateTimeoutSeconds is called and that the boundary is strictly
|
||||
* positive (i.e. 0 is rejected, not just negative values).
|
||||
*/
|
||||
@Test
|
||||
void rejectsZeroTimeoutForActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.timeoutSeconds", "0");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("timeoutSeconds"),
|
||||
"Error message must reference timeoutSeconds");
|
||||
}
|
||||
|
||||
/**
|
||||
* Active Claude provider has timeout set to 0. Same invariant for the other provider family.
|
||||
*/
|
||||
@Test
|
||||
void rejectsZeroTimeoutForActiveClaudeProvider() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.setProperty("ai.provider.claude.timeoutSeconds", "0");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("timeoutSeconds"),
|
||||
"Error message must reference timeoutSeconds");
|
||||
}
|
||||
}
|
||||
@@ -1,6 +1,7 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
@@ -14,11 +15,14 @@ import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link PropertiesConfigurationPortAdapter}.
|
||||
* <p>
|
||||
* Tests cover valid configuration loading, missing mandatory properties,
|
||||
* invalid property values, and API-key environment variable precedence.
|
||||
* invalid property values, and API-key environment variable precedence
|
||||
* for the multi-provider schema.
|
||||
*/
|
||||
class PropertiesConfigurationPortAdapterTest {
|
||||
|
||||
@@ -41,13 +45,20 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertNotNull(config);
|
||||
// Use endsWith to handle platform-specific path separators
|
||||
assertTrue(config.sourceFolder().toString().endsWith("source"));
|
||||
assertTrue(config.targetFolder().toString().endsWith("target"));
|
||||
assertTrue(config.sqliteFile().toString().endsWith("db.sqlite"));
|
||||
assertEquals("https://api.example.com", config.apiBaseUrl().toString());
|
||||
assertEquals("gpt-4", config.apiModel());
|
||||
assertEquals(30, config.apiTimeoutSeconds());
|
||||
assertNotNull(config.multiProviderConfiguration());
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE,
|
||||
config.multiProviderConfiguration().activeProviderFamily());
|
||||
assertEquals("https://api.example.com",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().baseUrl());
|
||||
assertEquals("gpt-4",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().model());
|
||||
assertEquals(30,
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds());
|
||||
assertEquals("test-api-key-from-properties",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().apiKey());
|
||||
assertEquals(3, config.maxRetriesTransient());
|
||||
assertEquals(100, config.maxPages());
|
||||
assertEquals(50000, config.maxTextCharacters());
|
||||
@@ -55,57 +66,60 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
assertTrue(config.runtimeLockFile().toString().endsWith("lock.lock"));
|
||||
assertTrue(config.logDirectory().toString().endsWith("logs"));
|
||||
assertEquals("DEBUG", config.logLevel());
|
||||
assertEquals("test-api-key-from-properties", config.apiKey());
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsAbsent() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenAbsentAndNoEnvVar() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "API key should be empty when not in properties and no env var");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Missing API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsNull() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsNull() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> null;
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey());
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Null env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsEmpty() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsEmpty() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> "";
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "Empty env var should fall back to empty string");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Empty env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsBlank() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsBlank() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> " ";
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "Blank env var should fall back to empty string");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Blank env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -113,7 +127,7 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
Path configFile = createConfigFile("valid-config.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> {
|
||||
if ("PDF_UMBENENNER_API_KEY".equals(key)) {
|
||||
if (MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)) {
|
||||
return "env-api-key-override";
|
||||
}
|
||||
return null;
|
||||
@@ -123,7 +137,9 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("env-api-key-override", config.apiKey(), "Environment variable should override properties");
|
||||
assertEquals("env-api-key-override",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().apiKey(),
|
||||
"Environment variable must override properties API key");
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -162,21 +178,22 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=60\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=60\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=5\n" +
|
||||
"max.pages=200\n" +
|
||||
"max.text.characters=100000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals(60, config.apiTimeoutSeconds());
|
||||
assertEquals(60, config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds());
|
||||
assertEquals(5, config.maxRetriesTransient());
|
||||
assertEquals(200, config.maxPages());
|
||||
assertEquals(100000, config.maxTextCharacters());
|
||||
@@ -188,21 +205,24 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds= 45 \n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds= 45 \n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=150\n" +
|
||||
"max.text.characters=75000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals(45, config.apiTimeoutSeconds(), "Whitespace should be trimmed from integer values");
|
||||
assertEquals(45,
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds(),
|
||||
"Whitespace should be trimmed from integer values");
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -211,14 +231,15 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=not-a-number\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=not-a-number\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=150\n" +
|
||||
"max.text.characters=75000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
@@ -232,26 +253,28 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_parsesUriCorrectly() throws Exception {
|
||||
void loadConfiguration_parsesBaseUrlStringCorrectly() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com:8080/v1\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=30\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com:8080/v1\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("https://api.example.com:8080/v1", config.apiBaseUrl().toString());
|
||||
assertEquals("https://api.example.com:8080/v1",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().baseUrl());
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -260,14 +283,15 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=30\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
@@ -281,26 +305,28 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
|
||||
@Test
|
||||
void allConfigurationFailuresAreClassifiedAsConfigurationLoadingException() throws Exception {
|
||||
// Verify that file I/O failure uses ConfigurationLoadingException
|
||||
// File I/O failure
|
||||
Path nonExistentFile = tempDir.resolve("nonexistent.properties");
|
||||
PropertiesConfigurationPortAdapter adapter1 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, nonExistentFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter1.loadConfiguration(),
|
||||
"File I/O failure should throw ConfigurationLoadingException");
|
||||
|
||||
// Verify that missing required property uses ConfigurationLoadingException
|
||||
// Missing required property
|
||||
Path missingPropFile = createConfigFile("missing-required.properties");
|
||||
PropertiesConfigurationPortAdapter adapter2 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, missingPropFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter2.loadConfiguration(),
|
||||
"Missing required property should throw ConfigurationLoadingException");
|
||||
|
||||
// Verify that invalid integer value uses ConfigurationLoadingException
|
||||
// Invalid integer value
|
||||
Path invalidIntFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=invalid\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=invalid\n" +
|
||||
"ai.provider.openai-compatible.apiKey=key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
@@ -310,29 +336,245 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter3.loadConfiguration(),
|
||||
"Invalid integer value should throw ConfigurationLoadingException");
|
||||
|
||||
// Verify that invalid URI value uses ConfigurationLoadingException
|
||||
Path invalidUriFile = createInlineConfig(
|
||||
// Unknown ai.provider.active value
|
||||
Path unknownProviderFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=not a valid uri\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=30\n" +
|
||||
"ai.provider.active=unknown-provider\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
PropertiesConfigurationPortAdapter adapter4 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, invalidUriFile);
|
||||
PropertiesConfigurationPortAdapter adapter4 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, unknownProviderFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter4.loadConfiguration(),
|
||||
"Invalid URI value should throw ConfigurationLoadingException");
|
||||
"Unknown provider identifier should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveDefaultsFalseWhenAbsent() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
// log.ai.sensitive intentionally omitted
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must default to false when the property is absent");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveParsedTrueWhenExplicitlySet() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=true\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertTrue(config.logAiSensitive(),
|
||||
"log.ai.sensitive must be parsed as true when explicitly set to 'true'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveParsedFalseWhenExplicitlySet() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=false\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must be parsed as false when explicitly set to 'false'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveHandlesCaseInsensitiveTrue() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=TRUE\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertTrue(config.logAiSensitive(),
|
||||
"log.ai.sensitive must handle case-insensitive 'TRUE'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveHandlesCaseInsensitiveFalse() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=FALSE\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must handle case-insensitive 'FALSE'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitive() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=maybe\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value should throw ConfigurationLoadingException");
|
||||
assertTrue(exception.getMessage().contains("'maybe'"),
|
||||
"Error message should include the invalid value");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitiveYes() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=yes\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value 'yes' should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitive1() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=1\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value '1' should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
private Path createConfigFile(String resourceName) throws Exception {
|
||||
Path sourceResource = Path.of("src/test/resources", resourceName);
|
||||
Path targetConfigFile = tempDir.resolve("application.properties");
|
||||
|
||||
// Copy content from resource file
|
||||
Files.copy(sourceResource, targetConfigFile);
|
||||
return targetConfigFile;
|
||||
}
|
||||
@@ -344,4 +586,4 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
}
|
||||
return configFile;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -19,8 +19,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link Sha256FingerprintAdapter}.
|
||||
*
|
||||
* @since M4-AP-002
|
||||
*/
|
||||
class Sha256FingerprintAdapterTest {
|
||||
|
||||
|
||||
@@ -28,12 +28,10 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
/**
|
||||
* Tests for {@link PdfTextExtractionPortAdapter}.
|
||||
* <p>
|
||||
* M3-AP-003: Minimal tests validating basic extraction functionality and technical error handling.
|
||||
* In AP-003 scope: all extraction problems are treated as TechnicalError, not ContentError.
|
||||
* No fachliche validation of text content (that is AP-004).
|
||||
* Validates basic extraction functionality and technical error handling.
|
||||
* All extraction problems are treated as {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError},
|
||||
* not content errors. Content usability (text quality assessment) is handled in the application layer.
|
||||
* PDFs are created programmatically using PDFBox to avoid external dependencies on test files.
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
class PdfTextExtractionPortAdapterTest {
|
||||
|
||||
@@ -170,8 +168,8 @@ class PdfTextExtractionPortAdapterTest {
|
||||
|
||||
PdfExtractionResult result = adapter.extractTextAndPageCount(candidate);
|
||||
|
||||
// AP-003: Empty text is SUCCESS, not an error
|
||||
// Fachliche Bewertung of text content happens in AP-004
|
||||
// Empty text is SUCCESS at extraction level, not an error
|
||||
// Fachliche Bewertung of text content happens in the application layer
|
||||
assertInstanceOf(PdfExtractionSuccess.class, result);
|
||||
PdfExtractionSuccess success = (PdfExtractionSuccess) result;
|
||||
assertEquals(1, success.pageCount().value());
|
||||
|
||||
@@ -20,8 +20,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
/**
|
||||
* Tests for {@link SourceDocumentCandidatesPortAdapter}.
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
class SourceDocumentCandidatesPortAdapterTest {
|
||||
|
||||
@@ -198,7 +196,7 @@ class SourceDocumentCandidatesPortAdapterTest {
|
||||
|
||||
@Test
|
||||
void testLoadCandidates_EmptyPdfFilesAreIncluded() throws IOException {
|
||||
// Create empty PDF files (M3-AP-002 requirement: PDF-Dateien im Quellordner)
|
||||
// Create empty PDF files
|
||||
Files.createFile(tempDir.resolve("empty1.pdf"));
|
||||
Files.createFile(tempDir.resolve("empty2.pdf"));
|
||||
// Also add a non-empty PDF for contrast
|
||||
@@ -207,8 +205,26 @@ class SourceDocumentCandidatesPortAdapterTest {
|
||||
List<SourceDocumentCandidate> candidates = adapter.loadCandidates();
|
||||
|
||||
assertEquals(3, candidates.size(),
|
||||
"Empty PDF files should be included as candidates; content evaluation happens in AP-004");
|
||||
"Empty PDF files should be included as candidates; content evaluation happens during document processing");
|
||||
assertTrue(candidates.stream().allMatch(c -> c.uniqueIdentifier().endsWith(".pdf")),
|
||||
"All candidates should be PDF files");
|
||||
}
|
||||
|
||||
/**
|
||||
* A directory whose name ends with {@code .pdf} must not be included as a candidate.
|
||||
* <p>
|
||||
* The regular-file filter must exclude directories even when their name matches the
|
||||
* PDF extension, so that only actual PDF files are returned.
|
||||
*/
|
||||
@Test
|
||||
void testLoadCandidates_DirectoryWithPdfExtensionIsExcluded() throws IOException {
|
||||
Files.write(tempDir.resolve("real.pdf"), "content".getBytes());
|
||||
Files.createDirectory(tempDir.resolve("looks-like.pdf"));
|
||||
|
||||
List<SourceDocumentCandidate> candidates = adapter.loadCandidates();
|
||||
|
||||
assertEquals(1, candidates.size(),
|
||||
"A directory with .pdf extension must not be included as a candidate");
|
||||
assertEquals("real.pdf", candidates.get(0).uniqueIdentifier());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,394 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DatabaseMetaData;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.PreparedStatement;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.sql.Statement;
|
||||
import java.time.Instant;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
import java.util.List;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* Tests for the additive {@code ai_provider} column in {@code processing_attempt}.
|
||||
* <p>
|
||||
* Covers schema migration (idempotency, nullable default for existing rows),
|
||||
* write/read round-trips for both supported provider identifiers, and
|
||||
* backward compatibility with databases created before provider tracking was introduced.
|
||||
*/
|
||||
class SqliteAttemptProviderPersistenceTest {
|
||||
|
||||
private String jdbcUrl;
|
||||
private SqliteSchemaInitializationAdapter schemaAdapter;
|
||||
private SqliteProcessingAttemptRepositoryAdapter repository;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
Path dbFile = tempDir.resolve("provider-test.db");
|
||||
jdbcUrl = "jdbc:sqlite:" + dbFile.toAbsolutePath();
|
||||
schemaAdapter = new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
repository = new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Schema migration tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* A fresh database must contain the {@code ai_provider} column after schema initialisation.
|
||||
*/
|
||||
@Test
|
||||
void addsProviderColumnOnFreshDb() throws SQLException {
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider column must exist in processing_attempt after fresh schema init")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* A database that already has the {@code processing_attempt} table without
|
||||
* {@code ai_provider} (simulating an existing installation before this column was added)
|
||||
* must receive the column via the idempotent schema evolution.
|
||||
*/
|
||||
@Test
|
||||
void addsProviderColumnOnExistingDbWithoutColumn() throws SQLException {
|
||||
// Bootstrap schema without the ai_provider column (simulate legacy DB)
|
||||
createLegacySchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider must not be present before evolution")
|
||||
.isFalse();
|
||||
|
||||
// Running initializeSchema must add the column
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider column must be added by schema evolution")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* Running schema initialisation multiple times must not fail and must not change the schema.
|
||||
*/
|
||||
@Test
|
||||
void migrationIsIdempotent() throws SQLException {
|
||||
schemaAdapter.initializeSchema();
|
||||
// Second and third init must not throw or change the schema
|
||||
schemaAdapter.initializeSchema();
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("Column must still be present after repeated init calls")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* Rows that existed before the {@code ai_provider} column was added must have
|
||||
* {@code NULL} as the column value, not a non-null default.
|
||||
*/
|
||||
@Test
|
||||
void existingRowsKeepNullProvider() throws SQLException {
|
||||
// Create legacy schema and insert a row without ai_provider
|
||||
createLegacySchema();
|
||||
DocumentFingerprint fp = fingerprint("aa");
|
||||
insertLegacyDocumentRecord(fp);
|
||||
insertLegacyAttemptRow(fp, "READY_FOR_AI");
|
||||
|
||||
// Now evolve the schema
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
// Read the existing row — ai_provider must be NULL
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fp);
|
||||
assertThat(attempts).hasSize(1);
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Existing rows must have NULL ai_provider after schema evolution")
|
||||
.isNull();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Write tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* A new attempt written with an active OpenAI-compatible provider must
|
||||
* persist {@code "openai-compatible"} in {@code ai_provider}.
|
||||
*/
|
||||
@Test
|
||||
void newAttemptsWriteOpenAiCompatibleProvider() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("bb");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fp, new RunId("run-oai"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fp);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).aiProvider()).isEqualTo("openai-compatible");
|
||||
}
|
||||
|
||||
/**
|
||||
* A new attempt written with an active Claude provider must persist
|
||||
* {@code "claude"} in {@code ai_provider}.
|
||||
* <p>
|
||||
* The provider selection is simulated at the data level here; the actual
|
||||
* Claude adapter is wired in a later step.
|
||||
*/
|
||||
@Test
|
||||
void newAttemptsWriteClaudeProvider() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("cc");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fp, new RunId("run-claude"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"claude",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fp);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).aiProvider()).isEqualTo("claude");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Read tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* The repository must correctly return the persisted provider identifier
|
||||
* when reading an attempt back from the database.
|
||||
*/
|
||||
@Test
|
||||
void repositoryReadsProviderColumn() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("dd");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
repository.save(new ProcessingAttempt(
|
||||
fp, new RunId("run-read"), 1, now, now.plusSeconds(2),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"Timeout", "Connection timed out", true,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null));
|
||||
|
||||
List<ProcessingAttempt> loaded = repository.findAllByFingerprint(fp);
|
||||
assertThat(loaded).hasSize(1);
|
||||
assertThat(loaded.get(0).aiProvider())
|
||||
.as("Repository must return the persisted ai_provider value")
|
||||
.isEqualTo("openai-compatible");
|
||||
}
|
||||
|
||||
/**
|
||||
* Reading a database that was created without the {@code ai_provider} column
|
||||
* (a pre-extension database) must succeed; the new field must be empty/null
|
||||
* for historical attempts.
|
||||
*/
|
||||
@Test
|
||||
void legacyDataReadingDoesNotFail() throws SQLException {
|
||||
// Set up legacy schema with a row that has no ai_provider column
|
||||
createLegacySchema();
|
||||
DocumentFingerprint fp = fingerprint("ee");
|
||||
insertLegacyDocumentRecord(fp);
|
||||
insertLegacyAttemptRow(fp, "FAILED_RETRYABLE");
|
||||
|
||||
// Evolve schema — now ai_provider column exists but legacy rows have NULL
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
// Reading must not throw and must return null for ai_provider
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fp);
|
||||
assertThat(attempts).hasSize(1);
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Legacy attempt (from before provider tracking) must have null aiProvider")
|
||||
.isNull();
|
||||
// Other fields must still be readable
|
||||
assertThat(attempts.get(0).status()).isEqualTo(ProcessingStatus.FAILED_RETRYABLE);
|
||||
}
|
||||
|
||||
/**
|
||||
* All existing attempt history tests must remain green: the repository
|
||||
* handles null {@code ai_provider} values transparently without errors.
|
||||
*/
|
||||
@Test
|
||||
void existingHistoryTestsRemainGreen() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("ff");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant base = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Save attempt with null provider (as in legacy path or non-AI attempt)
|
||||
ProcessingAttempt nullProviderAttempt = ProcessingAttempt.withoutAiFields(
|
||||
fp, new RunId("run-legacy"), 1,
|
||||
base, base.plusSeconds(1),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"Err", "msg", true);
|
||||
repository.save(nullProviderAttempt);
|
||||
|
||||
// Save attempt with explicit provider
|
||||
ProcessingAttempt withProvider = new ProcessingAttempt(
|
||||
fp, new RunId("run-new"), 2,
|
||||
base.plusSeconds(10), base.plusSeconds(11),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
repository.save(withProvider);
|
||||
|
||||
List<ProcessingAttempt> all = repository.findAllByFingerprint(fp);
|
||||
assertThat(all).hasSize(2);
|
||||
assertThat(all.get(0).aiProvider()).isNull();
|
||||
assertThat(all.get(1).aiProvider()).isEqualTo("openai-compatible");
|
||||
// Ordering preserved
|
||||
assertThat(all.get(0).attemptNumber()).isEqualTo(1);
|
||||
assertThat(all.get(1).attemptNumber()).isEqualTo(2);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private boolean columnExists(String table, String column) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
DatabaseMetaData meta = conn.getMetaData();
|
||||
try (ResultSet rs = meta.getColumns(null, null, table, column)) {
|
||||
return rs.next();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates the base tables that existed before the {@code ai_provider} column was added,
|
||||
* without running the schema evolution that adds that column.
|
||||
*/
|
||||
private void createLegacySchema() throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
Statement stmt = conn.createStatement()) {
|
||||
stmt.execute("PRAGMA foreign_keys = ON");
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS document_record (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
last_known_source_locator TEXT NOT NULL,
|
||||
last_known_source_file_name TEXT NOT NULL,
|
||||
overall_status TEXT NOT NULL,
|
||||
content_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
transient_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
last_failure_instant TEXT,
|
||||
last_success_instant TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL,
|
||||
CONSTRAINT uq_document_record_fingerprint UNIQUE (fingerprint)
|
||||
)""");
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS processing_attempt (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
run_id TEXT NOT NULL,
|
||||
attempt_number INTEGER NOT NULL,
|
||||
started_at TEXT NOT NULL,
|
||||
ended_at TEXT NOT NULL,
|
||||
status TEXT NOT NULL,
|
||||
failure_class TEXT,
|
||||
failure_message TEXT,
|
||||
retryable INTEGER NOT NULL DEFAULT 0,
|
||||
model_name TEXT,
|
||||
prompt_identifier TEXT,
|
||||
processed_page_count INTEGER,
|
||||
sent_character_count INTEGER,
|
||||
ai_raw_response TEXT,
|
||||
ai_reasoning TEXT,
|
||||
resolved_date TEXT,
|
||||
date_source TEXT,
|
||||
validated_title TEXT,
|
||||
final_target_file_name TEXT,
|
||||
CONSTRAINT fk_processing_attempt_fingerprint
|
||||
FOREIGN KEY (fingerprint) REFERENCES document_record (fingerprint),
|
||||
CONSTRAINT uq_processing_attempt_fingerprint_number
|
||||
UNIQUE (fingerprint, attempt_number)
|
||||
)""");
|
||||
}
|
||||
}
|
||||
|
||||
private void insertLegacyDocumentRecord(DocumentFingerprint fp) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, '/tmp/test.pdf', 'test.pdf', 'READY_FOR_AI',
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'))""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private void insertLegacyAttemptRow(DocumentFingerprint fp, String status) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO processing_attempt
|
||||
(fingerprint, run_id, attempt_number, started_at, ended_at, status, retryable)
|
||||
VALUES (?, 'run-legacy', 1, strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'), ?, 1)""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.setString(2, status);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private void insertDocumentRecord(DocumentFingerprint fp) {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, '/tmp/test.pdf', 'test.pdf', 'READY_FOR_AI',
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'))""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.executeUpdate();
|
||||
} catch (SQLException e) {
|
||||
throw new RuntimeException("Failed to insert test document record", e);
|
||||
}
|
||||
}
|
||||
|
||||
private static DocumentFingerprint fingerprint(String suffix) {
|
||||
return new DocumentFingerprint(
|
||||
("0".repeat(64 - suffix.length()) + suffix));
|
||||
}
|
||||
}
|
||||
@@ -391,6 +391,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, runId, 1, startedAt, endedAt,
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
"gpt-4o", "prompt-v1.txt",
|
||||
5, 1234,
|
||||
"{\"date\":\"2026-03-15\",\"title\":\"Stromabrechnung\",\"reasoning\":\"Invoice date found.\"}",
|
||||
@@ -434,6 +435,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, runId, 1, now, now.plusSeconds(5),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
"claude-sonnet-4-6", "prompt-v2.txt",
|
||||
3, 800,
|
||||
"{\"title\":\"Kontoauszug\",\"reasoning\":\"No date in document.\"}",
|
||||
@@ -531,6 +533,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-p"), 1, now, now.plusSeconds(2),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4o", "prompt-v1.txt", 2, 500,
|
||||
"{\"title\":\"Rechnung\",\"reasoning\":\"Found.\"}",
|
||||
"Found.", date, DateSource.AI_PROVIDED, "Rechnung",
|
||||
@@ -560,6 +563,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-1"), 1, base, base.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-a", "prompt-v1.txt", 1, 100,
|
||||
"{}", "First.", LocalDate.of(2026, 1, 1), DateSource.AI_PROVIDED, "TitelEins",
|
||||
null
|
||||
@@ -577,6 +581,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-3"), 3, base.plusSeconds(20), base.plusSeconds(21),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-b", "prompt-v2.txt", 2, 200,
|
||||
"{}", "Second.", LocalDate.of(2026, 2, 2), DateSource.AI_PROVIDED, "TitelZwei",
|
||||
null
|
||||
@@ -606,6 +611,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, runId, 1, now, now.plusSeconds(3),
|
||||
ProcessingStatus.SUCCESS,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4", "prompt-v1.txt", 2, 600,
|
||||
"{\"title\":\"Rechnung\",\"reasoning\":\"Invoice.\"}",
|
||||
"Invoice.",
|
||||
@@ -637,6 +643,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-prop"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4", "prompt-v1.txt", 1, 200,
|
||||
"{}", "reason",
|
||||
LocalDate.of(2026, 3, 1), DateSource.AI_PROVIDED,
|
||||
@@ -667,6 +674,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-1"), 1, base, base.plusSeconds(2),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-a", "prompt-v1.txt", 3, 700,
|
||||
"{}", "reason.", date, DateSource.AI_PROVIDED, "Bescheid", null
|
||||
);
|
||||
@@ -679,7 +687,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
ProcessingStatus.SUCCESS,
|
||||
null, null, false,
|
||||
null, null, null, null, null, null,
|
||||
null, null, null,
|
||||
null, null, null, null,
|
||||
"2026-02-10 - Bescheid.pdf"
|
||||
);
|
||||
repository.save(successAttempt);
|
||||
@@ -742,6 +750,7 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
fingerprint, new RunId("run-p2"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-x", "prompt-v1.txt", 1, 50,
|
||||
"{}", "Reasoning.", LocalDate.of(2026, 1, 15), DateSource.AI_PROVIDED, "Titel",
|
||||
null
|
||||
@@ -753,6 +762,62 @@ class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
assertThat(saved.get(0).status()).isEqualTo(ProcessingStatus.PROPOSAL_READY);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// AI field persistence is independent of logging configuration
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Verifies that the repository always stores the complete AI raw response and reasoning,
|
||||
* independent of any logging sensitivity configuration.
|
||||
* <p>
|
||||
* The {@code AiContentSensitivity} setting controls only whether sensitive content is
|
||||
* written to log files. It has no influence on what the repository persists. This test
|
||||
* demonstrates that full AI fields are stored regardless of any logging configuration by
|
||||
* verifying a round-trip with both full content and long reasoning text.
|
||||
*/
|
||||
@Test
|
||||
void save_persistsFullAiResponseAndReasoning_unaffectedByLoggingConfiguration() {
|
||||
// The repository has no dependency on AiContentSensitivity.
|
||||
// It always stores the complete AI raw response and reasoning.
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"d1d2d3d4d5d6d7d8d9dadbdcdddedfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfd0".substring(0, 64));
|
||||
RunId runId = new RunId("persistence-independence-run");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Deliberately long and complete AI raw response — must be stored in full
|
||||
String fullRawResponse = "{\"date\":\"2026-03-01\",\"title\":\"Stromabrechnung\","
|
||||
+ "\"reasoning\":\"Invoice date clearly stated on page 1. Utility provider named.\"}";
|
||||
// Deliberately complete reasoning — must be stored in full
|
||||
String fullReasoning = "Invoice date clearly stated on page 1. Utility provider named.";
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, runId, 1, now, now.plusSeconds(5),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4o", "prompt-v1.txt",
|
||||
3, 750,
|
||||
fullRawResponse,
|
||||
fullReasoning,
|
||||
LocalDate.of(2026, 3, 1), DateSource.AI_PROVIDED,
|
||||
"Stromabrechnung",
|
||||
null
|
||||
);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
ProcessingAttempt result = saved.get(0);
|
||||
|
||||
// Full raw response is stored completely — not truncated, not suppressed
|
||||
assertThat(result.aiRawResponse()).isEqualTo(fullRawResponse);
|
||||
// Full reasoning is stored completely — not truncated, not suppressed
|
||||
assertThat(result.aiReasoning()).isEqualTo(fullReasoning);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Integration with document records (FK constraints)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@@ -119,7 +119,8 @@ class SqliteSchemaInitializationAdapterTest {
|
||||
"resolved_date",
|
||||
"date_source",
|
||||
"validated_title",
|
||||
"final_target_file_name"
|
||||
"final_target_file_name",
|
||||
"ai_provider"
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
@@ -1,22 +1,24 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link SqliteUnitOfWorkAdapter}.
|
||||
@@ -24,7 +26,6 @@ import static org.junit.jupiter.api.Assertions.*;
|
||||
* Tests verify transactional semantics: successful commits, rollback on first-write failure,
|
||||
* rollback on second-write failure, and proper handling of DocumentPersistenceException.
|
||||
*
|
||||
* @since M4-AP-006
|
||||
*/
|
||||
class SqliteUnitOfWorkAdapterTest {
|
||||
|
||||
@@ -194,4 +195,40 @@ class SqliteUnitOfWorkAdapterTest {
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a document record written inside a successful transaction is persisted.
|
||||
* <p>
|
||||
* This confirms that the actual write operation is invoked and the transaction is
|
||||
* committed. Without an actual call to the underlying repository, the record would
|
||||
* not be retrievable after the transaction completes.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_committedRecordIsRetrievable() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/commit-test.pdf"),
|
||||
"commit-test.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
SqliteDocumentRecordRepositoryAdapter docRepository =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> txOps.createDocumentRecord(record));
|
||||
|
||||
var result = docRepository.findByFingerprint(fingerprint);
|
||||
assertFalse(result instanceof de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown,
|
||||
"Record must be persisted and retrievable after a successfully committed transaction");
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
@@ -1,20 +1,20 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Tests for {@link FilesystemTargetFileCopyAdapter}.
|
||||
|
||||
@@ -1,19 +1,19 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
|
||||
/**
|
||||
* Tests for {@link FilesystemTargetFolderAdapter}.
|
||||
|
||||
@@ -1,11 +1,12 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
# sqlite.file is missing
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
ai.provider.openai-compatible.apiKey=test-api-key
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
api.key=test-api-key
|
||||
@@ -1,10 +1,11 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
sqlite.file=/tmp/db.sqlite
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
sqlite.file=/tmp/db.sqlite
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
ai.provider.openai-compatible.apiKey=test-api-key-from-properties
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
@@ -11,4 +13,3 @@ prompt.template.file=/tmp/prompt.txt
|
||||
runtime.lock.file=/tmp/lock.lock
|
||||
log.directory=/tmp/logs
|
||||
log.level=DEBUG
|
||||
api.key=test-api-key-from-properties
|
||||
@@ -1,5 +1,7 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
|
||||
/**
|
||||
* Minimal runtime configuration for the application layer.
|
||||
* <p>
|
||||
@@ -9,12 +11,59 @@ package de.gecheckt.pdf.umbenenner.application.config;
|
||||
* <p>
|
||||
* This intentionally small contract ensures the application layer depends only on
|
||||
* the configuration values it actually uses, following hexagonal architecture principles.
|
||||
*
|
||||
* <h2>Validation invariants</h2>
|
||||
* <ul>
|
||||
* <li>{@link #maxPages()} must be ≥ 1.</li>
|
||||
* <li>{@link #maxRetriesTransient()} must be ≥ 1. The value {@code 0} is invalid
|
||||
* start configuration and must prevent the batch run from starting with exit
|
||||
* code 1.</li>
|
||||
* <li>{@link #aiContentSensitivity()} must not be {@code null}. The safe default is
|
||||
* {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT}.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>AI content sensitivity</h2>
|
||||
* <p>
|
||||
* The {@link #aiContentSensitivity()} field is derived from the {@code log.ai.sensitive}
|
||||
* configuration property (default: {@code false}). It governs whether the complete AI raw
|
||||
* response and complete AI {@code reasoning} may be written to log files. Sensitive AI
|
||||
* content is always persisted in SQLite regardless of this setting; only log output is
|
||||
* affected.
|
||||
* <p>
|
||||
* The safe default ({@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT}) must be used
|
||||
* whenever {@code log.ai.sensitive} is absent, {@code false}, or set to any value other
|
||||
* than the explicit opt-in.
|
||||
*/
|
||||
public record RuntimeConfiguration(
|
||||
/**
|
||||
* Maximum number of pages a document can have to be processed.
|
||||
* Documents exceeding this limit are rejected during pre-checks.
|
||||
*/
|
||||
int maxPages
|
||||
int maxPages,
|
||||
|
||||
/**
|
||||
* Maximum number of historised transient technical errors allowed per fingerprint
|
||||
* across all scheduler runs.
|
||||
* <p>
|
||||
* The attempt that causes the counter to reach this value finalises the document
|
||||
* to {@code FAILED_FINAL}. Must be an Integer ≥ 1; the value {@code 0} is
|
||||
* invalid start configuration.
|
||||
* <p>
|
||||
* Example: {@code maxRetriesTransient = 1} means the first transient error
|
||||
* immediately finalises the document.
|
||||
*/
|
||||
int maxRetriesTransient,
|
||||
|
||||
/**
|
||||
* Sensitivity decision governing whether AI-generated content may be written to log files.
|
||||
* <p>
|
||||
* Derived from the {@code log.ai.sensitive} configuration property. The default is
|
||||
* {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT} (do not log sensitive content).
|
||||
* Only {@link AiContentSensitivity#LOG_SENSITIVE_CONTENT} is produced when
|
||||
* {@code log.ai.sensitive = true} is explicitly set.
|
||||
* <p>
|
||||
* Must not be {@code null}.
|
||||
*/
|
||||
AiContentSensitivity aiContentSensitivity
|
||||
)
|
||||
{ }
|
||||
|
||||
@@ -0,0 +1,59 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
import java.util.Arrays;
|
||||
import java.util.Optional;
|
||||
|
||||
/**
|
||||
* Supported AI provider families for the PDF renaming process.
|
||||
* <p>
|
||||
* Each constant represents a distinct API protocol family. Exactly one provider family
|
||||
* is active per application run, selected via the {@code ai.provider.active} configuration property.
|
||||
* <p>
|
||||
* The {@link #getIdentifier()} method returns the string that must appear as the value of
|
||||
* {@code ai.provider.active} to activate the corresponding provider family.
|
||||
* Use {@link #fromIdentifier(String)} to resolve a configuration string to the enum constant.
|
||||
*/
|
||||
public enum AiProviderFamily {
|
||||
|
||||
/** OpenAI-compatible Chat Completions API – usable with OpenAI itself and compatible third-party endpoints. */
|
||||
OPENAI_COMPATIBLE("openai-compatible"),
|
||||
|
||||
/** Native Anthropic Messages API for Claude models. */
|
||||
CLAUDE("claude");
|
||||
|
||||
private final String identifier;
|
||||
|
||||
AiProviderFamily(String identifier) {
|
||||
this.identifier = identifier;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the configuration identifier string for this provider family.
|
||||
* <p>
|
||||
* This value corresponds to valid values of the {@code ai.provider.active} property.
|
||||
*
|
||||
* @return the configuration identifier, never {@code null}
|
||||
*/
|
||||
public String getIdentifier() {
|
||||
return identifier;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves a provider family from its configuration identifier string.
|
||||
* <p>
|
||||
* The comparison is case-sensitive and matches the exact identifier strings
|
||||
* defined by each constant (e.g., {@code "openai-compatible"}, {@code "claude"}).
|
||||
*
|
||||
* @param identifier the identifier as it appears in the {@code ai.provider.active} property;
|
||||
* {@code null} returns an empty Optional
|
||||
* @return the matching provider family, or {@link Optional#empty()} if not recognized
|
||||
*/
|
||||
public static Optional<AiProviderFamily> fromIdentifier(String identifier) {
|
||||
if (identifier == null) {
|
||||
return Optional.empty();
|
||||
}
|
||||
return Arrays.stream(values())
|
||||
.filter(f -> f.identifier.equals(identifier))
|
||||
.findFirst();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
/**
|
||||
* Immutable multi-provider configuration model.
|
||||
* <p>
|
||||
* Represents the resolved configuration for both supported AI provider families
|
||||
* together with the selection of the one provider family that is active for this
|
||||
* application run.
|
||||
*
|
||||
* <h2>Invariants</h2>
|
||||
* <ul>
|
||||
* <li>Exactly one provider family is active per run.</li>
|
||||
* <li>Required fields are enforced only for the active provider; the inactive
|
||||
* provider's configuration may be incomplete.</li>
|
||||
* <li>Validation of these invariants is performed by the corresponding validator
|
||||
* in the adapter layer, not by this record itself.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param activeProviderFamily the selected provider family for this run; {@code null}
|
||||
* indicates that {@code ai.provider.active} was absent or
|
||||
* held an unrecognised value – the validator will reject this
|
||||
* @param openAiCompatibleConfig configuration for the OpenAI-compatible provider family
|
||||
* @param claudeConfig configuration for the Anthropic Claude provider family
|
||||
*/
|
||||
public record MultiProviderConfiguration(
|
||||
AiProviderFamily activeProviderFamily,
|
||||
ProviderConfiguration openAiCompatibleConfig,
|
||||
ProviderConfiguration claudeConfig) {
|
||||
|
||||
/**
|
||||
* Returns the {@link ProviderConfiguration} for the currently active provider family.
|
||||
*
|
||||
* @return the active provider's configuration, never {@code null} when
|
||||
* {@link #activeProviderFamily()} is not {@code null}
|
||||
* @throws NullPointerException if {@code activeProviderFamily} is {@code null}
|
||||
*/
|
||||
public ProviderConfiguration activeProviderConfiguration() {
|
||||
return switch (activeProviderFamily) {
|
||||
case OPENAI_COMPATIBLE -> openAiCompatibleConfig;
|
||||
case CLAUDE -> claudeConfig;
|
||||
};
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,34 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
/**
|
||||
* Immutable configuration for a single AI provider family.
|
||||
* <p>
|
||||
* Holds all parameters needed to connect to and authenticate with one AI provider endpoint.
|
||||
* Instances are created by the configuration parser in the adapter layer; validation
|
||||
* of required fields is performed by the corresponding validator.
|
||||
*
|
||||
* <h2>Field semantics</h2>
|
||||
* <ul>
|
||||
* <li>{@code model} – the AI model name; required for the active provider, may be {@code null}
|
||||
* for the inactive provider.</li>
|
||||
* <li>{@code timeoutSeconds} – HTTP connection/read timeout in seconds; must be positive for
|
||||
* the active provider. {@code 0} indicates the value was not configured.</li>
|
||||
* <li>{@code baseUrl} – the base URL of the API endpoint. For the Anthropic Claude family a
|
||||
* default of {@code https://api.anthropic.com} is applied by the parser when the property
|
||||
* is absent; for the OpenAI-compatible family it is required and may not be {@code null}.</li>
|
||||
* <li>{@code apiKey} – the resolved API key after environment-variable precedence has been
|
||||
* applied; may be blank for the inactive provider, must not be blank for the active provider.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param model the AI model name; {@code null} when not configured
|
||||
* @param timeoutSeconds HTTP timeout in seconds; {@code 0} when not configured
|
||||
* @param baseUrl the base URL of the API endpoint; {@code null} when not configured
|
||||
* (only applicable to providers without a built-in default)
|
||||
* @param apiKey the resolved API key; blank when not configured
|
||||
*/
|
||||
public record ProviderConfiguration(
|
||||
String model,
|
||||
int timeoutSeconds,
|
||||
String baseUrl,
|
||||
String apiKey) {
|
||||
}
|
||||
@@ -1,23 +1,39 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.startup;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Typed immutable configuration model for PDF Umbenenner startup parameters.
|
||||
* <p>
|
||||
* Contains all technical infrastructure and runtime configuration parameters
|
||||
* loaded and validated at bootstrap time. This is a complete configuration model
|
||||
* for the entire application startup, including paths, API settings, persistence,
|
||||
* for the entire application startup, including paths, AI provider selection, persistence,
|
||||
* and operational parameters.
|
||||
*
|
||||
* <h2>AI provider configuration</h2>
|
||||
* <p>
|
||||
* The {@link MultiProviderConfiguration} encapsulates the active provider selection
|
||||
* together with the per-provider connection parameters for all supported provider families.
|
||||
* Exactly one provider family is active per run; the selection is driven by the
|
||||
* {@code ai.provider.active} configuration property.
|
||||
*
|
||||
* <h2>AI content sensitivity ({@code log.ai.sensitive})</h2>
|
||||
* <p>
|
||||
* The boolean property {@code log.ai.sensitive} controls whether sensitive AI-generated
|
||||
* content (complete raw AI response, complete AI {@code reasoning}) may be written to
|
||||
* log files. The default is {@code false} (safe/protect). Set to {@code true} only when
|
||||
* explicit diagnostic logging of AI content is required.
|
||||
* <p>
|
||||
* Sensitive AI content is always persisted in SQLite regardless of this setting.
|
||||
* Only log output is affected.
|
||||
*/
|
||||
public record StartConfiguration(
|
||||
Path sourceFolder,
|
||||
Path targetFolder,
|
||||
Path sqliteFile,
|
||||
URI apiBaseUrl,
|
||||
String apiModel,
|
||||
int apiTimeoutSeconds,
|
||||
MultiProviderConfiguration multiProviderConfiguration,
|
||||
int maxRetriesTransient,
|
||||
int maxPages,
|
||||
int maxTextCharacters,
|
||||
@@ -25,6 +41,12 @@ public record StartConfiguration(
|
||||
Path runtimeLockFile,
|
||||
Path logDirectory,
|
||||
String logLevel,
|
||||
String apiKey
|
||||
|
||||
/**
|
||||
* Whether sensitive AI content (raw response, reasoning) may be written to log files.
|
||||
* Corresponds to the {@code log.ai.sensitive} configuration property.
|
||||
* Default: {@code false} (do not log sensitive content).
|
||||
*/
|
||||
boolean logAiSensitive
|
||||
)
|
||||
{ }
|
||||
|
||||
@@ -0,0 +1,46 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sensitivity decision governing whether AI-generated content may be written to log files.
|
||||
* <p>
|
||||
* The following AI-generated content items are classified as sensitive and are subject to
|
||||
* this decision:
|
||||
* <ul>
|
||||
* <li>The <strong>complete AI raw response</strong> (full JSON body returned by the
|
||||
* AI service)</li>
|
||||
* <li>The <strong>complete AI {@code reasoning}</strong> field extracted from the
|
||||
* AI response</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Sensitive AI content is always written to SQLite (for traceability) regardless of
|
||||
* this decision. The decision controls only whether the content is also emitted into
|
||||
* log files.
|
||||
* <p>
|
||||
* <strong>Default behaviour:</strong> The default is {@link #PROTECT_SENSITIVE_CONTENT}.
|
||||
* Logging of sensitive AI content must be explicitly enabled by setting the boolean
|
||||
* configuration property {@code log.ai.sensitive = true}. Any other value, or the
|
||||
* absence of the property, results in {@link #PROTECT_SENSITIVE_CONTENT}.
|
||||
* <p>
|
||||
* <strong>Non-sensitive AI content</strong> (e.g. the resolved title, the resolved date,
|
||||
* the date source) is not covered by this decision and may always be logged.
|
||||
*/
|
||||
public enum AiContentSensitivity {
|
||||
|
||||
/**
|
||||
* Sensitive AI content (raw response, reasoning) must <strong>not</strong> be written
|
||||
* to log files.
|
||||
* <p>
|
||||
* This is the safe default. It is active whenever {@code log.ai.sensitive} is absent,
|
||||
* {@code false}, or set to any value other than the explicit opt-in.
|
||||
*/
|
||||
PROTECT_SENSITIVE_CONTENT,
|
||||
|
||||
/**
|
||||
* Sensitive AI content (raw response, reasoning) <strong>may</strong> be written
|
||||
* to log files.
|
||||
* <p>
|
||||
* This value is only produced when {@code log.ai.sensitive = true} is explicitly set
|
||||
* in the application configuration. It must never be the implicit default.
|
||||
*/
|
||||
LOG_SENSITIVE_CONTENT
|
||||
}
|
||||
@@ -1,8 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Represents successful HTTP communication with an AI service.
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Represents a technical failure during AI service invocation.
|
||||
* <p>
|
||||
|
||||
@@ -0,0 +1,90 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Unified classification of all document-level errors in the end state.
|
||||
* <p>
|
||||
* This enumeration provides a single, exhaustive taxonomy for every error category
|
||||
* that the retry policy and logging infrastructure must distinguish. It replaces
|
||||
* any ad-hoc string-based classification where an authoritative type is needed.
|
||||
* <p>
|
||||
* <strong>Mapping to failure counters:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #DETERMINISTIC_CONTENT_ERROR} → increments the content-error counter
|
||||
* ({@link FailureCounters#contentErrorCount()}). The first occurrence leads to
|
||||
* {@code FAILED_RETRYABLE}; the second leads to {@code FAILED_FINAL}.
|
||||
* There is no further retry after the second deterministic content error.</li>
|
||||
* <li>{@link #TRANSIENT_TECHNICAL_ERROR} → increments the transient-error counter
|
||||
* ({@link FailureCounters#transientErrorCount()}). Remains retryable until the
|
||||
* counter reaches the configured {@code max.retries.transient} limit (Integer ≥ 1).
|
||||
* The attempt that reaches the limit finalises the document to {@code FAILED_FINAL}.</li>
|
||||
* <li>{@link #TARGET_COPY_TECHNICAL_ERROR} → signals a failure on the physical target
|
||||
* file copy path. Within the same run, exactly one immediate technical retry is
|
||||
* allowed. If the immediate retry also fails, the error is treated as a
|
||||
* {@link #TRANSIENT_TECHNICAL_ERROR} for the purposes of counter updates and
|
||||
* cross-run retry evaluation.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Scope of deterministic content errors:</strong>
|
||||
* <ul>
|
||||
* <li>No usable PDF text extracted</li>
|
||||
* <li>Page limit exceeded</li>
|
||||
* <li>AI response functionally invalid (generic/unusable title, unparseable date)</li>
|
||||
* <li>Document content ambiguous or not uniquely interpretable</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Scope of transient technical errors:</strong>
|
||||
* <ul>
|
||||
* <li>AI service unreachable, HTTP timeout, network error</li>
|
||||
* <li>Unparseable or structurally invalid AI JSON</li>
|
||||
* <li>Temporary I/O error during PDF text extraction</li>
|
||||
* <li>Temporary SQLite lock or persistence failure</li>
|
||||
* <li>Any other non-deterministic infrastructure failure</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Architecture note:</strong> This type carries no infrastructure dependencies.
|
||||
* It is safe to reference from Domain, Application and Adapter layers.
|
||||
*/
|
||||
public enum DocumentErrorClassification {
|
||||
|
||||
/**
|
||||
* A deterministic content error that cannot be resolved by retrying with the same
|
||||
* document content.
|
||||
* <p>
|
||||
* Examples: no extractable text, page limit exceeded, AI-returned title is generic
|
||||
* or unusable, document content is ambiguous.
|
||||
* <p>
|
||||
* Retry rule: the first historised occurrence of this error for a fingerprint leads
|
||||
* to {@code FAILED_RETRYABLE} (one later run may retry). The second historised
|
||||
* occurrence leads to {@code FAILED_FINAL} (no further retries).
|
||||
*/
|
||||
DETERMINISTIC_CONTENT_ERROR,
|
||||
|
||||
/**
|
||||
* A transient technical infrastructure failure unrelated to the document content.
|
||||
* <p>
|
||||
* Examples: AI endpoint not reachable, HTTP timeout, malformed or non-parseable
|
||||
* JSON, temporary I/O failure, temporary SQLite lock.
|
||||
* <p>
|
||||
* Retry rule: remains {@code FAILED_RETRYABLE} until the transient-error counter
|
||||
* reaches the configured {@code max.retries.transient} limit. The attempt that
|
||||
* reaches the limit finalises the document to {@code FAILED_FINAL}.
|
||||
* The configured limit must be an Integer ≥ 1; the value {@code 0} is invalid
|
||||
* start configuration and prevents the batch run from starting.
|
||||
*/
|
||||
TRANSIENT_TECHNICAL_ERROR,
|
||||
|
||||
/**
|
||||
* A technical failure specifically on the physical target-file copy path.
|
||||
* <p>
|
||||
* This error class is distinct from {@link #TRANSIENT_TECHNICAL_ERROR} because it
|
||||
* triggers a special within-run handling: exactly one immediate technical retry of
|
||||
* the copy operation is allowed within the same document run. No new AI call and no
|
||||
* new naming proposal derivation occur during the immediate retry.
|
||||
* <p>
|
||||
* If the immediate retry succeeds, the document proceeds to {@code SUCCESS}.
|
||||
* If the immediate retry also fails, the combined failure is recorded as a
|
||||
* {@link #TRANSIENT_TECHNICAL_ERROR} for counter and cross-run retry evaluation.
|
||||
* The immediate retry is not counted in the laufübergreifenden transient-error counter.
|
||||
*/
|
||||
TARGET_COPY_TECHNICAL_ERROR
|
||||
}
|
||||
@@ -0,0 +1,81 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* Sealed type carrying the correlation context for all document-related log entries.
|
||||
* <p>
|
||||
* The logging correlation rule distinguishes two phases of document processing:
|
||||
* <ol>
|
||||
* <li><strong>Pre-fingerprint phase:</strong> Before a {@link DocumentFingerprint} has
|
||||
* been successfully computed (e.g. the source file cannot be read for hashing),
|
||||
* log entries are correlated via the batch run identifier and a stable candidate
|
||||
* description derived from the candidate's own identifier (typically its source
|
||||
* file path or name). Use {@link CandidateCorrelation}.</li>
|
||||
* <li><strong>Post-fingerprint phase:</strong> Once the fingerprint has been
|
||||
* successfully computed, all subsequent document-related log entries are correlated
|
||||
* via the batch run identifier and the fingerprint. Use
|
||||
* {@link FingerprintCorrelation}.</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
* <strong>Architecture constraints:</strong>
|
||||
* <ul>
|
||||
* <li>This type contains no filesystem ({@code Path}, {@code File}) or NIO types.</li>
|
||||
* <li>This type introduces no additional persistence truth source.</li>
|
||||
* <li>The correlation is a logging concern only and does not influence the processing
|
||||
* outcome, retry decision, or persistence model.</li>
|
||||
* </ul>
|
||||
*/
|
||||
public sealed interface DocumentLogCorrelation {
|
||||
|
||||
/**
|
||||
* Returns the batch run identifier shared by all log entries within one run.
|
||||
*
|
||||
* @return run identifier; never {@code null}
|
||||
*/
|
||||
RunId runId();
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Pre-fingerprint correlation
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Correlation context available before a {@link DocumentFingerprint} has been
|
||||
* successfully computed.
|
||||
* <p>
|
||||
* Used when the fingerprint computation itself fails or when a log entry must be
|
||||
* emitted at the very start of candidate processing (before any hashing result is
|
||||
* available).
|
||||
* <p>
|
||||
* The {@code candidateDescription} is a stable, human-readable identifier for the
|
||||
* candidate derived from the candidate's own unique identifier — typically the
|
||||
* source file name or path representation. It must not change between log entries
|
||||
* for the same candidate within a single run.
|
||||
*
|
||||
* @param runId batch run identifier; never {@code null}
|
||||
* @param candidateDescription stable human-readable candidate identifier;
|
||||
* never {@code null} or blank
|
||||
*/
|
||||
record CandidateCorrelation(RunId runId, String candidateDescription)
|
||||
implements DocumentLogCorrelation {}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Post-fingerprint correlation
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Correlation context available after a {@link DocumentFingerprint} has been
|
||||
* successfully computed.
|
||||
* <p>
|
||||
* Used for all document-related log entries from the point at which the fingerprint
|
||||
* is known. The fingerprint is the authoritative, content-stable document identity
|
||||
* and must appear in or be unambiguously derivable from every subsequent log entry
|
||||
* for this document.
|
||||
*
|
||||
* @param runId batch run identifier; never {@code null}
|
||||
* @param fingerprint content-based document identity; never {@code null}
|
||||
*/
|
||||
record FingerprintCorrelation(RunId runId, DocumentFingerprint fingerprint)
|
||||
implements DocumentLogCorrelation {}
|
||||
}
|
||||
@@ -1,12 +1,12 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Application-facing representation of the document master record (Dokument-Stammsatz).
|
||||
* <p>
|
||||
|
||||
@@ -7,24 +7,34 @@ package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
* <ul>
|
||||
* <li><strong>Content error counter</strong> ({@link #contentErrorCount()}):
|
||||
* counts how many times a deterministic content error occurred for this document
|
||||
* (no usable text, page limit exceeded). At count 1 the document is
|
||||
* {@code FAILED_RETRYABLE}; at count 2 it becomes {@code FAILED_FINAL}.
|
||||
* (no usable text, page limit exceeded, AI functional failure, ambiguous content).
|
||||
* At count 1 the document transitions to {@code FAILED_RETRYABLE};
|
||||
* at count 2 it transitions to {@code FAILED_FINAL}.
|
||||
* Skip events do <em>not</em> increase this counter.</li>
|
||||
* <li><strong>Transient error counter</strong> ({@link #transientErrorCount()}):
|
||||
* counts how many times a technical infrastructure error occurred after a
|
||||
* successful fingerprint was computed. The document remains
|
||||
* {@code FAILED_RETRYABLE} until the configured maximum is reached in later
|
||||
* milestones. Skip events do <em>not</em> increase this counter.</li>
|
||||
* counts how many times a transient technical error occurred after a successful
|
||||
* fingerprint was computed. The document remains {@code FAILED_RETRYABLE} while
|
||||
* this counter is strictly less than the configured {@code max.retries.transient}
|
||||
* value. The attempt that causes the counter to reach {@code max.retries.transient}
|
||||
* transitions the document to {@code FAILED_FINAL}.
|
||||
* The configured limit must be an Integer ≥ 1.
|
||||
* Skip events do <em>not</em> increase this counter.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* A freshly discovered document starts with both counters at zero.
|
||||
* Counters are only written by the repository layer on the instructions of the
|
||||
* application use case; they never change as a side-effect of a read operation.
|
||||
* <strong>Immediate within-run target copy retry:</strong>
|
||||
* The physical target-copy retry within the same run is not tracked in either counter.
|
||||
* It is a purely technical within-run mechanism and does not affect the
|
||||
* laufübergreifenden counter state.
|
||||
* <p>
|
||||
* <strong>Counter invariant:</strong>
|
||||
* Both counters start at zero for a newly discovered document and only increase
|
||||
* monotonically. The counters are written by the repository layer on the instructions
|
||||
* of the application use case; they never change as a side-effect of a read operation.
|
||||
*
|
||||
* @param contentErrorCount number of deterministic content errors recorded so far;
|
||||
* must be >= 0
|
||||
* @param transientErrorCount number of transient technical errors recorded so far;
|
||||
* must be >= 0
|
||||
* @param contentErrorCount number of historised deterministic content errors;
|
||||
* must be ≥ 0
|
||||
* @param transientErrorCount number of historised transient technical errors;
|
||||
* must be ≥ 0
|
||||
*/
|
||||
public record FailureCounters(int contentErrorCount, int transientErrorCount) {
|
||||
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
/**
|
||||
* Successful outcome of a fingerprint computation.
|
||||
* <p>
|
||||
|
||||
@@ -0,0 +1,44 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Decision governing whether a within-run immediate technical retry of the target copy
|
||||
* operation is permitted.
|
||||
* <p>
|
||||
* The immediate retry mechanism is strictly scoped:
|
||||
* <ul>
|
||||
* <li>It applies <strong>only</strong> to the physical target-file copy path.</li>
|
||||
* <li>It is permitted <strong>at most once</strong> per document per run (first copy
|
||||
* attempt failed; one additional attempt is allowed).</li>
|
||||
* <li>It does <strong>not</strong> involve a new AI call, a new naming-proposal
|
||||
* derivation, or any other pipeline stage.</li>
|
||||
* <li>It does <strong>not</strong> increment the laufübergreifenden
|
||||
* transient-error counter regardless of outcome.</li>
|
||||
* <li>It is a purely technical within-run recovery mechanism and is
|
||||
* <strong>not</strong> counted as a cross-run retry in the sense of
|
||||
* {@code max.retries.transient}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The concrete retry decision for the subsequent persistence step is derived from the
|
||||
* combined outcome after the immediate retry completes (see {@link RetryDecision}).
|
||||
*/
|
||||
public enum ImmediateRetryDecision {
|
||||
|
||||
/**
|
||||
* An immediate within-run retry of the target copy operation is permitted.
|
||||
* <p>
|
||||
* This value is produced when the first physical copy attempt within the current
|
||||
* document run has failed. The copy must be retried exactly once more.
|
||||
* No other pipeline stage is repeated.
|
||||
*/
|
||||
ALLOWED,
|
||||
|
||||
/**
|
||||
* No immediate within-run retry is permitted.
|
||||
* <p>
|
||||
* This value is produced when the immediate retry quota for this document run has
|
||||
* already been consumed (i.e. the immediate retry attempt itself has failed), or
|
||||
* when the failure did not occur on the target copy path.
|
||||
* The error must be escalated to the cross-run retry evaluation.
|
||||
*/
|
||||
DENIED
|
||||
}
|
||||
@@ -1,14 +1,14 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Application-facing representation of exactly one historised processing attempt
|
||||
* (Versuchshistorie-Eintrag) for an identified document.
|
||||
@@ -42,6 +42,10 @@ import java.util.Objects;
|
||||
* successful or skip attempts.</li>
|
||||
* <li>{@link #retryable()} — {@code true} if the failure is considered retryable in a
|
||||
* later run; {@code false} for final failures, successes, and skip attempts.</li>
|
||||
* <li>{@link #aiProvider()} — opaque identifier of the AI provider that was active
|
||||
* during this attempt (e.g. {@code "openai-compatible"} or {@code "claude"});
|
||||
* {@code null} for attempts that did not involve an AI call (skip, pre-check
|
||||
* failure) or for historical attempts recorded before this field was introduced.</li>
|
||||
* <li>{@link #modelName()} — the AI model name used in this attempt; {@code null} if
|
||||
* no AI call was made (e.g. pre-check failures or skip attempts).</li>
|
||||
* <li>{@link #promptIdentifier()} — stable identifier of the prompt template used;
|
||||
@@ -74,6 +78,7 @@ import java.util.Objects;
|
||||
* @param failureClass failure classification, or {@code null} for non-failure statuses
|
||||
* @param failureMessage failure description, or {@code null} for non-failure statuses
|
||||
* @param retryable whether this failure should be retried in a later run
|
||||
* @param aiProvider opaque AI provider identifier for this attempt, or {@code null}
|
||||
* @param modelName AI model name, or {@code null} if no AI call was made
|
||||
* @param promptIdentifier prompt identifier, or {@code null} if no AI call was made
|
||||
* @param processedPageCount number of PDF pages processed, or {@code null}
|
||||
@@ -97,6 +102,7 @@ public record ProcessingAttempt(
|
||||
String failureMessage,
|
||||
boolean retryable,
|
||||
// AI traceability fields (null for non-AI attempts)
|
||||
String aiProvider,
|
||||
String modelName,
|
||||
String promptIdentifier,
|
||||
Integer processedPageCount,
|
||||
@@ -131,7 +137,8 @@ public record ProcessingAttempt(
|
||||
* Creates a {@link ProcessingAttempt} with no AI traceability fields set.
|
||||
* <p>
|
||||
* Convenience factory for pre-check failures, skip events, and any attempt
|
||||
* that does not involve an AI call.
|
||||
* that does not involve an AI call. The {@link #aiProvider()} field is set
|
||||
* to {@code null}.
|
||||
*
|
||||
* @param fingerprint document identity; must not be null
|
||||
* @param runId batch run identifier; must not be null
|
||||
@@ -157,6 +164,6 @@ public record ProcessingAttempt(
|
||||
return new ProcessingAttempt(
|
||||
fingerprint, runId, attemptNumber, startedAt, endedAt,
|
||||
status, failureClass, failureMessage, retryable,
|
||||
null, null, null, null, null, null, null, null, null, null);
|
||||
null, null, null, null, null, null, null, null, null, null, null);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Outbound port for writing and reading the processing attempt history
|
||||
* (Versuchshistorie).
|
||||
|
||||
@@ -5,6 +5,22 @@ package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
* <p>
|
||||
* The application delegates all logging to this port to remain decoupled from
|
||||
* specific logging frameworks. Concrete implementations are provided by adapters.
|
||||
* <p>
|
||||
* <h2>Sensitive AI content logging</h2>
|
||||
* <p>
|
||||
* The {@link #debugSensitiveAiContent(String, Object[])} method allows logging
|
||||
* of sensitive AI-generated content (complete raw response, complete reasoning)
|
||||
* subject to the {@link AiContentSensitivity} setting:
|
||||
* <ul>
|
||||
* <li>When {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT} is active (default),
|
||||
* this method logs nothing.</li>
|
||||
* <li>When {@link AiContentSensitivity#LOG_SENSITIVE_CONTENT} is explicitly enabled,
|
||||
* this method logs the content to DEBUG level.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The complete sensitive content is always persisted in SQLite for traceability,
|
||||
* regardless of this logging setting. The logging decision controls only whether
|
||||
* the content also appears in log files.
|
||||
*/
|
||||
public interface ProcessingLogger {
|
||||
|
||||
@@ -24,6 +40,29 @@ public interface ProcessingLogger {
|
||||
*/
|
||||
void debug(String message, Object... args);
|
||||
|
||||
/**
|
||||
* Logs a debug-level message containing sensitive AI-generated content,
|
||||
* subject to the configured {@link AiContentSensitivity}.
|
||||
* <p>
|
||||
* This method is called with message and arguments containing sensitive AI content
|
||||
* (e.g., complete raw response, complete reasoning). The implementation must:
|
||||
* <ul>
|
||||
* <li>Check the current {@link AiContentSensitivity} setting.</li>
|
||||
* <li>If set to {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT} (default),
|
||||
* emit nothing to log.</li>
|
||||
* <li>If set to {@link AiContentSensitivity#LOG_SENSITIVE_CONTENT}, log the
|
||||
* message and arguments at DEBUG level normally.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* This is the only method where sensitive AI content may be logged based on
|
||||
* configuration. Other logging methods ({@link #info}, {@link #debug}, etc.)
|
||||
* must never log sensitive content.
|
||||
*
|
||||
* @param message the message template (may contain {} placeholders)
|
||||
* @param args optional message arguments that may include sensitive AI content
|
||||
*/
|
||||
void debugSensitiveAiContent(String message, Object... args);
|
||||
|
||||
/**
|
||||
* Logs a warning-level message.
|
||||
*
|
||||
|
||||
@@ -1,8 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
/**
|
||||
* Represents successful loading of an external prompt template.
|
||||
* <p>
|
||||
|
||||
@@ -0,0 +1,172 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sealed type representing the complete, authoritative retry decision for a document
|
||||
* after an error has been classified.
|
||||
* <p>
|
||||
* A {@code RetryDecision} is the output of the retry policy evaluation. It unambiguously
|
||||
* encodes what must happen next for the document: which status to persist, which counter
|
||||
* to increment, and whether a within-run immediate retry is still possible.
|
||||
* <p>
|
||||
* <strong>Decision cases and their semantics:</strong>
|
||||
* <ol>
|
||||
* <li>{@link ContentErrorRetryable} — first deterministic content error. Document moves
|
||||
* to {@code FAILED_RETRYABLE}; content-error counter is incremented by 1. One later
|
||||
* scheduler run may retry.</li>
|
||||
* <li>{@link ContentErrorFinal} — second (or later) deterministic content error. Document
|
||||
* moves to {@code FAILED_FINAL}; content-error counter is incremented by 1. No further
|
||||
* processing in any future run.</li>
|
||||
* <li>{@link TransientErrorRetryable} — transient technical error with remaining retry budget.
|
||||
* Document moves to {@code FAILED_RETRYABLE}; transient-error counter is incremented by 1.
|
||||
* A later scheduler run may retry, as long as the counter stays below
|
||||
* {@code max.retries.transient}.</li>
|
||||
* <li>{@link TransientErrorFinal} — transient technical error that exhausts the configured
|
||||
* {@code max.retries.transient} budget. Document moves to {@code FAILED_FINAL};
|
||||
* transient-error counter is incremented by 1. No further processing in any future run.</li>
|
||||
* <li>{@link TargetCopyWithImmediateRetry} — first physical copy failure within the current
|
||||
* run. The document has not yet changed status; exactly one immediate within-run retry
|
||||
* of the copy step is permitted. No new AI call and no new naming-proposal derivation
|
||||
* occur. This decision does not yet modify any counter or status; the outcome of the
|
||||
* immediate retry determines which subsequent decision applies.</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
* <strong>What this type does NOT cover:</strong>
|
||||
* <ul>
|
||||
* <li>Skip decisions ({@code SKIPPED_ALREADY_PROCESSED}, {@code SKIPPED_FINAL_FAILURE})
|
||||
* — skips are not retry decisions; they are pure historisation events.</li>
|
||||
* <li>Success — a successful outcome is not a retry decision.</li>
|
||||
* <li>Pre-fingerprint failures — errors before the fingerprint is computed are not
|
||||
* historised as attempts and therefore do not produce a {@code RetryDecision}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Counter invariant:</strong> Skip decisions ({@code SKIPPED_ALREADY_PROCESSED},
|
||||
* {@code SKIPPED_FINAL_FAILURE}) never produce a {@code RetryDecision} and never change
|
||||
* any failure counter.
|
||||
* <p>
|
||||
* <strong>Single-truth rule:</strong> The retry decision is derived exclusively from the
|
||||
* document master record and the attempt history. No additional, parallel truth source
|
||||
* for retry state is introduced.
|
||||
*/
|
||||
public sealed interface RetryDecision {
|
||||
|
||||
/**
|
||||
* Returns the failure class identifier for persistence and logging.
|
||||
* <p>
|
||||
* The failure class is a short, stable string identifying the type of failure,
|
||||
* typically the enum constant name of the original error or exception class name.
|
||||
*
|
||||
* @return failure class string; never {@code null} or blank
|
||||
*/
|
||||
String failureClass();
|
||||
|
||||
/**
|
||||
* Returns a human-readable failure message for persistence and logging.
|
||||
*
|
||||
* @return failure message; never {@code null} or blank
|
||||
*/
|
||||
String failureMessage();
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Deterministic content error cases
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* First historised deterministic content error for this fingerprint.
|
||||
* <p>
|
||||
* The document must be persisted with status {@code FAILED_RETRYABLE} and the
|
||||
* content-error counter incremented by 1. Exactly one later scheduler run is
|
||||
* permitted to retry.
|
||||
*
|
||||
* @param failureClass failure class identifier; never {@code null} or blank
|
||||
* @param failureMessage human-readable failure description; never {@code null} or blank
|
||||
*/
|
||||
record ContentErrorRetryable(String failureClass, String failureMessage)
|
||||
implements RetryDecision {}
|
||||
|
||||
/**
|
||||
* Second (or subsequent) historised deterministic content error for this fingerprint.
|
||||
* <p>
|
||||
* The document must be persisted with status {@code FAILED_FINAL} and the
|
||||
* content-error counter incremented by 1. No further processing is allowed in
|
||||
* any future run.
|
||||
*
|
||||
* @param failureClass failure class identifier; never {@code null} or blank
|
||||
* @param failureMessage human-readable failure description; never {@code null} or blank
|
||||
*/
|
||||
record ContentErrorFinal(String failureClass, String failureMessage)
|
||||
implements RetryDecision {}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Transient technical error cases
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Transient technical error with remaining retry budget.
|
||||
* <p>
|
||||
* The transient-error counter after incrementing is strictly less than
|
||||
* {@code max.retries.transient}. The document must be persisted with status
|
||||
* {@code FAILED_RETRYABLE} and the transient-error counter incremented by 1.
|
||||
* A later scheduler run may retry.
|
||||
*
|
||||
* @param failureClass failure class identifier; never {@code null} or blank
|
||||
* @param failureMessage human-readable failure description; never {@code null} or blank
|
||||
*/
|
||||
record TransientErrorRetryable(String failureClass, String failureMessage)
|
||||
implements RetryDecision {}
|
||||
|
||||
/**
|
||||
* Transient technical error that exhausts the configured {@code max.retries.transient}
|
||||
* budget.
|
||||
* <p>
|
||||
* The transient-error counter after incrementing equals {@code max.retries.transient}.
|
||||
* The document must be persisted with status {@code FAILED_FINAL} and the
|
||||
* transient-error counter incremented by 1. No further processing is allowed in
|
||||
* any future run.
|
||||
* <p>
|
||||
* Example: with {@code max.retries.transient = 1}, the very first transient error
|
||||
* produces this decision immediately.
|
||||
*
|
||||
* @param failureClass failure class identifier; never {@code null} or blank
|
||||
* @param failureMessage human-readable failure description; never {@code null} or blank
|
||||
*/
|
||||
record TransientErrorFinal(String failureClass, String failureMessage)
|
||||
implements RetryDecision {}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Target copy immediate retry case
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* First physical target-file copy failure within the current run.
|
||||
* <p>
|
||||
* Exactly one immediate technical retry of the copy operation is permitted within
|
||||
* the same document run. This decision does not change any counter or document
|
||||
* status — it defers the final outcome until the immediate retry completes:
|
||||
* <ul>
|
||||
* <li>If the immediate retry succeeds → document proceeds to {@code SUCCESS}.</li>
|
||||
* <li>If the immediate retry also fails → the combined failure is classified as
|
||||
* a transient technical error and a {@link TransientErrorRetryable} or
|
||||
* {@link TransientErrorFinal} decision is produced for the final persistence
|
||||
* step.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The immediate retry is strictly limited to the physical copy path. No new AI call
|
||||
* and no new naming-proposal derivation occur. This mechanism does not increment the
|
||||
* laufübergreifenden transient-error counter.
|
||||
*
|
||||
* @param failureMessage human-readable description of the initial copy failure;
|
||||
* never {@code null} or blank
|
||||
*/
|
||||
record TargetCopyWithImmediateRetry(String failureMessage) implements RetryDecision {
|
||||
|
||||
/**
|
||||
* Returns the constant failure class identifier for target copy failures.
|
||||
*
|
||||
* @return {@code "TARGET_COPY_TECHNICAL_ERROR"}
|
||||
*/
|
||||
@Override
|
||||
public String failureClass() {
|
||||
return DocumentErrorClassification.TARGET_COPY_TECHNICAL_ERROR.name();
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,9 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
/**
|
||||
* Outbound port for loading PDF document candidates from the source folder.
|
||||
* <p>
|
||||
|
||||
@@ -1,7 +1,5 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
import java.util.function.Consumer;
|
||||
|
||||
/**
|
||||
|
||||
@@ -62,6 +62,20 @@
|
||||
* — Sealed result of parsing raw response into JSON structure (success or parsing failure)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Retry policy and logging types:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentErrorClassification}
|
||||
* — Unified classification of all document-level errors (content, transient, target copy)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.RetryDecision}
|
||||
* — Sealed type representing the authoritative retry decision for a document error</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.ImmediateRetryDecision}
|
||||
* — Decision governing whether a within-run target copy retry is permitted</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity}
|
||||
* — Sensitivity decision governing whether AI-generated content may be logged</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentLogCorrelation}
|
||||
* — Sealed type carrying the correlation context for document-related log entries</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Exception types:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException}
|
||||
|
||||
@@ -85,30 +85,6 @@ public class AiRequestComposer {
|
||||
Objects.requireNonNull(promptContent, "promptContent must not be null");
|
||||
Objects.requireNonNull(documentText, "documentText must not be null");
|
||||
|
||||
// The complete request text is composed in a fixed, deterministic order:
|
||||
// 1. Prompt content (instruction)
|
||||
// 2. Newline separator
|
||||
// 3. Prompt identifier marker (for traceability)
|
||||
// 4. Newline separator
|
||||
// 5. Document text section marker
|
||||
// 6. Newline separator
|
||||
// 7. Document text content
|
||||
// 8. Newline separator
|
||||
// 9. Response format specification (JSON-only with required fields)
|
||||
//
|
||||
// This order is fixed so that another implementation knows exactly where
|
||||
// each part is positioned and what to expect.
|
||||
StringBuilder requestBuilder = new StringBuilder();
|
||||
requestBuilder.append(promptContent);
|
||||
requestBuilder.append("\n");
|
||||
requestBuilder.append("--- Prompt-ID: ").append(promptIdentifier.identifier()).append(" ---");
|
||||
requestBuilder.append("\n");
|
||||
requestBuilder.append("--- Document Text ---");
|
||||
requestBuilder.append("\n");
|
||||
requestBuilder.append(documentText);
|
||||
requestBuilder.append("\n");
|
||||
appendJsonResponseFormat(requestBuilder);
|
||||
|
||||
// Record the exact character count of the document text that was included.
|
||||
// This is the length of the document text (not the complete request).
|
||||
int sentCharacterCount = documentText.length();
|
||||
|
||||
@@ -6,6 +6,7 @@ import java.util.Objects;
|
||||
import java.util.Set;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ClockPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiResponseValidator.AiValidationResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiErrorClassification;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposal;
|
||||
|
||||
@@ -0,0 +1,200 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentErrorClassification;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ImmediateRetryDecision;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RetryDecision;
|
||||
|
||||
/**
|
||||
* Default implementation of the {@link RetryDecisionEvaluator} interface.
|
||||
* <p>
|
||||
* Applies the binding retry policy rules exactly as specified:
|
||||
* <ul>
|
||||
* <li><strong>Deterministic content errors</strong>: the first historised occurrence
|
||||
* for a fingerprint leads to {@link RetryDecision.ContentErrorRetryable} (one later
|
||||
* scheduler run may retry); the second occurrence leads to
|
||||
* {@link RetryDecision.ContentErrorFinal} (no further retries).</li>
|
||||
* <li><strong>Transient technical errors</strong>: the error remains
|
||||
* {@link RetryDecision.TransientErrorRetryable} while the counter after incrementing
|
||||
* is strictly less than {@code maxRetriesTransient}. When the counter after
|
||||
* incrementing reaches {@code maxRetriesTransient}, the result is
|
||||
* {@link RetryDecision.TransientErrorFinal}.</li>
|
||||
* <li><strong>Target copy failures</strong>: the first copy failure within a run
|
||||
* produces {@link RetryDecision.TargetCopyWithImmediateRetry}, allowing exactly
|
||||
* one immediate within-run retry of the physical copy step. This decision does
|
||||
* not modify any counter.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Counter semantics:</strong> The {@code currentCounters} passed to
|
||||
* {@link #evaluate} reflect the state <em>before</em> the current attempt's counter
|
||||
* increment. This evaluator computes what the counter will be after incrementing and
|
||||
* applies the threshold check accordingly.
|
||||
* <p>
|
||||
* <strong>Skip events</strong> ({@code SKIPPED_ALREADY_PROCESSED},
|
||||
* {@code SKIPPED_FINAL_FAILURE}) are not routed through this evaluator and never
|
||||
* produce a {@link RetryDecision}. No failure counter is changed by skip events.
|
||||
* <p>
|
||||
* <strong>Immediate within-run retry</strong> for the target copy path is a purely
|
||||
* technical within-run mechanism. It does not increment the laufübergreifenden
|
||||
* transient-error counter regardless of outcome, and it is not part of the
|
||||
* cross-run retry budget governed by {@code max.retries.transient}.
|
||||
* <p>
|
||||
* <strong>Single-truth rule:</strong> Evaluations are derived solely from the document
|
||||
* master record's failure counters and the configured limit. No additional, parallel
|
||||
* persistence source for retry decisions is introduced.
|
||||
* <p>
|
||||
* This class is stateless and thread-safe.
|
||||
*/
|
||||
public final class DefaultRetryDecisionEvaluator implements RetryDecisionEvaluator {
|
||||
|
||||
/**
|
||||
* Derives the authoritative retry decision for a document-level error.
|
||||
* <p>
|
||||
* Decision rules by error class:
|
||||
* <ul>
|
||||
* <li>{@link DocumentErrorClassification#DETERMINISTIC_CONTENT_ERROR}:
|
||||
* {@code contentErrorCount} before increment = 0 →
|
||||
* {@link RetryDecision.ContentErrorRetryable}; else →
|
||||
* {@link RetryDecision.ContentErrorFinal}.</li>
|
||||
* <li>{@link DocumentErrorClassification#TRANSIENT_TECHNICAL_ERROR}:
|
||||
* {@code transientErrorCount + 1 < maxRetriesTransient} →
|
||||
* {@link RetryDecision.TransientErrorRetryable};
|
||||
* {@code transientErrorCount + 1 >= maxRetriesTransient} →
|
||||
* {@link RetryDecision.TransientErrorFinal}.</li>
|
||||
* <li>{@link DocumentErrorClassification#TARGET_COPY_TECHNICAL_ERROR}:
|
||||
* always → {@link RetryDecision.TargetCopyWithImmediateRetry}.
|
||||
* No counter is modified by this decision.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param errorClass classification of the error that occurred; never {@code null}
|
||||
* @param currentCounters failure counters <em>before</em> incrementing for this
|
||||
* attempt; never {@code null}
|
||||
* @param maxRetriesTransient configured maximum number of historised transient errors
|
||||
* allowed per fingerprint; must be ≥ 1
|
||||
* @param failureClass short, stable failure class identifier; never {@code null} or blank
|
||||
* @param failureMessage human-readable description of the error; never {@code null} or blank
|
||||
* @return the authoritative {@link RetryDecision}; never {@code null}
|
||||
* @throws IllegalArgumentException if {@code maxRetriesTransient} is less than 1
|
||||
* @throws NullPointerException if any reference parameter is {@code null}
|
||||
*/
|
||||
@Override
|
||||
public RetryDecision evaluate(
|
||||
DocumentErrorClassification errorClass,
|
||||
FailureCounters currentCounters,
|
||||
int maxRetriesTransient,
|
||||
String failureClass,
|
||||
String failureMessage) {
|
||||
|
||||
Objects.requireNonNull(errorClass, "errorClass must not be null");
|
||||
Objects.requireNonNull(currentCounters, "currentCounters must not be null");
|
||||
Objects.requireNonNull(failureClass, "failureClass must not be null");
|
||||
Objects.requireNonNull(failureMessage, "failureMessage must not be null");
|
||||
if (failureClass.isBlank()) {
|
||||
throw new IllegalArgumentException("failureClass must not be blank");
|
||||
}
|
||||
if (failureMessage.isBlank()) {
|
||||
throw new IllegalArgumentException("failureMessage must not be blank");
|
||||
}
|
||||
if (maxRetriesTransient < 1) {
|
||||
throw new IllegalArgumentException(
|
||||
"maxRetriesTransient must be >= 1, but was: " + maxRetriesTransient);
|
||||
}
|
||||
|
||||
return switch (errorClass) {
|
||||
case DETERMINISTIC_CONTENT_ERROR -> evaluateContentError(
|
||||
currentCounters, failureClass, failureMessage);
|
||||
case TRANSIENT_TECHNICAL_ERROR -> evaluateTransientError(
|
||||
currentCounters, maxRetriesTransient, failureClass, failureMessage);
|
||||
case TARGET_COPY_TECHNICAL_ERROR ->
|
||||
new RetryDecision.TargetCopyWithImmediateRetry(failureMessage);
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Determines whether an immediate within-run retry of the target copy operation
|
||||
* is permitted.
|
||||
* <p>
|
||||
* {@link ImmediateRetryDecision#ALLOWED} is returned only when the copy has failed
|
||||
* on its first attempt within the current run. If this is the second copy attempt
|
||||
* (the immediate retry itself has failed), {@link ImmediateRetryDecision#DENIED} is
|
||||
* returned and the failure must be escalated to the cross-run retry evaluation.
|
||||
*
|
||||
* @param isFirstCopyAttemptInThisRun {@code true} if the failing copy attempt was
|
||||
* the first copy attempt for this document in
|
||||
* the current run
|
||||
* @return {@link ImmediateRetryDecision#ALLOWED} or {@link ImmediateRetryDecision#DENIED};
|
||||
* never {@code null}
|
||||
*/
|
||||
@Override
|
||||
public ImmediateRetryDecision evaluateImmediateRetry(boolean isFirstCopyAttemptInThisRun) {
|
||||
return isFirstCopyAttemptInThisRun
|
||||
? ImmediateRetryDecision.ALLOWED
|
||||
: ImmediateRetryDecision.DENIED;
|
||||
}
|
||||
|
||||
/**
|
||||
* Evaluates the retry decision for a deterministic content error.
|
||||
* <p>
|
||||
* The content-error counter before this attempt determines the decision:
|
||||
* <ul>
|
||||
* <li>Count = 0 (first error) → {@link RetryDecision.ContentErrorRetryable};
|
||||
* one later scheduler run may retry.</li>
|
||||
* <li>Count ≥ 1 (second or subsequent error) → {@link RetryDecision.ContentErrorFinal};
|
||||
* no further retries.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param currentCounters failure counters before incrementing
|
||||
* @param failureClass failure class identifier
|
||||
* @param failureMessage failure description
|
||||
* @return the appropriate content-error retry decision
|
||||
*/
|
||||
private static RetryDecision evaluateContentError(
|
||||
FailureCounters currentCounters,
|
||||
String failureClass,
|
||||
String failureMessage) {
|
||||
|
||||
if (currentCounters.contentErrorCount() == 0) {
|
||||
return new RetryDecision.ContentErrorRetryable(failureClass, failureMessage);
|
||||
}
|
||||
return new RetryDecision.ContentErrorFinal(failureClass, failureMessage);
|
||||
}
|
||||
|
||||
/**
|
||||
* Evaluates the retry decision for a transient technical error.
|
||||
* <p>
|
||||
* The transient-error counter after incrementing determines the decision:
|
||||
* <ul>
|
||||
* <li>Counter after increment strictly less than {@code maxRetriesTransient} →
|
||||
* {@link RetryDecision.TransientErrorRetryable}; a later scheduler run may retry.</li>
|
||||
* <li>Counter after increment equals or exceeds {@code maxRetriesTransient} →
|
||||
* {@link RetryDecision.TransientErrorFinal}; no further retries.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Example with {@code maxRetriesTransient = 1}: counter before = 0,
|
||||
* counter after = 1 = limit → {@link RetryDecision.TransientErrorFinal} immediately.
|
||||
* <p>
|
||||
* Example with {@code maxRetriesTransient = 2}: counter before = 0,
|
||||
* counter after = 1 < 2 → {@link RetryDecision.TransientErrorRetryable};
|
||||
* counter before = 1, counter after = 2 = limit → {@link RetryDecision.TransientErrorFinal}.
|
||||
*
|
||||
* @param currentCounters failure counters before incrementing
|
||||
* @param maxRetriesTransient configured maximum historised transient errors (≥ 1)
|
||||
* @param failureClass failure class identifier
|
||||
* @param failureMessage failure description
|
||||
* @return the appropriate transient-error retry decision
|
||||
*/
|
||||
private static RetryDecision evaluateTransientError(
|
||||
FailureCounters currentCounters,
|
||||
int maxRetriesTransient,
|
||||
String failureClass,
|
||||
String failureMessage) {
|
||||
|
||||
int counterAfterIncrement = currentCounters.transientErrorCount() + 1;
|
||||
if (counterAfterIncrement < maxRetriesTransient) {
|
||||
return new RetryDecision.TransientErrorRetryable(failureClass, failureMessage);
|
||||
}
|
||||
return new RetryDecision.TransientErrorFinal(failureClass, failureMessage);
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,10 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
import java.util.function.Consumer;
|
||||
import java.util.function.Function;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentKnownProcessable;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
@@ -16,7 +21,6 @@ import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingLogger;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort;
|
||||
@@ -33,50 +37,82 @@ import de.gecheckt.pdf.umbenenner.domain.model.NamingProposalReady;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
import java.util.function.Consumer;
|
||||
import java.util.function.Function;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
|
||||
/**
|
||||
* Application-level service that implements the per-document processing logic.
|
||||
* <p>
|
||||
* This service is the single authoritative place for the decision rules:
|
||||
* idempotency checks, status/counter mapping, target-copy finalization, and consistent
|
||||
* two-level persistence.
|
||||
* idempotency checks, status/counter mapping, target-copy finalization, retry
|
||||
* finalization, skip semantics, and consistent two-level persistence.
|
||||
*
|
||||
* <h2>Processing order per candidate</h2>
|
||||
* <ol>
|
||||
* <li>Load the document master record by fingerprint.</li>
|
||||
* <li>If the overall status is {@link ProcessingStatus#SUCCESS} → create and persist
|
||||
* a skip attempt with {@link ProcessingStatus#SKIPPED_ALREADY_PROCESSED}.</li>
|
||||
* <li>If the overall status is {@link ProcessingStatus#FAILED_FINAL} → create and persist
|
||||
* a skip attempt with {@link ProcessingStatus#SKIPPED_FINAL_FAILURE}.</li>
|
||||
* <li>If the overall status is {@link ProcessingStatus#SUCCESS} →
|
||||
* <strong>log skip at INFO with fingerprint</strong>;
|
||||
* persist a skip attempt with {@link ProcessingStatus#SKIPPED_ALREADY_PROCESSED}.
|
||||
* Failure counters are not changed.</li>
|
||||
* <li>If the overall status is {@link ProcessingStatus#FAILED_FINAL} →
|
||||
* <strong>log skip at INFO with fingerprint</strong>;
|
||||
* persist a skip attempt with {@link ProcessingStatus#SKIPPED_FINAL_FAILURE}.
|
||||
* Failure counters are not changed.</li>
|
||||
* <li>If the overall status is {@link ProcessingStatus#PROPOSAL_READY} → load the
|
||||
* leading proposal attempt and execute the target-copy finalization flow:
|
||||
* build the base filename, resolve duplicates, write the copy, persist SUCCESS or
|
||||
* FAILED_RETRYABLE.</li>
|
||||
* build the base filename, resolve duplicates,
|
||||
* <strong>log generated target filename at INFO with fingerprint</strong>,
|
||||
* write the copy, persist SUCCESS or FAILED_RETRYABLE.</li>
|
||||
* <li>Otherwise execute the pipeline (extraction + pre-checks + AI naming) and map
|
||||
* the result into status, counters, and retryable flag.</li>
|
||||
* <li><strong>Log retry decision at INFO with fingerprint and error classification</strong>:
|
||||
* FAILED_RETRYABLE (will retry in a later scheduler run) or
|
||||
* FAILED_FINAL (retry budget exhausted, no further processing).</li>
|
||||
* <li>Persist exactly one historised processing attempt for the identified document.</li>
|
||||
* <li>Persist the updated document master record.</li>
|
||||
* </ol>
|
||||
*
|
||||
* <h2>Retry finalization rules</h2>
|
||||
* <ul>
|
||||
* <li><strong>Deterministic content errors:</strong> The first historised occurrence
|
||||
* leads to {@link ProcessingStatus#FAILED_RETRYABLE} (content-error counter incremented
|
||||
* by 1). The second historised occurrence leads to {@link ProcessingStatus#FAILED_FINAL}
|
||||
* (content-error counter incremented by 1). No further retry is possible.</li>
|
||||
* <li><strong>Transient technical errors:</strong> The transient-error counter is
|
||||
* incremented by 1 per occurrence. The document remains
|
||||
* {@link ProcessingStatus#FAILED_RETRYABLE} as long as the counter is strictly less
|
||||
* than {@code maxRetriesTransient}. The attempt that causes the counter to reach
|
||||
* {@code maxRetriesTransient} finalises the document to
|
||||
* {@link ProcessingStatus#FAILED_FINAL}. Valid values of {@code maxRetriesTransient}
|
||||
* are integers ≥ 1; the value 0 is invalid startup configuration.</li>
|
||||
* <li><strong>Skip events</strong> ({@code SKIPPED_ALREADY_PROCESSED},
|
||||
* {@code SKIPPED_FINAL_FAILURE}) never change any failure counter.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Status transitions</h2>
|
||||
* <ul>
|
||||
* <li>Pre-check passed + AI naming proposal ready → {@link ProcessingStatus#PROPOSAL_READY}</li>
|
||||
* <li>First deterministic content failure → {@link ProcessingStatus#FAILED_RETRYABLE}</li>
|
||||
* <li>Second deterministic content failure → {@link ProcessingStatus#FAILED_FINAL}</li>
|
||||
* <li>Technical infrastructure failure → {@link ProcessingStatus#FAILED_RETRYABLE}</li>
|
||||
* <li>Technical failure at transient retry limit → {@link ProcessingStatus#FAILED_FINAL}</li>
|
||||
* <li>{@link ProcessingStatus#PROPOSAL_READY} + successful target copy + consistent
|
||||
* persistence → {@link ProcessingStatus#SUCCESS}</li>
|
||||
* <li>{@link ProcessingStatus#PROPOSAL_READY} + first copy failure + successful immediate retry
|
||||
* → treated as successful copy, proceeds to {@link ProcessingStatus#SUCCESS}</li>
|
||||
* <li>{@link ProcessingStatus#PROPOSAL_READY} + both copy attempts fail → cross-run
|
||||
* {@link ProcessingStatus#FAILED_RETRYABLE}, transient error counter +1</li>
|
||||
* <li>{@link ProcessingStatus#PROPOSAL_READY} + technical failure → {@link ProcessingStatus#FAILED_RETRYABLE},
|
||||
* transient error counter +1</li>
|
||||
* <li>{@link ProcessingStatus#SUCCESS} → {@link ProcessingStatus#SKIPPED_ALREADY_PROCESSED} skip</li>
|
||||
* <li>{@link ProcessingStatus#FAILED_FINAL} → {@link ProcessingStatus#SKIPPED_FINAL_FAILURE} skip</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Log correlation</h2>
|
||||
* <p>
|
||||
* All log entries emitted by this coordinator are post-fingerprint: the fingerprint is
|
||||
* available for every document that reaches this coordinator. Relevant log entries carry
|
||||
* the document fingerprint for unambiguous correlation across runs.
|
||||
*
|
||||
* <h2>Leading source for the naming proposal (verbindlich)</h2>
|
||||
* <p>
|
||||
* When a document is in {@code PROPOSAL_READY} state, the authoritative source for the
|
||||
@@ -116,9 +152,23 @@ public class DocumentProcessingCoordinator {
|
||||
private final TargetFolderPort targetFolderPort;
|
||||
private final TargetFileCopyPort targetFileCopyPort;
|
||||
private final ProcessingLogger logger;
|
||||
private final int maxRetriesTransient;
|
||||
private final String activeProviderIdentifier;
|
||||
|
||||
/**
|
||||
* Creates the document processing coordinator with all required ports and the logger.
|
||||
* Creates the document processing coordinator with all required ports, logger,
|
||||
* the transient retry limit, and the active AI provider identifier.
|
||||
* <p>
|
||||
* {@code maxRetriesTransient} is the maximum number of historised transient error attempts
|
||||
* per fingerprint before the document is finalised to
|
||||
* {@link ProcessingStatus#FAILED_FINAL}. The attempt that causes the counter to
|
||||
* reach this value finalises the document. Must be >= 1.
|
||||
* <p>
|
||||
* {@code activeProviderIdentifier} is the opaque string identifier of the AI provider
|
||||
* that is active for this run (e.g. {@code "openai-compatible"} or {@code "claude"}).
|
||||
* It is written to the attempt history for every attempt that involves an AI call,
|
||||
* enabling provider-level traceability per attempt without introducing
|
||||
* provider-specific logic in the application layer.
|
||||
*
|
||||
* @param documentRecordRepository port for reading and writing the document master record;
|
||||
* must not be null
|
||||
@@ -130,7 +180,13 @@ public class DocumentProcessingCoordinator {
|
||||
* @param targetFileCopyPort port for copying source files to the target folder;
|
||||
* must not be null
|
||||
* @param logger for processing-related logging; must not be null
|
||||
* @throws NullPointerException if any parameter is null
|
||||
* @param maxRetriesTransient maximum number of historised transient error attempts
|
||||
* before finalisation; must be >= 1
|
||||
* @param activeProviderIdentifier opaque identifier of the active AI provider for this run;
|
||||
* must not be null or blank
|
||||
* @throws NullPointerException if any object parameter is null
|
||||
* @throws IllegalArgumentException if {@code maxRetriesTransient} is less than 1, or
|
||||
* if {@code activeProviderIdentifier} is blank
|
||||
*/
|
||||
public DocumentProcessingCoordinator(
|
||||
DocumentRecordRepository documentRecordRepository,
|
||||
@@ -138,7 +194,17 @@ public class DocumentProcessingCoordinator {
|
||||
UnitOfWorkPort unitOfWorkPort,
|
||||
TargetFolderPort targetFolderPort,
|
||||
TargetFileCopyPort targetFileCopyPort,
|
||||
ProcessingLogger logger) {
|
||||
ProcessingLogger logger,
|
||||
int maxRetriesTransient,
|
||||
String activeProviderIdentifier) {
|
||||
if (maxRetriesTransient < 1) {
|
||||
throw new IllegalArgumentException(
|
||||
"maxRetriesTransient must be >= 1, got: " + maxRetriesTransient);
|
||||
}
|
||||
Objects.requireNonNull(activeProviderIdentifier, "activeProviderIdentifier must not be null");
|
||||
if (activeProviderIdentifier.isBlank()) {
|
||||
throw new IllegalArgumentException("activeProviderIdentifier must not be blank");
|
||||
}
|
||||
this.documentRecordRepository =
|
||||
Objects.requireNonNull(documentRecordRepository, "documentRecordRepository must not be null");
|
||||
this.processingAttemptRepository =
|
||||
@@ -150,6 +216,8 @@ public class DocumentProcessingCoordinator {
|
||||
this.targetFileCopyPort =
|
||||
Objects.requireNonNull(targetFileCopyPort, "targetFileCopyPort must not be null");
|
||||
this.logger = Objects.requireNonNull(logger, "logger must not be null");
|
||||
this.maxRetriesTransient = maxRetriesTransient;
|
||||
this.activeProviderIdentifier = activeProviderIdentifier;
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -285,7 +353,7 @@ public class DocumentProcessingCoordinator {
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// M6 target-copy finalization path
|
||||
// Target-copy finalization path
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
@@ -297,6 +365,10 @@ public class DocumentProcessingCoordinator {
|
||||
* <li>Build the base filename from the proposal's date and title.</li>
|
||||
* <li>Resolve the first available unique filename in the target folder.</li>
|
||||
* <li>Copy the source file to the target folder.</li>
|
||||
* <li>If the copy fails: attempt exactly one immediate within-run retry of the same
|
||||
* physical copy step. No new AI call and no new naming-proposal derivation occur.
|
||||
* If the retry also fails, treat the combined failure as a transient error and
|
||||
* skip the SUCCESS path.</li>
|
||||
* <li>Persist a new {@code SUCCESS} attempt and update the master record.</li>
|
||||
* <li>If persistence fails after a successful copy: attempt best-effort rollback
|
||||
* of the copy and persist {@code FAILED_RETRYABLE} instead.</li>
|
||||
@@ -337,6 +409,16 @@ public class DocumentProcessingCoordinator {
|
||||
"Status is PROPOSAL_READY but no PROPOSAL_READY attempt exists in history");
|
||||
}
|
||||
|
||||
// Log sensitive AI content (raw response, reasoning) if configured
|
||||
if (proposalAttempt.aiRawResponse() != null) {
|
||||
logger.debugSensitiveAiContent("AI raw response for '{}' (fingerprint: {}): {}",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(), proposalAttempt.aiRawResponse());
|
||||
}
|
||||
if (proposalAttempt.aiReasoning() != null) {
|
||||
logger.debugSensitiveAiContent("AI reasoning for '{}' (fingerprint: {}): {}",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(), proposalAttempt.aiReasoning());
|
||||
}
|
||||
|
||||
// --- Step 2: Build base filename from the proposal ---
|
||||
TargetFilenameBuildingService.BaseFilenameResult filenameResult =
|
||||
TargetFilenameBuildingService.buildBaseFilename(proposalAttempt);
|
||||
@@ -365,19 +447,41 @@ public class DocumentProcessingCoordinator {
|
||||
|
||||
String resolvedFilename =
|
||||
((ResolvedTargetFilename) resolutionResult).resolvedFilename();
|
||||
logger.info("Resolved target filename for '{}': '{}'.",
|
||||
candidate.uniqueIdentifier(), resolvedFilename);
|
||||
logger.info("Generated target filename for '{}' (fingerprint: {}): '{}'.",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(), resolvedFilename);
|
||||
|
||||
// --- Step 4: Copy file to target ---
|
||||
// --- Step 4: Copy file to target (with one immediate within-run retry) ---
|
||||
TargetFileCopyResult copyResult =
|
||||
targetFileCopyPort.copyToTarget(candidate.locator(), resolvedFilename);
|
||||
|
||||
if (copyResult instanceof TargetFileCopyTechnicalFailure copyFailure) {
|
||||
logger.error("Target copy failed for '{}': {}",
|
||||
candidate.uniqueIdentifier(), copyFailure.errorMessage());
|
||||
return persistTransientError(
|
||||
candidate, fingerprint, existingRecord, context, attemptStart, now,
|
||||
"Target file copy failed: " + copyFailure.errorMessage());
|
||||
if (copyResult instanceof TargetFileCopyTechnicalFailure firstCopyFailure) {
|
||||
// First copy attempt failed — perform exactly one immediate within-run retry.
|
||||
// The retry reuses the same resolved filename and document context; no new AI
|
||||
// call, no new naming-proposal derivation. This mechanism does not increment
|
||||
// the cross-run transient-error counter by itself.
|
||||
logger.warn("First target copy attempt failed for '{}': {}. Performing immediate within-run retry.",
|
||||
candidate.uniqueIdentifier(), firstCopyFailure.errorMessage());
|
||||
|
||||
TargetFileCopyResult retryCopyResult =
|
||||
targetFileCopyPort.copyToTarget(candidate.locator(), resolvedFilename);
|
||||
|
||||
if (retryCopyResult instanceof TargetFileCopyTechnicalFailure retryCopyFailure) {
|
||||
// Immediate retry also failed — the combined failure is escalated as a
|
||||
// cross-run transient technical error. No further within-run retry is
|
||||
// attempted. This is the only document-level result for persistence.
|
||||
logger.error("Immediate within-run retry also failed for '{}': {}",
|
||||
candidate.uniqueIdentifier(), retryCopyFailure.errorMessage());
|
||||
String combinedMessage = "Target file copy failed after immediate within-run retry."
|
||||
+ " First attempt: " + firstCopyFailure.errorMessage()
|
||||
+ "; Retry attempt: " + retryCopyFailure.errorMessage();
|
||||
return persistTransientError(
|
||||
candidate, fingerprint, existingRecord, context, attemptStart, now,
|
||||
combinedMessage);
|
||||
}
|
||||
|
||||
// Immediate retry succeeded — proceed to SUCCESS path as if the copy
|
||||
// had succeeded on the first attempt.
|
||||
logger.info("Immediate within-run retry succeeded for '{}'.", candidate.uniqueIdentifier());
|
||||
}
|
||||
|
||||
// Copy succeeded — attempt to persist SUCCESS
|
||||
@@ -414,7 +518,7 @@ public class DocumentProcessingCoordinator {
|
||||
ProcessingAttempt successAttempt = new ProcessingAttempt(
|
||||
fingerprint, context.runId(), attemptNumber, attemptStart, now,
|
||||
ProcessingStatus.SUCCESS, null, null, false,
|
||||
null, null, null, null, null, null, null, null, null,
|
||||
null, null, null, null, null, null, null, null, null, null,
|
||||
resolvedFilename);
|
||||
|
||||
DocumentRecord successRecord = buildSuccessRecord(
|
||||
@@ -447,8 +551,14 @@ public class DocumentProcessingCoordinator {
|
||||
}
|
||||
|
||||
/**
|
||||
* Persists a {@code FAILED_RETRYABLE} attempt with an incremented transient error counter
|
||||
* for a document-level technical error during the target-copy finalization stage.
|
||||
* Persists a transient error for a document-level technical failure during the
|
||||
* target-copy finalization stage.
|
||||
* <p>
|
||||
* The retry decision (status and updated counters) is derived via the central
|
||||
* rule in {@link ProcessingOutcomeTransition}, keeping the target-copy finalization
|
||||
* path consistent with the AI pipeline path. The transient error counter is always
|
||||
* incremented by exactly one. This method does not count the within-run immediate
|
||||
* retry — only the combined outcome of the retry is reported here.
|
||||
*
|
||||
* @return true if the error was persisted; false if the error persistence itself failed
|
||||
*/
|
||||
@@ -461,28 +571,44 @@ public class DocumentProcessingCoordinator {
|
||||
Instant now,
|
||||
String errorMessage) {
|
||||
|
||||
FailureCounters updatedCounters =
|
||||
existingRecord.failureCounters().withIncrementedTransientErrorCount();
|
||||
// Delegate to the central retry rule so the target-copy path and the AI pipeline
|
||||
// path are governed by the same logic without duplication.
|
||||
ProcessingOutcomeTransition.ProcessingOutcome transition =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
new TechnicalDocumentError(candidate, errorMessage, null),
|
||||
existingRecord.failureCounters(),
|
||||
maxRetriesTransient);
|
||||
FailureCounters updatedCounters = transition.counters();
|
||||
ProcessingStatus errorStatus = transition.overallStatus();
|
||||
boolean retryable = transition.retryable();
|
||||
|
||||
try {
|
||||
int attemptNumber = processingAttemptRepository.loadNextAttemptNumber(fingerprint);
|
||||
ProcessingAttempt errorAttempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, context.runId(), attemptNumber, attemptStart, now,
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
ProcessingStatus.FAILED_RETRYABLE.name(),
|
||||
errorMessage, true);
|
||||
errorStatus,
|
||||
errorStatus.name(),
|
||||
errorMessage, retryable);
|
||||
|
||||
DocumentRecord errorRecord = buildTransientErrorRecord(
|
||||
existingRecord, candidate, updatedCounters, now);
|
||||
existingRecord, candidate, updatedCounters, errorStatus, now);
|
||||
|
||||
unitOfWorkPort.executeInTransaction(txOps -> {
|
||||
txOps.saveProcessingAttempt(errorAttempt);
|
||||
txOps.updateDocumentRecord(errorRecord);
|
||||
});
|
||||
|
||||
logger.debug("Transient error persisted for '{}': status=FAILED_RETRYABLE, "
|
||||
+ "transientErrors={}.",
|
||||
candidate.uniqueIdentifier(),
|
||||
updatedCounters.transientErrorCount());
|
||||
if (!retryable) {
|
||||
logger.info("Retry decision for '{}' (fingerprint: {}): FAILED_FINAL — "
|
||||
+ "transient error limit reached ({}/{} attempts). No further retry.",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(),
|
||||
updatedCounters.transientErrorCount(), maxRetriesTransient);
|
||||
} else {
|
||||
logger.info("Retry decision for '{}' (fingerprint: {}): FAILED_RETRYABLE — "
|
||||
+ "transient error, will retry in later run ({}/{} attempts).",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(),
|
||||
updatedCounters.transientErrorCount(), maxRetriesTransient);
|
||||
}
|
||||
return true;
|
||||
|
||||
} catch (DocumentPersistenceException persistEx) {
|
||||
@@ -493,9 +619,14 @@ public class DocumentProcessingCoordinator {
|
||||
}
|
||||
|
||||
/**
|
||||
* Attempts to persist a {@code FAILED_RETRYABLE} attempt after a persistence failure
|
||||
* that occurred following a successful target copy. This is a secondary persistence
|
||||
* effort; its failure is logged but does not change the return value.
|
||||
* Attempts to persist a transient error after a persistence failure that occurred
|
||||
* following a successful target copy. This is a secondary persistence effort;
|
||||
* its failure is logged but does not change the return value.
|
||||
* <p>
|
||||
* Applies the same transient limit check as {@link #persistTransientError} via the
|
||||
* central rule in {@link ProcessingOutcomeTransition}: if the incremented counter
|
||||
* reaches {@code maxRetriesTransient}, the secondary attempt is persisted as
|
||||
* {@link ProcessingStatus#FAILED_FINAL}.
|
||||
*/
|
||||
private void persistTransientErrorAfterPersistenceFailure(
|
||||
SourceDocumentCandidate candidate,
|
||||
@@ -506,18 +637,24 @@ public class DocumentProcessingCoordinator {
|
||||
Instant now,
|
||||
String errorMessage) {
|
||||
|
||||
FailureCounters updatedCounters =
|
||||
existingRecord.failureCounters().withIncrementedTransientErrorCount();
|
||||
ProcessingOutcomeTransition.ProcessingOutcome transition =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
new TechnicalDocumentError(candidate, errorMessage, null),
|
||||
existingRecord.failureCounters(),
|
||||
maxRetriesTransient);
|
||||
FailureCounters updatedCounters = transition.counters();
|
||||
ProcessingStatus errorStatus = transition.overallStatus();
|
||||
|
||||
try {
|
||||
int attemptNumber = processingAttemptRepository.loadNextAttemptNumber(fingerprint);
|
||||
ProcessingAttempt errorAttempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, context.runId(), attemptNumber, attemptStart, now,
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
ProcessingStatus.FAILED_RETRYABLE.name(),
|
||||
errorMessage, true);
|
||||
errorStatus,
|
||||
errorStatus.name(),
|
||||
errorMessage, transition.retryable());
|
||||
|
||||
DocumentRecord errorRecord = buildTransientErrorRecord(
|
||||
existingRecord, candidate, updatedCounters, now);
|
||||
existingRecord, candidate, updatedCounters, errorStatus, now);
|
||||
|
||||
unitOfWorkPort.executeInTransaction(txOps -> {
|
||||
txOps.saveProcessingAttempt(errorAttempt);
|
||||
@@ -618,13 +755,13 @@ public class DocumentProcessingCoordinator {
|
||||
|
||||
private ProcessingOutcomeTransition.ProcessingOutcome mapOutcomeForNewDocument(
|
||||
DocumentProcessingOutcome pipelineOutcome) {
|
||||
return ProcessingOutcomeTransition.forNewDocument(pipelineOutcome);
|
||||
return ProcessingOutcomeTransition.forNewDocument(pipelineOutcome, maxRetriesTransient);
|
||||
}
|
||||
|
||||
private ProcessingOutcomeTransition.ProcessingOutcome mapOutcomeForKnownDocument(
|
||||
DocumentProcessingOutcome pipelineOutcome,
|
||||
FailureCounters existingCounters) {
|
||||
return ProcessingOutcomeTransition.forKnownDocument(pipelineOutcome, existingCounters);
|
||||
return ProcessingOutcomeTransition.forKnownDocument(pipelineOutcome, existingCounters, maxRetriesTransient);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
@@ -717,12 +854,13 @@ public class DocumentProcessingCoordinator {
|
||||
DocumentRecord existingRecord,
|
||||
SourceDocumentCandidate candidate,
|
||||
FailureCounters updatedCounters,
|
||||
ProcessingStatus targetStatus,
|
||||
Instant now) {
|
||||
return new DocumentRecord(
|
||||
existingRecord.fingerprint(),
|
||||
new SourceDocumentLocator(candidate.locator().value()),
|
||||
candidate.uniqueIdentifier(),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
targetStatus,
|
||||
updatedCounters,
|
||||
now, // lastFailureInstant
|
||||
existingRecord.lastSuccessInstant(),
|
||||
@@ -764,11 +902,27 @@ public class DocumentProcessingCoordinator {
|
||||
recordWriter.accept(txOps);
|
||||
});
|
||||
|
||||
logger.info("Document '{}' processed: status={}, contentErrors={}, transientErrors={}.",
|
||||
candidate.uniqueIdentifier(),
|
||||
outcome.overallStatus(),
|
||||
outcome.counters().contentErrorCount(),
|
||||
outcome.counters().transientErrorCount());
|
||||
if (outcome.overallStatus() == ProcessingStatus.FAILED_RETRYABLE) {
|
||||
logger.info("Retry decision for '{}' (fingerprint: {}): FAILED_RETRYABLE — "
|
||||
+ "will retry in later scheduler run. "
|
||||
+ "ContentErrors={}, TransientErrors={}.",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(),
|
||||
outcome.counters().contentErrorCount(),
|
||||
outcome.counters().transientErrorCount());
|
||||
} else if (outcome.overallStatus() == ProcessingStatus.FAILED_FINAL) {
|
||||
logger.info("Retry decision for '{}' (fingerprint: {}): FAILED_FINAL — "
|
||||
+ "permanently failed, no further retry. "
|
||||
+ "ContentErrors={}, TransientErrors={}.",
|
||||
candidate.uniqueIdentifier(), fingerprint.sha256Hex(),
|
||||
outcome.counters().contentErrorCount(),
|
||||
outcome.counters().transientErrorCount());
|
||||
} else {
|
||||
logger.info("Document '{}' processed: status={} (fingerprint: {}). "
|
||||
+ "ContentErrors={}, TransientErrors={}.",
|
||||
candidate.uniqueIdentifier(), outcome.overallStatus(), fingerprint.sha256Hex(),
|
||||
outcome.counters().contentErrorCount(),
|
||||
outcome.counters().transientErrorCount());
|
||||
}
|
||||
return true;
|
||||
|
||||
} catch (DocumentPersistenceException e) {
|
||||
@@ -812,6 +966,7 @@ public class DocumentProcessingCoordinator {
|
||||
yield new ProcessingAttempt(
|
||||
fingerprint, context.runId(), attemptNumber, startedAt, endedAt,
|
||||
outcome.overallStatus(), failureClass, failureMessage, outcome.retryable(),
|
||||
activeProviderIdentifier,
|
||||
ctx.modelName(), ctx.promptIdentifier(),
|
||||
ctx.processedPageCount(), ctx.sentCharacterCount(),
|
||||
ctx.aiRawResponse(),
|
||||
@@ -825,6 +980,7 @@ public class DocumentProcessingCoordinator {
|
||||
yield new ProcessingAttempt(
|
||||
fingerprint, context.runId(), attemptNumber, startedAt, endedAt,
|
||||
outcome.overallStatus(), failureClass, failureMessage, outcome.retryable(),
|
||||
activeProviderIdentifier,
|
||||
ctx.modelName(), ctx.promptIdentifier(),
|
||||
ctx.processedPageCount(), ctx.sentCharacterCount(),
|
||||
ctx.aiRawResponse(),
|
||||
@@ -837,6 +993,7 @@ public class DocumentProcessingCoordinator {
|
||||
yield new ProcessingAttempt(
|
||||
fingerprint, context.runId(), attemptNumber, startedAt, endedAt,
|
||||
outcome.overallStatus(), failureClass, failureMessage, outcome.retryable(),
|
||||
activeProviderIdentifier,
|
||||
ctx.modelName(), ctx.promptIdentifier(),
|
||||
ctx.processedPageCount(), ctx.sentCharacterCount(),
|
||||
ctx.aiRawResponse(),
|
||||
|
||||
@@ -1,17 +1,17 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
import java.util.Objects;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
|
||||
/**
|
||||
* Orchestrates document processing pipeline: extraction → pre-checks → outcome classification.
|
||||
|
||||
@@ -1,15 +1,15 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Evaluates whether a successfully extracted PDF passes pre-checks.
|
||||
* <p>
|
||||
|
||||
@@ -10,11 +10,13 @@ import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
|
||||
/**
|
||||
* Pure status and counter transition policy for document processing outcomes.
|
||||
* Authoritative, stateless retry decision rule for all document processing outcomes.
|
||||
* <p>
|
||||
* This class encapsulates the deterministic rules for mapping a pipeline outcome
|
||||
* (pre-check, naming proposal, or failure) to a processing status, updated
|
||||
* failure counters, and retryability flag.
|
||||
* This class is the single production source of truth for mapping a pipeline outcome
|
||||
* (pre-check, naming proposal, or failure) to a processing status, updated failure
|
||||
* counters, and retryability flag. Both the AI pipeline path and the target-copy
|
||||
* finalization path in {@link DocumentProcessingCoordinator} delegate to this class,
|
||||
* so that no duplicate retry logic exists elsewhere.
|
||||
* <p>
|
||||
* The transition logic is independent of persistence, orchestration, or any
|
||||
* infrastructure concern. It is purely declarative and stateless.
|
||||
@@ -36,9 +38,29 @@ import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
* <li><strong>AI functional failure (second or later occurrence):</strong>
|
||||
* Status becomes {@link ProcessingStatus#FAILED_FINAL},
|
||||
* content error counter incremented by 1, {@code retryable=false}.</li>
|
||||
* <li><strong>Technical error (pre-fingerprint / extraction / AI infrastructure):</strong>
|
||||
* <li><strong>Technical error below the transient retry limit:</strong>
|
||||
* Status becomes {@link ProcessingStatus#FAILED_RETRYABLE},
|
||||
* transient error counter incremented by 1, {@code retryable=true}.</li>
|
||||
* <li><strong>Technical error at or above the transient retry limit:</strong>
|
||||
* Status becomes {@link ProcessingStatus#FAILED_FINAL},
|
||||
* transient error counter incremented by 1, {@code retryable=false}.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Transient retry limit semantics</h2>
|
||||
* <p>
|
||||
* {@code maxRetriesTransient} is interpreted as the maximum number of historised
|
||||
* transient error attempts per fingerprint. The attempt that causes the counter
|
||||
* to reach {@code maxRetriesTransient} finalises the document status to
|
||||
* {@link ProcessingStatus#FAILED_FINAL}. Valid values are integers >= 1;
|
||||
* the value 0 is invalid startup configuration and must be rejected before
|
||||
* the batch run begins.
|
||||
* <p>
|
||||
* Examples:
|
||||
* <ul>
|
||||
* <li>{@code maxRetriesTransient = 1}: the first historised transient error
|
||||
* immediately finalises to {@code FAILED_FINAL}.</li>
|
||||
* <li>{@code maxRetriesTransient = 2}: the first transient error yields
|
||||
* {@code FAILED_RETRYABLE}; the second finalises to {@code FAILED_FINAL}.</li>
|
||||
* </ul>
|
||||
*/
|
||||
final class ProcessingOutcomeTransition {
|
||||
@@ -52,24 +74,33 @@ final class ProcessingOutcomeTransition {
|
||||
* <p>
|
||||
* For new documents, all failure counters start at zero.
|
||||
*
|
||||
* @param pipelineOutcome the outcome from the processing pipeline
|
||||
* @param pipelineOutcome the outcome from the processing pipeline
|
||||
* @param maxRetriesTransient maximum number of historised transient error attempts
|
||||
* before the document is finalised to {@code FAILED_FINAL};
|
||||
* must be >= 1
|
||||
* @return the mapped outcome with status, counters, and retryability
|
||||
*/
|
||||
static ProcessingOutcome forNewDocument(DocumentProcessingOutcome pipelineOutcome) {
|
||||
return forKnownDocument(pipelineOutcome, FailureCounters.zero());
|
||||
static ProcessingOutcome forNewDocument(
|
||||
DocumentProcessingOutcome pipelineOutcome,
|
||||
int maxRetriesTransient) {
|
||||
return forKnownDocument(pipelineOutcome, FailureCounters.zero(), maxRetriesTransient);
|
||||
}
|
||||
|
||||
/**
|
||||
* Maps a pipeline outcome to a processing outcome, considering the existing
|
||||
* failure counter state from a known document's history.
|
||||
*
|
||||
* @param pipelineOutcome the outcome from the processing pipeline
|
||||
* @param existingCounters the current failure counter values from the document's master record
|
||||
* @param pipelineOutcome the outcome from the processing pipeline
|
||||
* @param existingCounters the current failure counter values from the document's master record
|
||||
* @param maxRetriesTransient maximum number of historised transient error attempts
|
||||
* before the document is finalised to {@code FAILED_FINAL};
|
||||
* must be >= 1
|
||||
* @return the mapped outcome with updated status, counters, and retryability
|
||||
*/
|
||||
static ProcessingOutcome forKnownDocument(
|
||||
DocumentProcessingOutcome pipelineOutcome,
|
||||
FailureCounters existingCounters) {
|
||||
FailureCounters existingCounters,
|
||||
int maxRetriesTransient) {
|
||||
|
||||
return switch (pipelineOutcome) {
|
||||
case NamingProposalReady ignored -> {
|
||||
@@ -106,31 +137,37 @@ final class ProcessingOutcomeTransition {
|
||||
}
|
||||
|
||||
case TechnicalDocumentError ignored4 -> {
|
||||
// Technical error (extraction / infrastructure): retryable, transient counter +1
|
||||
// Technical error (extraction / infrastructure): apply transient retry limit
|
||||
FailureCounters updatedCounters = existingCounters.withIncrementedTransientErrorCount();
|
||||
boolean limitReached = updatedCounters.transientErrorCount() >= maxRetriesTransient;
|
||||
yield new ProcessingOutcome(
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
existingCounters.withIncrementedTransientErrorCount(),
|
||||
true
|
||||
limitReached ? ProcessingStatus.FAILED_FINAL : ProcessingStatus.FAILED_RETRYABLE,
|
||||
updatedCounters,
|
||||
!limitReached
|
||||
);
|
||||
}
|
||||
|
||||
case AiTechnicalFailure ignored5 -> {
|
||||
// Technical AI error (timeout, unreachable, bad JSON): retryable, transient counter +1
|
||||
// Technical AI error (timeout, unreachable, bad JSON): apply transient retry limit
|
||||
FailureCounters updatedCounters = existingCounters.withIncrementedTransientErrorCount();
|
||||
boolean limitReached = updatedCounters.transientErrorCount() >= maxRetriesTransient;
|
||||
yield new ProcessingOutcome(
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
existingCounters.withIncrementedTransientErrorCount(),
|
||||
true
|
||||
limitReached ? ProcessingStatus.FAILED_FINAL : ProcessingStatus.FAILED_RETRYABLE,
|
||||
updatedCounters,
|
||||
!limitReached
|
||||
);
|
||||
}
|
||||
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed ignored6 -> {
|
||||
// Pre-check passed without AI step: in normal flow this should not appear at
|
||||
// the outcome transition level once the AI pipeline is fully wired. Treat it
|
||||
// as a technical error to avoid silent inconsistency.
|
||||
// as a technical error and apply the transient retry limit.
|
||||
FailureCounters updatedCounters = existingCounters.withIncrementedTransientErrorCount();
|
||||
boolean limitReached = updatedCounters.transientErrorCount() >= maxRetriesTransient;
|
||||
yield new ProcessingOutcome(
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
existingCounters.withIncrementedTransientErrorCount(),
|
||||
true
|
||||
limitReached ? ProcessingStatus.FAILED_FINAL : ProcessingStatus.FAILED_RETRYABLE,
|
||||
updatedCounters,
|
||||
!limitReached
|
||||
);
|
||||
}
|
||||
};
|
||||
|
||||
@@ -0,0 +1,107 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentErrorClassification;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ImmediateRetryDecision;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RetryDecision;
|
||||
|
||||
/**
|
||||
* Application service contract for deriving authoritative retry decisions from
|
||||
* document error state and configuration.
|
||||
* <p>
|
||||
* This interface defines the single, testable entry point for all retry policy
|
||||
* evaluations. Implementations must apply the verbindlichen retry rules exactly
|
||||
* as specified:
|
||||
* <ul>
|
||||
* <li><strong>Deterministic content errors</strong> ({@link DocumentErrorClassification#DETERMINISTIC_CONTENT_ERROR}):
|
||||
* the <em>first</em> historised content error for a fingerprint results in
|
||||
* {@link RetryDecision.ContentErrorRetryable}; the <em>second</em> results in
|
||||
* {@link RetryDecision.ContentErrorFinal}.</li>
|
||||
* <li><strong>Transient technical errors</strong> ({@link DocumentErrorClassification#TRANSIENT_TECHNICAL_ERROR}):
|
||||
* the error remains retryable while the transient-error counter after incrementing
|
||||
* stays strictly below {@code maxRetriesTransient}. When the counter after
|
||||
* incrementing reaches {@code maxRetriesTransient}, the result is
|
||||
* {@link RetryDecision.TransientErrorFinal}.</li>
|
||||
* <li><strong>Target copy failures</strong> ({@link DocumentErrorClassification#TARGET_COPY_TECHNICAL_ERROR})
|
||||
* on the <em>first</em> copy attempt within a run: result is
|
||||
* {@link RetryDecision.TargetCopyWithImmediateRetry}. After the immediate retry
|
||||
* has itself failed, the failure is re-evaluated as a
|
||||
* {@link DocumentErrorClassification#TRANSIENT_TECHNICAL_ERROR}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Counter semantics:</strong>
|
||||
* <ul>
|
||||
* <li>The {@code currentCounters} passed to {@link #evaluate} reflect the state
|
||||
* <em>before</em> the current attempt's counter increment. The evaluator is
|
||||
* responsible for determining what the counter will be after incrementing and
|
||||
* applying the threshold check accordingly.</li>
|
||||
* <li>Skip events ({@code SKIPPED_ALREADY_PROCESSED}, {@code SKIPPED_FINAL_FAILURE})
|
||||
* are not routed through this evaluator and never produce a
|
||||
* {@link RetryDecision}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>{@code maxRetriesTransient} invariant:</strong>
|
||||
* The value must be an Integer ≥ 1. A value of {@code 0} is invalid configuration
|
||||
* and must be rejected at startup before any batch run begins. Implementations of
|
||||
* this interface may assume the value is always ≥ 1 when called.
|
||||
* <p>
|
||||
* Example for {@code maxRetriesTransient = 1}:
|
||||
* <ul>
|
||||
* <li>transient-error counter before = 0 → after increment = 1 = limit → {@link RetryDecision.TransientErrorFinal}</li>
|
||||
* </ul>
|
||||
* Example for {@code maxRetriesTransient = 2}:
|
||||
* <ul>
|
||||
* <li>transient-error counter before = 0 → after increment = 1 < 2 → {@link RetryDecision.TransientErrorRetryable}</li>
|
||||
* <li>transient-error counter before = 1 → after increment = 2 = limit → {@link RetryDecision.TransientErrorFinal}</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Single-truth rule:</strong> No parallel persistence source for retry
|
||||
* decisions is introduced. Evaluations are derived solely from the document master
|
||||
* record's failure counters and the configured limit.
|
||||
*/
|
||||
public interface RetryDecisionEvaluator {
|
||||
|
||||
/**
|
||||
* Derives the authoritative retry decision for a document-level error.
|
||||
* <p>
|
||||
* The decision is determined by the error classification, the existing failure
|
||||
* counters (before any increment for the current attempt), and the configured
|
||||
* transient-retry limit.
|
||||
*
|
||||
* @param errorClass classification of the error that occurred; never {@code null}
|
||||
* @param currentCounters failure counters <em>before</em> incrementing for this
|
||||
* attempt; never {@code null}
|
||||
* @param maxRetriesTransient configured maximum number of historised transient errors
|
||||
* allowed per fingerprint; must be ≥ 1
|
||||
* @param failureClass short, stable failure class identifier for persistence
|
||||
* and logging; never {@code null} or blank
|
||||
* @param failureMessage human-readable description of the error; never {@code null}
|
||||
* or blank
|
||||
* @return the authoritative {@link RetryDecision}; never {@code null}
|
||||
* @throws IllegalArgumentException if {@code maxRetriesTransient} is less than 1
|
||||
*/
|
||||
RetryDecision evaluate(
|
||||
DocumentErrorClassification errorClass,
|
||||
FailureCounters currentCounters,
|
||||
int maxRetriesTransient,
|
||||
String failureClass,
|
||||
String failureMessage);
|
||||
|
||||
/**
|
||||
* Determines whether an immediate within-run retry of the target copy operation
|
||||
* is permitted.
|
||||
* <p>
|
||||
* An immediate retry is {@link ImmediateRetryDecision#ALLOWED} only when the copy
|
||||
* has failed on its first attempt within the current run. If this is the second
|
||||
* copy attempt within the same run (i.e. the immediate retry itself has failed),
|
||||
* the result is {@link ImmediateRetryDecision#DENIED}.
|
||||
*
|
||||
* @param isFirstCopyAttemptInThisRun {@code true} if the failing copy attempt was
|
||||
* the first copy attempt for this document in
|
||||
* the current run; {@code false} if it was the
|
||||
* immediate retry attempt
|
||||
* @return {@link ImmediateRetryDecision#ALLOWED} or {@link ImmediateRetryDecision#DENIED};
|
||||
* never {@code null}
|
||||
*/
|
||||
ImmediateRetryDecision evaluateImmediateRetry(boolean isFirstCopyAttemptInThisRun);
|
||||
}
|
||||
@@ -1,10 +1,10 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
|
||||
import java.time.LocalDate;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
|
||||
/**
|
||||
* Stateless service for building the base target filename from a leading naming proposal.
|
||||
* <p>
|
||||
@@ -91,12 +91,17 @@ public final class TargetFilenameBuildingService {
|
||||
* <ul>
|
||||
* <li>Resolved date must be non-null.</li>
|
||||
* <li>Validated title must be non-null and non-blank.</li>
|
||||
* <li>Validated title must not exceed 20 characters.</li>
|
||||
* <li>Validated title must contain only letters, digits, and spaces.</li>
|
||||
* <li>Validated title must not exceed 20 characters (before Windows cleaning).</li>
|
||||
* <li>After Windows-character cleaning, title must contain only letters, digits, and spaces.</li>
|
||||
* </ul>
|
||||
* If any rule is violated, the state is treated as an
|
||||
* {@link InconsistentProposalState}.
|
||||
* <p>
|
||||
* Windows compatibility: Windows-incompatible characters
|
||||
* (e.g., {@code < > : " / \ | ? *}) are removed from the title before final validation.
|
||||
* This ensures the resulting filename can be created on Windows systems.
|
||||
* The 20-character rule is applied to the original title before cleaning.
|
||||
* <p>
|
||||
* The 20-character limit applies exclusively to the base title. A duplicate-avoidance
|
||||
* suffix (e.g., {@code (1)}) may be appended by the target folder adapter after this
|
||||
* method returns and is not counted against the 20 characters.
|
||||
@@ -127,15 +132,25 @@ public final class TargetFilenameBuildingService {
|
||||
+ title + "'");
|
||||
}
|
||||
|
||||
if (!isAllowedTitleCharacters(title)) {
|
||||
// Remove Windows-incompatible characters to enable technical Windows compatibility
|
||||
String cleanedTitle = removeWindowsIncompatibleCharacters(title);
|
||||
|
||||
if (cleanedTitle.isBlank()) {
|
||||
return new InconsistentProposalState(
|
||||
"Leading PROPOSAL_READY attempt has title with disallowed characters "
|
||||
+ "(only letters, digits, and spaces are permitted): '"
|
||||
"Title becomes empty after Windows-compatibility cleaning: '"
|
||||
+ title + "'");
|
||||
}
|
||||
|
||||
// After cleaning, verify that only letters, digits, and spaces remain
|
||||
if (!isAllowedTitleCharacters(cleanedTitle)) {
|
||||
return new InconsistentProposalState(
|
||||
"After Windows-compatibility cleaning, title contains disallowed characters "
|
||||
+ "(only letters, digits, and spaces are permitted): '"
|
||||
+ cleanedTitle + "'");
|
||||
}
|
||||
|
||||
// Build: YYYY-MM-DD - Titel.pdf
|
||||
String baseFilename = date + " - " + title + ".pdf";
|
||||
String baseFilename = date + " - " + cleanedTitle + ".pdf";
|
||||
return new BaseFilenameReady(baseFilename);
|
||||
}
|
||||
|
||||
@@ -156,4 +171,21 @@ public final class TargetFilenameBuildingService {
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Removes characters that are incompatible with Windows filenames.
|
||||
* <p>
|
||||
* Windows-incompatible characters are: {@code < > : " / \ | ? *}
|
||||
* <p>
|
||||
* This is a defensive measure for ensuring Windows compatibility. The characters are
|
||||
* simply removed; no replacement is performed. Unicode letters (including Umlauts and ß)
|
||||
* and spaces are retained.
|
||||
*
|
||||
* @param title the title to clean; must not be null
|
||||
* @return the cleaned title with Windows-incompatible characters removed
|
||||
*/
|
||||
private static String removeWindowsIncompatibleCharacters(String title) {
|
||||
// Windows-incompatible characters: < > : " / \ | ? *
|
||||
return title.replaceAll("[<>:\"/\\\\|?*]", "");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,5 +1,9 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.usecase;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunProcessingUseCase;
|
||||
@@ -23,10 +27,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Batch processing implementation of {@link BatchRunProcessingUseCase}.
|
||||
* <p>
|
||||
@@ -159,13 +159,19 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
/**
|
||||
* Loads candidates and processes them one by one.
|
||||
* <p>
|
||||
* Tracks whether any document-level persistence failures occur during processing.
|
||||
* A persistence failure for a single document causes the overall batch outcome
|
||||
* to be FAILURE instead of SUCCESS.
|
||||
* Document-level failures — including content errors, transient technical errors,
|
||||
* and individual persistence failures — do not affect the batch outcome. The batch
|
||||
* completes with {@link BatchRunOutcome#SUCCESS} as long as the source folder is accessible
|
||||
* and the processing loop runs to completion without a hard infrastructure error.
|
||||
* Document-level persistence failures are logged by the coordinator and retried in
|
||||
* subsequent runs; they must not escalate to a hard batch failure.
|
||||
* <p>
|
||||
* Only a hard source folder access failure ({@link SourceDocumentAccessException}) prevents
|
||||
* the batch from running at all, in which case {@link BatchRunOutcome#FAILURE} is returned.
|
||||
*
|
||||
* @param context the current batch run context
|
||||
* @return SUCCESS if all candidates were processed without persistence failures,
|
||||
* FAILURE if source access fails or any document-level persistence failure occurred
|
||||
* @return {@link BatchRunOutcome#SUCCESS} after all candidates have been processed,
|
||||
* or {@link BatchRunOutcome#FAILURE} if the source folder is inaccessible
|
||||
*/
|
||||
private BatchRunOutcome processCandidates(BatchRunContext context) {
|
||||
List<SourceDocumentCandidate> candidates;
|
||||
@@ -177,24 +183,13 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
}
|
||||
logger.info("Found {} PDF candidate(s) in source folder.", candidates.size());
|
||||
|
||||
// Track whether any document-level persistence failures occurred
|
||||
boolean anyPersistenceFailure = false;
|
||||
|
||||
// Process each candidate
|
||||
for (SourceDocumentCandidate candidate : candidates) {
|
||||
if (!processCandidate(candidate, context)) {
|
||||
anyPersistenceFailure = true;
|
||||
}
|
||||
processCandidate(candidate, context);
|
||||
}
|
||||
|
||||
logger.info("Batch run completed. Processed {} candidate(s). RunId: {}",
|
||||
candidates.size(), context.runId());
|
||||
|
||||
if (anyPersistenceFailure) {
|
||||
logger.warn("Batch run completed with document-level persistence failure(s).");
|
||||
return BatchRunOutcome.FAILURE;
|
||||
}
|
||||
|
||||
return BatchRunOutcome.SUCCESS;
|
||||
}
|
||||
|
||||
@@ -222,23 +217,30 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
* <p>
|
||||
* Processing order:
|
||||
* <ol>
|
||||
* <li><strong>Log:</strong> detected source file at INFO level with run-ID (pre-fingerprint
|
||||
* correlation via run-ID and candidate description).</li>
|
||||
* <li>Record the attempt start instant.</li>
|
||||
* <li>Compute the SHA-256 fingerprint of the candidate file content.</li>
|
||||
* <li>If fingerprint computation fails: log as non-identifiable run event and
|
||||
* return true — no SQLite record is created, but no persistence failure occurred.</li>
|
||||
* <li>If fingerprint computation fails: log as non-identifiable run event with run-ID
|
||||
* and return true — no SQLite record is created, no persistence failure.</li>
|
||||
* <li>Load document master record.</li>
|
||||
* <li>If already {@code SUCCESS} → persist skip attempt with
|
||||
* {@code SKIPPED_ALREADY_PROCESSED}.</li>
|
||||
* <li>If already {@code FAILED_FINAL} → persist skip attempt with
|
||||
* {@code SKIPPED_FINAL_FAILURE}.</li>
|
||||
* <li>Otherwise execute the pipeline (extraction + pre-checks).</li>
|
||||
* <li>Map result into status, counters and retryable flag.</li>
|
||||
* <li>If already {@code SUCCESS} → log skip at INFO level with fingerprint;
|
||||
* persist skip attempt with {@code SKIPPED_ALREADY_PROCESSED}.</li>
|
||||
* <li>If already {@code FAILED_FINAL} → log skip at INFO level with fingerprint;
|
||||
* persist skip attempt with {@code SKIPPED_FINAL_FAILURE}.</li>
|
||||
* <li>Otherwise execute the pipeline (extraction + pre-checks + AI naming).</li>
|
||||
* <li>Map result into status, counters, and retryable flag.</li>
|
||||
* <li><strong>Log:</strong> retry decision at INFO level with fingerprint and error
|
||||
* classification (FAILED_RETRYABLE or FAILED_FINAL).</li>
|
||||
* <li>Persist exactly one historised processing attempt.</li>
|
||||
* <li>Persist the updated document master record.</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
* Per-document errors do not abort the overall batch run. Each candidate ends
|
||||
* controlled regardless of its outcome.
|
||||
* <p>
|
||||
* Post-fingerprint log entries carry the document fingerprint for correlation.
|
||||
* Pre-fingerprint log entries (steps 1–4) use run-ID and candidate description.
|
||||
*
|
||||
* @param candidate the candidate to process
|
||||
* @param context the current batch run context
|
||||
@@ -246,14 +248,15 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
* errors return true; persistence failures return false)
|
||||
*/
|
||||
private boolean processCandidate(SourceDocumentCandidate candidate, BatchRunContext context) {
|
||||
logger.debug("Processing candidate: {}", candidate.uniqueIdentifier());
|
||||
logger.info("Detected source file '{}' for processing (RunId: {}).",
|
||||
candidate.uniqueIdentifier(), context.runId());
|
||||
|
||||
Instant attemptStart = Instant.now();
|
||||
FingerprintResult fingerprintResult = fingerprintPort.computeFingerprint(candidate);
|
||||
|
||||
return switch (fingerprintResult) {
|
||||
case FingerprintTechnicalError fingerprintError -> {
|
||||
handleFingerprintError(candidate, fingerprintError);
|
||||
handleFingerprintError(candidate, fingerprintError, context);
|
||||
yield true; // fingerprint errors are not persistence failures
|
||||
}
|
||||
case FingerprintSuccess fingerprintSuccess ->
|
||||
@@ -262,15 +265,23 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
}
|
||||
|
||||
/**
|
||||
* Handles a fingerprint computation error by logging it as a non-identifiable event.
|
||||
* Handles a fingerprint computation error by logging it as a non-identifiable run event.
|
||||
* No SQLite record is created for this candidate.
|
||||
* <p>
|
||||
* Log entries before a successful fingerprint are correlated via the batch run identifier
|
||||
* and the candidate description, as no fingerprint is available for document-level
|
||||
* correlation.
|
||||
*
|
||||
* @param candidate the candidate that could not be fingerprinted
|
||||
* @param error the fingerprint error
|
||||
* @param error the fingerprint error
|
||||
* @param context the current batch run context; used for run-level log correlation
|
||||
*/
|
||||
private void handleFingerprintError(SourceDocumentCandidate candidate, FingerprintTechnicalError error) {
|
||||
logger.warn("Fingerprint computation failed for '{}': {} — candidate skipped (not historised).",
|
||||
candidate.uniqueIdentifier(), error.errorMessage());
|
||||
private void handleFingerprintError(
|
||||
SourceDocumentCandidate candidate,
|
||||
FingerprintTechnicalError error,
|
||||
BatchRunContext context) {
|
||||
logger.warn("Fingerprint computation failed for '{}' (RunId: {}): {} — candidate not historised.",
|
||||
candidate.uniqueIdentifier(), context.runId(), error.errorMessage());
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -382,10 +393,10 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
private void logProcessingOutcome(SourceDocumentCandidate candidate, DocumentProcessingOutcome outcome) {
|
||||
switch (outcome) {
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed failed ->
|
||||
logger.info("Pre-checks FAILED for '{}': {} (Deterministic content error).",
|
||||
logger.info("Pre-checks failed for '{}': {} (deterministic content error).",
|
||||
candidate.uniqueIdentifier(), failed.failureReasonDescription());
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError technicalError ->
|
||||
logger.warn("Processing FAILED for '{}': {} (Technical error – retryable).",
|
||||
logger.warn("Processing failed for '{}': {} (transient technical error – retryable).",
|
||||
candidate.uniqueIdentifier(), technicalError.errorMessage());
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.NamingProposalReady ready ->
|
||||
logger.info("AI naming proposal ready for '{}': title='{}', date={}.",
|
||||
@@ -393,10 +404,10 @@ public class DefaultBatchRunProcessingUseCase implements BatchRunProcessingUseCa
|
||||
ready.proposal().validatedTitle(),
|
||||
ready.proposal().resolvedDate());
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.AiTechnicalFailure aiTechnical ->
|
||||
logger.warn("AI technical failure for '{}': {} (Transient – retryable).",
|
||||
logger.warn("AI invocation failed for '{}': {} (transient technical error – retryable).",
|
||||
candidate.uniqueIdentifier(), aiTechnical.errorMessage());
|
||||
case de.gecheckt.pdf.umbenenner.domain.model.AiFunctionalFailure aiFunctional ->
|
||||
logger.info("AI functional failure for '{}': {} (Deterministic content error).",
|
||||
logger.info("AI naming failed for '{}': {} (deterministic content error).",
|
||||
candidate.uniqueIdentifier(), aiFunctional.errorMessage());
|
||||
default -> { /* other outcomes are handled elsewhere */ }
|
||||
}
|
||||
|
||||
@@ -1,8 +1,14 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.in;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link BatchRunOutcome} enumeration.
|
||||
|
||||
@@ -0,0 +1,191 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertInstanceOf;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotEquals;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* Tests for the {@link DocumentLogCorrelation} sealed type and its two permitted implementations.
|
||||
* <p>
|
||||
* Verifies:
|
||||
* <ul>
|
||||
* <li>{@link DocumentLogCorrelation.CandidateCorrelation} stores the run identifier and
|
||||
* candidate description correctly (pre-fingerprint phase).</li>
|
||||
* <li>{@link DocumentLogCorrelation.FingerprintCorrelation} stores the run identifier and
|
||||
* fingerprint correctly (post-fingerprint phase).</li>
|
||||
* <li>The sealed type contract: only the two permitted subtypes exist.</li>
|
||||
* </ul>
|
||||
*/
|
||||
class DocumentLogCorrelationTest {
|
||||
|
||||
private static final String RUN_ID_VALUE = "run-correlation-test-001";
|
||||
private static final String CANDIDATE_DESCRIPTION = "invoice-2026-01-15.pdf";
|
||||
private static final String FINGERPRINT_HEX = "a".repeat(64);
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// CandidateCorrelation – pre-fingerprint phase
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_storesRunId() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation.CandidateCorrelation correlation =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
|
||||
assertEquals(runId, correlation.runId());
|
||||
}
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_storesCandidateDescription() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation.CandidateCorrelation correlation =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
|
||||
assertEquals(CANDIDATE_DESCRIPTION, correlation.candidateDescription());
|
||||
}
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_runIdAccessibleViaInterface() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation correlation =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
|
||||
// runId() is declared on the sealed interface and must be accessible polymorphically
|
||||
assertEquals(runId, correlation.runId());
|
||||
}
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_twoInstancesWithSameDataAreEqual() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation.CandidateCorrelation first =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
DocumentLogCorrelation.CandidateCorrelation second =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
|
||||
assertEquals(first, second);
|
||||
}
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_implementsDocumentLogCorrelation() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation.CandidateCorrelation correlation =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
|
||||
assertInstanceOf(DocumentLogCorrelation.class, correlation);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// FingerprintCorrelation – post-fingerprint phase
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_storesRunId() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(FINGERPRINT_HEX);
|
||||
DocumentLogCorrelation.FingerprintCorrelation correlation =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
|
||||
assertEquals(runId, correlation.runId());
|
||||
}
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_storesFingerprint() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(FINGERPRINT_HEX);
|
||||
DocumentLogCorrelation.FingerprintCorrelation correlation =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
|
||||
assertEquals(fingerprint, correlation.fingerprint());
|
||||
}
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_runIdAccessibleViaInterface() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(FINGERPRINT_HEX);
|
||||
DocumentLogCorrelation correlation =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
|
||||
// runId() is declared on the sealed interface and must be accessible polymorphically
|
||||
assertEquals(runId, correlation.runId());
|
||||
}
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_twoInstancesWithSameDataAreEqual() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(FINGERPRINT_HEX);
|
||||
DocumentLogCorrelation.FingerprintCorrelation first =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
DocumentLogCorrelation.FingerprintCorrelation second =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
|
||||
assertEquals(first, second);
|
||||
}
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_implementsDocumentLogCorrelation() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(FINGERPRINT_HEX);
|
||||
DocumentLogCorrelation.FingerprintCorrelation correlation =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, fingerprint);
|
||||
|
||||
assertInstanceOf(DocumentLogCorrelation.class, correlation);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Sealed type structural contract
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void sealedType_patternMatchExhaustsAllPermittedSubtypes() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
|
||||
DocumentLogCorrelation candidatePhase =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, CANDIDATE_DESCRIPTION);
|
||||
DocumentLogCorrelation fingerprintPhase =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, new DocumentFingerprint(FINGERPRINT_HEX));
|
||||
|
||||
// Pattern match on the sealed type must compile exhaustively for exactly these two cases
|
||||
String candidatePhaseResult = describe(candidatePhase);
|
||||
String fingerprintPhaseResult = describe(fingerprintPhase);
|
||||
|
||||
assertEquals("candidate", candidatePhaseResult);
|
||||
assertEquals("fingerprint", fingerprintPhaseResult);
|
||||
}
|
||||
|
||||
/** Helper method using an exhaustive switch over the sealed type. */
|
||||
private static String describe(DocumentLogCorrelation correlation) {
|
||||
return switch (correlation) {
|
||||
case DocumentLogCorrelation.CandidateCorrelation ignored -> "candidate";
|
||||
case DocumentLogCorrelation.FingerprintCorrelation ignored -> "fingerprint";
|
||||
};
|
||||
}
|
||||
|
||||
@Test
|
||||
void candidateCorrelation_differentDescriptions_areNotEqual() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentLogCorrelation.CandidateCorrelation withFirst =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, "first.pdf");
|
||||
DocumentLogCorrelation.CandidateCorrelation withSecond =
|
||||
new DocumentLogCorrelation.CandidateCorrelation(runId, "second.pdf");
|
||||
|
||||
assertNotEquals(withFirst, withSecond);
|
||||
}
|
||||
|
||||
@Test
|
||||
void fingerprintCorrelation_differentFingerprints_areNotEqual() {
|
||||
RunId runId = new RunId(RUN_ID_VALUE);
|
||||
DocumentFingerprint first = new DocumentFingerprint("a".repeat(64));
|
||||
DocumentFingerprint second = new DocumentFingerprint("b".repeat(64));
|
||||
DocumentLogCorrelation.FingerprintCorrelation withFirst =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, first);
|
||||
DocumentLogCorrelation.FingerprintCorrelation withSecond =
|
||||
new DocumentLogCorrelation.FingerprintCorrelation(runId, second);
|
||||
|
||||
assertNotEquals(withFirst, withSecond);
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,20 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.time.ZoneOffset;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
@@ -19,20 +34,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.time.ZoneOffset;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link AiNamingService}.
|
||||
@@ -314,4 +315,13 @@ class AiNamingServiceTest {
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("maxTextCharacters must be >= 1");
|
||||
}
|
||||
|
||||
@Test
|
||||
void constructor_maxTextCharactersOne_doesNotThrow() {
|
||||
// maxTextCharacters=1 is the minimum valid value (boundary test).
|
||||
// A changed-conditional-boundary mutation that changes '< 1' to '<= 1' would
|
||||
// cause this constructor call to throw — this test detects that mutation.
|
||||
new AiNamingService(aiInvocationPort, promptPort, validator, MODEL_NAME, 1);
|
||||
// No exception expected; reaching this line means the boundary is correct
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,14 +1,15 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiResponseParsingFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiResponseParsingResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiResponseParsingSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ParsedAiResponse;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link AiResponseParser}.
|
||||
|
||||
@@ -1,18 +1,19 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ClockPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposal;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ParsedAiResponse;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.time.ZoneOffset;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ClockPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposal;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ParsedAiResponse;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link AiResponseValidator}.
|
||||
|
||||
@@ -0,0 +1,322 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertInstanceOf;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentErrorClassification;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ImmediateRetryDecision;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RetryDecision;
|
||||
|
||||
/**
|
||||
* Tests for {@link DefaultRetryDecisionEvaluator}.
|
||||
* <p>
|
||||
* Verifies the binding retry policy rules for deterministic content errors,
|
||||
* transient technical errors, target copy failures, and the within-run
|
||||
* immediate retry mechanism.
|
||||
*/
|
||||
class DefaultRetryDecisionEvaluatorTest {
|
||||
|
||||
private static final String FAILURE_CLASS = "SOME_FAILURE";
|
||||
private static final String FAILURE_MESSAGE = "Something went wrong";
|
||||
|
||||
private DefaultRetryDecisionEvaluator evaluator;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
evaluator = new DefaultRetryDecisionEvaluator();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Deterministic content error rules
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void evaluate_firstContentError_returnsContentErrorRetryable() {
|
||||
FailureCounters counters = new FailureCounters(0, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.ContentErrorRetryable.class, decision);
|
||||
RetryDecision.ContentErrorRetryable retryable = (RetryDecision.ContentErrorRetryable) decision;
|
||||
assertEquals(FAILURE_CLASS, retryable.failureClass());
|
||||
assertEquals(FAILURE_MESSAGE, retryable.failureMessage());
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_secondContentError_returnsContentErrorFinal() {
|
||||
FailureCounters counters = new FailureCounters(1, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.ContentErrorFinal.class, decision);
|
||||
RetryDecision.ContentErrorFinal finalDecision = (RetryDecision.ContentErrorFinal) decision;
|
||||
assertEquals(FAILURE_CLASS, finalDecision.failureClass());
|
||||
assertEquals(FAILURE_MESSAGE, finalDecision.failureMessage());
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_subsequentContentErrors_alwaysReturnContentErrorFinal() {
|
||||
// Any count >= 1 results in final (covers legacy data with higher counts)
|
||||
for (int count = 1; count <= 5; count++) {
|
||||
FailureCounters counters = new FailureCounters(count, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.ContentErrorFinal.class, decision,
|
||||
"Expected ContentErrorFinal for contentErrorCount=" + count);
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_contentError_transientCounterIsIrrelevant() {
|
||||
// Non-zero transient counter must not affect content error decision
|
||||
FailureCounters counters = new FailureCounters(0, 5);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.ContentErrorRetryable.class, decision);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Transient technical error rules
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientOne_firstError_returnsTransientErrorFinal() {
|
||||
// maxRetriesTransient=1: counter before=0, after=1=limit → final immediately
|
||||
FailureCounters counters = new FailureCounters(0, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorFinal.class, decision,
|
||||
"With maxRetriesTransient=1, first transient error must be final");
|
||||
RetryDecision.TransientErrorFinal finalDecision = (RetryDecision.TransientErrorFinal) decision;
|
||||
assertEquals(FAILURE_CLASS, finalDecision.failureClass());
|
||||
assertEquals(FAILURE_MESSAGE, finalDecision.failureMessage());
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientTwo_firstError_returnsTransientErrorRetryable() {
|
||||
// maxRetriesTransient=2: counter before=0, after=1 < 2 → retryable
|
||||
FailureCounters counters = new FailureCounters(0, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 2, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorRetryable.class, decision);
|
||||
RetryDecision.TransientErrorRetryable retryable = (RetryDecision.TransientErrorRetryable) decision;
|
||||
assertEquals(FAILURE_CLASS, retryable.failureClass());
|
||||
assertEquals(FAILURE_MESSAGE, retryable.failureMessage());
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientTwo_secondError_returnsTransientErrorFinal() {
|
||||
// maxRetriesTransient=2: counter before=1, after=2=limit → final
|
||||
FailureCounters counters = new FailureCounters(0, 1);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 2, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorFinal.class, decision,
|
||||
"With maxRetriesTransient=2, second transient error must be final");
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientThree_firstError_returnsRetryable() {
|
||||
FailureCounters counters = new FailureCounters(0, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 3, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorRetryable.class, decision);
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientThree_secondError_returnsRetryable() {
|
||||
FailureCounters counters = new FailureCounters(0, 1);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 3, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorRetryable.class, decision);
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_maxRetriesTransientThree_thirdError_returnsFinal() {
|
||||
// counter before=2, after=3=limit → final
|
||||
FailureCounters counters = new FailureCounters(0, 2);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 3, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorFinal.class, decision,
|
||||
"Third transient error with maxRetriesTransient=3 must be final");
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_contentCounterIsIrrelevant() {
|
||||
// Non-zero content error counter must not affect transient error decision
|
||||
FailureCounters counters = new FailureCounters(1, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 2, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TransientErrorRetryable.class, decision);
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_transientError_legacyDataWithHigherCounts_finalizesCorrectly() {
|
||||
// Existing legacy data may have counter values beyond normal expectations;
|
||||
// the evaluator must still apply the threshold check consistently
|
||||
FailureCounters counters = new FailureCounters(3, 5);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 3, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
// counter before=5, after=6 >= 3 → final
|
||||
assertInstanceOf(RetryDecision.TransientErrorFinal.class, decision);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Target copy technical error rule
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void evaluate_targetCopyError_returnsTargetCopyWithImmediateRetry() {
|
||||
FailureCounters counters = new FailureCounters(0, 0);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TARGET_COPY_TECHNICAL_ERROR,
|
||||
counters, 1, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TargetCopyWithImmediateRetry.class, decision);
|
||||
RetryDecision.TargetCopyWithImmediateRetry immediate =
|
||||
(RetryDecision.TargetCopyWithImmediateRetry) decision;
|
||||
assertEquals(FAILURE_MESSAGE, immediate.failureMessage());
|
||||
assertEquals(DocumentErrorClassification.TARGET_COPY_TECHNICAL_ERROR.name(),
|
||||
immediate.failureClass());
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_targetCopyError_countersAndMaxRetriesAreIgnored() {
|
||||
// Target copy decision is independent of counters and maxRetriesTransient
|
||||
FailureCounters counters = new FailureCounters(2, 3);
|
||||
|
||||
RetryDecision decision = evaluator.evaluate(
|
||||
DocumentErrorClassification.TARGET_COPY_TECHNICAL_ERROR,
|
||||
counters, 5, FAILURE_CLASS, FAILURE_MESSAGE);
|
||||
|
||||
assertInstanceOf(RetryDecision.TargetCopyWithImmediateRetry.class, decision);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Immediate within-run retry decision
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void evaluateImmediateRetry_firstAttempt_returnsAllowed() {
|
||||
ImmediateRetryDecision decision = evaluator.evaluateImmediateRetry(true);
|
||||
|
||||
assertEquals(ImmediateRetryDecision.ALLOWED, decision);
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluateImmediateRetry_secondAttempt_returnsDenied() {
|
||||
ImmediateRetryDecision decision = evaluator.evaluateImmediateRetry(false);
|
||||
|
||||
assertEquals(ImmediateRetryDecision.DENIED, decision);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Guard conditions
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenMaxRetriesTransientIsZero() {
|
||||
FailureCounters counters = FailureCounters.zero();
|
||||
|
||||
assertThrows(IllegalArgumentException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, 0, FAILURE_CLASS, FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenMaxRetriesTransientIsNegative() {
|
||||
FailureCounters counters = FailureCounters.zero();
|
||||
|
||||
assertThrows(IllegalArgumentException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.TRANSIENT_TECHNICAL_ERROR,
|
||||
counters, -1, FAILURE_CLASS, FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenErrorClassIsNull() {
|
||||
assertThrows(NullPointerException.class, () ->
|
||||
evaluator.evaluate(null, FailureCounters.zero(), 1,
|
||||
FAILURE_CLASS, FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenCountersAreNull() {
|
||||
assertThrows(NullPointerException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
null, 1, FAILURE_CLASS, FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenFailureClassIsNull() {
|
||||
assertThrows(NullPointerException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
FailureCounters.zero(), 1, null, FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenFailureClassIsBlank() {
|
||||
assertThrows(IllegalArgumentException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
FailureCounters.zero(), 1, " ", FAILURE_MESSAGE));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenFailureMessageIsNull() {
|
||||
assertThrows(NullPointerException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
FailureCounters.zero(), 1, FAILURE_CLASS, null));
|
||||
}
|
||||
|
||||
@Test
|
||||
void evaluate_throwsWhenFailureMessageIsBlank() {
|
||||
assertThrows(IllegalArgumentException.class, () ->
|
||||
evaluator.evaluate(
|
||||
DocumentErrorClassification.DETERMINISTIC_CONTENT_ERROR,
|
||||
FailureCounters.zero(), 1, FAILURE_CLASS, " "));
|
||||
}
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,27 +1,31 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertInstanceOf;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
|
||||
/**
|
||||
* Tests for {@link DocumentProcessingService}.
|
||||
@@ -44,8 +48,8 @@ class DocumentProcessingServiceTest {
|
||||
SourceDocumentLocator locator = new SourceDocumentLocator(pdfFile.toString());
|
||||
candidate = new SourceDocumentCandidate("document.pdf", 2048L, locator);
|
||||
|
||||
// Create runtime configuration with maxPages limit
|
||||
runtimeConfig = new RuntimeConfiguration(10);
|
||||
// Create runtime configuration with maxPages limit and default transient retry limit
|
||||
runtimeConfig = new RuntimeConfiguration(10, 3, AiContentSensitivity.PROTECT_SENSITIVE_CONTENT);
|
||||
}
|
||||
|
||||
@Test
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link DocumentTextLimiter}.
|
||||
*/
|
||||
|
||||
@@ -1,23 +1,27 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Tests for {@link PreCheckEvaluator}.
|
||||
@@ -236,7 +240,7 @@ class PreCheckEvaluatorTest {
|
||||
// =========================================================================
|
||||
|
||||
private RuntimeConfiguration buildConfig(int maxPages) throws Exception {
|
||||
return new RuntimeConfiguration(maxPages);
|
||||
return new RuntimeConfiguration(maxPages, 3, AiContentSensitivity.PROTECT_SENSITIVE_CONTENT);
|
||||
}
|
||||
|
||||
private int maxPages(int limit) {
|
||||
|
||||
@@ -0,0 +1,376 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.time.LocalDate;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiAttemptContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiFunctionalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposal;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.NamingProposalReady;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckFailureReason;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PreCheckPassed;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.TechnicalDocumentError;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link ProcessingOutcomeTransition} — the authoritative central retry rule.
|
||||
* <p>
|
||||
* These tests prove that:
|
||||
* <ul>
|
||||
* <li>Deterministic content errors follow the first-retryable / second-final rule.</li>
|
||||
* <li>Transient technical errors respect the configured {@code maxRetriesTransient} limit.</li>
|
||||
* <li>{@code maxRetriesTransient = 1} immediately finalises on the first transient error.</li>
|
||||
* <li>Naming proposals yield {@code PROPOSAL_READY} with unchanged counters.</li>
|
||||
* <li>AI functional failures are governed by the same content-error rule.</li>
|
||||
* <li>AI technical failures are governed by the same transient-error rule.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Skip events ({@code SKIPPED_ALREADY_PROCESSED}, {@code SKIPPED_FINAL_FAILURE}) are not
|
||||
* routed through this class — they never modify any failure counter.
|
||||
*/
|
||||
class ProcessingOutcomeTransitionTest {
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Test fixtures
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private static final int LIMIT_1 = 1;
|
||||
private static final int LIMIT_2 = 2;
|
||||
private static final int LIMIT_3 = 3;
|
||||
|
||||
private static SourceDocumentCandidate candidate() {
|
||||
return new SourceDocumentCandidate("test.pdf", 1024L,
|
||||
new SourceDocumentLocator("/tmp/test.pdf"));
|
||||
}
|
||||
|
||||
private static AiAttemptContext aiContext() {
|
||||
return new AiAttemptContext("model", "prompt.txt", 1, 100, "{}");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Naming proposal → PROPOSAL_READY (counters unchanged)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_namingProposalReady_returnsProposalReady_countersUnchanged() {
|
||||
NamingProposal proposal = new NamingProposal(
|
||||
LocalDate.of(2026, 1, 15), DateSource.AI_PROVIDED, "Rechnung", "reason");
|
||||
NamingProposalReady outcome = new NamingProposalReady(candidate(), proposal, aiContext());
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.PROPOSAL_READY, result.overallStatus());
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(0, result.counters().contentErrorCount());
|
||||
assertEquals(0, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_namingProposalReady_countersUnchangedFromExisting() {
|
||||
NamingProposal proposal = new NamingProposal(
|
||||
LocalDate.of(2026, 1, 15), DateSource.AI_PROVIDED, "Rechnung", "reason");
|
||||
NamingProposalReady outcome = new NamingProposalReady(candidate(), proposal, aiContext());
|
||||
FailureCounters existing = new FailureCounters(1, 2);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_3);
|
||||
|
||||
assertEquals(ProcessingStatus.PROPOSAL_READY, result.overallStatus());
|
||||
assertFalse(result.retryable());
|
||||
// Counters must be preserved unchanged on a successful naming proposal
|
||||
assertEquals(1, result.counters().contentErrorCount(),
|
||||
"Content error counter must remain unchanged on PROPOSAL_READY");
|
||||
assertEquals(2, result.counters().transientErrorCount(),
|
||||
"Transient error counter must remain unchanged on PROPOSAL_READY");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Deterministic content errors (PreCheckFailed)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_firstPreCheckFailed_returnsFailedRetryable_contentCounterOne() {
|
||||
PreCheckFailed outcome = new PreCheckFailed(candidate(), PreCheckFailureReason.NO_USABLE_TEXT);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus());
|
||||
assertTrue(result.retryable());
|
||||
assertEquals(1, result.counters().contentErrorCount());
|
||||
assertEquals(0, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_secondPreCheckFailed_returnsFailedFinal_contentCounterTwo() {
|
||||
PreCheckFailed outcome = new PreCheckFailed(candidate(), PreCheckFailureReason.PAGE_LIMIT_EXCEEDED);
|
||||
FailureCounters existing = new FailureCounters(1, 0);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus());
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(2, result.counters().contentErrorCount());
|
||||
assertEquals(0, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_anySubsequentContentError_remainsFailedFinal() {
|
||||
// Count >= 1 before the attempt always leads to FAILED_FINAL (covers legacy data)
|
||||
for (int priorCount = 1; priorCount <= 5; priorCount++) {
|
||||
FailureCounters existing = new FailureCounters(priorCount, 0);
|
||||
PreCheckFailed outcome =
|
||||
new PreCheckFailed(candidate(), PreCheckFailureReason.NO_USABLE_TEXT);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"Expected FAILED_FINAL for prior contentErrorCount=" + priorCount);
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(priorCount + 1, result.counters().contentErrorCount());
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
void forNewDocument_contentError_transientCounterIsIrrelevant() {
|
||||
PreCheckFailed outcome = new PreCheckFailed(candidate(), PreCheckFailureReason.NO_USABLE_TEXT);
|
||||
|
||||
// Counter before: 0 content errors (first occurrence), transient ignored
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
outcome, new FailureCounters(0, 5), LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus(),
|
||||
"Non-zero transient counter must not affect the content error decision");
|
||||
assertTrue(result.retryable());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Deterministic content errors (AiFunctionalFailure)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_firstAiFunctionalFailure_returnsFailedRetryable_contentCounterOne() {
|
||||
AiFunctionalFailure outcome = new AiFunctionalFailure(candidate(), "Generic title", aiContext());
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus());
|
||||
assertTrue(result.retryable());
|
||||
assertEquals(1, result.counters().contentErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_secondAiFunctionalFailure_returnsFailedFinal() {
|
||||
AiFunctionalFailure outcome = new AiFunctionalFailure(candidate(), "Generic title", aiContext());
|
||||
FailureCounters existing = new FailureCounters(1, 0);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus());
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(2, result.counters().contentErrorCount());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Transient technical errors (TechnicalDocumentError)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_transientError_limitOne_immediatelyFinal() {
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "I/O error", null);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"With limit=1, the first transient error must immediately finalise");
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
assertEquals(0, result.counters().contentErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forNewDocument_transientError_limitTwo_firstErrorRetryable() {
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "Timeout", null);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus());
|
||||
assertTrue(result.retryable());
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_transientError_limitTwo_secondErrorFinal() {
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "Timeout again", null);
|
||||
FailureCounters existing = new FailureCounters(0, 1);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"Second transient error with limit=2 must finalise");
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(2, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_transientError_limitThree_sequence() {
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "error", null);
|
||||
|
||||
// First error: counter 0→1, 1 < 3 → retryable
|
||||
ProcessingOutcomeTransition.ProcessingOutcome first =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
outcome, new FailureCounters(0, 0), LIMIT_3);
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, first.overallStatus());
|
||||
assertEquals(1, first.counters().transientErrorCount());
|
||||
|
||||
// Second error: counter 1→2, 2 < 3 → retryable
|
||||
ProcessingOutcomeTransition.ProcessingOutcome second =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
outcome, new FailureCounters(0, 1), LIMIT_3);
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, second.overallStatus());
|
||||
assertEquals(2, second.counters().transientErrorCount());
|
||||
|
||||
// Third error: counter 2→3 = limit=3 → FAILED_FINAL
|
||||
ProcessingOutcomeTransition.ProcessingOutcome third =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
outcome, new FailureCounters(0, 2), LIMIT_3);
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, third.overallStatus(),
|
||||
"Third transient error with limit=3 must finalise");
|
||||
assertFalse(third.retryable());
|
||||
assertEquals(3, third.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_transientError_legacyHighCounters_stillFinalise() {
|
||||
// Legacy data may have counters well above normal expectations.
|
||||
// The threshold check must still apply correctly.
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "error", null);
|
||||
FailureCounters existing = new FailureCounters(3, 10); // already far above limit
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_3);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"Transient counter above limit must still produce FAILED_FINAL");
|
||||
assertEquals(11, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forNewDocument_transientError_contentCounterIsIrrelevant() {
|
||||
TechnicalDocumentError outcome = new TechnicalDocumentError(candidate(), "error", null);
|
||||
|
||||
// Non-zero content error counter must not affect the transient error decision
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(
|
||||
outcome, new FailureCounters(2, 0), LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus(),
|
||||
"Content error counter must not affect transient error decision");
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Transient technical errors (AiTechnicalFailure)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_aiTechnicalFailure_limitOne_immediatelyFinal() {
|
||||
AiTechnicalFailure outcome = new AiTechnicalFailure(candidate(), "HTTP timeout", null, aiContext());
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"With limit=1, first AI technical failure must immediately finalise");
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_aiTechnicalFailure_limitTwo_secondFinal() {
|
||||
AiTechnicalFailure outcome = new AiTechnicalFailure(candidate(), "HTTP timeout", null, aiContext());
|
||||
FailureCounters existing = new FailureCounters(0, 1);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus());
|
||||
assertEquals(2, result.counters().transientErrorCount());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// PreCheckPassed routed through transition (edge case: no AI step taken)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void forNewDocument_preCheckPassed_limitOne_immediatelyFinal() {
|
||||
// PreCheckPassed without an AI outcome is treated as a transient error by the transition.
|
||||
// With limit=1 the first such error must immediately finalise to FAILED_FINAL.
|
||||
PreCheckPassed outcome = new PreCheckPassed(
|
||||
candidate(), new PdfExtractionSuccess("text", new PdfPageCount(1)));
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_1);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"With limit=1 a PreCheckPassed-routed transient error must immediately finalise");
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
assertEquals(0, result.counters().contentErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forNewDocument_preCheckPassed_limitTwo_firstErrorRetryable() {
|
||||
// With limit=2 the first PreCheckPassed-routed transient error is retryable.
|
||||
PreCheckPassed outcome = new PreCheckPassed(
|
||||
candidate(), new PdfExtractionSuccess("text", new PdfPageCount(1)));
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forNewDocument(outcome, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_RETRYABLE, result.overallStatus(),
|
||||
"With limit=2 the first PreCheckPassed-routed transient error must be retryable");
|
||||
assertTrue(result.retryable());
|
||||
assertEquals(1, result.counters().transientErrorCount());
|
||||
assertEquals(0, result.counters().contentErrorCount());
|
||||
}
|
||||
|
||||
@Test
|
||||
void forKnownDocument_preCheckPassed_limitTwo_secondErrorFinal() {
|
||||
// With limit=2 and an existing transient error count of 1,
|
||||
// the next PreCheckPassed-routed error increments to 2 = limit → FAILED_FINAL.
|
||||
PreCheckPassed outcome = new PreCheckPassed(
|
||||
candidate(), new PdfExtractionSuccess("text", new PdfPageCount(1)));
|
||||
FailureCounters existing = new FailureCounters(0, 1);
|
||||
|
||||
ProcessingOutcomeTransition.ProcessingOutcome result =
|
||||
ProcessingOutcomeTransition.forKnownDocument(outcome, existing, LIMIT_2);
|
||||
|
||||
assertEquals(ProcessingStatus.FAILED_FINAL, result.overallStatus(),
|
||||
"PreCheckPassed-routed error at transient limit must finalise to FAILED_FINAL");
|
||||
assertFalse(result.retryable());
|
||||
assertEquals(2, result.counters().transientErrorCount());
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,13 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.service;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.TargetFilenameBuildingService.BaseFilenameReady;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.TargetFilenameBuildingService.BaseFilenameResult;
|
||||
@@ -9,20 +17,13 @@ import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link TargetFilenameBuildingService}.
|
||||
* <p>
|
||||
* Covers the verbindliches Zielformat {@code YYYY-MM-DD - Titel.pdf}, the 20-character
|
||||
* base-title rule, the fachliche Titelregel (only letters, digits, and spaces), and the
|
||||
* detection of inconsistent persistence states.
|
||||
* base-title rule, the fachliche Titelregel (only letters, digits, and spaces),
|
||||
* Windows-compatibility character removal, and the detection of inconsistent persistence
|
||||
* states.
|
||||
*/
|
||||
class TargetFilenameBuildingServiceTest {
|
||||
|
||||
@@ -201,21 +202,120 @@ class TargetFilenameBuildingServiceTest {
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_titleWithSlash_returnsInconsistentProposalState() {
|
||||
void buildBaseFilename_titleWithSlash_removesWindowsIncompatibleCharacterAndSucceeds() {
|
||||
// Slash (/) is a Windows-incompatible character. It should be removed,
|
||||
// leaving "RgStrom" which is valid (letters only)
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 1, 1), "Rg/Strom");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.isEqualTo("2026-01-01 - RgStrom.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_titleWithMultipleWindowsChars_removesAllAndSucceeds() {
|
||||
// Multiple Windows-incompatible characters (: and ") should be removed,
|
||||
// leaving "Rechnung 2026" which is valid (letters, digits, and spaces)
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 1, 1), "Rechnung: \"2026\"");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.isEqualTo("2026-01-01 - Rechnung 2026.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_titleWithDot_returnsInconsistentProposalState() {
|
||||
// Dot (.) is NOT a Windows-incompatible character (as per our list < > : " / \ | ? *)
|
||||
// So it remains in the cleaned title and causes validation to fail
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 1, 1), "Rechnung.pdf");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(InconsistentProposalState.class);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Windows compatibility – removal of incompatible characters
|
||||
// (defensive measure; characters should not appear in validated title)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_titleWithDot_returnsInconsistentProposalState() {
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 1, 1), "Rechnung.pdf");
|
||||
void buildBaseFilename_validTitleProducesWindowsCompatibleFilename() {
|
||||
// Valid titles containing only letters, digits, and spaces should produce
|
||||
// correct Windows-compatible filenames
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 5, 20), "Versicherung");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(InconsistentProposalState.class);
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.isEqualTo("2026-05-20 - Versicherung.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_germanUmlautsAreRetainedInFilename() {
|
||||
// German Umlauts (ä, ö, ü) and ß are valid filename characters on Windows
|
||||
// and must be retained in the output filename
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 6, 15), "Überprüfung");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.isEqualTo("2026-06-15 - Überprüfung.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_germanSzligIsRetainedInFilename() {
|
||||
// German ß is a valid filename character on Windows and must be retained
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 3, 10), "Straße");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.isEqualTo("2026-03-10 - Straße.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_completeFormatIsWindowsCompatible() {
|
||||
// The complete filename format (YYYY-MM-DD - Titel.pdf) is Windows-compatible
|
||||
// The hyphen in the date and the dot in the extension are valid Windows characters
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 12, 31), "Bericht");
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
String filename = ((BaseFilenameReady) result).baseFilename();
|
||||
assertThat(filename).matches("\\d{4}-\\d{2}-\\d{2} - .+\\.pdf");
|
||||
// Verify no Windows-incompatible characters: < > : " / \ | ? *
|
||||
assertThat(filename).doesNotContain("<");
|
||||
assertThat(filename).doesNotContain(">");
|
||||
assertThat(filename).doesNotContain(":");
|
||||
assertThat(filename).doesNotContain("\"");
|
||||
assertThat(filename).doesNotContain("/");
|
||||
assertThat(filename).doesNotContain("\\");
|
||||
assertThat(filename).doesNotContain("|");
|
||||
assertThat(filename).doesNotContain("?");
|
||||
assertThat(filename).doesNotContain("*");
|
||||
}
|
||||
|
||||
@Test
|
||||
void buildBaseFilename_twentyCharacterRuleUnaffectedByWindowsCompatibility() {
|
||||
// The 20-character rule applies to the base title only.
|
||||
// Windows-compatibility cleaning does not change the length counting mechanism.
|
||||
String title = "Stromabrechnung 2026"; // exactly 20 characters
|
||||
ProcessingAttempt attempt = proposalAttempt(LocalDate.of(2026, 3, 31), title);
|
||||
|
||||
BaseFilenameResult result = TargetFilenameBuildingService.buildBaseFilename(attempt);
|
||||
|
||||
assertThat(result).isInstanceOf(BaseFilenameReady.class);
|
||||
assertThat(((BaseFilenameReady) result).baseFilename())
|
||||
.startsWith("2026-03-31 - Stromabrechnung 2026");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
@@ -256,6 +356,7 @@ class TargetFilenameBuildingServiceTest {
|
||||
Instant.now(), Instant.now(),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
"gpt-4", "prompt-v1.txt", 1, 100,
|
||||
"{}", "reasoning text",
|
||||
date, DateSource.AI_PROVIDED, title,
|
||||
|
||||
@@ -1,14 +1,34 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.usecase;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.util.ArrayList;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.function.Consumer;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ClockPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordLookupResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordRepository;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalFinalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintSuccess;
|
||||
@@ -18,22 +38,21 @@ import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttemptRepository;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingLogger;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentCandidatesPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentCandidatesPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.UnitOfWorkPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiNamingService;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiResponseValidator;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.DocumentProcessingCoordinator;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError;
|
||||
@@ -41,24 +60,12 @@ import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.util.LinkedHashMap;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
import java.util.function.Consumer;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
|
||||
/**
|
||||
* Tests for {@link DefaultBatchRunProcessingUseCase}.
|
||||
* <p>
|
||||
@@ -443,11 +450,13 @@ class BatchRunProcessingUseCaseTest {
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Regression test: when a document-level persistence failure occurs,
|
||||
* the batch outcome must be FAILURE, not SUCCESS.
|
||||
* Document-level persistence failures must not escalate to a hard batch failure.
|
||||
* The batch outcome must be SUCCESS even when a document's persistence step fails,
|
||||
* because the batch loop ran to completion and the failed document will be retried
|
||||
* in a subsequent run.
|
||||
*/
|
||||
@Test
|
||||
void execute_documentPersistenceFailure_batchOutcomeIsFailure() throws Exception {
|
||||
void execute_documentPersistenceFailure_batchOutcomeIsSuccess() throws Exception {
|
||||
MockRunLockPort lockPort = new MockRunLockPort();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
@@ -460,7 +469,7 @@ class BatchRunProcessingUseCaseTest {
|
||||
DocumentProcessingCoordinator failingProcessor = new DocumentProcessingCoordinator(
|
||||
new NoOpDocumentRecordRepository(), new NoOpProcessingAttemptRepository(),
|
||||
new NoOpUnitOfWorkPort(), new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(),
|
||||
new NoOpProcessingLogger()) {
|
||||
new NoOpProcessingLogger(), 3, "openai-compatible") {
|
||||
@Override
|
||||
public boolean processDeferredOutcome(
|
||||
de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate candidate,
|
||||
@@ -480,16 +489,20 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
BatchRunOutcome outcome = useCase.execute(context);
|
||||
|
||||
assertTrue(outcome.isFailure(), "Document persistence failure should yield FAILURE outcome");
|
||||
assertFalse(outcome.isSuccess(), "Batch must not succeed when document persistence failed");
|
||||
assertTrue(outcome.isSuccess(),
|
||||
"Document-level persistence failure must not escalate to batch FAILURE — "
|
||||
+ "the batch ran to completion and the document will be retried");
|
||||
assertFalse(outcome.isFailure(),
|
||||
"Batch must not signal FAILURE when only a document-level persistence error occurred");
|
||||
}
|
||||
|
||||
/**
|
||||
* Regression test: mixed batch where one document succeeds and one has persistence failure.
|
||||
* The batch outcome must be FAILURE due to the persistence failure.
|
||||
* Mixed batch: one document completes normally, one has a persistence failure.
|
||||
* The batch outcome must still be SUCCESS because both documents were processed
|
||||
* by the loop. Document-level failures do not escalate to exit code 1.
|
||||
*/
|
||||
@Test
|
||||
void execute_mixedBatch_oneCandidateSuccess_oneDocumentPersistenceFails_batchIsFailure() throws Exception {
|
||||
void execute_mixedBatch_oneCandidateSuccess_oneDocumentPersistenceFails_batchIsSuccess() throws Exception {
|
||||
MockRunLockPort lockPort = new MockRunLockPort();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
@@ -504,7 +517,7 @@ class BatchRunProcessingUseCaseTest {
|
||||
DocumentProcessingCoordinator selectiveFailingProcessor = new DocumentProcessingCoordinator(
|
||||
new NoOpDocumentRecordRepository(), new NoOpProcessingAttemptRepository(),
|
||||
new NoOpUnitOfWorkPort(), new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(),
|
||||
new NoOpProcessingLogger()) {
|
||||
new NoOpProcessingLogger(), 3, "openai-compatible") {
|
||||
private int callCount = 0;
|
||||
|
||||
@Override
|
||||
@@ -527,9 +540,11 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
BatchRunOutcome outcome = useCase.execute(context);
|
||||
|
||||
assertTrue(outcome.isFailure(),
|
||||
"Batch must fail when any document has a persistence failure, even if others succeeded");
|
||||
assertFalse(outcome.isSuccess(), "Cannot be SUCCESS when persistence failed for any document");
|
||||
assertTrue(outcome.isSuccess(),
|
||||
"Batch must be SUCCESS even when one document had a persistence failure — "
|
||||
+ "the batch loop ran to completion");
|
||||
assertFalse(outcome.isFailure(),
|
||||
"Document-level persistence failure in one candidate must not make the batch FAILURE");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
@@ -580,43 +595,6 @@ class BatchRunProcessingUseCaseTest {
|
||||
"Bei Quellordner-Zugriffsfehler muss ein Fehler geloggt werden");
|
||||
}
|
||||
|
||||
@Test
|
||||
void execute_withPersistenceFailure_logsWarning() throws Exception {
|
||||
// Prüft, dass nach Batch-Lauf mit Persistenzfehler eine Warnung geloggt wird
|
||||
CapturingProcessingLogger capturingLogger = new CapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
SourceDocumentCandidate candidate = makeCandidate("doc.pdf");
|
||||
FixedCandidatesPort candidatesPort = new FixedCandidatesPort(List.of(candidate));
|
||||
FixedExtractionPort extractionPort = new FixedExtractionPort(
|
||||
new PdfExtractionSuccess("text", new PdfPageCount(1)));
|
||||
|
||||
// Coordinator der immer Persistenzfehler zurückgibt
|
||||
DocumentProcessingCoordinator failingCoordinator = new DocumentProcessingCoordinator(
|
||||
new NoOpDocumentRecordRepository(), new NoOpProcessingAttemptRepository(),
|
||||
new NoOpUnitOfWorkPort(), new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(),
|
||||
new NoOpProcessingLogger()) {
|
||||
@Override
|
||||
public boolean processDeferredOutcome(
|
||||
de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate c,
|
||||
de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint fp,
|
||||
de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext ctx,
|
||||
java.time.Instant start,
|
||||
java.util.function.Function<de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate, de.gecheckt.pdf.umbenenner.domain.model.DocumentProcessingOutcome> exec) {
|
||||
return false;
|
||||
}
|
||||
};
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = new DefaultBatchRunProcessingUseCase(
|
||||
config, new MockRunLockPort(), candidatesPort, extractionPort,
|
||||
new AlwaysSuccessFingerprintPort(), failingCoordinator, buildStubAiNamingService(), capturingLogger);
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId("persist-warn"), Instant.now()));
|
||||
|
||||
assertTrue(capturingLogger.warnCallCount > 0,
|
||||
"Nach Batch-Lauf mit Persistenzfehler muss eine Warnung geloggt werden");
|
||||
}
|
||||
|
||||
@Test
|
||||
void execute_batchStart_logsInfo() throws Exception {
|
||||
// Prüft, dass beim Batch-Start mindestens die erwarteten Info-Einträge geloggt werden.
|
||||
@@ -660,11 +638,12 @@ class BatchRunProcessingUseCaseTest {
|
||||
// Prüft, dass bei erfolgreich verarbeiteter Datei debug() durch logExtractionResult
|
||||
// und info() durch logProcessingOutcome aufgerufen wird.
|
||||
// Erwartete debug()-Aufrufe für einen Kandidaten (success-Pfad):
|
||||
// L138 (lock acquired) + L249 (processCandidate) + L293 (fingerprint) + L337 (logExtractionResult) + L213 (lock released) = 5
|
||||
// Ohne logExtractionResult-Aufruf: 4
|
||||
// lock acquired + fingerprint computed + logExtractionResult + lock released = 4
|
||||
// Ohne logExtractionResult-Aufruf wären es nur 3 debug()-Aufrufe.
|
||||
// Erwartete info()-Aufrufe für einen Kandidaten (success-Pfad):
|
||||
// L130 (initiiert) + L145 (gestartet) + L178 (Kandidaten gefunden) + L365 (PreCheckPassed) + L190 (abgeschlossen) = 5
|
||||
// Ohne logProcessingOutcome-Aufruf: 4
|
||||
// Batch initiiert + Batch gestartet + Kandidaten gefunden + erkannte Quelldatei
|
||||
// + logProcessingOutcome (PreCheckPassed) + Batch abgeschlossen = 6
|
||||
// Ohne logProcessingOutcome-Aufruf wären es 5 info()-Aufrufe.
|
||||
CapturingProcessingLogger capturingLogger = new CapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
@@ -680,21 +659,21 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId("log-precheck"), Instant.now()));
|
||||
|
||||
// Ohne logExtractionResult wären es mindestens 4 debug()-Aufrufe; mit logExtractionResult 5
|
||||
assertTrue(capturingLogger.debugCallCount >= 5,
|
||||
"logExtractionResult muss bei PdfExtractionSuccess debug() aufrufen (erwartet >= 5, war: "
|
||||
// Ohne logExtractionResult wären es nur 3 debug()-Aufrufe; mit logExtractionResult >= 4
|
||||
assertTrue(capturingLogger.debugCallCount >= 4,
|
||||
"logExtractionResult muss bei PdfExtractionSuccess debug() aufrufen (erwartet >= 4, war: "
|
||||
+ capturingLogger.debugCallCount + ")");
|
||||
// Ohne logProcessingOutcome wären es 4 info()-Aufrufe; mit logProcessingOutcome 5
|
||||
assertTrue(capturingLogger.infoCallCount >= 5,
|
||||
"logProcessingOutcome muss bei PreCheckPassed info() aufrufen (erwartet >= 5, war: "
|
||||
// Ohne logProcessingOutcome wären es 5 info()-Aufrufe; mit logProcessingOutcome >= 6
|
||||
assertTrue(capturingLogger.infoCallCount >= 6,
|
||||
"logProcessingOutcome muss bei PreCheckPassed info() aufrufen (erwartet >= 6, war: "
|
||||
+ capturingLogger.infoCallCount + ")");
|
||||
}
|
||||
|
||||
@Test
|
||||
void execute_extractionContentError_logsDebugAndPreCheckFailedInfo() throws Exception {
|
||||
// Prüft, dass bei PdfExtractionContentError debug (logExtractionResult) und info (logProcessingOutcome) geloggt wird.
|
||||
// Erwartete debug()-Aufrufe: 5 (lock + processCandidate + fingerprint + logExtractionResult (content) + lock released)
|
||||
// Erwartete info()-Aufrufe: 5 (L130 + L145 + L178 + L369 PreCheckFailed + L190)
|
||||
// Erwartete debug()-Aufrufe: 4 (lock acquired + fingerprint + logExtractionResult (content) + lock released)
|
||||
// Erwartete info()-Aufrufe: 6 (Batch initiiert + gestartet + Kandidaten gefunden + erkannte Quelldatei + PreCheckFailed + abgeschlossen)
|
||||
CapturingProcessingLogger capturingLogger = new CapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
@@ -710,20 +689,20 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId("log-content-error"), Instant.now()));
|
||||
|
||||
// Ohne logExtractionResult wären es 4 debug()-Aufrufe; mit logExtractionResult 5
|
||||
assertTrue(capturingLogger.debugCallCount >= 5,
|
||||
"logExtractionResult muss bei PdfExtractionContentError debug() aufrufen (erwartet >= 5, war: "
|
||||
// Ohne logExtractionResult wären es nur 3 debug()-Aufrufe; mit logExtractionResult >= 4
|
||||
assertTrue(capturingLogger.debugCallCount >= 4,
|
||||
"logExtractionResult muss bei PdfExtractionContentError debug() aufrufen (erwartet >= 4, war: "
|
||||
+ capturingLogger.debugCallCount + ")");
|
||||
// Ohne logProcessingOutcome (PreCheckFailed) wären es 4 info()-Aufrufe; mit 5
|
||||
assertTrue(capturingLogger.infoCallCount >= 5,
|
||||
"logProcessingOutcome muss bei PreCheckFailed info() aufrufen (erwartet >= 5, war: "
|
||||
// Ohne logProcessingOutcome (PreCheckFailed) wären es 5 info()-Aufrufe; mit >= 6
|
||||
assertTrue(capturingLogger.infoCallCount >= 6,
|
||||
"logProcessingOutcome muss bei PreCheckFailed info() aufrufen (erwartet >= 6, war: "
|
||||
+ capturingLogger.infoCallCount + ")");
|
||||
}
|
||||
|
||||
@Test
|
||||
void execute_extractionTechnicalError_logsDebugAndWarn() throws Exception {
|
||||
// Prüft, dass bei PdfExtractionTechnicalError debug (logExtractionResult) und warn (logProcessingOutcome) geloggt wird.
|
||||
// Erwartete debug()-Aufrufe: 5 (lock + processCandidate + fingerprint + logExtractionResult + lock released)
|
||||
// Erwartete debug()-Aufrufe: 4 (lock acquired + fingerprint + logExtractionResult + lock released)
|
||||
CapturingProcessingLogger capturingLogger = new CapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
@@ -739,15 +718,254 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId("log-tech-error"), Instant.now()));
|
||||
|
||||
// Ohne logExtractionResult wären es 4 debug()-Aufrufe; mit logExtractionResult 5
|
||||
assertTrue(capturingLogger.debugCallCount >= 5,
|
||||
"logExtractionResult muss bei PdfExtractionTechnicalError debug() aufrufen (erwartet >= 5, war: "
|
||||
// Ohne logExtractionResult wären es nur 3 debug()-Aufrufe; mit logExtractionResult >= 4
|
||||
assertTrue(capturingLogger.debugCallCount >= 4,
|
||||
"logExtractionResult muss bei PdfExtractionTechnicalError debug() aufrufen (erwartet >= 4, war: "
|
||||
+ capturingLogger.debugCallCount + ")");
|
||||
// logProcessingOutcome ruft warn() auf für TechnicalDocumentError
|
||||
assertTrue(capturingLogger.warnCallCount > 0,
|
||||
"logProcessingOutcome muss bei TechnicalDocumentError warn() aufrufen");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Batch-level integration tests (real coordinator + capturing repos)
|
||||
// These prove that skip and finalization semantics work in the actual batch run,
|
||||
// not just at the coordinator unit-test level.
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Full batch integration: a candidate whose fingerprint maps to a SUCCESS record
|
||||
* must be historised as SKIPPED_ALREADY_PROCESSED without triggering the pipeline.
|
||||
*/
|
||||
@Test
|
||||
void execute_successDocument_batchRunHistorisesSkippedAlreadyProcessed() throws Exception {
|
||||
MockRunLockPort lockPort = new MockRunLockPort();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
SourceDocumentCandidate candidate = makeCandidate("already-done.pdf");
|
||||
DocumentFingerprint fingerprint = makeFingerprint(candidate.uniqueIdentifier());
|
||||
|
||||
// Capturing repos to inspect what was persisted
|
||||
BatchCapturingDocumentRecordRepository recordRepo = new BatchCapturingDocumentRecordRepository();
|
||||
BatchCapturingProcessingAttemptRepository attemptRepo = new BatchCapturingProcessingAttemptRepository();
|
||||
BatchCapturingUnitOfWorkPort unitOfWork = new BatchCapturingUnitOfWorkPort(recordRepo, attemptRepo);
|
||||
|
||||
// Repo returns SUCCESS for this fingerprint
|
||||
DocumentRecord successRecord = new DocumentRecord(
|
||||
fingerprint, new SourceDocumentLocator("/tmp/already-done.pdf"), "already-done.pdf",
|
||||
ProcessingStatus.SUCCESS, FailureCounters.zero(),
|
||||
null, java.time.Instant.now(), java.time.Instant.now(), java.time.Instant.now(),
|
||||
"/target", "2026-01-15 - Rechnung.pdf");
|
||||
recordRepo.setLookupResult(new DocumentTerminalSuccess(successRecord));
|
||||
|
||||
DocumentProcessingCoordinator realCoordinator = new DocumentProcessingCoordinator(
|
||||
recordRepo, attemptRepo, unitOfWork,
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger(), 3,
|
||||
"openai-compatible");
|
||||
|
||||
// Fingerprint port returns the pre-defined fingerprint for this candidate
|
||||
FingerprintPort fixedFingerprintPort = c -> new FingerprintSuccess(fingerprint);
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = buildUseCase(
|
||||
config, lockPort, new FixedCandidatesPort(List.of(candidate)), new NoOpExtractionPort(),
|
||||
fixedFingerprintPort, realCoordinator);
|
||||
BatchRunContext context = new BatchRunContext(new RunId("skip-success-run"), Instant.now());
|
||||
|
||||
useCase.execute(context);
|
||||
|
||||
// Exactly one skip attempt must be recorded
|
||||
assertEquals(1, attemptRepo.savedAttempts.size(),
|
||||
"Exactly one skip attempt must be historised for a SUCCESS document");
|
||||
assertEquals(ProcessingStatus.SKIPPED_ALREADY_PROCESSED, attemptRepo.savedAttempts.get(0).status(),
|
||||
"Skip attempt status must be SKIPPED_ALREADY_PROCESSED");
|
||||
assertFalse(attemptRepo.savedAttempts.get(0).retryable(),
|
||||
"Skip attempt must not be retryable");
|
||||
}
|
||||
|
||||
/**
|
||||
* Full batch integration: a candidate whose fingerprint maps to a FAILED_FINAL record
|
||||
* must be historised as SKIPPED_FINAL_FAILURE without triggering the pipeline.
|
||||
*/
|
||||
@Test
|
||||
void execute_finallyFailedDocument_batchRunHistorisesSkippedFinalFailure() throws Exception {
|
||||
MockRunLockPort lockPort = new MockRunLockPort();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
SourceDocumentCandidate candidate = makeCandidate("permanent-failure.pdf");
|
||||
DocumentFingerprint fingerprint = makeFingerprint(candidate.uniqueIdentifier());
|
||||
|
||||
BatchCapturingDocumentRecordRepository recordRepo = new BatchCapturingDocumentRecordRepository();
|
||||
BatchCapturingProcessingAttemptRepository attemptRepo = new BatchCapturingProcessingAttemptRepository();
|
||||
BatchCapturingUnitOfWorkPort unitOfWork = new BatchCapturingUnitOfWorkPort(recordRepo, attemptRepo);
|
||||
|
||||
// Repo returns FAILED_FINAL for this fingerprint
|
||||
DocumentRecord failedFinalRecord = new DocumentRecord(
|
||||
fingerprint, new SourceDocumentLocator("/tmp/permanent-failure.pdf"), "permanent-failure.pdf",
|
||||
ProcessingStatus.FAILED_FINAL, new FailureCounters(2, 0),
|
||||
java.time.Instant.now(), null, java.time.Instant.now(), java.time.Instant.now(),
|
||||
null, null);
|
||||
recordRepo.setLookupResult(new DocumentTerminalFinalFailure(failedFinalRecord));
|
||||
|
||||
DocumentProcessingCoordinator realCoordinator = new DocumentProcessingCoordinator(
|
||||
recordRepo, attemptRepo, unitOfWork,
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger(), 3,
|
||||
"openai-compatible");
|
||||
|
||||
FingerprintPort fixedFingerprintPort = c -> new FingerprintSuccess(fingerprint);
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = buildUseCase(
|
||||
config, lockPort, new FixedCandidatesPort(List.of(candidate)), new NoOpExtractionPort(),
|
||||
fixedFingerprintPort, realCoordinator);
|
||||
BatchRunContext context = new BatchRunContext(new RunId("skip-final-failure-run"), Instant.now());
|
||||
|
||||
useCase.execute(context);
|
||||
|
||||
// Exactly one skip attempt must be recorded
|
||||
assertEquals(1, attemptRepo.savedAttempts.size(),
|
||||
"Exactly one skip attempt must be historised for a FAILED_FINAL document");
|
||||
assertEquals(ProcessingStatus.SKIPPED_FINAL_FAILURE, attemptRepo.savedAttempts.get(0).status(),
|
||||
"Skip attempt status must be SKIPPED_FINAL_FAILURE");
|
||||
assertFalse(attemptRepo.savedAttempts.get(0).retryable(),
|
||||
"Skip attempt must not be retryable");
|
||||
}
|
||||
|
||||
/**
|
||||
* Full batch integration: a batch with one terminal (SUCCESS) and one processable document
|
||||
* handles each independently — the terminal document is skipped, the processable one is processed.
|
||||
* Document errors on one candidate must not block processing of the other.
|
||||
*/
|
||||
@Test
|
||||
void execute_mixedTerminalAndProcessable_eachHandledIndependentlyInBatch() throws Exception {
|
||||
MockRunLockPort lockPort = new MockRunLockPort();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
SourceDocumentCandidate terminalCandidate = makeCandidate("terminal.pdf");
|
||||
SourceDocumentCandidate processableCandidate = makeCandidate("processable.pdf");
|
||||
|
||||
DocumentFingerprint terminalFp = makeFingerprint(terminalCandidate.uniqueIdentifier());
|
||||
DocumentFingerprint processableFp = makeFingerprint(processableCandidate.uniqueIdentifier());
|
||||
|
||||
BatchCapturingDocumentRecordRepository recordRepo = new BatchCapturingDocumentRecordRepository();
|
||||
BatchCapturingProcessingAttemptRepository attemptRepo = new BatchCapturingProcessingAttemptRepository();
|
||||
BatchCapturingUnitOfWorkPort unitOfWork = new BatchCapturingUnitOfWorkPort(recordRepo, attemptRepo);
|
||||
|
||||
// Terminal candidate: SUCCESS in repo
|
||||
DocumentRecord successRecord = new DocumentRecord(
|
||||
terminalFp, new SourceDocumentLocator("/tmp/terminal.pdf"), "terminal.pdf",
|
||||
ProcessingStatus.SUCCESS, FailureCounters.zero(),
|
||||
null, java.time.Instant.now(), java.time.Instant.now(), java.time.Instant.now(),
|
||||
"/target", "2026-01-15 - Rechnung.pdf");
|
||||
|
||||
// Per-fingerprint lookup: terminal gets SUCCESS, processable gets Unknown
|
||||
recordRepo.setLookupByFingerprint(terminalFp, new DocumentTerminalSuccess(successRecord));
|
||||
recordRepo.setLookupByFingerprint(processableFp, new DocumentUnknown());
|
||||
|
||||
DocumentProcessingCoordinator realCoordinator = new DocumentProcessingCoordinator(
|
||||
recordRepo, attemptRepo, unitOfWork,
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger(), 3,
|
||||
"openai-compatible");
|
||||
|
||||
FingerprintPort perCandidateFingerprintPort = candidate -> {
|
||||
if (candidate.uniqueIdentifier().equals("terminal.pdf")) return new FingerprintSuccess(terminalFp);
|
||||
return new FingerprintSuccess(processableFp);
|
||||
};
|
||||
|
||||
FixedCandidatesPort candidatesPort = new FixedCandidatesPort(
|
||||
List.of(terminalCandidate, processableCandidate));
|
||||
FixedExtractionPort extractionPort = new FixedExtractionPort(
|
||||
new PdfExtractionSuccess("Invoice text", new PdfPageCount(1)));
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = buildUseCase(
|
||||
config, lockPort, candidatesPort, extractionPort,
|
||||
perCandidateFingerprintPort, realCoordinator);
|
||||
BatchRunContext context = new BatchRunContext(new RunId("mixed-batch-run"), Instant.now());
|
||||
|
||||
useCase.execute(context);
|
||||
|
||||
// Two attempts must be recorded (one skip + one pipeline attempt)
|
||||
assertEquals(2, attemptRepo.savedAttempts.size(),
|
||||
"Two attempts must be recorded: one skip and one pipeline attempt");
|
||||
|
||||
// First attempt: skip for the terminal candidate
|
||||
assertEquals(ProcessingStatus.SKIPPED_ALREADY_PROCESSED, attemptRepo.savedAttempts.get(0).status(),
|
||||
"First attempt must be SKIPPED_ALREADY_PROCESSED for the terminal candidate");
|
||||
|
||||
// Second attempt: pipeline result for the processable candidate (AI fails → FAILED_RETRYABLE transient)
|
||||
assertNotEquals(ProcessingStatus.SKIPPED_ALREADY_PROCESSED, attemptRepo.savedAttempts.get(1).status(),
|
||||
"Second attempt must not be a skip — it must be the pipeline result for the processable candidate");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Log correlation tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void execute_preFingerprintError_logContainsRunIdAndCandidateDescription() throws Exception {
|
||||
// When fingerprint computation fails, the warning log must reference both the run-ID
|
||||
// and the candidate's unique identifier (pre-fingerprint correlation rule).
|
||||
String runIdValue = "run-correlation-pre-fp";
|
||||
String candidateFilename = "unreadable-candidate.pdf";
|
||||
|
||||
MessageCapturingProcessingLogger capturingLogger = new MessageCapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
FixedCandidatesPort candidatesPort = new FixedCandidatesPort(
|
||||
List.of(makeCandidate(candidateFilename)));
|
||||
|
||||
// Fingerprint port that always fails
|
||||
FingerprintPort failingFingerprintPort = c ->
|
||||
new FingerprintTechnicalError("File not readable", null);
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = new DefaultBatchRunProcessingUseCase(
|
||||
config, new MockRunLockPort(), candidatesPort, new NoOpExtractionPort(),
|
||||
failingFingerprintPort, new NoOpDocumentProcessingCoordinator(),
|
||||
buildStubAiNamingService(), capturingLogger);
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId(runIdValue), Instant.now()));
|
||||
|
||||
// At least one warning message must contain both run-ID and candidate filename
|
||||
boolean correlationPresent = capturingLogger.warnMessages.stream()
|
||||
.anyMatch(m -> m.contains(runIdValue) && m.contains(candidateFilename));
|
||||
assertTrue(correlationPresent,
|
||||
"Pre-fingerprint warning must reference both run-ID '" + runIdValue
|
||||
+ "' and candidate '" + candidateFilename + "'. "
|
||||
+ "Captured warn messages: " + capturingLogger.warnMessages);
|
||||
}
|
||||
|
||||
@Test
|
||||
void execute_postFingerprintProcessing_logContainsFingerprintHex() throws Exception {
|
||||
// After a successful fingerprint computation, at least one log message must contain
|
||||
// the fingerprint's SHA-256 hex value (post-fingerprint correlation rule).
|
||||
String candidateFilename = "identifiable.pdf";
|
||||
|
||||
MessageCapturingProcessingLogger capturingLogger = new MessageCapturingProcessingLogger();
|
||||
RuntimeConfiguration config = buildConfig(tempDir);
|
||||
|
||||
SourceDocumentCandidate candidate = makeCandidate(candidateFilename);
|
||||
FixedCandidatesPort candidatesPort = new FixedCandidatesPort(List.of(candidate));
|
||||
FixedExtractionPort extractionPort = new FixedExtractionPort(
|
||||
new PdfExtractionSuccess("Some invoice text", new PdfPageCount(1)));
|
||||
|
||||
// Deterministic fingerprint port so we can verify the exact hex in the log
|
||||
AlwaysSuccessFingerprintPort fingerprintPort = new AlwaysSuccessFingerprintPort();
|
||||
DocumentFingerprint expectedFingerprint = ((FingerprintSuccess) fingerprintPort.computeFingerprint(candidate)).fingerprint();
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = new DefaultBatchRunProcessingUseCase(
|
||||
config, new MockRunLockPort(), candidatesPort, extractionPort,
|
||||
fingerprintPort, new TrackingDocumentProcessingCoordinator(),
|
||||
buildStubAiNamingService(), capturingLogger);
|
||||
|
||||
useCase.execute(new BatchRunContext(new RunId("run-correlation-post-fp"), Instant.now()));
|
||||
|
||||
String fingerprintHex = expectedFingerprint.sha256Hex();
|
||||
boolean fingerprintInLog = capturingLogger.allMessages().stream()
|
||||
.anyMatch(m -> m.contains(fingerprintHex));
|
||||
assertTrue(fingerprintInLog,
|
||||
"At least one log message must contain the fingerprint hex '" + fingerprintHex
|
||||
+ "' after successful fingerprint computation. "
|
||||
+ "Captured messages: " + capturingLogger.allMessages());
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
@@ -779,8 +997,8 @@ class BatchRunProcessingUseCaseTest {
|
||||
}
|
||||
|
||||
private static RuntimeConfiguration buildConfig(Path tempDir) throws Exception {
|
||||
// maxPages set to 3 – useful for page-limit tests
|
||||
return new RuntimeConfiguration(3);
|
||||
// maxPages set to 3 – useful for page-limit tests; maxRetriesTransient set to 3
|
||||
return new RuntimeConfiguration(3, 3, AiContentSensitivity.PROTECT_SENSITIVE_CONTENT);
|
||||
}
|
||||
|
||||
private static SourceDocumentCandidate makeCandidate(String filename) {
|
||||
@@ -937,7 +1155,8 @@ class BatchRunProcessingUseCaseTest {
|
||||
private static class NoOpDocumentProcessingCoordinator extends DocumentProcessingCoordinator {
|
||||
NoOpDocumentProcessingCoordinator() {
|
||||
super(new NoOpDocumentRecordRepository(), new NoOpProcessingAttemptRepository(), new NoOpUnitOfWorkPort(),
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger());
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger(), 3,
|
||||
"openai-compatible");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -949,7 +1168,8 @@ class BatchRunProcessingUseCaseTest {
|
||||
|
||||
TrackingDocumentProcessingCoordinator() {
|
||||
super(new NoOpDocumentRecordRepository(), new NoOpProcessingAttemptRepository(), new NoOpUnitOfWorkPort(),
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger());
|
||||
new NoOpTargetFolderPort(), new NoOpTargetFileCopyPort(), new NoOpProcessingLogger(), 3,
|
||||
"openai-compatible");
|
||||
}
|
||||
|
||||
@Override
|
||||
@@ -1083,6 +1303,11 @@ class BatchRunProcessingUseCaseTest {
|
||||
// No-op
|
||||
}
|
||||
|
||||
@Override
|
||||
public void debugSensitiveAiContent(String message, Object... args) {
|
||||
// No-op: sensitivity is controlled by the Log4jProcessingLogger adapter
|
||||
}
|
||||
|
||||
@Override
|
||||
public void warn(String message, Object... args) {
|
||||
// No-op
|
||||
@@ -1094,10 +1319,176 @@ class BatchRunProcessingUseCaseTest {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Captures formatted log messages for each log level.
|
||||
* Used by log-correlation tests that must inspect message content.
|
||||
*/
|
||||
private static class MessageCapturingProcessingLogger implements ProcessingLogger {
|
||||
final List<String> infoMessages = new ArrayList<>();
|
||||
final List<String> debugMessages = new ArrayList<>();
|
||||
final List<String> debugSensitiveAiContentMessages = new ArrayList<>();
|
||||
final List<String> warnMessages = new ArrayList<>();
|
||||
final List<String> errorMessages = new ArrayList<>();
|
||||
|
||||
/** Formats a message template with its arguments the same way SLF4J/Log4j2 does. */
|
||||
private static String format(String message, Object... args) {
|
||||
if (args == null || args.length == 0) return message;
|
||||
StringBuilder sb = new StringBuilder();
|
||||
int argIndex = 0;
|
||||
int start = 0;
|
||||
int pos;
|
||||
while ((pos = message.indexOf("{}", start)) != -1 && argIndex < args.length) {
|
||||
sb.append(message, start, pos);
|
||||
sb.append(args[argIndex++]);
|
||||
start = pos + 2;
|
||||
}
|
||||
sb.append(message, start, message.length());
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
@Override
|
||||
public void info(String message, Object... args) {
|
||||
infoMessages.add(format(message, args));
|
||||
}
|
||||
|
||||
@Override
|
||||
public void debug(String message, Object... args) {
|
||||
debugMessages.add(format(message, args));
|
||||
}
|
||||
|
||||
@Override
|
||||
public void debugSensitiveAiContent(String message, Object... args) {
|
||||
debugSensitiveAiContentMessages.add(format(message, args));
|
||||
}
|
||||
|
||||
@Override
|
||||
public void warn(String message, Object... args) {
|
||||
warnMessages.add(format(message, args));
|
||||
}
|
||||
|
||||
@Override
|
||||
public void error(String message, Object... args) {
|
||||
errorMessages.add(format(message, args));
|
||||
}
|
||||
|
||||
List<String> allMessages() {
|
||||
List<String> all = new ArrayList<>();
|
||||
all.addAll(infoMessages);
|
||||
all.addAll(debugMessages);
|
||||
all.addAll(debugSensitiveAiContentMessages);
|
||||
all.addAll(warnMessages);
|
||||
all.addAll(errorMessages);
|
||||
return all;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* DocumentRecordRepository for batch integration tests.
|
||||
* Supports per-fingerprint lookup results and records written records.
|
||||
*/
|
||||
private static class BatchCapturingDocumentRecordRepository implements DocumentRecordRepository {
|
||||
private final Map<String, DocumentRecordLookupResult> lookupByFp = new java.util.HashMap<>();
|
||||
private DocumentRecordLookupResult defaultResult = new DocumentUnknown();
|
||||
final List<DocumentRecord> createdRecords = new ArrayList<>();
|
||||
final List<DocumentRecord> updatedRecords = new ArrayList<>();
|
||||
|
||||
void setLookupResult(DocumentRecordLookupResult result) {
|
||||
this.defaultResult = result;
|
||||
}
|
||||
|
||||
void setLookupByFingerprint(DocumentFingerprint fp, DocumentRecordLookupResult result) {
|
||||
lookupByFp.put(fp.sha256Hex(), result);
|
||||
}
|
||||
|
||||
@Override
|
||||
public DocumentRecordLookupResult findByFingerprint(DocumentFingerprint fingerprint) {
|
||||
return lookupByFp.getOrDefault(fingerprint.sha256Hex(), defaultResult);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void create(DocumentRecord record) {
|
||||
createdRecords.add(record);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void update(DocumentRecord record) {
|
||||
updatedRecords.add(record);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* ProcessingAttemptRepository for batch integration tests. Records all saved attempts.
|
||||
*/
|
||||
private static class BatchCapturingProcessingAttemptRepository implements ProcessingAttemptRepository {
|
||||
final List<ProcessingAttempt> savedAttempts = new ArrayList<>();
|
||||
|
||||
@Override
|
||||
public int loadNextAttemptNumber(DocumentFingerprint fingerprint) {
|
||||
return savedAttempts.size() + 1;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void save(ProcessingAttempt attempt) {
|
||||
savedAttempts.add(attempt);
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<ProcessingAttempt> findAllByFingerprint(DocumentFingerprint fingerprint) {
|
||||
return savedAttempts.stream()
|
||||
.filter(a -> a.fingerprint().sha256Hex().equals(fingerprint.sha256Hex()))
|
||||
.toList();
|
||||
}
|
||||
|
||||
@Override
|
||||
public ProcessingAttempt findLatestProposalReadyAttempt(DocumentFingerprint fingerprint) {
|
||||
return savedAttempts.stream()
|
||||
.filter(a -> a.fingerprint().sha256Hex().equals(fingerprint.sha256Hex())
|
||||
&& a.status() == ProcessingStatus.PROPOSAL_READY)
|
||||
.reduce((first, second) -> second)
|
||||
.orElse(null);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* UnitOfWorkPort for batch integration tests. Executes operations directly
|
||||
* against the capturing repos.
|
||||
*/
|
||||
private static class BatchCapturingUnitOfWorkPort implements UnitOfWorkPort {
|
||||
private final BatchCapturingDocumentRecordRepository recordRepo;
|
||||
private final BatchCapturingProcessingAttemptRepository attemptRepo;
|
||||
|
||||
BatchCapturingUnitOfWorkPort(BatchCapturingDocumentRecordRepository recordRepo,
|
||||
BatchCapturingProcessingAttemptRepository attemptRepo) {
|
||||
this.recordRepo = recordRepo;
|
||||
this.attemptRepo = attemptRepo;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void executeInTransaction(Consumer<TransactionOperations> operations) {
|
||||
operations.accept(new TransactionOperations() {
|
||||
@Override
|
||||
public void saveProcessingAttempt(ProcessingAttempt attempt) {
|
||||
attemptRepo.save(attempt);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void createDocumentRecord(DocumentRecord record) {
|
||||
recordRepo.create(record);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void updateDocumentRecord(DocumentRecord record) {
|
||||
recordRepo.update(record);
|
||||
}
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
/** Zählt Logger-Aufrufe je Level, um VoidMethodCallMutator-Mutationen zu erkennen. */
|
||||
private static class CapturingProcessingLogger implements ProcessingLogger {
|
||||
int infoCallCount = 0;
|
||||
int debugCallCount = 0;
|
||||
int debugSensitiveAiContentCallCount = 0;
|
||||
int warnCallCount = 0;
|
||||
int errorCallCount = 0;
|
||||
|
||||
@@ -1111,6 +1502,11 @@ class BatchRunProcessingUseCaseTest {
|
||||
debugCallCount++;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void debugSensitiveAiContent(String message, Object... args) {
|
||||
debugSensitiveAiContentCallCount++;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void warn(String message, Object... args) {
|
||||
warnCallCount++;
|
||||
|
||||
@@ -62,6 +62,11 @@
|
||||
<artifactId>mockito-junit-jupiter</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.assertj</groupId>
|
||||
<artifactId>assertj-core</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<build>
|
||||
|
||||
@@ -0,0 +1,62 @@
|
||||
package de.gecheckt.pdf.umbenenner.bootstrap;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.ai.AnthropicClaudeHttpAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.ai.OpenAiHttpAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
|
||||
/**
|
||||
* Selects and instantiates the active {@link AiInvocationPort} implementation
|
||||
* based on the configured provider family.
|
||||
* <p>
|
||||
* This component lives in the bootstrap layer and is the single point where
|
||||
* the active provider family is mapped to its corresponding adapter implementation.
|
||||
* Exactly one provider is selected per application run; the selection is driven
|
||||
* by the value of {@code ai.provider.active}.
|
||||
*
|
||||
* <h2>Registered providers</h2>
|
||||
* <ul>
|
||||
* <li>{@link AiProviderFamily#OPENAI_COMPATIBLE} — {@link OpenAiHttpAdapter}</li>
|
||||
* <li>{@link AiProviderFamily#CLAUDE} — {@link AnthropicClaudeHttpAdapter}</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Hard start failure</h2>
|
||||
* <p>
|
||||
* If the requested provider family has no registered implementation, an
|
||||
* {@link InvalidStartConfigurationException} is thrown immediately, which the
|
||||
* bootstrap runner maps to exit code 1.
|
||||
*/
|
||||
public class AiProviderSelector {
|
||||
|
||||
/**
|
||||
* Selects and constructs the {@link AiInvocationPort} implementation for the given
|
||||
* provider family using the supplied provider configuration.
|
||||
*
|
||||
* @param family the active provider family; must not be {@code null}
|
||||
* @param config the configuration for the active provider; must not be {@code null}
|
||||
* @return the constructed adapter instance; never {@code null}
|
||||
* @throws InvalidStartConfigurationException if no implementation is registered
|
||||
* for the requested provider family
|
||||
*/
|
||||
public AiInvocationPort select(AiProviderFamily family, ProviderConfiguration config) {
|
||||
Objects.requireNonNull(family, "provider family must not be null");
|
||||
Objects.requireNonNull(config, "provider configuration must not be null");
|
||||
|
||||
if (family == AiProviderFamily.OPENAI_COMPATIBLE) {
|
||||
return new OpenAiHttpAdapter(config);
|
||||
}
|
||||
|
||||
if (family == AiProviderFamily.CLAUDE) {
|
||||
return new AnthropicClaudeHttpAdapter(config);
|
||||
}
|
||||
|
||||
throw new InvalidStartConfigurationException(
|
||||
"No AI adapter implementation registered for provider family: "
|
||||
+ family.getIdentifier()
|
||||
+ ". Supported in the current build: openai-compatible, claude");
|
||||
}
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user