Compare commits
93 Commits
5441d15b41
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 3f1d50d356 | |||
| ca91749a04 | |||
| 57ea9cf649 | |||
| 9c8ba2170e | |||
| b13d8ba0e1 | |||
| 7b7af28d12 | |||
| f4bf76652a | |||
| 67ab91cd70 | |||
| 4a21b23312 | |||
| cd1deb9f92 | |||
| 8fd9e350e5 | |||
| 5099ff4aca | |||
| 39800b6ea8 | |||
| 0e65ae32ff | |||
| a51fcf7055 | |||
| 9c2a205137 | |||
| 559b051ab3 | |||
| 03689802dd | |||
| d61316c699 | |||
| a3f47ba560 | |||
| 8d915e7ded | |||
| e91cfb9ec2 | |||
| a5d687d625 | |||
| cab9fed5b0 | |||
| f2bbc8a884 | |||
| c7818ce920 | |||
| ac3662e758 | |||
| 788f6110d4 | |||
| e9e9b2d17a | |||
| ffd91c766d | |||
| 7e4193a173 | |||
| df0a3ad07b | |||
| 7e4201b651 | |||
| f81f30c7ea | |||
| 2dc07d16d5 | |||
| 8bcd80d70a | |||
| 9874fdb1ba | |||
| 506f5ac32e | |||
| 0246699e77 | |||
| 167b56bec5 | |||
| d8d7657a29 | |||
| ab267d5df4 | |||
| 3a772c20c0 | |||
| 9ea6c3aaa5 | |||
| cd5b6253df | |||
| c15fb6b18d | |||
| c77a6f06af | |||
| cd2389f3e1 | |||
| d1dfc75d4e | |||
| ac02057991 | |||
| 2d7be60057 | |||
| efc13d841e | |||
| 707364d912 | |||
| e9bf9231e3 | |||
| 7bac60c66c | |||
| b5db3fb361 | |||
| 6437ef38af | |||
| 00daa9cb74 | |||
| 9fd6bc469d | |||
| 8f1e41c1a6 | |||
| ca17e0a082 | |||
| 8278a16bbb | |||
| 9ddb32912c | |||
| 94728c270f | |||
| 5b95cc2561 | |||
| 3657b0c3de | |||
| 7764a50308 | |||
| 8e6d745e4b | |||
| deaa8c9fa3 | |||
| 73824544b6 | |||
| 9f4449546d | |||
| dd5082bfef | |||
| efb4d0b222 | |||
| 3ab10a89f0 | |||
| cb7ed57721 | |||
| 3e65eff6e6 | |||
| 3a14bcb0d0 | |||
| 9ba29aaba5 | |||
| 62f9542e50 | |||
| c3d207b742 | |||
| 326e739e45 | |||
| a35ac5c8f1 | |||
| 049aa361db | |||
| 30f070f2a6 | |||
| d61299f892 | |||
| fc30d1effd | |||
| ca91a60cad | |||
| 00c4cf1e5c | |||
| 8ee4041feb | |||
| 29ea56d2cf | |||
| 6a44def89b | |||
| cae9c944d7 | |||
| 7448d1340b |
1
.claude/.gitignore
vendored
Normal file
1
.claude/.gitignore
vendored
Normal file
@@ -0,0 +1 @@
|
||||
/settings.local.json
|
||||
@@ -1,24 +0,0 @@
|
||||
{
|
||||
"permissions": {
|
||||
"allow": [
|
||||
"Bash(xargs grep:*)",
|
||||
"Bash(xargs wc:*)",
|
||||
"Bash(mvn clean:*)",
|
||||
"Bash(mvn verify:*)",
|
||||
"Bash(mvn test:*)",
|
||||
"Bash(find D:/Dev/Projects/pdf-umbenenner-parent -not -path */target/* -type d)",
|
||||
"Bash(mvn -pl pdf-umbenenner-adapter-out clean compile)",
|
||||
"Bash(mvn dependency:tree -pl pdf-umbenenner-adapter-out)",
|
||||
"Bash(mvn -pl pdf-umbenenner-domain clean compile)",
|
||||
"Bash(mvn help:describe -Dplugin=org.apache.pdfbox:pdfbox -Ddetail=false)",
|
||||
"Bash(cd /d D:/Dev/Projects/pdf-umbenenner-parent)",
|
||||
"Bash(mvn -v)",
|
||||
"Bash(grep -E \"\\\\.java$\")",
|
||||
"Bash(grep \"\\\\.java$\")",
|
||||
"Bash(mvn -q clean compile -DskipTests)",
|
||||
"Bash(mvn -q test)",
|
||||
"Bash(mvn -q clean test)",
|
||||
"Bash(./mvnw.cmd:*)"
|
||||
]
|
||||
}
|
||||
}
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -72,3 +72,5 @@ Desktop.ini
|
||||
hs_err_pid*
|
||||
replay_pid*
|
||||
/review-input.zip
|
||||
/run-milestone.ps1
|
||||
/run-v11.ps1
|
||||
|
||||
271
CLAUDE.md
271
CLAUDE.md
@@ -6,7 +6,6 @@ Dieses Repository implementiert einen lokal gestarteten **PDF-Umbenenner mit KI*
|
||||
## Autoritative Dokumente
|
||||
@docs/specs/technik-und-architektur.md
|
||||
@docs/specs/fachliche-anforderungen.md
|
||||
@docs/specs/meilensteine.md
|
||||
|
||||
Für die Umsetzung ist zusätzlich immer das aktuell aktive Arbeitspaket unter `docs/workpackages/` maßgeblich.
|
||||
Nicht raten, wenn Dokumente fehlen, unklar sind oder sich widersprechen.
|
||||
@@ -16,21 +15,17 @@ Die Dokumente haben folgende feste Bedeutung:
|
||||
|
||||
- `docs/specs/technik-und-architektur.md` = verbindliche technische Zielarchitektur
|
||||
- `docs/specs/fachliche-anforderungen.md` = verbindliche fachliche Regeln
|
||||
- `docs/specs/meilensteine.md` = zulässiger Funktionsumfang pro Meilenstein
|
||||
- `docs/workpackages/...` = verbindlicher Scope, Reihenfolge und Inhalt des aktuell bearbeiteten Arbeitspakets
|
||||
|
||||
Bei Konflikten gilt folgende Priorität:
|
||||
|
||||
1. **Technik- und Architektur-Dokument**
|
||||
1. **Technik- und Architektur-Dokument**
|
||||
Verbindliche technische Zielarchitektur. Architekturbrüche sind unzulässig.
|
||||
|
||||
2. **Fachliche Anforderungen**
|
||||
2. **Fachliche Anforderungen**
|
||||
Verbindliche fachliche Regeln und fachliches Zielverhalten.
|
||||
|
||||
3. **Meilensteine**
|
||||
Begrenzen den zulässigen Funktionsumfang auf den aktuellen Entwicklungsstand.
|
||||
|
||||
4. **Arbeitspakete**
|
||||
3. **Arbeitspakete**
|
||||
Definieren den konkret erlaubten Umsetzungsumfang des aktuellen Schritts.
|
||||
|
||||
Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und keine stillen Annahmen treffen.
|
||||
@@ -46,8 +41,11 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
- kein interner Scheduler
|
||||
- Log4j2 für Logging
|
||||
- SQLite als lokaler Persistenzspeicher
|
||||
- OpenAI-kompatible HTTP-Schnittstelle für KI-Zugriff
|
||||
- API-Provider, Base-URL und Modellname sind **Konfiguration**, keine Architekturentscheidung
|
||||
- KI-Anbindung über genau **eine** der beiden unterstützten Provider-Familien:
|
||||
- **OpenAI-kompatible HTTP-Schnittstelle** (Chat-Completions-Stil)
|
||||
- **native Anthropic Messages API** (Claude-Modelle)
|
||||
- Pro Lauf ist genau ein Provider aktiv. Kein Fallback, keine Parallelnutzung.
|
||||
- Konkrete Provider-Familie, Base-URL und Modellname sind **Konfiguration**, keine Architekturentscheidung.
|
||||
|
||||
## Verbindliche Modulstruktur
|
||||
- `pdf-umbenenner-domain`
|
||||
@@ -66,11 +64,16 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
- Adapter dürfen nicht direkt voneinander abhängen
|
||||
- Keine Vermischung von Dateisystem, PDF-Auslese, SQLite, KI-HTTP, Konfiguration, Logging, Benennungslogik und Retry-Entscheidungen
|
||||
- Logging ist technische Infrastruktur, kein fachlicher Port
|
||||
- Port-Verträge enthalten weder `Path`/`File` noch NIO- oder JDBC-Typen
|
||||
- Der `AiNamingPort` bleibt provider-neutral; provider-spezifische Typen, Header, URLs und Antwortstrukturen leben ausschließlich in der jeweiligen Adapter-Out-Implementierung
|
||||
- Es gibt keine gemeinsame „abstrakte KI-Adapter"-Zwischenschicht zwischen Port und konkreten Adaptern
|
||||
- Die Bootstrap-Schicht wählt die **eine** aktive `AiNamingPort`-Implementierung anhand der Konfiguration aus
|
||||
|
||||
## Globale fachliche Leitplanken
|
||||
- Zielformat: `YYYY-MM-DD - Titel.pdf`
|
||||
- Bei Namenskollisionen: `YYYY-MM-DD - Titel(1).pdf`, `YYYY-MM-DD - Titel(2).pdf`, ...
|
||||
- Die **20 Zeichen** gelten nur für den **Basistitel**; das Dubletten-Suffix zählt nicht mit
|
||||
- Das Dubletten-Suffix wird unmittelbar vor `.pdf` angehängt
|
||||
- Titel sind **deutsch**, verständlich, eindeutig und enthalten keine Sonderzeichen außer Leerzeichen
|
||||
- Eigennamen bleiben unverändert
|
||||
- Datumsermittlung mit Priorität aus den fachlichen Anforderungen; wenn kein belastbares Datum eindeutig ableitbar ist, ist das **aktuelle Datum** als Fallback erlaubt
|
||||
@@ -81,34 +84,261 @@ Wenn Dokumente fehlen, unklar sind oder sich widersprechen, nicht raten und kein
|
||||
- Identifikation erfolgt **nicht** über Dateinamen
|
||||
- Quelldateien werden **nie** überschrieben, verändert, verschoben oder gelöscht
|
||||
|
||||
## Aktiver Implementierungsstand
|
||||
|
||||
Die fachliche und technische Basis ist vollständig umgesetzt, dokumentiert, getestet (inkl. PIT-Mutationstests, Smoke-Tests, End-to-End-Tests) und freigegeben.
|
||||
|
||||
Der aktive Stand ist die Erweiterung **„Zusätzlicher KI-Provider Anthropic Claude über die native Messages API"**. Sie ist eine bewusst minimale Erweiterung des freigegebenen Basisstands.
|
||||
|
||||
### Ziel der aktiven Erweiterung
|
||||
- Der bestehende OpenAI-kompatible KI-Weg bleibt unverändert nutzbar.
|
||||
- Zusätzlich wird die **native Anthropic Messages API** als zweite, gleichwertig unterstützte Provider-Familie integriert.
|
||||
- Genau **ein** Provider ist pro Lauf aktiv – ausschließlich über Konfiguration ausgewählt.
|
||||
- **Kein** automatischer Fallback, **keine** Parallelnutzung, **keine** Profilverwaltung.
|
||||
- Der fachliche KI-Vertrag (`NamingProposal` aus Application-/Domain-Sicht) bleibt unverändert.
|
||||
- Bestehende Properties-Dateien aus dem Vorgängerstand werden beim ersten Start kontrolliert in das neue Schema migriert; vorher wird automatisch eine `.bak`-Sicherung angelegt.
|
||||
- Architekturgrenzen, Persistenzmodell, Statussemantik, Retry-Semantik, Exit-Code-Verhalten und Logging-Mindestumfang bleiben unverändert; sie werden ausschließlich um den Provider-Identifikator und die Provider-Auswahl ergänzt.
|
||||
|
||||
## Statussemantik
|
||||
|
||||
| Status | Bedeutung |
|
||||
|---|---|
|
||||
| `READY_FOR_AI` | Verarbeitbar, KI-Pfad noch nicht durchlaufen |
|
||||
| `FAILED_RETRYABLE` | Verarbeitbar, transient fehlgeschlagen |
|
||||
| `PROPOSAL_READY` | Eingangszustand für Dateinamensbildung und Zielkopie |
|
||||
| `SUCCESS` | Terminaler Enderfolg – nur nach Zielkopie und konsistenter Persistenz zulässig |
|
||||
| `FAILED_FINAL` | Terminal, wird nicht erneut fachlich verarbeitet |
|
||||
| `SKIPPED_ALREADY_PROCESSED` | Historisierter Skip für SUCCESS-Dokumente |
|
||||
| `SKIPPED_FINAL_FAILURE` | Historisierter Skip für FAILED_FINAL-Dokumente |
|
||||
|
||||
### SUCCESS-Bedingung (verbindlich)
|
||||
`SUCCESS` darf erst gesetzt werden, wenn:
|
||||
1. die Zielkopie erfolgreich geschrieben wurde,
|
||||
2. der finale Zieldateiname bestimmt ist,
|
||||
3. die Persistenz konsistent fortgeschrieben wurde.
|
||||
|
||||
### Führende Quelle des Benennungsvorschlags (verbindlich)
|
||||
- Die führende Quelle für Datum, Datumsquelle, validierten Titel und Reasoning ist der **neueste Versuchshistorieneintrag mit Status `PROPOSAL_READY`**.
|
||||
- Kein Rekonstruieren aus dem Dokument-Stammsatz.
|
||||
- Kein neuer KI-Aufruf, wenn bereits ein nutzbarer `PROPOSAL_READY`-Versuch vorliegt.
|
||||
- Status `PROPOSAL_READY` ohne lesbaren konsistenten Proposal-Versuch = dokumentbezogener technischer Fehler.
|
||||
- Proposal-Versuch mit fachlich unbrauchbarem Titel oder Datum = inkonsistenter Persistenzzustand = dokumentbezogener technischer Fehler.
|
||||
- Inkonsistente Proposal-Zustände werden **nicht stillschweigend geheilt**, sondern als technische Dokumentfehler behandelt.
|
||||
|
||||
## Retry-Semantik
|
||||
|
||||
### Deterministische Inhaltsfehler
|
||||
Deterministische Inhaltsfehler sind insbesondere:
|
||||
- kein brauchbarer Text
|
||||
- Seitenlimit überschritten
|
||||
- fachlich unbrauchbarer oder generischer Titel
|
||||
- vorhandenes, aber unbrauchbares KI-Datum
|
||||
|
||||
Regel:
|
||||
- **erster** historisierter deterministischer Inhaltsfehler → `FAILED_RETRYABLE`
|
||||
- **zweiter** historisierter deterministischer Inhaltsfehler → `FAILED_FINAL`
|
||||
|
||||
### Transiente technische Fehler
|
||||
- Transiente Fehler laufen über den Transientfehlerzähler im Dokument-Stammsatz.
|
||||
- Sie bleiben retryable bis der konfigurierte Grenzwert `max.retries.transient` erreicht ist.
|
||||
- Der Fehlversuch, der den Grenzwert **erreicht**, finalisiert den Dokumentstatus zu `FAILED_FINAL`.
|
||||
- `max.retries.transient` = **Integer >= 1**; der Wert `0` ist ungültige Startkonfiguration.
|
||||
- Die Klassifikation gilt provider-unabhängig: Technische Fehler aus dem aktiven KI-Provider werden in dieselbe transiente Kategorie eingeordnet wie bisher. Der inaktive Provider wird in keiner Fehlersituation als Backup verwendet.
|
||||
|
||||
### Technischer Sofort-Wiederholversuch
|
||||
- **Genau ein** zusätzlicher technischer Schreibversuch innerhalb desselben Dokumentlaufs.
|
||||
- **Ausschließlich** für Fehler beim physischen Zielkopierpfad.
|
||||
- Kein erneuter KI-Aufruf, keine erneute Fachableitung.
|
||||
- Zählt **nicht** zum laufübergreifenden Transientfehlerzähler.
|
||||
- Liefert genau ein dokumentbezogenes Ergebnis für Persistenz und Statusfortschreibung.
|
||||
|
||||
### Skip-Semantik
|
||||
- `SUCCESS` → in späteren Läufen `SKIPPED_ALREADY_PROCESSED` historisieren, keine Zähleränderung.
|
||||
- `FAILED_FINAL` → in späteren Läufen `SKIPPED_FINAL_FAILURE` historisieren, keine Zähleränderung.
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI`, `PROPOSAL_READY` → verarbeitbar.
|
||||
|
||||
## Logging-Mindestumfang
|
||||
|
||||
Folgende Informationen müssen nachvollziehbar geloggt werden:
|
||||
- Laufstart mit Lauf-ID
|
||||
- aktiver KI-Provider für den Lauf
|
||||
- Laufende
|
||||
- erkannte Quelldatei
|
||||
- Überspringen bereits erfolgreicher Dateien
|
||||
- Überspringen final fehlgeschlagener Dateien
|
||||
- erzeugter Zielname
|
||||
- Retry-Entscheidung
|
||||
- Fehler mit Klassifikation
|
||||
|
||||
### Korrelationsregel
|
||||
- Vor erfolgreicher Fingerprint-Ermittlung: Korrelation über Lauf-ID und Kandidatenbezug.
|
||||
- Nach erfolgreicher Fingerprint-Ermittlung: dokumentbezogene Logs enthalten den Fingerprint oder eine eindeutig ableitbare Referenz.
|
||||
- Keine neue Persistenz-Wahrheit oder zusätzliche Tracking-Ebene.
|
||||
|
||||
### Sensibilitätsregel für KI-Inhalte
|
||||
- Vollständige KI-Rohantwort: standardmäßig **nicht** ins Log, bleibt in SQLite.
|
||||
- Vollständiges KI-`reasoning`: standardmäßig **nicht** ins Log, bleibt in SQLite.
|
||||
- Freischaltung nur über expliziten booleschen Konfigurationswert.
|
||||
- Default: sicher / nicht loggen.
|
||||
- Die Sensibilitätsregel gilt provider-unabhängig.
|
||||
|
||||
## Verarbeitungsreihenfolge pro Dokument
|
||||
|
||||
1. Fingerprint berechnen
|
||||
2. Dokument-Stammsatz laden
|
||||
3. Terminale Skip-Fälle entscheiden (`SUCCESS` → `SKIPPED_ALREADY_PROCESSED`, `FAILED_FINAL` → `SKIPPED_FINAL_FAILURE`)
|
||||
4. Falls nötig: Pfad bis `PROPOSAL_READY` durchlaufen (inkl. KI-Aufruf über den aktiven Provider)
|
||||
5. Führenden `PROPOSAL_READY`-Versuch laden
|
||||
6. Finalen Basis-Dateinamen bilden
|
||||
7. Dubletten-Suffix im Zielordner bestimmen
|
||||
8. Zielkopie schreiben (temporäre Datei + finaler Move/Rename; bei Fehler: genau ein Sofort-Wiederholversuch)
|
||||
9. Retry-Entscheidung ableiten
|
||||
10. Neuen Versuch historisieren, Stammsatz konsistent fortschreiben
|
||||
|
||||
## Zielkopie-Semantik
|
||||
- Kopie zunächst in temporäre Zieldatei im Zielkontext
|
||||
- Finaler Move/Rename auf den geplanten Zieldateinamen
|
||||
- Quelldatei bleibt **immer unverändert**
|
||||
- Bei technischem Schreibfehler: genau ein Sofort-Wiederholversuch (nur Zielkopierpfad)
|
||||
- Bei Persistenzfehler nach erfolgreicher Zielkopie: kein `SUCCESS` setzen, best-effort Rückbau der Zielkopie, Ergebnis bleibt dokumentbezogener technischer Fehler
|
||||
|
||||
## Fehlersemantik
|
||||
- Technische Fehler → `FAILED_RETRYABLE`, Transientfehlerzähler +1
|
||||
- Bei Erreichen von `max.retries.transient` → `FAILED_FINAL`
|
||||
- Kein Abbruch des Batch-Laufs für andere Dokumente
|
||||
- Keine neue finale Fehlerkategorie
|
||||
- Vor-Fingerprint-Fehler werden **nicht** als SQLite-Versuch historisiert
|
||||
- Provider-spezifische Fehlerausprägungen (HTTP-Fehler, Auth-Fehler, Antwort-Schema-Fehler) werden im jeweiligen Adapter klassifiziert und auf die bestehenden Fehlerkategorien abgebildet. Es entstehen keine neuen Fehlerklassen.
|
||||
|
||||
## Persistenz
|
||||
|
||||
Zwei-Ebenen-Modell bleibt unverändert – keine dritte Wahrheitsquelle.
|
||||
|
||||
**Dokument-Stammsatz** enthält u.a.:
|
||||
- letzten Zielpfad, letzten Zieldateinamen
|
||||
- Inhaltsfehler- und Transientfehlerzähler
|
||||
- Gesamtstatus
|
||||
|
||||
**Versuchshistorie** enthält u.a.:
|
||||
- finalen Zieldateinamen
|
||||
- Fehlerklasse, Fehlermeldung, Retryable-Flag
|
||||
- **Provider-Identifikator des aktiven KI-Providers für den Versuch**
|
||||
|
||||
**Invariante:** Der führende `PROPOSAL_READY`-Versuch wird nicht überschrieben.
|
||||
Jeder Lauf erzeugt einen **zusätzlichen** neuen Versuchseintrag.
|
||||
|
||||
**Rückwärtsverträglichkeit:** Bestehende Datenbestände bleiben lesbar, fortschreibbar und korrekt interpretierbar. Schema-Erweiterungen sind additiv mit definierten Defaultwerten für historische Versuche ohne Provider-Identifikator.
|
||||
|
||||
## Naming-Regel (verbindlich für alle Arbeitspakete)
|
||||
In Implementierungen, Kommentaren und JavaDoc dürfen **keine** Meilenstein- oder
|
||||
Arbeitspaket-Bezeichner erscheinen:
|
||||
|
||||
- Verboten: `M1`, `M2`, …, `M8`
|
||||
- Verboten: `AP-001`, `AP-002`, … `AP-00x`
|
||||
- Verboten: Versionsbezeichner wie `V1.0`, `V1.1` in Code/JavaDoc
|
||||
|
||||
Stattdessen werden **zeitlose technische Bezeichnungen** verwendet.
|
||||
Bestehende Kommentare mit solchen Bezeichnern, die durch eigene Änderungen berührt werden, sind zu ersetzen.
|
||||
|
||||
## Arbeitsweise
|
||||
- Arbeite immer nur im **explizit aktiven Meilenstein** und im **explizit aktiven Arbeitspaket**
|
||||
- **Kein Vorgriff** auf spätere Meilensteine oder Arbeitspakete
|
||||
- Arbeite immer nur im **explizit aktiven Arbeitspaket**
|
||||
- **Kein Vorgriff** auf spätere Arbeitspakete
|
||||
- Änderungen klein, fokussiert und architekturtreu halten
|
||||
- Keine unnötigen Umbenennungen, keine großflächigen Refactorings ohne Not
|
||||
- Vor Änderungen zuerst die betroffenen Dateien und Abhängigkeiten verstehen
|
||||
- **Keine Annahmen über Dateipfade.** Typen und Klassen werden per Suche nach Typname gefunden, nicht über vermutete Pfade.
|
||||
- Keine Vermutungen: Bei echter Unklarheit oder Dokumentkonflikten knapp nachfragen oder den Konflikt benennen
|
||||
- Keine stillen Änderungen am bestehenden OpenAI-kompatiblen KI-Weg
|
||||
|
||||
## Definition of Done pro Arbeitspaket
|
||||
Ein Arbeitspaket ist erst fertig, wenn:
|
||||
- der Zielumfang des aktuellen Arbeitspakets vollständig umgesetzt ist
|
||||
- der Stand konsistent, fehlerfrei und buildbar ist
|
||||
- Implementierung, Konfiguration, JavaDoc und Tests ergänzt sind, **soweit für den Stand sinnvoll**
|
||||
- keine Inhalte späterer Meilensteine vorweggenommen wurden
|
||||
- keine Inhalte späterer Arbeitspakete vorweggenommen wurden
|
||||
- der Zwischenstand in sich geschlossen und übergabefähig ist
|
||||
|
||||
## Pflicht-Output-Format nach jedem Arbeitspaket
|
||||
|
||||
```
|
||||
- Scope erfüllt: ja/nein
|
||||
- Geänderte Dateien:
|
||||
- <Dateipfad>
|
||||
- ...
|
||||
- Build-Kommando: <verwendetes Kommando>
|
||||
- Build-Status: ERFOLGREICH / FEHLGESCHLAGEN
|
||||
- Offene Punkte: keine / <Beschreibung>
|
||||
- Risiken: keine / <Beschreibung>
|
||||
```
|
||||
|
||||
## Qualitäts- und Prüfreihenfolge
|
||||
- Nur den für das aktuelle Arbeitspaket nötigen Scope ändern
|
||||
- Nach Änderungen den kleinsten sinnvollen Build-/Test-Umfang ausführen
|
||||
- Build-Validierung vom Parent-Root:
|
||||
`.\mvnw.cmd clean verify -pl pdf-umbenenner-domain,pdf-umbenenner-application,pdf-umbenenner-adapter-out,pdf-umbenenner-adapter-in-cli,pdf-umbenenner-bootstrap --also-make`
|
||||
- Schlägt der Build fehl: Fehler beheben, erneut bauen, erst dann weiter
|
||||
- Vor Abschluss sicherstellen, dass der relevante Maven-Reactor-Stand fehlerfrei ist
|
||||
- Fehler nicht kaschieren; Ursachen sauber beheben oder offen benennen
|
||||
|
||||
## Wichtige Betriebsregeln
|
||||
- Ungültige Startkonfiguration verhindert den Verarbeitungslauf und führt zu Exit-Code `1`
|
||||
- Eine ungültige oder fehlende Provider-Auswahl ist eine ungültige Startkonfiguration
|
||||
- Run-Lock verhindert parallele Instanzen; wenn bereits eine Instanz läuft, beendet sich die neue Instanz sofort
|
||||
- Exit-Code `0`: Lauf technisch ordnungsgemäß ausgeführt, auch wenn einzelne Dateien fachlich oder transient fehlgeschlagen sind
|
||||
- Exit-Code `1`: harter Start-/Bootstrap-Fehler
|
||||
- Umgebungsvariable hat Vorrang vor Properties beim API-Key
|
||||
- API-Schlüssel: pro Provider eine eigene Umgebungsvariable, Vorrang vor Properties derselben Provider-Familie. Schlüssel verschiedener Provider werden niemals vermischt.
|
||||
- Dokumentbezogene Fehler führen **nicht** zu Exit-Code `1`
|
||||
|
||||
## Konfigurationsparameter
|
||||
Verbindlich zweckmäßige Parameter:
|
||||
- `source.folder` – Quellordner
|
||||
- `target.folder` – Zielordner (muss vorhanden oder anlegbar sein, Schreibzugriff erforderlich)
|
||||
- `sqlite.file` – SQLite-Datenbankdatei
|
||||
- `ai.provider.active` – aktiver KI-Provider (Pflicht; zulässige Werte sind die Bezeichner der unterstützten Provider-Familien)
|
||||
- `max.retries.transient` – max. historisierte transiente Fehlversuche pro Fingerprint (**Integer >= 1**, `0` ist ungültig)
|
||||
- `max.pages` – Seitenlimit
|
||||
- `max.text.characters` – maximale Zeichenzahl für KI-Eingabe
|
||||
- `prompt.template.file` – externe Prompt-Datei
|
||||
- `log.ai.sensitive` – sensible KI-Logausgabe freischalten (Boolean, Default: `false`)
|
||||
- `runtime.lock.file` – Lock-Datei (optional)
|
||||
- `log.directory` – Log-Verzeichnis (optional)
|
||||
|
||||
Pro Provider-Familie existiert ein eigener Parameter-Namensraum mit zweckmäßig:
|
||||
- Modellname
|
||||
- API-Schlüssel (Umgebungsvariable hat Vorrang)
|
||||
- Timeout
|
||||
- Basis-URL (optional, wo betrieblich sinnvoll)
|
||||
|
||||
Konkretes Schema (zweckmäßig):
|
||||
|
||||
```properties
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
ai.provider.claude.baseUrl=...
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
### Migration historischer Konfiguration
|
||||
Bestehende Properties-Dateien des Vorgängerstands (mit flachen Schlüsseln wie `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key`) werden beim ersten Start erkannt und kontrolliert in das neue Schema überführt.
|
||||
|
||||
Verbindlicher Ablauf:
|
||||
1. Legacy-Form erkennen
|
||||
2. **`.bak`-Sicherung** der Originaldatei anlegen
|
||||
3. Inhalt in das neue Schema überführen
|
||||
- Legacy-Werte landen im Namensraum **`openai-compatible`**
|
||||
- `ai.provider.active` wird auf `openai-compatible` gesetzt
|
||||
4. Datei in-place schreiben
|
||||
5. Datei erneut laden und validieren
|
||||
6. Erst danach den normalen Lauf fortsetzen
|
||||
|
||||
Alte und neue Struktur sind **kein** dauerhaft gleichrangiges Endformat.
|
||||
|
||||
## Nicht-Ziele / Verbote
|
||||
- kein Web-UI
|
||||
@@ -119,4 +349,15 @@ Ein Arbeitspaket ist erst fertig, wenn:
|
||||
- keine interne Scheduler-Logik
|
||||
- keine Architekturbrüche
|
||||
- keine neuen Bibliotheken oder Frameworks ohne klare Notwendigkeit und Begründung
|
||||
- keine stillen Änderungen an Provider-Bindung oder Architekturprinzipien
|
||||
- **keine** automatische Fallback-Umschaltung zwischen KI-Providern
|
||||
- **keine** parallele Nutzung mehrerer KI-Provider in einem Lauf
|
||||
- **keine** Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
- **keine** Provider-Familien jenseits der explizit unterstützten (OpenAI-kompatibel, Anthropic Messages API)
|
||||
- keine stillen Änderungen am bestehenden OpenAI-kompatiblen KI-Weg
|
||||
- kein Sofort-Wiederholversuch außerhalb des Zielkopierpfads
|
||||
- keine Reporting- oder Statistikfunktionen
|
||||
- keine neue dritte Persistenz-Wahrheitsquelle für Retry-Entscheidungen
|
||||
- keine neue Fachfunktionalität jenseits des definierten Zielbilds
|
||||
- kein großflächiges Refactoring ohne nachweisbaren Defektbezug
|
||||
- keine spekulativen Umbauten ohne konkreten Qualitäts- oder Konsistenzbezug
|
||||
- keine Vermischung von API-Schlüsseln verschiedener Provider-Familien
|
||||
|
||||
195
README.md
Normal file
195
README.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# PDF-Umbenenner
|
||||
|
||||
Ein lokal gestartetes Java-Programm zur KI-gestützten Umbenennung bereits OCR-verarbeiteter, durchsuchbarer PDF-Dateien.
|
||||
|
||||
Die Anwendung liest PDF-Dateien aus einem konfigurierbaren Quellordner, extrahiert den Text, ermittelt daraus per KI einen normierten Dateinamen und legt **eine Kopie** im Zielordner ab. Die Quelldateien bleiben unverändert.
|
||||
|
||||
## Zielbild
|
||||
|
||||
Der PDF-Umbenenner ist bewusst als schlanke Batch-Anwendung ausgelegt:
|
||||
|
||||
- **Java 21**
|
||||
- **Maven Multi-Module**
|
||||
- **ausführbares Standalone-JAR**
|
||||
- **lokaler Start**, z. B. über den **Windows Task Scheduler**
|
||||
- **kein Webserver**
|
||||
- **kein Applikationsserver**
|
||||
- **keine Dauerlauf-Anwendung**
|
||||
- **kein interner Scheduler**
|
||||
- **SQLite** als lokaler Persistenzspeicher
|
||||
- **Log4j2** für Logging
|
||||
- strikte **hexagonale Architektur / Ports and Adapters**
|
||||
|
||||
## Fachlicher Überblick
|
||||
|
||||
Die Anwendung verarbeitet Dokumente in einem robusten, nachvollziehbaren Ablauf:
|
||||
|
||||
1. Quellordner lesen
|
||||
2. PDF-Kandidaten erkennen
|
||||
3. Fingerprint der Quelldatei bestimmen
|
||||
4. bereits erfolgreich verarbeitete bzw. final fehlgeschlagene Dokumente überspringen
|
||||
5. PDF-Text extrahieren
|
||||
6. KI-basierten Benennungsvorschlag erzeugen
|
||||
7. normierten Zieldateinamen bilden
|
||||
8. Kollisionen im Zielordner über Dubletten-Suffixe auflösen
|
||||
9. Kopie im Zielordner ablegen
|
||||
10. Ergebnis und Versuchshistorie in SQLite persistieren
|
||||
|
||||
## Dateinamensregeln
|
||||
|
||||
Das Zielformat lautet:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
Bei Namenskollisionen werden Suffixe direkt vor `.pdf` ergänzt:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel(1).pdf
|
||||
YYYY-MM-DD - Titel(2).pdf
|
||||
```
|
||||
|
||||
Wichtige Regeln:
|
||||
|
||||
- die **20 Zeichen** beziehen sich nur auf den **Basistitel**
|
||||
- das Dubletten-Suffix zählt **nicht** zu diesen 20 Zeichen
|
||||
- Titel werden auf **Deutsch** erzeugt
|
||||
- Eigennamen bleiben unverändert
|
||||
- Quelldateien werden **nie** überschrieben, verschoben oder verändert
|
||||
|
||||
## KI-Anbindung
|
||||
|
||||
Die KI-Anbindung ist konfigurationsgetrieben. Der fachliche Vertrag bleibt unabhängig vom Anbieter gleich: Aus dem Dokumentinhalt wird ein strukturierter Benennungsvorschlag abgeleitet, aus dem die Anwendung den finalen Dateinamen bildet.
|
||||
|
||||
Der aktuelle Stand unterstützt mehrere Provider über Konfiguration, darunter:
|
||||
|
||||
- **OpenAI-kompatible Endpunkte**
|
||||
- **Claude API**
|
||||
|
||||
Die Provider-Auswahl ist **Konfiguration**, keine Architekturentscheidung.
|
||||
|
||||
## Wichtige Annahmen und Grenzen
|
||||
|
||||
- Die Anwendung erwartet **bereits OCR-verarbeitete, durchsuchbare PDFs**.
|
||||
- Nicht durchsuchbare oder inhaltlich nicht brauchbare PDFs werden als Fehler behandelt.
|
||||
- Mehrdeutige Dokumente erzeugen **kein unsicheres Ergebnis**.
|
||||
- Erfolgreich verarbeitete Dateien werden in späteren Läufen nicht erneut verarbeitet.
|
||||
- Final fehlgeschlagene Dateien werden in späteren Läufen übersprungen.
|
||||
|
||||
## Architektur
|
||||
|
||||
Das Projekt ist strikt nach **Ports and Adapters / Hexagonal Architecture** aufgebaut.
|
||||
|
||||
### Module
|
||||
|
||||
- `pdf-umbenenner-domain`
|
||||
- `pdf-umbenenner-application`
|
||||
- `pdf-umbenenner-adapter-in-cli`
|
||||
- `pdf-umbenenner-adapter-out`
|
||||
- `pdf-umbenenner-bootstrap`
|
||||
|
||||
### Grundprinzipien
|
||||
|
||||
- Abhängigkeiten zeigen immer **nach innen**
|
||||
- Domain kennt **keine Infrastruktur**
|
||||
- externe Zugriffe erfolgen ausschließlich über **Ports**
|
||||
- technische Implementierungen liegen in **Adaptern**
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung
|
||||
|
||||
## Konfiguration
|
||||
|
||||
Die Anwendung wird über eine `.properties`-Datei konfiguriert.
|
||||
|
||||
Typische Bereiche sind:
|
||||
|
||||
- Quellordner
|
||||
- Zielordner
|
||||
- SQLite-Datei
|
||||
- KI-Provider und Modell
|
||||
- Timeout
|
||||
- Seitenlimit
|
||||
- Textlimit für KI-Aufrufe
|
||||
- Prompt-Datei
|
||||
- Logging
|
||||
|
||||
Für einen lokalen Einstieg dient die Beispielkonfiguration unter:
|
||||
|
||||
```text
|
||||
config/application-local.example.properties
|
||||
```
|
||||
|
||||
## Build
|
||||
|
||||
Projektweit:
|
||||
|
||||
```bash
|
||||
./mvnw clean verify
|
||||
```
|
||||
|
||||
Unter Windows:
|
||||
|
||||
```powershell
|
||||
.\mvnw.cmd clean verify
|
||||
```
|
||||
|
||||
## Start
|
||||
|
||||
Das ausführbare Artefakt wird im Bootstrap-Modul erzeugt. Der Start erfolgt als normales Java-Programm:
|
||||
|
||||
```bash
|
||||
java -jar <bootstrap-jar>.jar
|
||||
```
|
||||
|
||||
Die konkrete JAR-Datei hängt vom aktuellen Build-Stand ab.
|
||||
|
||||
## Logging, Status und Nachvollziehbarkeit
|
||||
|
||||
Der PDF-Umbenenner ist auf Nachvollziehbarkeit und Wiederholbarkeit ausgelegt:
|
||||
|
||||
- persistente Dokumenthistorie in **SQLite**
|
||||
- Status- und Retry-Semantik für robuste Batch-Läufe
|
||||
- Idempotenz über inhaltsbasierten Fingerprint
|
||||
- Logging über **Log4j2**
|
||||
- Schutz sensibler KI-Inhalte im Log
|
||||
|
||||
## Dokumentation im Repository
|
||||
|
||||
Die maßgeblichen Dokumente sind:
|
||||
|
||||
- `CLAUDE.md`
|
||||
- `docs/specs/technik-und-architektur.md`
|
||||
- `docs/specs/fachliche-anforderungen.md`
|
||||
- `docs/specs/meilensteine.md`
|
||||
- `docs/workpackages/...`
|
||||
|
||||
Empfohlene Leserichtung:
|
||||
|
||||
1. `CLAUDE.md`
|
||||
2. technische Zielarchitektur
|
||||
3. fachliche Anforderungen
|
||||
4. Meilensteine
|
||||
5. aktives Arbeitspaket
|
||||
|
||||
## Entwicklungsleitplanken
|
||||
|
||||
- kleine, fokussierte Änderungen
|
||||
- keine stillen Annahmen bei Dokumentkonflikten
|
||||
- keine unnötigen Refactorings
|
||||
- Architekturtreue hat Vorrang
|
||||
- keine Meilenstein- oder Arbeitspaket-Bezeichner in Produktionscode, Kommentaren oder JavaDoc
|
||||
|
||||
## Status des Projekts
|
||||
|
||||
Das Repository verfolgt einen inkrementellen, meilensteinbasierten Ausbau. Der aktuelle Produktstand baut auf einem vollständig implementierten Kern für:
|
||||
|
||||
- Konfiguration und Startvalidierung
|
||||
- Quellordner-Scan und PDF-Textauslese
|
||||
- Fingerprint, SQLite-Persistenz und Idempotenz
|
||||
- KI-Integration für Benennungsvorschläge
|
||||
- Dateinamensbildung und Zielkopie
|
||||
- Retry-Logik, Logging und betriebliche Robustheit
|
||||
|
||||
## Lizenz / Nutzung
|
||||
|
||||
Falls für dieses Repository eine konkrete Lizenz vorgesehen ist, sollte sie hier ergänzt werden.
|
||||
@@ -1,21 +1,89 @@
|
||||
# PDF Umbenenner Local Configuration Example
|
||||
# AP-005: Copy this file to config/application.properties and adjust values for local development
|
||||
# PDF Umbenenner – Konfigurationsbeispiel fuer lokale Entwicklung
|
||||
# Kopiere diese Datei nach config/application.properties und passe die Werte an.
|
||||
|
||||
# Mandatory M1 properties
|
||||
# ---------------------------------------------------------------------------
|
||||
# Pflichtparameter (allgemein)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Quellordner: Ordner, aus dem OCR-verarbeitete PDF-Dateien gelesen werden.
|
||||
# Der Ordner muss vorhanden und lesbar sein.
|
||||
source.folder=./work/local/source
|
||||
target.folder=./work/local/target
|
||||
sqlite.file=./work/local/pdf-umbenenner.db
|
||||
api.baseUrl=http://localhost:8080/api
|
||||
api.model=gpt-4o-mini
|
||||
api.timeoutSeconds=30
|
||||
max.retries.transient=3
|
||||
max.pages=10
|
||||
max.text.characters=5000
|
||||
prompt.template.file=./config/prompts/local-template.txt
|
||||
|
||||
# Optional properties
|
||||
runtime.lock.file=./work/local/lock.pid
|
||||
# Zielordner: Ordner, in den die umbenannten Kopien abgelegt werden.
|
||||
# Wird automatisch angelegt, wenn er noch nicht existiert.
|
||||
target.folder=./work/local/target
|
||||
|
||||
# SQLite-Datenbankdatei fuer Bearbeitungsstatus und Versuchshistorie.
|
||||
# Das uebergeordnete Verzeichnis muss vorhanden sein.
|
||||
sqlite.file=./work/local/pdf-umbenenner.db
|
||||
|
||||
# Maximale Anzahl historisierter transienter Fehlversuche pro Dokument.
|
||||
# Muss eine ganze Zahl >= 1 sein.
|
||||
max.retries.transient=3
|
||||
|
||||
# Maximale Seitenzahl pro Dokument. Dokumente mit mehr Seiten werden als
|
||||
# deterministischer Inhaltsfehler behandelt (kein KI-Aufruf).
|
||||
max.pages=10
|
||||
|
||||
# Maximale Zeichenanzahl des Dokumenttexts, der an die KI gesendet wird.
|
||||
max.text.characters=5000
|
||||
|
||||
# Pfad zur externen Prompt-Datei. Der Dateiname dient als Prompt-Identifikator
|
||||
# in der Versuchshistorie.
|
||||
prompt.template.file=./config/prompts/template.txt
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Optionale Parameter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Pfad zur Lock-Datei fuer den Startschutz (verhindert parallele Instanzen).
|
||||
runtime.lock.file=./work/local/pdf-umbenenner.lock
|
||||
|
||||
# Log-Verzeichnis. Wird weggelassen, schreibt Log4j2 in ./logs/.
|
||||
log.directory=./work/local/logs
|
||||
|
||||
# Log-Level (DEBUG, INFO, WARN, ERROR). Standard ist INFO.
|
||||
log.level=INFO
|
||||
# api.key can also be set via environment variable PDF_UMBENENNER_API_KEY
|
||||
api.key=your-local-api-key-here
|
||||
|
||||
# Sensible KI-Inhalte (vollstaendige Rohantwort und Reasoning) ins Log schreiben.
|
||||
# Erlaubte Werte: true oder false. Standard ist false (geschuetzt).
|
||||
log.ai.sensitive=false
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Aktiver KI-Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
# Erlaubte Werte: openai-compatible, claude
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# OpenAI-kompatibler Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
# Basis-URL des KI-Dienstes (ohne Pfadsuffix wie /chat/completions).
|
||||
ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1
|
||||
|
||||
# Modellname des KI-Dienstes.
|
||||
ai.provider.openai-compatible.model=gpt-4o-mini
|
||||
|
||||
# HTTP-Timeout fuer KI-Anfragen in Sekunden (muss > 0 sein).
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
|
||||
# API-Schluessel.
|
||||
# Vorrangreihenfolge: OPENAI_COMPATIBLE_API_KEY (Umgebungsvariable) >
|
||||
# PDF_UMBENENNER_API_KEY (veraltete Umgebungsvariable, weiterhin akzeptiert) >
|
||||
# ai.provider.openai-compatible.apiKey (dieser Wert)
|
||||
ai.provider.openai-compatible.apiKey=your-openai-api-key-here
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)
|
||||
# ---------------------------------------------------------------------------
|
||||
# Basis-URL (optional; Standard: https://api.anthropic.com)
|
||||
# ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
|
||||
# Modellname (z. B. claude-3-5-sonnet-20241022)
|
||||
# ai.provider.claude.model=claude-3-5-sonnet-20241022
|
||||
|
||||
# HTTP-Timeout fuer KI-Anfragen in Sekunden (muss > 0 sein).
|
||||
# ai.provider.claude.timeoutSeconds=60
|
||||
|
||||
# API-Schluessel. Die Umgebungsvariable ANTHROPIC_API_KEY hat Vorrang.
|
||||
# ai.provider.claude.apiKey=
|
||||
|
||||
@@ -1,21 +1,46 @@
|
||||
# PDF Umbenenner Test Configuration Example
|
||||
# AP-005: Copy this file to config/application.properties and adjust values for testing
|
||||
# PDF Umbenenner – Konfigurationsbeispiel fuer Testlaeufe
|
||||
# Kopiere diese Datei nach config/application.properties und passe die Werte an.
|
||||
# Diese Vorlage enthaelt kuerzere Timeouts und niedrigere Limits fuer Testlaeufe.
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Pflichtparameter (allgemein)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Mandatory M1 properties
|
||||
source.folder=./work/test/source
|
||||
target.folder=./work/test/target
|
||||
sqlite.file=./work/test/pdf-umbenenner-test.db
|
||||
api.baseUrl=http://localhost:8081/api
|
||||
api.model=gpt-4o-mini-test
|
||||
api.timeoutSeconds=10
|
||||
|
||||
max.retries.transient=1
|
||||
max.pages=5
|
||||
max.text.characters=2000
|
||||
prompt.template.file=./config/prompts/test-template.txt
|
||||
prompt.template.file=./config/prompts/template.txt
|
||||
|
||||
# Optional properties
|
||||
runtime.lock.file=./work/test/lock.pid
|
||||
# ---------------------------------------------------------------------------
|
||||
# Optionale Parameter
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
runtime.lock.file=./work/test/pdf-umbenenner.lock
|
||||
log.directory=./work/test/logs
|
||||
log.level=DEBUG
|
||||
# api.key can also be set via environment variable PDF_UMBENENNER_API_KEY
|
||||
api.key=test-api-key-placeholder
|
||||
log.ai.sensitive=false
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Aktiver KI-Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# OpenAI-kompatibler Provider
|
||||
# ---------------------------------------------------------------------------
|
||||
ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1
|
||||
ai.provider.openai-compatible.model=gpt-4o-mini
|
||||
ai.provider.openai-compatible.timeoutSeconds=10
|
||||
ai.provider.openai-compatible.apiKey=test-api-key-placeholder
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)
|
||||
# ---------------------------------------------------------------------------
|
||||
# ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
# ai.provider.claude.model=claude-3-5-sonnet-20241022
|
||||
# ai.provider.claude.timeoutSeconds=60
|
||||
# ai.provider.claude.apiKey=your-anthropic-api-key-here
|
||||
|
||||
@@ -1 +1,22 @@
|
||||
This is a test prompt template for AP-006 validation.
|
||||
Du bist ein Assistent zur automatischen Benennung gescannter PDF-Dokumente.
|
||||
|
||||
Analysiere den folgenden Dokumenttext und ermittle:
|
||||
|
||||
1. Einen inhaltlich passenden deutschen Titel (maximal 20 Zeichen, nur Buchstaben und Leerzeichen, keine Abkürzungen, keine generischen Bezeichnungen wie "Dokument", "Datei", "Scan" oder "PDF")
|
||||
2. Das relevanteste Datum des Dokuments
|
||||
|
||||
Datumsermittlung nach Priorität:
|
||||
- Rechnungsdatum
|
||||
- Dokumentdatum
|
||||
- Ausstellungsdatum oder Bescheiddatum
|
||||
- Schreibdatum oder Ende eines Leistungszeitraums
|
||||
- Kein Datum angeben, wenn kein belastbares Datum eindeutig ableitbar ist
|
||||
|
||||
Titelregeln:
|
||||
- Titel auf Deutsch formulieren
|
||||
- Eigennamen (Personen, Firmen, Orte) unverändert übernehmen
|
||||
- Maximal 20 Zeichen (nur der Basistitel, ohne Datumspräfix)
|
||||
- Keine Sonderzeichen außer Leerzeichen
|
||||
- Eindeutig und verständlich, nicht generisch
|
||||
|
||||
Wenn das Dokument nicht eindeutig interpretierbar ist, beschreibe dies im Reasoning.
|
||||
|
||||
175
docs/befundliste.md
Normal file
175
docs/befundliste.md
Normal file
@@ -0,0 +1,175 @@
|
||||
# Befundliste – Integrierte Gesamtprüfung und Freigabe des Endstands
|
||||
|
||||
**Erstellt:** 2026-04-08
|
||||
**Aktualisiert:** 2026-04-08 (Naming-Convention-Bereinigung B1 abgeschlossen, finale Freigabe)
|
||||
**Grundlage:** Vollständiger Maven-Reactor-Build, Unit-Tests, E2E-Tests, Integrationstests (Smoke),
|
||||
PIT-Mutationsanalyse, Code-Review gegen verbindliche Spezifikationen (technik-und-architektur.md,
|
||||
fachliche-anforderungen.md, CLAUDE.md)
|
||||
|
||||
---
|
||||
|
||||
## Ausgeführte Prüfungen
|
||||
|
||||
| Prüfbereich | Ausgeführt | Ergebnis |
|
||||
|---|---|---|
|
||||
| Maven-Reactor-Build (clean verify, alle Module) | ja | GRÜN |
|
||||
| Unit-Tests (Domain, Application, Adapter-out, Bootstrap) | ja | GRÜN |
|
||||
| E2E-Tests (BatchRunEndToEndTest, 11 Szenarien) | ja | GRÜN |
|
||||
| Integrationstests / Smoke-IT (ExecutableJarSmokeTestIT, 2 Tests) | ja | GRÜN |
|
||||
| PIT-Mutationsanalyse (alle Module) | ja | siehe Einzelbefunde |
|
||||
| Hexagonale Architektur – Domain-Isolation | ja | GRÜN |
|
||||
| Hexagonale Architektur – Port-Verträge (kein Path/NIO/JDBC) | ja | GRÜN |
|
||||
| Hexagonale Architektur – keine Adapter-zu-Adapter-Abhängigkeiten | ja | GRÜN |
|
||||
| Statusmodell (8 Werte, Semantik laut CLAUDE.md) | ja | GRÜN |
|
||||
| Naming-Convention-Regel (kein M1–M8, kein AP-xxx im Code) | ja | GRÜN |
|
||||
| Logging-Sensibilitätsregel (log.ai.sensitive) | ja | GRÜN |
|
||||
| Exit-Code-Semantik (0 / 1) | ja | GRÜN |
|
||||
| Konfigurationsbeispiele (Pflicht- und Optionalparameter) | ja | GRÜN |
|
||||
| Betriebsdokumentation (docs/betrieb.md) | ja | GRÜN |
|
||||
| Prompt-Template im Repository | ja | GRÜN |
|
||||
| Rückwärtsverträglichkeit M4–M7 (Statusmodell, Schema) | ja (statisch) | GRÜN |
|
||||
|
||||
---
|
||||
|
||||
## Grüne Bereiche (keine Befunde)
|
||||
|
||||
### Build und Tests
|
||||
|
||||
- Vollständiger Maven-Reactor-Build erfolgreich (`BUILD SUCCESS`, Gesamtlaufzeit ~4 Minuten)
|
||||
- **827+ Tests** bestanden, 0 Fehler, 0 übersprungen:
|
||||
- Domain: 227 Tests
|
||||
- Application: 295 Tests
|
||||
- Adapter-out: 227 Tests
|
||||
- Bootstrap (Unit): 76 Tests
|
||||
- Smoke-IT: 2 Tests
|
||||
|
||||
### E2E-Szenarien (BatchRunEndToEndTest)
|
||||
|
||||
Alle geforderten Kernszenarien aus der E2E-Testbasis sind abgedeckt und grün:
|
||||
|
||||
- Happy-Path: zwei Läufe → `SUCCESS`
|
||||
- Deterministischer Inhaltsfehler: zwei Läufe → `FAILED_FINAL`
|
||||
- Transienter KI-Fehler → `FAILED_RETRYABLE`
|
||||
- Skip nach `SUCCESS` → `SKIPPED_ALREADY_PROCESSED`
|
||||
- Skip nach `FAILED_FINAL` → `SKIPPED_FINAL_FAILURE`
|
||||
- `PROPOSAL_READY`-Finalisierung ohne erneuten KI-Aufruf im zweiten Lauf
|
||||
- Zielkopierfehler mit Sofort-Wiederholversuch → `SUCCESS`
|
||||
- Transiente Fehler über mehrere Läufe → Ausschöpfung → `FAILED_FINAL`
|
||||
- Zielkopierfehler beide Versuche gescheitert → `FAILED_RETRYABLE`
|
||||
- Zwei verschiedene Dokumente, gleicher Vorschlagsname → Dubletten-Suffix `(1)`
|
||||
- Mixed-Batch: ein Erfolg, ein Inhaltsfehler → Batch-Outcome `SUCCESS` (Exit-Code 0)
|
||||
|
||||
### Hexagonale Architektur
|
||||
|
||||
- **Domain** vollständig infrastrukturfrei: keine Imports aus `java.nio`, `java.io.File`,
|
||||
JDBC, Log4j oder HTTP-Bibliotheken
|
||||
- **Port-Verträge** (alle Interfaces in `application.port.out`) enthalten keine `Path`-,
|
||||
`File`-, NIO- oder JDBC-Typen; nur Domain-Typen werden in Signaturen verwendet
|
||||
- **Keine Adapter-zu-Adapter-Abhängigkeiten** in `adapter-out`: kein Modul referenziert
|
||||
ein anderes Adapter-Implementierungspaket direkt
|
||||
- **Abhängigkeitsrichtung** korrekt: adapter-out → application → domain
|
||||
|
||||
### Fachregeln
|
||||
|
||||
- Statusmodell vollständig (8 Werte: `READY_FOR_AI`, `PROPOSAL_READY`, `SUCCESS`,
|
||||
`FAILED_RETRYABLE`, `FAILED_FINAL`, `SKIPPED_ALREADY_PROCESSED`,
|
||||
`SKIPPED_FINAL_FAILURE`, `PROCESSING`)
|
||||
- Retry-Semantik korrekt implementiert (deterministisch 1 Retry → final;
|
||||
transient bis `max.retries.transient`)
|
||||
- Skip-Semantik korrekt (SUCCESS → Skip, FAILED_FINAL → Skip, keine Zähleränderung)
|
||||
- Führende Proposal-Quelle: `PROPOSAL_READY`-Versuch wird korrekt als Quelle verwendet
|
||||
- SUCCESS-Bedingung: erst nach Zielkopie und konsistenter Persistenz
|
||||
|
||||
### Logging und Sensibilität
|
||||
|
||||
- `log.ai.sensitive`-Mechanismus vollständig implementiert und getestet
|
||||
- Default `false` (sicher): KI-Rohantwort und Reasoning nicht im Log
|
||||
- Persistenz in SQLite unabhängig von dieser Einstellung
|
||||
- Konfiguration in beiden Beispieldateien dokumentiert
|
||||
|
||||
### Konfiguration und Dokumentation
|
||||
|
||||
- `config/application-local.example.properties`: vollständig, alle Pflicht- und
|
||||
Optionalparameter vorhanden
|
||||
- `config/application-test.example.properties`: vollständig
|
||||
- `config/prompts/template.txt`: Prompt-Template im Repository vorhanden
|
||||
- `docs/betrieb.md`: Betriebsdokumentation mit Start, Konfiguration, Exit-Codes,
|
||||
Retry-Grundverhalten, Logging-Sensibilität
|
||||
- Konfigurationsparameter-Namen in Dokumentation und Code konsistent
|
||||
|
||||
### Exit-Code-Semantik
|
||||
|
||||
- Exit-Code `0`: technisch ordnungsgemäßer Lauf (auch bei Teilfehlern einzelner Dokumente)
|
||||
- Exit-Code `1`: harte Start-/Bootstrap-Fehler, ungültige Konfiguration, Lock-Fehler
|
||||
- Implementierung in `PdfUmbenennerApplication` und `BootstrapRunner` korrekt
|
||||
|
||||
### PIT-Mutationsanalyse (Gesamtstand)
|
||||
|
||||
- Domain: 83 % Mutation Kill Rate
|
||||
- Adapter-out: 83 % Mutation Kill Rate
|
||||
- Application: 87 % Test Strength
|
||||
- Bootstrap: 76 % Kill Rate (34 Mutationen, 26 getötet)
|
||||
|
||||
---
|
||||
|
||||
## Abgeschlossene Punkte
|
||||
|
||||
### B1 – Naming-Convention-Verletzungen in Code, Tests und Konfiguration (CLAUDE.md § Naming-Regel)
|
||||
|
||||
**Themenbereich:** Dokumentation / Codequalität
|
||||
**Norm:** CLAUDE.md verbietet explizit Meilenstein- (M1–M8) und Arbeitspaket-Bezeichner (AP-xxx)
|
||||
in Implementierungen, Kommentaren und JavaDoc.
|
||||
**Status:** **BEHOBEN** – alle 43 Treffer in `.java`-Dateien sowie der Kommentarheader in
|
||||
`config/application.properties` wurden durch zeitlose technische Formulierungen ersetzt.
|
||||
|
||||
---
|
||||
|
||||
## Dokumentierte Randpunkte (kein Handlungsbedarf, freigabekompatibel)
|
||||
|
||||
#### B2 – StartConfiguration in Application-Schicht enthält java.nio.file.Path (Architektur-Grenzfall)
|
||||
|
||||
**Themenbereich:** Architektur
|
||||
**Norm:** „Application orchestriert Use Cases und enthält keine technischen
|
||||
Implementierungsdetails" (technik-und-architektur.md §3.1); Port-Verträge dürfen keine
|
||||
NIO-Typen enthalten (CLAUDE.md).
|
||||
**Befund:** `StartConfiguration` (in `application/config/startup/`) ist ein Java-Record
|
||||
mit `java.nio.file.Path`-Feldern für `sourceFolder`, `targetFolder`, `sqliteFile`,
|
||||
`promptTemplateFile`, `runtimeLockFile`, `logDirectory`.
|
||||
**Kontext:** `StartConfiguration` ist kein Port-Vertrag, sondern ein unveränderliches
|
||||
Konfigurations-DTO, das ausschließlich von Bootstrap erzeugt und an Adapter übergeben wird.
|
||||
Die Port-Verträge selbst sind sauber (keine Path-Typen in Port-Interfaces).
|
||||
**Bewertung:** Grenzfall. `Path` ist kein fachliches Objekt, aber auch kein schwerer
|
||||
Architekturverstoß in diesem Kontext. Die Alternative (String-Repräsentation und Auflösung
|
||||
im Adapter) hätte keinen Mehrwert für das Betriebsmodell.
|
||||
**Entscheidung:** Kein Handlungsbedarf. Das Verschieben von `StartConfiguration` in das
|
||||
Bootstrap-Modul wäre eine Option, ist aber keine Pflicht, da kein funktionaler Defekt vorliegt.
|
||||
|
||||
---
|
||||
|
||||
#### B3 – PIT-Überlebende in Bootstrap (Bootstrap: 76 % Kill Rate)
|
||||
|
||||
**Themenbereich:** Testqualität
|
||||
**Befund:** 8 überlebende Mutanten im Bootstrap-Modul (34 generiert, 26 getötet).
|
||||
Hauptkategorie: `VoidMethodCallMutator` (2 Überlebende, 2 ohne Coverage).
|
||||
**Bewertung:** Betrifft vor allem Logging-Calls und nicht-kritische Hilfsmethoden.
|
||||
Keine funktional tragenden Entscheidungspfade betroffen.
|
||||
**Entscheidung:** Kein Handlungsbedarf. Betrifft vor allem Logging-Calls und nicht-kritische
|
||||
Hilfsmethoden. Wurde auf akzeptablem Niveau konsolidiert.
|
||||
|
||||
---
|
||||
|
||||
## Zusammenfassung und Freigabe
|
||||
|
||||
| Klassifikation | Anzahl | Beschreibung |
|
||||
|---|---|---|
|
||||
| Release-Blocker | **0** | – |
|
||||
| Abgeschlossen (war nicht blockierend) | **1** | B1 Naming-Convention-Bereinigung |
|
||||
| Dokumentierte Randpunkte (freigabekompatibel) | **2** | B2 Path-Grenzfall, B3 PIT-Bootstrap |
|
||||
|
||||
**Freigabeentscheidung: Der Endstand ist produktionsbereit und freigegeben.**
|
||||
|
||||
Alle fachlichen, technischen und architekturellen Kernanforderungen aus den verbindlichen
|
||||
Spezifikationen (technik-und-architektur.md, fachliche-anforderungen.md, CLAUDE.md) sind
|
||||
vollständig umgesetzt und durch automatisierte Tests abgesichert. Der Maven-Build ist fehlerfrei.
|
||||
Die CLAUDE.md-Naming-Convention-Regel (kein M1–M8, kein AP-xxx im Produktions- oder Testcode)
|
||||
ist vollständig eingehalten. Keine bekannten spezifikationsrelevanten Blocker sind offen.
|
||||
289
docs/betrieb.md
Normal file
289
docs/betrieb.md
Normal file
@@ -0,0 +1,289 @@
|
||||
# Betriebsdokumentation – PDF Umbenenner
|
||||
|
||||
## Zweck
|
||||
|
||||
Der PDF Umbenenner liest bereits OCR-verarbeitete, durchsuchbare PDF-Dateien aus einem
|
||||
konfigurierten Quellordner, ermittelt per KI-Aufruf einen normierten deutschen Dateinamen
|
||||
und legt eine Kopie im konfigurierten Zielordner ab. Die Quelldatei bleibt unverändert.
|
||||
|
||||
---
|
||||
|
||||
## Voraussetzungen
|
||||
|
||||
- Java 21 (JRE oder JDK)
|
||||
- Zugang zu einem OpenAI-kompatiblen KI-Dienst (API-Schlüssel erforderlich)
|
||||
- Quellordner mit OCR-verarbeiteten PDF-Dateien
|
||||
- Schreibzugriff auf Zielordner und Datenbankverzeichnis
|
||||
|
||||
---
|
||||
|
||||
## Start des ausführbaren JAR
|
||||
|
||||
Das ausführbare JAR wird durch den Maven-Build im Verzeichnis
|
||||
`pdf-umbenenner-bootstrap/target/` erzeugt:
|
||||
|
||||
```
|
||||
java -jar pdf-umbenenner-bootstrap/target/pdf-umbenenner-bootstrap-0.0.1-SNAPSHOT.jar
|
||||
```
|
||||
|
||||
Die Anwendung liest die Konfiguration aus `config/application.properties` relativ zum
|
||||
Arbeitsverzeichnis, in dem der Befehl ausgeführt wird.
|
||||
|
||||
### Start über Windows Task Scheduler
|
||||
|
||||
Empfohlene Startsequenz für den Windows Task Scheduler:
|
||||
|
||||
1. Aktion: Programm/Skript starten
|
||||
2. Programm: `java`
|
||||
3. Argumente: `-jar C:\Pfad\zur\Installation\pdf-umbenenner-bootstrap\target\pdf-umbenenner-bootstrap-0.0.1-SNAPSHOT.jar`
|
||||
4. Starten in: `C:\Pfad\zur\Installation` (muss das Verzeichnis mit `config\application.properties` und `config\prompts\` enthalten)
|
||||
|
||||
> **Hinweis:** Das „Starten in"-Verzeichnis ist das Arbeitsverzeichnis der Anwendung.
|
||||
> Die Konfigurationsdatei `config/application.properties` sowie das Prompt-Verzeichnis
|
||||
> `config/prompts/` müssen relativ zu diesem Verzeichnis erreichbar sein. Der JAR-Pfad
|
||||
> in den Argumenten muss absolut oder relativ zum Starten-in-Verzeichnis korrekt angegeben sein.
|
||||
|
||||
---
|
||||
|
||||
## Konfiguration
|
||||
|
||||
Die Konfiguration wird aus `config/application.properties` geladen.
|
||||
Vorlagen für lokale und Test-Konfigurationen befinden sich in:
|
||||
|
||||
- `config/application-local.example.properties`
|
||||
- `config/application-test.example.properties`
|
||||
|
||||
### Pflichtparameter (allgemein)
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-------------------------|--------------|
|
||||
| `source.folder` | Quellordner mit OCR-PDFs (muss vorhanden und lesbar sein) |
|
||||
| `target.folder` | Zielordner für umbenannte Kopien (wird angelegt, wenn nicht vorhanden) |
|
||||
| `sqlite.file` | SQLite-Datenbankdatei (übergeordnetes Verzeichnis muss existieren) |
|
||||
| `ai.provider.active` | Aktiver KI-Provider: `openai-compatible` oder `claude` |
|
||||
| `max.retries.transient` | Maximale transiente Fehlversuche pro Dokument (ganzzahlig, >= 1) |
|
||||
| `max.pages` | Maximale Seitenzahl pro Dokument (ganzzahlig, > 0) |
|
||||
| `max.text.characters` | Maximale Zeichenanzahl des Dokumenttexts für KI-Anfragen (ganzzahlig, > 0) |
|
||||
| `prompt.template.file` | Pfad zur externen Prompt-Datei (muss vorhanden sein) |
|
||||
|
||||
### Provider-Parameter
|
||||
|
||||
Nur der **aktive** Provider muss vollständig konfiguriert sein. Der inaktive Provider wird nicht validiert.
|
||||
|
||||
**OpenAI-kompatibler Provider** (`ai.provider.active=openai-compatible`):
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-----------|--------------|
|
||||
| `ai.provider.openai-compatible.baseUrl` | Basis-URL des KI-Dienstes (z. B. `https://api.openai.com/v1`) |
|
||||
| `ai.provider.openai-compatible.model` | Modellname (z. B. `gpt-4o-mini`) |
|
||||
| `ai.provider.openai-compatible.timeoutSeconds` | HTTP-Timeout in Sekunden (ganzzahlig, > 0) |
|
||||
| `ai.provider.openai-compatible.apiKey` | API-Schlüssel (Umgebungsvariable `OPENAI_COMPATIBLE_API_KEY` hat Vorrang) |
|
||||
|
||||
**Anthropic Claude-Provider** (`ai.provider.active=claude`):
|
||||
|
||||
| Parameter | Beschreibung |
|
||||
|-----------|--------------|
|
||||
| `ai.provider.claude.baseUrl` | Basis-URL (optional; Standard: `https://api.anthropic.com`) |
|
||||
| `ai.provider.claude.model` | Modellname (z. B. `claude-3-5-sonnet-20241022`) |
|
||||
| `ai.provider.claude.timeoutSeconds` | HTTP-Timeout in Sekunden (ganzzahlig, > 0) |
|
||||
| `ai.provider.claude.apiKey` | API-Schlüssel (Umgebungsvariable `ANTHROPIC_API_KEY` hat Vorrang) |
|
||||
|
||||
### Optionale Parameter
|
||||
|
||||
| Parameter | Beschreibung | Standard |
|
||||
|---------------------|--------------|---------|
|
||||
| `runtime.lock.file` | Lock-Datei für Startschutz | `pdf-umbenenner.lock` im Arbeitsverzeichnis |
|
||||
| `log.directory` | Log-Verzeichnis | `./logs/` |
|
||||
| `log.level` | Log-Level (`DEBUG`, `INFO`, `WARN`, `ERROR`) | `INFO` |
|
||||
| `log.ai.sensitive` | KI-Rohantwort und Reasoning ins Log schreiben (`true`/`false`) | `false` |
|
||||
|
||||
### API-Schlüssel
|
||||
|
||||
Pro Provider-Familie existiert eine eigene Umgebungsvariable, die Vorrang vor dem Properties-Wert hat:
|
||||
|
||||
| Provider | Umgebungsvariable |
|
||||
|---|---|
|
||||
| `openai-compatible` | `OPENAI_COMPATIBLE_API_KEY` |
|
||||
| `claude` | `ANTHROPIC_API_KEY` |
|
||||
|
||||
Schlüssel verschiedener Provider-Familien werden niemals vermischt.
|
||||
|
||||
---
|
||||
|
||||
## Migration älterer Konfigurationsdateien
|
||||
|
||||
Ältere Konfigurationsdateien, die noch die flachen Schlüssel `api.baseUrl`, `api.model`,
|
||||
`api.timeoutSeconds` und `api.key` verwenden, werden beim ersten Start **automatisch**
|
||||
in das aktuelle Schema überführt.
|
||||
|
||||
### Was passiert
|
||||
|
||||
1. Die Anwendung erkennt die veraltete Form anhand der flachen `api.*`-Schlüssel.
|
||||
2. **Vor jeder Änderung** wird eine Sicherungskopie der Originaldatei angelegt:
|
||||
- Standardfall: `config/application.properties.bak`
|
||||
- Falls `.bak` bereits existiert: `config/application.properties.bak.1`, `.bak.2`, …
|
||||
- Bestehende Sicherungen werden **niemals überschrieben**.
|
||||
3. Die Datei wird in-place in das neue Schema überführt:
|
||||
- `api.baseUrl` → `ai.provider.openai-compatible.baseUrl`
|
||||
- `api.model` → `ai.provider.openai-compatible.model`
|
||||
- `api.timeoutSeconds` → `ai.provider.openai-compatible.timeoutSeconds`
|
||||
- `api.key` → `ai.provider.openai-compatible.apiKey`
|
||||
- `ai.provider.active=openai-compatible` wird ergänzt.
|
||||
- Alle übrigen Schlüssel bleiben unverändert.
|
||||
4. Die migrierte Datei wird über eine temporäre Datei (`*.tmp`) und atomischen
|
||||
Move/Rename geschrieben. Das Original wird niemals teilbeschrieben.
|
||||
5. Die migrierte Datei wird sofort neu eingelesen und validiert.
|
||||
|
||||
### Bei Migrationsfehler
|
||||
|
||||
Schlägt die Validierung der migrierten Datei fehl, bricht die Anwendung mit Exit-Code `1` ab.
|
||||
Die Sicherungskopie (`.bak`) bleibt in diesem Fall erhalten und enthält die unveränderte
|
||||
Originaldatei. Die Konfiguration muss dann manuell korrigiert werden.
|
||||
|
||||
### Betreiber-Hinweis
|
||||
|
||||
Die Umgebungsvariable `PDF_UMBENENNER_API_KEY` des Vorgängerstands wird **nicht** automatisch
|
||||
umbenannt. Falls dieser Wert bislang verwendet wurde, muss er auf `OPENAI_COMPATIBLE_API_KEY`
|
||||
umgestellt werden.
|
||||
|
||||
---
|
||||
|
||||
## Prompt-Konfiguration
|
||||
|
||||
Der Prompt wird aus der in `prompt.template.file` konfigurierten externen Textdatei geladen.
|
||||
Der Dateiname der Prompt-Datei dient als Prompt-Identifikator in der Versuchshistorie
|
||||
(SQLite) und ermöglicht so die Nachvollziehbarkeit, welche Prompt-Version für welchen
|
||||
Verarbeitungsversuch verwendet wurde.
|
||||
|
||||
Eine Vorlage befindet sich in `config/prompts/template.txt` und kann direkt verwendet oder
|
||||
an den jeweiligen KI-Dienst angepasst werden.
|
||||
|
||||
Die Anwendung ergänzt den Prompt automatisch um:
|
||||
- einen Dokumenttext-Abschnitt
|
||||
- eine explizite JSON-Antwortspezifikation mit den Feldern `title`, `reasoning` und `date`
|
||||
|
||||
Der Prompt in `template.txt` muss deshalb **keine** JSON-Formatanweisung enthalten –
|
||||
nur den inhaltlichen Auftrag an die KI.
|
||||
|
||||
---
|
||||
|
||||
## Zielformat
|
||||
|
||||
Jede erfolgreich verarbeitete PDF-Datei wird im Zielordner unter folgendem Namen abgelegt:
|
||||
|
||||
```
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
Bei Namenskollisionen wird ein laufendes Suffix angehängt:
|
||||
|
||||
```
|
||||
YYYY-MM-DD - Titel(1).pdf
|
||||
YYYY-MM-DD - Titel(2).pdf
|
||||
```
|
||||
|
||||
Das Suffix zählt nicht zu den 20 Zeichen des Basistitels.
|
||||
|
||||
---
|
||||
|
||||
## Retry- und Skip-Verhalten
|
||||
|
||||
### Dokumentstatus
|
||||
|
||||
Die folgende Tabelle beschreibt die persistenten Statuszustände der Dokument-Stammsätze
|
||||
in der SQLite-Datenbank. Diese Zustände sind nach Abschluss eines Verarbeitungsversuchs
|
||||
dauerhaft gespeichert.
|
||||
|
||||
| Status | Bedeutung |
|
||||
|-----------------------------|-----------|
|
||||
| `READY_FOR_AI` | Verarbeitbar, KI-Pfad noch nicht durchlaufen |
|
||||
| `PROPOSAL_READY` | KI-Benennungsvorschlag liegt vor, Zielkopie noch nicht geschrieben |
|
||||
| `SUCCESS` | Erfolgreich verarbeitet und kopiert (terminaler Endzustand) |
|
||||
| `FAILED_RETRYABLE` | Fehlgeschlagen, erneuter Versuch in späterem Lauf möglich |
|
||||
| `FAILED_FINAL` | Terminal fehlgeschlagen, wird nicht erneut verarbeitet |
|
||||
| `SKIPPED_ALREADY_PROCESSED` | Übersprungen – Dokument bereits erfolgreich verarbeitet |
|
||||
| `SKIPPED_FINAL_FAILURE` | Übersprungen – Dokument terminal fehlgeschlagen |
|
||||
|
||||
Zusätzlich kennt das System den transienten Zustand `PROCESSING`, der während der aktiven
|
||||
Verarbeitung eines Dokuments im Stammsatz gesetzt werden kann. Er wird nach Abschluss des
|
||||
Verarbeitungsversuchs stets durch einen der obigen Zustände ersetzt und ist kein gültiger
|
||||
Endstatus in der Datenbank.
|
||||
|
||||
### Retry-Regeln
|
||||
|
||||
**Deterministische Inhaltsfehler** (z. B. kein extrahierbarer Text, Seitenlimit überschritten,
|
||||
unbrauchbarer KI-Titel):
|
||||
|
||||
- Erster Fehler → `FAILED_RETRYABLE` (ein Wiederholversuch in späterem Lauf erlaubt)
|
||||
- Zweiter Fehler → `FAILED_FINAL` (kein weiterer Versuch)
|
||||
|
||||
**Transiente technische Fehler** (z. B. KI nicht erreichbar, HTTP-Timeout):
|
||||
|
||||
- Wiederholbar bis zum Grenzwert `max.retries.transient`
|
||||
- Bei Erreichen des Grenzwerts → `FAILED_FINAL`
|
||||
|
||||
**Technischer Sofort-Wiederholversuch:**
|
||||
|
||||
Bei einem Schreibfehler der Zielkopie wird innerhalb desselben Laufs exakt ein
|
||||
Sofort-Wiederholversuch unternommen. Dieser zählt nicht zum laufübergreifenden
|
||||
Fehlerzähler.
|
||||
|
||||
---
|
||||
|
||||
## Logging
|
||||
|
||||
Logs werden in das konfigurierte `log.directory` geschrieben (Standard: `./logs/`).
|
||||
Log-Rotation erfolgt täglich und bei Erreichen von 10 MB je Datei.
|
||||
|
||||
### Sensible KI-Inhalte
|
||||
|
||||
Standardmäßig werden die vollständige KI-Rohantwort und das KI-Reasoning **nicht** ins Log
|
||||
geschrieben, sondern ausschließlich in der SQLite-Datenbank gespeichert.
|
||||
|
||||
Die Ausgabe kann für Diagnosezwecke mit `log.ai.sensitive=true` freigeschaltet werden.
|
||||
Erlaubte Werte: `true` oder `false`. Jeder andere Wert ist ungültig und verhindert den Start.
|
||||
|
||||
---
|
||||
|
||||
## Exit-Codes
|
||||
|
||||
| Code | Bedeutung |
|
||||
|------|-----------|
|
||||
| `0` | Lauf technisch ordnungsgemäß ausgeführt (auch bei dokumentbezogenen Teilfehlern) |
|
||||
| `1` | Harter Start- oder Bootstrap-Fehler (ungültige Konfiguration, Lock nicht erwerbbar, Schema-Initialisierungsfehler) |
|
||||
|
||||
Dokumentbezogene Fehler einzelner PDF-Dateien führen **nicht** zu Exit-Code `1`.
|
||||
|
||||
---
|
||||
|
||||
## Startschutz (Parallelinstanzschutz)
|
||||
|
||||
Die Anwendung verwendet eine exklusive Lock-Datei, um parallele Instanzen zu verhindern.
|
||||
Wenn bereits eine Instanz läuft, beendet sich die neue Instanz sofort mit Exit-Code `1`.
|
||||
|
||||
Der Pfad der Lock-Datei ist über `runtime.lock.file` konfigurierbar.
|
||||
Ohne Konfiguration wird `pdf-umbenenner.lock` im Arbeitsverzeichnis verwendet.
|
||||
|
||||
---
|
||||
|
||||
## SQLite-Datenbank
|
||||
|
||||
Die SQLite-Datei enthält:
|
||||
|
||||
- **Dokument-Stammsätze**: Gesamtstatus, Fehlerzähler, letzter Zieldateiname, Zeitstempel
|
||||
- **Versuchshistorie**: Jeder Verarbeitungsversuch mit Modell, Prompt-Identifikator,
|
||||
KI-Rohantwort, Reasoning, Datum, Titel und Fehlerstatus
|
||||
|
||||
Die Datenbank ist die führende Wahrheitsquelle für Bearbeitungsstatus und Nachvollziehbarkeit.
|
||||
Sie muss nicht manuell verwaltet werden – das Schema wird beim Start automatisch initialisiert.
|
||||
|
||||
---
|
||||
|
||||
## Systemgrenzen
|
||||
|
||||
- Nur OCR-verarbeitete, durchsuchbare PDF-Dateien werden verarbeitet
|
||||
- Keine eingebaute OCR-Funktion
|
||||
- Kein Web-UI, keine REST-API, keine interaktive Bedienung
|
||||
- Kein interner Scheduler – der Start erfolgt extern (z. B. Windows Task Scheduler)
|
||||
- Quelldateien werden nie überschrieben, verschoben oder gelöscht
|
||||
- Die Identifikation erfolgt über SHA-256-Fingerprint des Dateiinhalts, nicht über Dateinamen
|
||||
@@ -1,5 +1,11 @@
|
||||
# Technik und Architektur – PDF-Umbenenner mit KI
|
||||
|
||||
> **Versionshinweis v2**
|
||||
> Diese Fassung erweitert die KI-Anbindung um einen zweiten, gleichwertig unterstützten Provider.
|
||||
> Geändert wurden ausschließlich die Abschnitte, die für die Mehrprovider-Fähigkeit erforderlich sind:
|
||||
> Technologiestack (Abschnitt 5), KI-Integration (Abschnitt 11), Konfiguration (Abschnitt 14) sowie die Abschlussbewertung (Abschnitt 19).
|
||||
> Alle übrigen Abschnitte bleiben inhaltlich unverändert.
|
||||
|
||||
## 1. Ziel und Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt die verbindliche technische Zielarchitektur für den **PDF-Umbenenner**.
|
||||
@@ -130,7 +136,7 @@ Enthält technische Implementierungen der Outbound-Ports, insbesondere:
|
||||
- Dateisystem
|
||||
- PDFBox
|
||||
- SQLite
|
||||
- OpenAI-kompatibler HTTP-Client
|
||||
- KI-HTTP-Clients (eine Implementierung je unterstütztem Provider, siehe Abschnitt 11)
|
||||
- Properties-/Umgebungs-Konfiguration
|
||||
- Run-Lock
|
||||
- Clock
|
||||
@@ -139,7 +145,8 @@ Enthält technische Implementierungen der Outbound-Ports, insbesondere:
|
||||
Verantwortlich für:
|
||||
- Laden und Validieren der Konfiguration
|
||||
- Erzeugen des Objektgraphen
|
||||
- Verdrahtung aller Adapter und Ports
|
||||
- Auswahl und Verdrahtung der **einen** aktiven KI-Provider-Implementierung
|
||||
- Verdrahtung aller übrigen Adapter und Ports
|
||||
- Start des CLI-Adapters
|
||||
- Setzen des Exit-Codes
|
||||
|
||||
@@ -162,13 +169,18 @@ Verbindlich eingesetzt werden:
|
||||
- **SQLite** als lokaler Persistenzspeicher
|
||||
- **SQLite JDBC-Treiber**
|
||||
- **Log4j2** für Logging
|
||||
- **OpenAI-kompatible HTTP-API** für KI-Zugriff
|
||||
- **Java HTTP Client** oder technisch gleichwertige Standard-HTTP-Komponente
|
||||
- **JSON-Bibliothek** für robuste JSON-Serialisierung und -Validierung
|
||||
|
||||
Für die KI-Anbindung werden **zwei gleichwertig unterstützte Provider-Familien** technisch zugelassen:
|
||||
- **OpenAI-kompatible HTTP-Schnittstelle** (Chat-Completions-Stil)
|
||||
- **native Anthropic Messages API** für Claude-Modelle
|
||||
|
||||
Pro Lauf ist genau **eine** dieser Provider-Implementierungen aktiv. Die Auswahl erfolgt ausschließlich über Konfiguration (siehe Abschnitt 14).
|
||||
|
||||
Nicht verbindlich festgelegt sind:
|
||||
- konkreter KI-Provider
|
||||
- konkrete KI-Basis-URL
|
||||
- konkreter KI-Provider innerhalb einer Provider-Familie
|
||||
- konkrete Basis-URL
|
||||
- konkreter Modellname
|
||||
|
||||
Diese drei Punkte sind **reine Konfiguration** und ausdrücklich **keine Architekturentscheidung**.
|
||||
@@ -196,6 +208,8 @@ Verbindlich zweckmäßige Outbound-Ports:
|
||||
- `RunLockPort`
|
||||
- `ClockPort`
|
||||
|
||||
Der `AiNamingPort` bleibt **provider-neutral**. Er kennt weder OpenAI- noch Anthropic-spezifische Typen, Header, URLs oder Antwortformate. Provider-spezifische Details (Endpunkt, Authentifizierung, Request-/Response-Format) leben ausschließlich in den jeweiligen Adapter-Out-Implementierungen.
|
||||
|
||||
### 6.3 Logging
|
||||
Logging ist **kein fachlicher Port**. Logging ist technische Infrastruktur.
|
||||
|
||||
@@ -234,6 +248,8 @@ Die Verarbeitung einer einzelnen Datei erfolgt in dieser Reihenfolge:
|
||||
16. temporäre Zieldatei final verschieben/umbenennen
|
||||
17. Erfolg und Versuchshistorie persistent speichern
|
||||
|
||||
Die Verarbeitungsschritte sind **provider-unabhängig**. Welcher konkrete KI-Adapter Schritt 9 ausführt, ist außerhalb der Application nicht sichtbar.
|
||||
|
||||
### 7.3 Erfolgskriterium
|
||||
Ein Dokument gilt genau dann als erfolgreich verarbeitet, wenn:
|
||||
1. brauchbarer PDF-Text vorliegt,
|
||||
@@ -288,63 +304,15 @@ Beispiele:
|
||||
- `2026-03-31 - Stromabrechnung(1).pdf`
|
||||
- `2026-03-31 - Stromabrechnung(2).pdf`
|
||||
|
||||
### 8.6 Windows-Kompatibilität
|
||||
Die Anwendung stellt zusätzlich sicher, dass der Zielname für Windows zulässig ist.
|
||||
---
|
||||
|
||||
Unzulässige Zeichen sind technisch zu entfernen oder kontrolliert zu ersetzen.
|
||||
## 9. Retry- und Fehlersemantik
|
||||
|
||||
> Inhaltlich unverändert gegenüber der Vorgängerfassung. Nur die Erkenntnis, dass technische KI-Fehler unabhängig vom konkreten Provider als transient klassifiziert werden, gilt jetzt für **beide** Provider-Familien gleichermaßen.
|
||||
|
||||
---
|
||||
|
||||
## 9. Fehlerklassifikation und Retry-Regeln
|
||||
|
||||
### 9.1 Grundsatz
|
||||
Nur **retryable** Fehler dürfen in späteren Läufen erneut verarbeitet werden.
|
||||
|
||||
**Finale** Fehler werden in späteren Läufen übersprungen.
|
||||
|
||||
### 9.2 Deterministische Inhaltsfehler
|
||||
Deterministische Inhaltsfehler sind insbesondere:
|
||||
- kein brauchbarer PDF-Text
|
||||
- Seitenlimit überschritten
|
||||
- Dokument inhaltlich mehrdeutig
|
||||
- kein brauchbarer Titel
|
||||
- generischer oder unzulässiger Titel
|
||||
- von der KI gelieferter Datumswert ist vorhanden, aber unbrauchbar oder nicht interpretierbar
|
||||
|
||||
Regel:
|
||||
- genau **1 Retry** in einem späteren Scheduler-Lauf
|
||||
- danach **finaler Fehler**
|
||||
|
||||
### 9.3 Transiente technische Fehler
|
||||
Transiente technische Fehler sind insbesondere:
|
||||
- KI nicht erreichbar
|
||||
- HTTP-Timeout
|
||||
- temporäre IO-Fehler
|
||||
- temporäre SQLite-Sperre
|
||||
- ungültiges oder nicht parsebares KI-JSON
|
||||
- sonstige vorübergehende technische Infrastrukturfehler
|
||||
|
||||
Regel:
|
||||
- Retry in späteren Läufen bis zum konfigurierten Maximalwert
|
||||
|
||||
### 9.4 Technischer Sofort-Wiederholversuch
|
||||
Zusätzlich zulässig ist genau **ein technischer Sofort-Wiederholversuch** innerhalb desselben Laufs für den Zielkopiervorgang, wenn das Schreiben der Zieldatei fehlschlägt.
|
||||
|
||||
Dieser Mechanismus ist **kein fachlicher Retry** und wird getrennt vom laufübergreifenden Retry-Modell behandelt.
|
||||
|
||||
### 9.5 Statusmodell
|
||||
Verbindlich zweckmäßige Statuswerte:
|
||||
- `SUCCESS`
|
||||
- `FAILED_RETRYABLE`
|
||||
- `FAILED_FINAL`
|
||||
- `SKIPPED_ALREADY_PROCESSED`
|
||||
- `SKIPPED_FINAL_FAILURE`
|
||||
|
||||
Ein technischer Zwischenstatus `PROCESSING` ist zusätzlich zulässig und sinnvoll.
|
||||
|
||||
---
|
||||
|
||||
## 10. Idempotenz und Identifikation
|
||||
## 10. Identifikation und Reproduzierbarkeit
|
||||
|
||||
### 10.1 Identifikation
|
||||
Die Identifikation eines Dokuments erfolgt **nicht** über den Dateinamen.
|
||||
@@ -362,35 +330,63 @@ Daraus folgt:
|
||||
Reproduzierbarkeit bedeutet technisch:
|
||||
- nach einem erfolgreichen Lauf bleibt das gespeicherte Ergebnis stabil
|
||||
- erfolgreiche Dateien werden nicht erneut KI-basiert bewertet
|
||||
- KI-Aufrufe werden, soweit die API es zulässt, mit möglichst geringer Varianz konfiguriert
|
||||
- Prompt-Version und Modellname werden persistiert
|
||||
- KI-Aufrufe werden, soweit die jeweilige API es zulässt, mit möglichst geringer Varianz konfiguriert
|
||||
- Prompt-Version, Modellname **und der Name des aktiven Providers** werden persistiert
|
||||
|
||||
---
|
||||
|
||||
## 11. KI-Integration
|
||||
|
||||
### 11.1 Schnittstelle
|
||||
Die KI wird ausschließlich über eine **OpenAI-kompatible HTTP-Schnittstelle** angebunden.
|
||||
### 11.1 Unterstützte Provider-Familien
|
||||
Die KI wird über genau **eine** der folgenden Provider-Familien angebunden:
|
||||
|
||||
Basis-URL, Modellname und API-Key sind reine Konfiguration.
|
||||
1. **OpenAI-kompatible HTTP-Schnittstelle**
|
||||
Chat-Completions-Stil. Geeignet für OpenAI selbst und für jeden API-kompatiblen Drittanbieter.
|
||||
2. **Native Anthropic Messages API**
|
||||
Die offizielle Anthropic-Schnittstelle zur Nutzung von Claude-Modellen.
|
||||
|
||||
### 11.2 Prompt
|
||||
Pro Lauf ist genau **ein** Provider aktiv. Es gibt:
|
||||
- **keine** automatische Fallback-Umschaltung
|
||||
- **keine** parallele Nutzung mehrerer Provider in einem Lauf
|
||||
- **keine** Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
|
||||
Die Auswahl erfolgt ausschließlich über Konfiguration. Ein Fehler des aktiven Providers ist und bleibt ein Fehler dieses einen Pfads und folgt der bestehenden Retry- und Fehlersemantik.
|
||||
|
||||
### 11.2 Architekturelle Einbettung
|
||||
- Pro Provider-Familie existiert **genau eine** Implementierung des `AiNamingPort` im Modul `pdf-umbenenner-adapter-out`.
|
||||
- Provider-spezifische Endpunkte, Header, Authentifizierungsverfahren, Request- und Response-Strukturen leben ausschließlich in der jeweiligen Adapter-Implementierung.
|
||||
- Application und Domain bleiben provider-neutral. Sie kennen weder den Begriff „OpenAI" noch „Claude".
|
||||
- Das **Bootstrap-Modul** wählt anhand der Konfiguration die eine aktive Implementierung aus und verdrahtet sie als `AiNamingPort`.
|
||||
- Adapter dürfen nicht voneinander abhängen. Es gibt keinen gemeinsamen „abstrakten KI-Adapter" als Infrastrukturschicht zwischen Port und konkreten Adaptern.
|
||||
|
||||
### 11.3 Einheitlicher fachlicher Vertrag
|
||||
Unabhängig vom aktiven Provider gilt derselbe fachliche Vertrag:
|
||||
- gleicher fachlicher Input (Prompt, Textausschnitt, Modellbezug)
|
||||
- gleicher fachlicher Output (Domain-Typ `NamingProposal`)
|
||||
- gleiche Validierungs- und Folgeprozesse in der Application
|
||||
- keine provider-spezifische Verzweigung im fachlichen Kern
|
||||
|
||||
Jede provider-spezifische Antwort wird im Adapter auf denselben Domain-Typ abgebildet. Eine Sonderbehandlung im Use-Case oder in der Domain ist unzulässig.
|
||||
|
||||
### 11.4 Prompt
|
||||
Der Prompt wird **nicht** im Code fest verdrahtet.
|
||||
|
||||
Verbindlich:
|
||||
- externe Prompt-Datei
|
||||
- Prompt-Version oder Prompt-Dateiname wird mitpersistiert
|
||||
- der Prompt darf die KI zur Ausgabe eines deutschen Titels anweisen
|
||||
- derselbe Prompt wird providerübergreifend verwendet; provider-spezifische Anpassungen finden ausschließlich in der Adapter-Implementierung statt
|
||||
|
||||
### 11.3 Textmenge
|
||||
### 11.5 Textmenge
|
||||
Es wird nicht zwingend der komplette extrahierte PDF-Text an die KI gesendet.
|
||||
|
||||
Verbindlich:
|
||||
- die maximale Zeichenzahl ist konfigurierbar
|
||||
- die Begrenzung muss vor dem KI-Aufruf technisch angewendet werden
|
||||
- die Begrenzung gilt providerunabhängig
|
||||
|
||||
### 11.4 Antwortformat
|
||||
Die KI muss genau ein parsebares JSON-Objekt liefern.
|
||||
### 11.6 Antwortformat
|
||||
Die KI muss – unabhängig vom aktiven Provider – fachlich genau ein parsebares JSON-Objekt liefern.
|
||||
|
||||
Zweckmäßiges Schema:
|
||||
|
||||
@@ -408,7 +404,9 @@ Regeln:
|
||||
- `date` ist optional, wenn kein belastbares Datum ableitbar ist
|
||||
- liefert die KI kein `date`, setzt die Anwendung das aktuelle Datum als Fallback
|
||||
|
||||
### 11.5 Antwortvalidierung
|
||||
Wie der Adapter dieses Schema aus der jeweiligen Provider-Antwort extrahiert (z. B. aus `choices[].message.content` bei OpenAI-kompatiblen Schnittstellen oder aus dem Content-Block-Array der Anthropic Messages API), ist eine reine Adapter-Implementierungsfrage.
|
||||
|
||||
### 11.7 Antwortvalidierung
|
||||
Die Antwort gilt nur dann als technisch brauchbar, wenn:
|
||||
- JSON parsebar ist
|
||||
- `title` vorhanden ist
|
||||
@@ -418,6 +416,11 @@ Zusätzlich gilt fachlich:
|
||||
- `title` muss validierbar und brauchbar sein
|
||||
- ein vorhandenes `date` muss im Format `YYYY-MM-DD` interpretierbar sein
|
||||
|
||||
Diese Validierung ist provider-unabhängig und liegt in Application/Domain.
|
||||
|
||||
### 11.8 Fehlerklassifikation
|
||||
Technische Fehler des aktiven Providers (HTTP-Fehler, Timeouts, ungültige Antwortstrukturen, Authentifizierungsfehler) werden im Adapter erkannt und auf die bestehende technische Fehlersemantik des Projekts abgebildet (transient vs. deterministisch). Es entsteht keine neue Fehlerkategorie. Der inaktive Provider wird in keiner Fehlersituation als Backup verwendet.
|
||||
|
||||
---
|
||||
|
||||
## 12. PDF-Verarbeitung
|
||||
@@ -457,6 +460,8 @@ Die Persistenz wird zweckmäßig in **zwei Ebenen** geführt:
|
||||
1. **Dokument-Stammsatz** pro Fingerprint
|
||||
2. **Versuchshistorie** mit einem Datensatz pro Verarbeitungsversuch
|
||||
|
||||
Das bestehende Schema bleibt erhalten. Es wird ausschließlich um die Information erweitert, **welcher Provider** den jeweiligen Versuch erzeugt hat (siehe 13.4). Eine neue Wahrheitsquelle entsteht nicht.
|
||||
|
||||
### 13.3 Dokument-Stammsatz
|
||||
Mindestens zweckmäßig zu speichern:
|
||||
- interne ID
|
||||
@@ -485,6 +490,7 @@ Für **jeden Versuch separat** zu speichern:
|
||||
- Fehlerklasse
|
||||
- Fehlermeldung
|
||||
- Retryable-Flag
|
||||
- **Provider-Identifikator des aktiven KI-Providers für diesen Versuch**
|
||||
- Modellname
|
||||
- Prompt-Identifikator
|
||||
- verarbeitete Seitenzahl
|
||||
@@ -496,11 +502,16 @@ Für **jeden Versuch separat** zu speichern:
|
||||
- finaler Titel
|
||||
- finaler Zieldateiname
|
||||
|
||||
Der Provider-Identifikator macht jeden Versuch eindeutig nachvollziehbar einer Provider-Familie zuordenbar, ohne den fachlichen Vertrag zu verändern.
|
||||
|
||||
### 13.5 Sensible Inhalte
|
||||
Die vollständige KI-Rohantwort wird in SQLite gespeichert.
|
||||
|
||||
Sie soll **standardmäßig nicht vollständig in Logdateien** geschrieben werden.
|
||||
|
||||
### 13.6 Rückwärtsverträglichkeit
|
||||
Bestehende Datenbestände aus dem Stand vor v2 müssen weiterhin lesbar, fortschreibbar und korrekt interpretierbar bleiben. Schema-Erweiterungen erfolgen additiv und mit definierten Defaultwerten für historische Versuche ohne Provider-Identifikator.
|
||||
|
||||
---
|
||||
|
||||
## 14. Konfiguration
|
||||
@@ -508,33 +519,77 @@ Sie soll **standardmäßig nicht vollständig in Logdateien** geschrieben werden
|
||||
### 14.1 Format
|
||||
Die technische Konfiguration erfolgt über `.properties`.
|
||||
|
||||
### 14.2 Mindestparameter
|
||||
### 14.2 Provider-Auswahl
|
||||
Genau ein Provider ist aktiv. Die Auswahl erfolgt über einen einzigen Pflichtparameter, der den aktiven Provider benennt. Zulässige Werte sind die Bezeichner der unterstützten Provider-Familien aus Abschnitt 11.1.
|
||||
|
||||
### 14.3 Mindestparameter
|
||||
Verbindlich zweckmäßige Parameter:
|
||||
|
||||
- `source.folder`
|
||||
- `target.folder`
|
||||
- `sqlite.file`
|
||||
- `api.baseUrl`
|
||||
- `api.model`
|
||||
- `api.timeoutSeconds`
|
||||
- **`ai.provider.active`** – Auswahl des aktiven Providers (Pflicht)
|
||||
- `max.retries.transient`
|
||||
- `max.pages`
|
||||
- `max.text.characters`
|
||||
- `prompt.template.file`
|
||||
|
||||
Pro unterstützter Provider-Familie existiert ein eigener Parameter-Namensraum mit zweckmäßig mindestens:
|
||||
- Modellname
|
||||
- API-Schlüssel
|
||||
- Timeout
|
||||
- Basis-URL (optional, wo betrieblich sinnvoll)
|
||||
|
||||
Konkretes Schema (zweckmäßig, frei wählbare Bezeichner):
|
||||
|
||||
```properties
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
ai.provider.claude.baseUrl=...
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
Zusätzlich zweckmäßig:
|
||||
- `runtime.lock.file`
|
||||
- `log.directory`
|
||||
- `log.level`
|
||||
- `api.key`
|
||||
- `log.ai.sensitive`
|
||||
|
||||
### 14.3 API-Key
|
||||
Der API-Key darf über Umgebungsvariable oder Properties geliefert werden.
|
||||
### 14.4 API-Schlüssel
|
||||
API-Schlüssel dürfen über Umgebungsvariable oder Properties geliefert werden.
|
||||
|
||||
Verbindlich:
|
||||
- Umgebungsvariable hat Vorrang
|
||||
- pro Provider-Familie existiert eine **eigene definierte Umgebungsvariable**
|
||||
- die Umgebungsvariable hat **Vorrang** vor dem Properties-Wert derselben Provider-Familie
|
||||
- Schlüssel verschiedener Provider-Familien werden niemals vermischt
|
||||
|
||||
### 14.4 Konfigurationsvalidierung
|
||||
Beim Start müssen alle Pflichtparameter validiert werden.
|
||||
### 14.5 Migration historischer Konfigurationen
|
||||
Bestehende Properties-Dateien aus dem Stand vor v2 (mit flachen Schlüsseln wie `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key`) sind eine eindeutig erkennbare Legacy-Form.
|
||||
|
||||
Beim ersten Start mit erkannter Legacy-Form gilt verbindlich:
|
||||
1. Legacy-Form erkennen
|
||||
2. **`.bak`-Sicherung** der Originaldatei anlegen
|
||||
3. Inhalt in das neue Schema überführen
|
||||
- die Legacy-Werte werden in den Namensraum der Provider-Familie **`openai-compatible`** überführt
|
||||
- `ai.provider.active` wird auf `openai-compatible` gesetzt
|
||||
4. neue Datei schreiben (In-Place-Update)
|
||||
5. Datei erneut laden und validieren
|
||||
6. erst danach den normalen Lauf fortsetzen
|
||||
|
||||
Es ist **kein** Ziel, alte und neue Struktur dauerhaft gleichrangig als Endformat zu pflegen.
|
||||
|
||||
### 14.6 Konfigurationsvalidierung
|
||||
Beim Start müssen alle Pflichtparameter validiert werden, insbesondere:
|
||||
- `ai.provider.active` ist gesetzt und benennt einen unterstützten Provider
|
||||
- für den aktiven Provider sind alle Pflichtwerte vorhanden und technisch konsistent
|
||||
- für den **inaktiven** Provider werden keine Pflichtwerte erzwungen
|
||||
|
||||
Bei ungültiger Startkonfiguration:
|
||||
- beginnt kein Verarbeitungslauf
|
||||
@@ -553,6 +608,7 @@ Das Logging muss mindestens enthalten:
|
||||
- Laufstart
|
||||
- Laufende
|
||||
- Lauf-ID
|
||||
- **aktiver KI-Provider für den Lauf**
|
||||
- erkannte Quelldatei
|
||||
- Überspringen bereits erfolgreicher Dateien
|
||||
- Überspringen final fehlgeschlagener Dateien
|
||||
@@ -566,6 +622,7 @@ Standardmäßig gilt:
|
||||
- vollständige KI-Rohantwort **in SQLite**
|
||||
- `reasoning` darf geloggt werden, sofern dies betrieblich gewünscht ist
|
||||
- die Ausgabe sensibler Inhalte muss konfigurierbar sein
|
||||
- die Sensibilitätsregel gilt provider-unabhängig
|
||||
|
||||
### 15.4 Speicherort
|
||||
Das Log-Verzeichnis ist konfigurierbar. Ohne explizite Konfiguration ist ein lokales `logs/`-Verzeichnis im Programmkontext zweckmäßig.
|
||||
@@ -598,7 +655,7 @@ Verbindliche Interpretation:
|
||||
- `1`: Lauf konnte wegen hartem Start-/Bootstrap-Fehler nicht ordnungsgemäß beginnen oder fortgesetzt werden
|
||||
|
||||
Typische `1`-Fälle:
|
||||
- ungültige Konfiguration
|
||||
- ungültige Konfiguration (einschließlich fehlender oder unbekannter `ai.provider.active`)
|
||||
- Run-Lock nicht erwerbbar
|
||||
- essentielle Ressourcen beim Start nicht verfügbar
|
||||
|
||||
@@ -616,23 +673,30 @@ Nicht Bestandteil dieser Architektur sind:
|
||||
- menschlicher Review-Workflow
|
||||
- interne Scheduler-Logik
|
||||
- fachliche Identifikation über Dateinamen
|
||||
- automatische Fallback-Umschaltung zwischen KI-Providern
|
||||
- parallele Nutzung mehrerer KI-Provider in einem Lauf
|
||||
- mehrere konkurrierende Konfigurationen je Provider-Familie (Profilverwaltung)
|
||||
- Provider-Familien jenseits der in Abschnitt 11.1 explizit genannten
|
||||
|
||||
---
|
||||
|
||||
## 19. Abschlussbewertung
|
||||
|
||||
Der technische Zielstand ist mit den hier festgelegten Regeln:
|
||||
Der technische Zielstand ist mit den in dieser Fassung festgelegten Regeln:
|
||||
- konsistent
|
||||
- widerspruchsfrei
|
||||
- hexagonal sauber geschnitten
|
||||
- für einen minimalen produktiven PDF-Umbenenner zweckmäßig
|
||||
- offen für genau zwei gleichwertig unterstützte KI-Provider-Familien, ohne den fachlichen Kern zu verändern
|
||||
|
||||
Besonders verbindlich geklärt sind jetzt:
|
||||
Besonders verbindlich geklärt sind:
|
||||
- Dateinamensformat mit `YYYY-MM-DD - Titel.pdf`
|
||||
- Dublettenregel mit `(1)`, `(2)`, ...
|
||||
- Trennung zwischen finalen und retrybaren Fehlern
|
||||
- Fallback-Datum durch die Anwendung
|
||||
- Zwei-Ebenen-Persistenz mit Versuchshistorie
|
||||
- Zwei-Ebenen-Persistenz mit Versuchshistorie inkl. Provider-Identifikator
|
||||
- Exit-Code-Regel für harte Startfehler
|
||||
- OpenAI-kompatible Schnittstelle ohne fest verdrahteten Provider
|
||||
- Unterstützung von OpenAI-kompatibler Schnittstelle **und** nativer Anthropic Messages API
|
||||
- genau **ein** aktiver Provider pro Lauf, ohne Fallback
|
||||
- Verlagerung technischer Persistenzobjekte aus der Domain heraus
|
||||
- Migration historischer flacher Properties-Konfiguration mit `.bak`-Sicherung
|
||||
|
||||
669
docs/workpackages/M5 - Arbeitspakete.md
Normal file
669
docs/workpackages/M5 - Arbeitspakete.md
Normal file
@@ -0,0 +1,669 @@
|
||||
# M5 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M5 – KI-Integration, Prompt-Bezug und validierter Benennungsvorschlag**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3** und **M4** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter und Bootstrap ändern.
|
||||
- Keine Annahmen treffen, die nicht durch dieses Dokument oder die verbindlichen Spezifikationen gedeckt sind.
|
||||
- Kein Vorgriff auf **M6+**.
|
||||
- Kein Umbau bestehender M1–M4-Strukturen ohne direkten M5-Bezug.
|
||||
- Neue Typen, Ports, Statuswerte, Migrationen und Adapter so schneiden, dass sie aus einem einzelnen Arbeitspaket heraus **klar benennbar, testbar und reviewbar** sind.
|
||||
- M5 darf bestehende M4-Persistenz **gezielt evolvieren**, aber nicht stillschweigend neu erfinden.
|
||||
- Jeder positive M5-Zwischenstand muss bereits so modelliert sein, dass **M6 darauf ohne Statusbruch** aufsetzen kann.
|
||||
- M5 endet fachlich mit einem **persistierten, validierten Benennungsvorschlag**; **nicht** mit einer Zielkopie.
|
||||
|
||||
## Explizit nicht Bestandteil von M5
|
||||
|
||||
- physische Zielkopie in den Zielordner
|
||||
- finales Dateinamensformat `YYYY-MM-DD - Titel.pdf` als technische Ausgabeoperation
|
||||
- Dublettenbehandlung `(1)`, `(2)` im Zielordner
|
||||
- Windows-Zeichenbereinigung für den finalen Zieldateinamen
|
||||
- Zielpfad- und Zieldateinamenspersistenz
|
||||
- atomisches Schreiben oder temporäre Zieldateien
|
||||
- vollständige End-to-End-Erfolgssemantik des späteren Produktiv-Endstands aus M6
|
||||
- Logging-Feinschliff des Endstands aus M7
|
||||
- vollständige laufübergreifende Retry-Logik späterer Meilensteine für alle Fehlerarten jenseits des in M5 konkret benötigten Umfangs
|
||||
- manuelle Nachbearbeitung oder Benutzerinteraktion
|
||||
|
||||
## Verbindliche M5-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. Status- und Übergangssemantik zwischen M4, M5 und M6
|
||||
|
||||
Die in M4 verwendete positive Zwischenbedeutung von `SUCCESS` reicht ab M5 **nicht mehr aus**, weil M5 einen validierten Benennungsvorschlag erzeugt, M6 aber erst die Zielkopie schreibt.
|
||||
|
||||
Daher gilt ab M5 verbindlich:
|
||||
|
||||
- `SUCCESS` ist ab M5 **für den echten Enderfolg nach M6 reserviert**.
|
||||
- M5 führt die beiden **nicht-terminalen positiven Statuswerte** ein:
|
||||
- `READY_FOR_AI` = Dokument ist fachlich bis einschließlich M3 vorbereitet, aber es liegt noch **kein** gültiger M5-Benennungsvorschlag vor.
|
||||
- `PROPOSAL_READY` = ein gültiger M5-Benennungsvorschlag ist persistent vorhanden; das Dokument ist **für M5 abgeschlossen**, aber **noch nicht** im Sinne von M6 erfolgreich kopiert.
|
||||
- Bereits vorhandene **M4-Altbestände** mit positivem Status `SUCCESS`, aber **ohne** M5-Benennungsvorschlag, müssen in M5 **kontrolliert in `READY_FOR_AI` überführt** werden.
|
||||
- Die bestehenden negativen bzw. Skip-Statuswerte bleiben erhalten:
|
||||
- `FAILED_RETRYABLE`
|
||||
- `FAILED_FINAL`
|
||||
- `SKIPPED_ALREADY_PROCESSED`
|
||||
- `SKIPPED_FINAL_FAILURE`
|
||||
- Für M5-spezifische Wiederholungsläufe gilt:
|
||||
- `PROPOSAL_READY` wird **nicht erneut per KI verarbeitet**,
|
||||
- `SUCCESS` wird **nicht erneut verarbeitet**,
|
||||
- `FAILED_FINAL` wird **nicht erneut verarbeitet**,
|
||||
- `READY_FOR_AI` und `FAILED_RETRYABLE` bleiben verarbeitbar.
|
||||
- Für M6 gilt bereits als verbindliche Übergaberegel:
|
||||
- `PROPOSAL_READY` ist **kein terminaler Gesamterfolg**, sondern der fachlich korrekte Eingangszustand für Dateinamensbildung und Zielkopie.
|
||||
|
||||
### 2. Externer Prompt-Bezug
|
||||
|
||||
- Der Prompt ist in M5 **nicht** im Code fest verdrahtet.
|
||||
- Die Anwendung lädt den fachlich verwendeten Prompt aus einer **externen Datei**.
|
||||
- Der für einen KI-Versuch verwendete Prompt muss über einen **stabilen Prompt-Identifikator** nachvollziehbar sein, mindestens über den Prompt-Dateinamen oder einen gleichwertig stabilen Identifikator.
|
||||
|
||||
### 3. Deterministische KI-Anfragezusammensetzung
|
||||
|
||||
M5 führt **kein frei erfundenes Templating-System** ein.
|
||||
|
||||
Stattdessen gilt:
|
||||
|
||||
- Die Application erzeugt für den KI-Port eine **vollständig deterministische Anfrage-Repräsentation**.
|
||||
- Diese Repräsentation besteht mindestens aus:
|
||||
- Prompt-Inhalt,
|
||||
- Prompt-Identifikator,
|
||||
- begrenztem Dokumenttext,
|
||||
- tatsächlich gesendeter Zeichenzahl,
|
||||
- der verbindlichen JSON-only-Erwartung für `date`, `title`, `reasoning`.
|
||||
- Die Zusammensetzung von Prompt und Dokumenttext erfolgt über einen **festen, dokumentierten technischen Aufbau**, damit KI 2 dies ohne Interpretationsspielraum umsetzen kann.
|
||||
- Provider-spezifische Features für strukturiertes Output dürfen **optional intern** verwendet werden, sind aber **nicht** Voraussetzung des M5-Designs.
|
||||
|
||||
### 4. KI-Konfiguration und effektiver API-Key
|
||||
|
||||
Für M5 müssen die bereits im Zielbild vorgesehenen KI-relevanten Konfigurationswerte technisch nutzbar und verdrahtet sein, insbesondere:
|
||||
|
||||
- `api.baseUrl`
|
||||
- `api.model`
|
||||
- `api.timeoutSeconds`
|
||||
- `max.text.characters`
|
||||
- `prompt.template.file`
|
||||
|
||||
Der API-Key bleibt Konfiguration; die bestehende Prioritätsregel **Umgebungsvariable vor Properties** bleibt verbindlich.
|
||||
|
||||
Zusätzlich gilt für M5:
|
||||
|
||||
- der **effektive** API-Key muss nicht nur aufgelöst, sondern auch **tatsächlich im HTTP-Adapter verwendet** werden,
|
||||
- die Verwendung muss automatisiert testbar nachgewiesen sein.
|
||||
|
||||
### 5. Begrenzung des an die KI gesendeten Inhalts
|
||||
|
||||
- Die maximale Zeichenzahl des an die KI gegebenen Dokumentinhalts ist konfigurierbar.
|
||||
- Die Begrenzung muss **vor dem KI-Aufruf** technisch angewendet werden.
|
||||
- Die tatsächlich an die KI gesendete Zeichenzahl muss verfügbar und persistierbar sein.
|
||||
- Die Begrenzung verändert **nicht** den extrahierten Originaltext im M3-/M4-Sinne, sondern nur den **für M5 verwendeten KI-Eingabetext**.
|
||||
|
||||
### 6. Vollständige M5-Regel für Titel und Datum
|
||||
|
||||
M5 muss die fachliche Regel **operational vollständig** umsetzen, ohne unzuverlässige Heuristik-Experimente einzuführen.
|
||||
|
||||
Daher gilt kombinatorisch:
|
||||
|
||||
- Der **Prompt-Vertrag** bindet die KI verbindlich an die nicht rein technisch prüfbaren Erwartungen:
|
||||
- Titel auf Deutsch,
|
||||
- verständlich,
|
||||
- hinreichend eindeutig,
|
||||
- Eigennamen unverändert.
|
||||
- Die **Application-Validierung** prüft zusätzlich alle in M5 **objektiv prüfbaren** Regeln:
|
||||
- `title` vorhanden,
|
||||
- `reasoning` vorhanden,
|
||||
- Basistitel max. 20 Zeichen,
|
||||
- keine unzulässigen Sonderzeichen außer Leerzeichen,
|
||||
- keine generischen Platzhaltertitel,
|
||||
- vorhandenes `date` ist als `YYYY-MM-DD` interpretierbar.
|
||||
- M5 führt **keine** spekulative Sprachdetektion, keine unsichere Eigennamen-Normalisierung und keine weichen Heuristiken für „Verständlichkeit“ ein, die fachlich eher neue Fehler erzeugen würden.
|
||||
|
||||
### 7. Technischer Antwortvertrag der KI
|
||||
|
||||
Die KI-Antwort muss in M5 als **genau ein parsebares JSON-Objekt** verarbeitet werden.
|
||||
|
||||
Zweckmäßiges Schema im M5-Sinn:
|
||||
|
||||
```json
|
||||
{
|
||||
"date": "2026-02-11",
|
||||
"title": "Stromabrechnung",
|
||||
"reasoning": "..."
|
||||
}
|
||||
```
|
||||
|
||||
Dabei gilt:
|
||||
|
||||
- `title` ist verpflichtend,
|
||||
- `reasoning` ist verpflichtend,
|
||||
- `date` ist optional,
|
||||
- zusätzliche Felder dürfen technisch toleriert werden, sind aber für M5 fachlich irrelevant,
|
||||
- zusätzlicher Freitext außerhalb des JSON-Objekts macht die Antwort **technisch unbrauchbar**.
|
||||
|
||||
### 8. Technisch unbrauchbare vs. fachlich unbrauchbare KI-Ergebnisse
|
||||
|
||||
Für M5 ist diese Unterscheidung verbindlich:
|
||||
|
||||
**Technisch unbrauchbare KI-Ergebnisse** sind insbesondere:
|
||||
- KI nicht erreichbar,
|
||||
- Timeout,
|
||||
- technisch fehlgeschlagener HTTP-Aufruf,
|
||||
- nicht parsebare KI-Antwort,
|
||||
- technisch unvollständige Antwort ohne verpflichtende Strukturbestandteile,
|
||||
- zusätzlicher Text außerhalb des erwarteten einzelnen JSON-Objekts.
|
||||
|
||||
Diese Fälle sind im M5-Sinn **dokumentbezogene technische Fehler**.
|
||||
|
||||
**Fachlich unbrauchbare KI-Ergebnisse** sind insbesondere:
|
||||
- unbrauchbarer oder generischer Titel,
|
||||
- Titel verletzt die technisch prüfbaren Titelregeln des Zielbilds,
|
||||
- die KI liefert ein `date`, dieses ist aber nicht interpretierbar oder unbrauchbar.
|
||||
|
||||
Diese Fälle sind im M5-Sinn **deterministische Inhaltsfehler**.
|
||||
|
||||
### 9. Datumsauflösung und Datumsquelle
|
||||
|
||||
- Liefert die KI ein gültiges `date`, wird dieses als aufgelöstes Datum verwendet.
|
||||
- Liefert die KI **kein** `date`, setzt die Anwendung den Fallback über die technische Uhr (`ClockPort`) auf das aktuelle Datum.
|
||||
- Liefert die KI ein `date`, dieses ist aber unbrauchbar, ist das **kein** Fallback-Fall, sondern ein fachlicher Fehlerfall.
|
||||
- Die Datumsquelle muss im M5-Stand nachvollziehbar und persistierbar sein.
|
||||
|
||||
### 10. M5-Benennungsvorschlag und führende Quelle für M6
|
||||
|
||||
Der M5-Benennungsvorschlag besteht mindestens aus:
|
||||
|
||||
- aufgelöstem Datum,
|
||||
- Datumsquelle,
|
||||
- validiertem Basistitel,
|
||||
- KI-Begründung (`reasoning`).
|
||||
|
||||
**Nicht** Bestandteil des M5-Benennungsvorschlags sind:
|
||||
|
||||
- finaler vollständiger Dateiname,
|
||||
- Dubletten-Suffix,
|
||||
- Zielpfad,
|
||||
- physische Zielkopie.
|
||||
|
||||
Für die Übergabe an M6 gilt verbindlich:
|
||||
|
||||
- die **führende Quelle** des gültigen M5-Benennungsvorschlags ist der **neueste Versuchshistorieneintrag mit Ergebnisstatus `PROPOSAL_READY`**,
|
||||
- M5 führt **keine parallele zweite Wahrheitsquelle** für Datum/Titel/Reasoning im Dokument-Stammsatz ein,
|
||||
- der Dokument-Stammsatz enthält in M5 weiterhin den Gesamtstatus und die Zähler, nicht aber redundante M6-Zieldaten.
|
||||
|
||||
### 11. Erweiterung der Versuchshistorie in M5
|
||||
|
||||
Die bereits in M4 vorhandene Versuchshistorie wird in M5 gezielt erweitert. Für jeden dokumentbezogenen, identifizierten M5-Versuch müssen mindestens speicherbar sein:
|
||||
|
||||
- Modellname,
|
||||
- Prompt-Identifikator oder Prompt-Dateiname,
|
||||
- verarbeitete Seitenzahl,
|
||||
- an die KI gesendete Zeichenzahl,
|
||||
- KI-Rohantwort,
|
||||
- KI-Reasoning,
|
||||
- aufgelöstes Datum,
|
||||
- Datumsquelle,
|
||||
- validierter Titel.
|
||||
|
||||
Nicht Bestandteil der M5-Versuchshistorie sind Zielpfad oder finaler Zieldateiname.
|
||||
|
||||
### 12. Fehlersemantik und Zählerfortschreibung in M5
|
||||
|
||||
M5 erweitert die in M4 bereits vorhandene Fehlersemantik:
|
||||
|
||||
- dokumentbezogene **technische** KI-Fehler nach erfolgreicher Fingerprint-Ermittlung bleiben retryable und laufen über den **Transientfehlerzähler**,
|
||||
- fachlich **deterministische** KI-Ergebnisfehler laufen über den **Inhaltsfehlerzähler**,
|
||||
- die bestehende M4-Regel „erster deterministischer Inhaltsfehler retryable, zweiter final“ bleibt erhalten und gilt auch für M5-deterministische Inhaltsfehler,
|
||||
- Skip-Fälle aus M4 bleiben unverändert erhalten,
|
||||
- Vor-Fingerprint-Fehler bleiben weiterhin **nicht historisierte** Laufereignisse.
|
||||
|
||||
### 13. Kein Vorgriff auf M6
|
||||
|
||||
M5 endet fachlich mit einem **persistierten, validierten Benennungsvorschlag**. M5 führt **keine** Zielkopie aus und erzeugt **keinen** finalen Zieldateinamen als technische Dateisystemoperation.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 M5-Kernobjekte, Statusmodell und KI-/Prompt-Port-Verträge präzisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M5-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die M5-relevanten Typen, positiven Zwischenstatus, KI-/Prompt-Verträge und Ergebnissemantiken werden eindeutig eingeführt, damit spätere Arbeitspakete ohne Interpretationsspielraum implementiert werden können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Neue M5-relevante Kernobjekte bzw. Application-nahe Typen anlegen, insbesondere für:
|
||||
- externen Prompt-Bezug,
|
||||
- Prompt-Identifikator,
|
||||
- deterministische KI-Anfrage-Repräsentation,
|
||||
- KI-Rohantwort,
|
||||
- parsebares KI-Antwortmodell,
|
||||
- validierten Benennungsvorschlag,
|
||||
- Datumsquelle,
|
||||
- KI-bezogene Nachvollziehbarkeitsdaten pro Versuch,
|
||||
- technische vs. fachliche KI-Fehlerklassifikation.
|
||||
- Das Statusmodell gezielt um die nicht-terminalen positiven Zwischenstatus erweitern:
|
||||
- `READY_FOR_AI`
|
||||
- `PROPOSAL_READY`
|
||||
- Die Semantik dieser Statuswerte in JavaDoc und ggf. `package-info` so dokumentieren, dass klar ist:
|
||||
- `READY_FOR_AI` ist **M5-verarbeitbar**,
|
||||
- `PROPOSAL_READY` ist **für M5 abgeschlossen**, aber **für M6 weiterverarbeitbar**,
|
||||
- `SUCCESS` bleibt ab M5 dem echten M6-Enderfolg vorbehalten.
|
||||
- Outbound-Ports definieren für:
|
||||
- Laden des externen Prompts,
|
||||
- KI-Aufruf über eine OpenAI-kompatible Schnittstelle.
|
||||
- Port-Verträge so schneiden, dass **weder `Path`/`File` noch HTTP-/JSON-Bibliothekstypen** in Domain oder Application durchsickern.
|
||||
- Rückgabemodelle so anlegen, dass spätere Arbeitspakete ohne Zusatzannahmen unterscheiden können zwischen:
|
||||
- technisch erfolgreichem KI-Aufruf mit Rohantwort,
|
||||
- technischem KI-Fehler,
|
||||
- parsebarer Antwort,
|
||||
- technisch unbrauchbarer Antwort,
|
||||
- fachlich validiertem Benennungsvorschlag,
|
||||
- fachlich unbrauchbarem Benennungsvorschlag.
|
||||
- M5-spezifische Semantik von `date`, `title` und `reasoning` in JavaDoc eindeutig beschreiben.
|
||||
- Explizit dokumentieren, dass M5 **keinen finalen Zieldateinamen** und **keine Zielkopie** erzeugt.
|
||||
- Explizit dokumentieren, dass die führende Quelle für M6 **der neueste Versuch mit `PROPOSAL_READY`** ist.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Prompt-Datei laden
|
||||
- HTTP-Adapter
|
||||
- KI-JSON-Parsing
|
||||
- fachliche Titel- und Datumsvalidierung
|
||||
- Persistenzanpassungen
|
||||
- Batch-Integration
|
||||
|
||||
### Fertig wenn
|
||||
- die M5-relevanten Typen und Port-Verträge vorhanden sind,
|
||||
- technische und fachliche M5-Ergebnisarten klar unterscheidbar modelliert sind,
|
||||
- die positiven Zwischenstatus ohne Widerspruch zu M6 definiert sind,
|
||||
- Domain und Application frei von Infrastrukturtypen bleiben,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Externen Prompt laden, stabil identifizieren und deterministische KI-Anfrage zusammensetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der fachlich verwendete Prompt wird aus einer externen Datei geladen, stabil identifizierbar gemacht und gemeinsam mit dem Dokumenttext in eine eindeutig definierte, deterministische KI-Anfrage-Repräsentation überführt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Adapter-Out für das Laden der externen Prompt-Datei implementieren.
|
||||
- Aus dem geladenen Prompt einen **stabilen Prompt-Identifikator** ableiten, mindestens über den Prompt-Dateinamen oder einen gleichwertig stabilen Identifikator.
|
||||
- Sicherstellen, dass leerer oder technisch unbrauchbarer Prompt nicht als gültige M5-Eingabe weitergereicht wird.
|
||||
- Einen **fest dokumentierten technischen Aufbau** der KI-Anfrage-Repräsentation implementieren, der mindestens enthält:
|
||||
- Prompt-Inhalt,
|
||||
- Prompt-Identifikator,
|
||||
- Dokumenttext als eigener Block,
|
||||
- die JSON-only-Erwartung für `date`, `title`, `reasoning`.
|
||||
- Die Zusammensetzung so schneiden, dass KI 2 **ohne implizite Entscheidung** weiß, in welcher Reihenfolge Prompt und Dokumenttext zusammengeführt werden.
|
||||
- Den Mechanismus so schneiden, dass **kein frei erfundenes Templating-System** eingeführt werden muss.
|
||||
- JavaDoc für Prompt-Bezug, Identifikator, deterministische Zusammensetzung, JSON-only-Vertrag und M5-Nicht-Ziele ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- HTTP-Aufruf zur KI
|
||||
- Textbegrenzung
|
||||
- KI-Antwortvalidierung
|
||||
- Batch-Integration
|
||||
- Persistenz der Prompt-Metadaten
|
||||
|
||||
### Fertig wenn
|
||||
- der Prompt aus einer externen Datei technisch ladbar ist,
|
||||
- ein stabiler Prompt-Identifikator bereitsteht,
|
||||
- die KI-Anfrage-Repräsentation deterministisch aufgebaut wird,
|
||||
- kein freies Ad-hoc-Templating eingeführt wurde,
|
||||
- der Stand weiterhin fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 OpenAI-kompatiblen KI-HTTP-Adapter mit wirksamer Konfiguration und kontrolliertem technischem Fehlerverhalten implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Ein technisch gekapselter Adapter kann einen M5-KI-Aufruf gegen eine OpenAI-kompatible HTTP-Schnittstelle ausführen, den **effektiven** API-Key tatsächlich verwenden und eine Rohantwort oder einen kontrollierten technischen Fehler liefern.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- OpenAI-kompatiblen HTTP-Adapter im Adapter-Out implementieren.
|
||||
- Basis-URL, Modellname, Timeout und API-Zugriff aus der vorhandenen Konfiguration technisch nutzbar machen.
|
||||
- Den Adapter so schneiden, dass Domain und Application nur mit dem **abstrakten KI-Port** arbeiten.
|
||||
- Eine technische Rohantwort liefern, die in späteren Arbeitspaketen geparst und validiert werden kann.
|
||||
- Kontrolliertes Fehlerverhalten mindestens für folgende Fälle umsetzen:
|
||||
- Timeout,
|
||||
- nicht erreichbarer Endpunkt,
|
||||
- technisch fehlgeschlagener HTTP-Aufruf,
|
||||
- sonstige technische Kommunikationsfehler.
|
||||
- Sicherstellen, dass der **effektive API-Key** gemäß bestehender Prioritätsregel im Adapter **real verwendet** wird.
|
||||
- Sicherstellen, dass keine HTTP-, Authentifizierungs- oder JSON-Implementierungsdetails in Domain oder Application durchsickern.
|
||||
- JavaDoc für OpenAI-Kompatibilität, technische Fehlergrenzen, Konfigurationsnutzung und M5-Nicht-Ziele ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- fachliche Validierung der KI-Antwort
|
||||
- Persistenz
|
||||
- Batch-Orchestrierung
|
||||
- Textbegrenzung
|
||||
- Datums-Fallback
|
||||
- finaler Benennungsvorschlag
|
||||
|
||||
### Fertig wenn
|
||||
- der KI-Port technisch implementiert ist,
|
||||
- Konfiguration für Basis-URL, Modellname, Timeout und effektiven API-Key wirksam verwendet wird,
|
||||
- Rohantwort und technische Fehler kontrolliert über den Port geliefert werden,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Textbegrenzung, KI-JSON-Parsing, vollständige M5-Validierung und Datumsauflösung umsetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-003 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die Anwendung kann einen begrenzten Dokumenttext für den KI-Aufruf vorbereiten, die KI-Rohantwort als JSON interpretieren und daraus entweder einen validierten Benennungsvorschlag oder einen eindeutig klassifizierten Fehler ableiten.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Konfigurierbare Begrenzung des an die KI zu sendenden Dokumenttexts umsetzen.
|
||||
- Sicherstellen, dass die Begrenzung **vor dem KI-Aufruf** angewendet wird.
|
||||
- Tatsächlich gesendete Zeichenzahl technisch erfassen und für spätere Persistenz bereitstellen.
|
||||
- Parsing der KI-Rohantwort als **genau ein JSON-Objekt** implementieren.
|
||||
- Technische Antwortvalidierung umsetzen:
|
||||
- `title` vorhanden und nicht leer,
|
||||
- `reasoning` vorhanden und nicht leer,
|
||||
- `date` optional,
|
||||
- zusätzlicher Freitext außerhalb des JSON-Objekts führt zu technischer Ungültigkeit.
|
||||
- Fachliche M5-Validierung für die **objektiv prüfbaren** Titel- und Datumsregeln umsetzen, insbesondere:
|
||||
- Basistitel max. 20 Zeichen,
|
||||
- keine unzulässigen Sonderzeichen außer Leerzeichen,
|
||||
- keine generischen Platzhaltertitel,
|
||||
- ein vorhandenes `date` muss als `YYYY-MM-DD` interpretierbar sein.
|
||||
- Den nicht rein technisch prüfbaren Teil der fachlichen Titelregel explizit über den **Prompt-Vertrag** absichern und dies im Code/JavaDoc sauber dokumentieren:
|
||||
- Deutsch,
|
||||
- verständlich,
|
||||
- hinreichend eindeutig,
|
||||
- Eigennamen unverändert.
|
||||
- Explizit umsetzen:
|
||||
- fehlendes `date` → Fallback über `ClockPort`,
|
||||
- vorhandenes, aber unbrauchbares `date` → fachlicher Fehler,
|
||||
- parsebare, aber fachlich unbrauchbare Titel → fachlicher Fehler.
|
||||
- Einen validierten M5-Benennungsvorschlag bereitstellen, der mindestens enthält:
|
||||
- aufgelöstes Datum,
|
||||
- Datumsquelle,
|
||||
- validierten Titel,
|
||||
- KI-Reasoning.
|
||||
- JavaDoc für Begrenzungsregel, Antwortvalidierung, Datums-Fallback und operative Vollständigkeit der Titelregel ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- HTTP-Adapter
|
||||
- SQLite-Schemaerweiterung
|
||||
- Batch-Integration
|
||||
- Status- und Zählerfortschreibung
|
||||
- finale Dateinamensbildung
|
||||
- Zielkopie
|
||||
|
||||
### Fertig wenn
|
||||
- Dokumenttext begrenzt werden kann,
|
||||
- parsebare und unparsebare KI-Antworten sauber unterscheidbar sind,
|
||||
- fehlendes Datum korrekt auf den Clock-Fallback läuft,
|
||||
- fachlich unbrauchbare Titel und Datumswerte sauber erkannt werden,
|
||||
- ein validierter M5-Benennungsvorschlag technisch verfügbar ist,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 SQLite-Schema von M4 nach M5 evolvieren, Altbestände migrieren und Versuchshistorie um M5-Nachvollziehbarkeit erweitern
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die bestehende M4-Persistenz wird kontrolliert auf den M5-Stand erweitert, inklusive der **idempotenten Migration positiver M4-Altbestände** und der Erweiterung der Versuchshistorie um die M5-Nachvollziehbarkeit.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Das bestehende SQLite-Schema **evolvieren**, nicht neu erfinden.
|
||||
- Die Schema-Initialisierung so erweitern, dass ein vorhandenes M4-Schema kontrolliert auf den M5-Stand gebracht werden kann.
|
||||
- Die Versuchshistorie um die für M5 benötigten Felder erweitern, insbesondere für:
|
||||
- Modellname,
|
||||
- Prompt-Identifikator oder Prompt-Dateiname,
|
||||
- verarbeitete Seitenzahl,
|
||||
- an die KI gesendete Zeichenzahl,
|
||||
- KI-Rohantwort,
|
||||
- KI-Reasoning,
|
||||
- aufgelöstes Datum,
|
||||
- Datumsquelle,
|
||||
- validierten Titel.
|
||||
- Die Persistenz so erweitern, dass die neuen nicht-terminalen positiven Statuswerte fachlich konsistent speicherbar sind:
|
||||
- `READY_FOR_AI`
|
||||
- `PROPOSAL_READY`
|
||||
- Eine **idempotente M4→M5-Altbestandsmigration** vorsehen, die mindestens sicherstellt:
|
||||
- M4-Dokumente mit positivem Altstatus `SUCCESS`, aber ohne M5-Benennungsvorschlag, werden kontrolliert nach `READY_FOR_AI` überführt,
|
||||
- diese Migration ist mehrfach ausführbar, ohne Daten zu beschädigen,
|
||||
- bestehende negative und terminale Fehlerzustände bleiben unangetastet.
|
||||
- Sicherstellen, dass:
|
||||
- bestehende M4-Daten weiterhin lesbar bleiben,
|
||||
- die M5-Erweiterung idempotent initialisierbar ist,
|
||||
- keine M6+-Felder für Zielpfad oder finalen Zieldateinamen angelegt werden.
|
||||
- Repository-Mapping der Versuchshistorie auf die neuen M5-Felder erweitern.
|
||||
- Falls für Tests oder Use-Case-Integration nötig, passende Lesefähigkeiten ergänzen, ohne spätere Reporting-Funktionalität vorwegzunehmen.
|
||||
- JavaDoc für Schemaevolution, Altbestandsmigration, Rückwärtsverträglichkeit und M5-Nachvollziehbarkeit ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Batch-Use-Case-Integration
|
||||
- KI-Aufruf
|
||||
- Status- und Zählerentscheidungen im laufenden Batch
|
||||
- Zielpfad- oder Dateinamenspersistenz
|
||||
- M6-Dateisystemfunktionalität
|
||||
|
||||
### Fertig wenn
|
||||
- das M4-Schema kontrolliert auf M5 erweitert werden kann,
|
||||
- M4-Altbestände mit positivem Zwischenstatus idempotent auf die M5-Semantik überführt werden,
|
||||
- die neuen M5-Felder in der Versuchshistorie technisch schreib- und lesbar sind,
|
||||
- bestehende M4-Daten nicht unbrauchbar werden,
|
||||
- keine M6+-Persistenzfelder eingeführt wurden,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 M5-Entscheidungslogik und Batch-Integration für KI-Aufruf, Fehlerklassifikation, Statusfortschreibung und persistierten Benennungsvorschlag umsetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der bestehende M4-Lauf wird zu einem echten M5-Lauf erweitert, der nach erfolgreicher M3-Vorprüfung per KI einen validierten Benennungsvorschlag erzeugt, KI-bezogene Versuchsdaten persistiert und die positive Statussemantik so fortschreibt, dass M6 ohne Statusbruch anknüpfen kann.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden Batch-Use-Case so erweitern, dass pro geeignetem Dokument nach M4-Identifikation und nach bestandener M3-Vorprüfung zusätzlich gilt:
|
||||
1. Prompt laden,
|
||||
2. Dokumenttext begrenzen,
|
||||
3. deterministische KI-Anfrage erzeugen,
|
||||
4. KI-Aufruf durchführen,
|
||||
5. KI-Rohantwort technisch verarbeiten,
|
||||
6. Benennungsvorschlag validieren,
|
||||
7. Versuchshistorie mit M5-Nachvollziehbarkeit fortschreiben,
|
||||
8. Status und Zähler im vorhandenen Rahmen konsistent fortschreiben.
|
||||
- Sicherstellen, dass **kein KI-Aufruf** erfolgt bei:
|
||||
- Vor-Fingerprint-Fehlern,
|
||||
- terminalen Skip-Fällen aus M4,
|
||||
- M3-Inhaltsfehlern,
|
||||
- sonstigen Fällen, die den Dokumentlauf bereits vor der KI sauber beendet haben.
|
||||
- Die positiven Dokumentzustände explizit wie folgt behandeln:
|
||||
- M4-Altbestand `SUCCESS` ohne M5-Benennungsvorschlag wird **nicht** als terminal erfolgreich übersprungen, sondern gilt nach Migration bzw. Normalisierung als `READY_FOR_AI`,
|
||||
- gültiger M5-Benennungsvorschlag führt zu `PROPOSAL_READY`, **nicht** zu `SUCCESS`,
|
||||
- bestehendes `PROPOSAL_READY` führt in M5 zu **keinem erneuten KI-Aufruf**.
|
||||
- M5-Fehlerfälle explizit wie folgt in den bestehenden Status- und Zählerrahmen überführen:
|
||||
- technischer KI-Fehler oder technisch unbrauchbare KI-Antwort → dokumentbezogener technischer Fehler, retryable, **Transientfehlerzähler +1**,
|
||||
- parsebare, aber fachlich unbrauchbare KI-Antwort → deterministischer Inhaltsfehler, **Inhaltsfehlerzähler** nach bestehender Regel fortschreiben,
|
||||
- gültiger Benennungsvorschlag → `PROPOSAL_READY`, Fehlerzähler unverändert.
|
||||
- Sicherstellen, dass bei dokumentbezogenen KI-Fehlern der Batch-Lauf für andere Dokumente kontrolliert weiterläuft.
|
||||
- Für identifizierte Dokumente sicherstellen, dass Versuchshistorie und Stammsatz konsistent fortgeschrieben werden und kein teilpersistierter Zustand zurückbleibt.
|
||||
- Sicherstellen, dass die führende Quelle für M6 **der neueste Versuch mit `PROPOSAL_READY`** bleibt und nicht implizit aus anderen Daten rekonstruiert werden muss.
|
||||
- Sicherstellen, dass KI-Rohantwort standardmäßig **in SQLite**, aber nicht als M6-/M7-Funktionalität fehlmodelliert wird.
|
||||
- JavaDoc für M5-Laufreihenfolge, KI-Grenze, Fehlerklassifikation, positive Statusfortschreibung und Persistenzkonsistenz ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- physische Zielkopie
|
||||
- finale Dateinamensbildung
|
||||
- Dublettenbehandlung
|
||||
- Zielpfad- und Zieldateinamenspersistenz
|
||||
- M6- oder M7-Feinschliff
|
||||
|
||||
### Fertig wenn
|
||||
- der Batch-Lauf für geeignete Dokumente tatsächlich einen KI-basierten Benennungsvorschlag erzeugt,
|
||||
- M4-Altbestände mit positivem Zwischenstatus korrekt in die M5-Verarbeitung überführt werden,
|
||||
- gültige M5-Ergebnisse als `PROPOSAL_READY` und **nicht** als `SUCCESS` persistiert werden,
|
||||
- M5-spezifische technische und fachliche KI-Fehler sauber unterschieden werden,
|
||||
- der validierte Benennungsvorschlag persistiert wird,
|
||||
- KI-Aufrufe nur an den fachlich zulässigen Stellen erfolgen,
|
||||
- der Lauf trotz dokumentbezogener KI-Fehler kontrolliert weiterarbeitet,
|
||||
- weiterhin keine M6+-Funktionalität enthalten ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Bootstrap- und CLI-Anpassungen für M5-Konfiguration, Startvalidierung, Schemaevolution und vollständige Verdrahtung durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Programmeinstieg ist sauber an den M5-Lauf angepasst; die KI-relevante Konfiguration wird validiert, die M5-Schemaevolution einschließlich Altbestandsmigration wird beim Start wirksam, alle M5-Bausteine sind verdrahtet und harte Startfehler führen weiterhin kontrolliert zu Exit-Code 1.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bootstrap-Verdrahtung auf die neuen M5-Ports, Adapter, Validierungs- und Persistenzbausteine erweitern.
|
||||
- M5-relevante Konfiguration ergänzen bzw. verdrahten, insbesondere für:
|
||||
- `api.baseUrl`
|
||||
- `api.model`
|
||||
- `api.timeoutSeconds`
|
||||
- `max.text.characters`
|
||||
- `prompt.template.file`
|
||||
- Startvalidierung so ergänzen, dass mindestens geprüft wird:
|
||||
- Prompt-Datei ist vorhanden und technisch lesbar,
|
||||
- `max.text.characters` ist gültig und technisch nutzbar,
|
||||
- Timeout ist gültig,
|
||||
- KI-Basis-URL und Modellname sind vorhanden,
|
||||
- ein effektiver API-Key gemäß bestehender Prioritätsregel verfügbar ist.
|
||||
- Die bestehende M4-Schema-Initialisierung mit der M5-Schemaevolution und der M4→M5-Altbestandsmigration sauber kombinieren.
|
||||
- Sicherstellen, dass harte Start-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehler weiterhin zu **Exit-Code 1** führen.
|
||||
- Sicherstellen, dass dokumentbezogene KI-Fehler **nicht** als Startfehler fehlmodelliert werden.
|
||||
- JavaDoc und `package-info` für aktualisierte Verdrahtung, Konfiguration, Schemaevolution und Modulgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Logging-Feinschliff des Endstands
|
||||
- M6-Dateisystemverdrahtung
|
||||
- Zielkopie
|
||||
- Dublettenlogik
|
||||
- spätere Betriebsoptimierungen
|
||||
|
||||
### Fertig wenn
|
||||
- das Programm im M5-Stand vollständig startbar ist,
|
||||
- alle M5-Bausteine korrekt verdrahtet sind,
|
||||
- die M5-Startvalidierung greift,
|
||||
- die M5-Schemaevolution einschließlich Altbestandsmigration beim Start wirksam wird,
|
||||
- harte Startfehler weiterhin kontrolliert zu Exit-Code 1 führen,
|
||||
- der Build fehlerfrei bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Tests für Prompt-Bezug, KI-Adapter, Validierung, Altbestandsmigration, Statussemantik, Schemaevolution und M5-Ablauf vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige M5-Zielzustand wird automatisiert abgesichert und als konsistenter Übergabestand nachgewiesen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Unit-Tests für den externen Prompt-Bezug implementieren, insbesondere für:
|
||||
- Prompt-Datei wird geladen,
|
||||
- leerer oder technisch ungültiger Prompt wird abgelehnt,
|
||||
- stabiler Prompt-Identifikator wird geliefert,
|
||||
- deterministische Zusammensetzung der KI-Anfrage bleibt reproduzierbar.
|
||||
- Tests für Textbegrenzung und tatsächlich gesendete Zeichenzahl implementieren.
|
||||
- Tests für KI-JSON-Parsing und technische Antwortvalidierung implementieren, insbesondere für:
|
||||
- gültiges JSON,
|
||||
- ungültiges JSON,
|
||||
- fehlendes `title`,
|
||||
- fehlendes `reasoning`,
|
||||
- optionales `date`,
|
||||
- zusätzlicher Text außerhalb des JSON-Objekts.
|
||||
- Tests für fachliche M5-Validierung implementieren, insbesondere für:
|
||||
- gültigen Titel,
|
||||
- generischen Titel,
|
||||
- Titel mit unzulässigen Sonderzeichen,
|
||||
- Titel über 20 Zeichen,
|
||||
- gültiges KI-Datum,
|
||||
- fehlendes Datum mit Clock-Fallback,
|
||||
- unbrauchbares vorhandenes Datum.
|
||||
- Adapter-Tests für den KI-Port ergänzen, insbesondere für:
|
||||
- erfolgreiche Rohantwort,
|
||||
- Timeout,
|
||||
- technisch fehlgeschlagenen HTTP-Aufruf,
|
||||
- Verwendung des **effektiven API-Keys** im tatsächlichen HTTP-Request,
|
||||
- Vorrang der Umgebungsvariable vor `api.key` aus Properties im real genutzten Adapterpfad.
|
||||
- Repository- und Schema-Tests gegen SQLite ergänzen, insbesondere für:
|
||||
- Evolution eines M4-Schemas auf M5,
|
||||
- Persistenz und Auslesen der neuen M5-Versuchshistorienfelder,
|
||||
- Rückwärtsverträglichkeit bestehender M4-Daten,
|
||||
- idempotente Migration von M4-Altbeständen `SUCCESS` nach `READY_FOR_AI`.
|
||||
- Integrationstests für den M5-Ablauf ergänzen, insbesondere:
|
||||
- gültiger M5-Happy-Path mit persistiertem Benennungsvorschlag endet in `PROPOSAL_READY`, **nicht** in `SUCCESS`,
|
||||
- technisch unbrauchbare KI-Antwort führt zu retryablem technischem Fehler,
|
||||
- fachlich unbrauchbare KI-Antwort führt zu deterministischem Inhaltsfehler,
|
||||
- bei fehlendem `date` wird der Clock-Fallback verwendet,
|
||||
- bei M3-Inhaltsfehlern erfolgt kein KI-Aufruf,
|
||||
- bestehendes `PROPOSAL_READY` wird im M5-Wiederholungslauf nicht erneut per KI verarbeitet,
|
||||
- M4-Altbestände mit positivem Zwischenstatus werden im M5-Lauf nicht fälschlich als final erfolgreich übersprungen,
|
||||
- die führende Quelle für M6 ist der neueste Versuch mit `PROPOSAL_READY`,
|
||||
- es erfolgt weiterhin **keine Zielkopie**.
|
||||
- Tests für Bootstrap- und Startverhalten ergänzen, insbesondere:
|
||||
- ungültige M5-Konfiguration führt zu Exit-Code 1,
|
||||
- M5-Schemaevolution einschließlich Altbestandsmigration wird beim Start wirksam,
|
||||
- dokumentbezogene KI-Fehler führen **nicht** zu Exit-Code 1.
|
||||
- Den M5-Stand abschließend auf Konsistenz, Architekturtreue und Nicht-Vorgriff auf M6+ prüfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Tests für Zielkopie
|
||||
- Tests für Dublettenbehandlung im Zielordner
|
||||
- Tests für finalen Zieldateinamen
|
||||
- Tests für M6-/M7-Endverhalten
|
||||
|
||||
### Fertig wenn
|
||||
- die Test-Suite für den M5-Umfang grün ist,
|
||||
- die wichtigsten M5-Randfälle automatisiert abgesichert sind,
|
||||
- die positive Statussemantik M4→M5→M6 ohne Lücke nachgewiesen ist,
|
||||
- der definierte M5-Zielzustand vollständig erreicht ist,
|
||||
- ein fehlerfreier, übergabefähiger Stand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M5-Zielumfang aus den verbindlichen Spezifikationen ab und schließen zusätzlich die zuvor offene **Status- und Übergangssemantik zwischen M4, M5 und M6** sauber:
|
||||
|
||||
- externer Prompt-Bezug
|
||||
- deterministische KI-Anfrage-Zusammensetzung
|
||||
- OpenAI-kompatibler KI-Aufruf
|
||||
- konfigurierbarer KI-Zugriff einschließlich wirksam genutztem effektivem API-Key
|
||||
- Begrenzung des an die KI gesendeten Inhalts
|
||||
- technische und fachliche Validierung der KI-Antwort
|
||||
- Datums-Fallback durch die Anwendung
|
||||
- persistierter, validierter Benennungsvorschlag
|
||||
- Erweiterung der Versuchshistorie um M5-Nachvollziehbarkeit
|
||||
- idempotente M4→M5-Altbestandsmigration
|
||||
- positive Zwischenstatus `READY_FOR_AI` und `PROPOSAL_READY` als saubere Brücke zu M6
|
||||
- Tests für Prompt, KI-Adapter, Validierung, Datumsauflösung, Schemaevolution, Altbestandsmigration und M5-Ablauf
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M4 sowie zu M6+ gewahrt. Insbesondere werden **keine** Zielkopie, **keine** Dublettenbehandlung im Zielordner und **keine** finale Dateinamensbildung vorweggenommen.
|
||||
525
docs/workpackages/M6 - Arbeitspakete.md
Normal file
525
docs/workpackages/M6 - Arbeitspakete.md
Normal file
@@ -0,0 +1,525 @@
|
||||
# M6 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M6 – Dateinamensbildung, Dublettenbehandlung und Zielkopie**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3**, **M4** und **M5** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter und Bootstrap ändern.
|
||||
- Keine Annahmen treffen, die nicht durch dieses Dokument oder die verbindlichen Spezifikationen gedeckt sind.
|
||||
- Kein Vorgriff auf **M7+**.
|
||||
- Kein Umbau bestehender M1–M5-Strukturen ohne direkten M6-Bezug.
|
||||
- Neue Typen, Ports, Statusübergänge, Migrationen und Adapter so schneiden, dass sie aus einem einzelnen Arbeitspaket heraus **klar benennbar, testbar und reviewbar** sind.
|
||||
- M6 baut auf dem in M5 persistierten, validierten Benennungsvorschlag auf und führt **keine zweite Wahrheitsquelle** für Datum, Titel oder Reasoning ein.
|
||||
- Jedes Arbeitspaket muss so präzise sein, dass **KI 1** keine offenen Fach- oder Architekturentscheidungen an **KI 2** delegieren muss.
|
||||
- M6 endet fachlich mit dem **echten Enderfolg**: korrekt benannte Zielkopie vorhanden und Erfolg konsistent persistiert.
|
||||
|
||||
## Explizit nicht Bestandteil von M6
|
||||
|
||||
- fachliche Retry-Logik des Endstands über mehrere spätere Läufe hinaus, soweit sie über die bereits vorhandene Minimalfehlersemantik hinausgeht
|
||||
- technischer Sofort-Wiederholversuch für Zielkopierfehler innerhalb desselben Laufs
|
||||
- Logging-Feinschliff und Sensibilitätsregeln des Endstands aus M7
|
||||
- neue KI-Funktionalität, Prompt-Evolution oder M5-Fachlogik jenseits der für M6 nötigen Weiterverwendung
|
||||
- manuelle Nachbearbeitung oder Benutzerinteraktion
|
||||
- Reporting-, Statistik- oder Auswertungsfunktionen
|
||||
- spätere Betriebsoptimierungen, die nicht für den M6-Zielstand notwendig sind
|
||||
|
||||
## Verbindliche M6-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. Führende Quelle des Benennungsvorschlags
|
||||
|
||||
- Die führende Quelle für Datum, Datumsquelle, validierten Titel und Reasoning bleibt in M6 der **neueste Versuchshistorieneintrag mit Ergebnisstatus `PROPOSAL_READY`**.
|
||||
- M6 rekonstruiert diesen Benennungsvorschlag **nicht** aus dem Dokument-Stammsatz.
|
||||
- M6 erzeugt **keinen neuen KI-Aufruf**, wenn bereits ein nutzbarer `PROPOSAL_READY`-Versuch vorliegt.
|
||||
- Ein Dokumentzustand `PROPOSAL_READY` ohne lesbaren, konsistenten `PROPOSAL_READY`-Versuch gilt in M6 als **dokumentbezogener technischer Fehler**, nicht als stiller Anlass für eine heimliche Neuinterpretation.
|
||||
- Ein geladener `PROPOSAL_READY`-Versuch mit fachlich oder technisch unbrauchbaren Kernwerten für Datum oder Titel gilt ebenfalls als **inkonsistenter Persistenzzustand** und damit als **dokumentbezogener technischer Fehler**.
|
||||
|
||||
### 2. Positive Status- und Übergangssemantik in M6
|
||||
|
||||
Ab M6 gilt verbindlich:
|
||||
|
||||
- `READY_FOR_AI` bleibt verarbeitbar.
|
||||
- `FAILED_RETRYABLE` bleibt verarbeitbar.
|
||||
- `PROPOSAL_READY` ist der **fachlich korrekte Eingangszustand** für Dateinamensbildung, Dublettenbehandlung und Zielkopie.
|
||||
- `SUCCESS` ist ab M6 der **echte terminale Enderfolg** nach erfolgreicher Zielkopie und konsistenter Persistenz.
|
||||
- `FAILED_FINAL` bleibt terminal und wird nicht erneut fachlich verarbeitet.
|
||||
- `SUCCESS` wird in späteren Läufen nicht erneut verarbeitet, sondern mit `SKIPPED_ALREADY_PROCESSED` historisiert.
|
||||
- `FAILED_FINAL` wird in späteren Läufen nicht erneut verarbeitet, sondern mit `SKIPPED_FINAL_FAILURE` historisiert.
|
||||
|
||||
### 3. Verbindliche Dateinamensregeln in M6
|
||||
|
||||
Der finale Zieldateiname folgt technisch verbindlich diesem Muster:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
Dabei gilt:
|
||||
|
||||
- das Datum stammt aus dem führenden M5-Benennungsvorschlag,
|
||||
- der Titel stammt aus dem führenden M5-Benennungsvorschlag,
|
||||
- die **20 Zeichen** gelten nur für den **Basistitel**,
|
||||
- das Dubletten-Suffix zählt **nicht** zu diesen 20 Zeichen,
|
||||
- die fachliche Titelregel **„keine Sonderzeichen außer Leerzeichen“** bleibt auch in M6 verbindlich abgesichert,
|
||||
- Windows-unzulässige Zeichen werden nur im Rahmen der **technischen Dateisystemzulässigkeit** kontrolliert entfernt oder ersetzt,
|
||||
- diese technische Bereinigung darf **keine neue fachliche Titelinterpretation** erzeugen,
|
||||
- wenn ein geladener Proposal-Titel entgegen der M5-Semantik die fachlichen Titelregeln verletzt, wird dieser Zustand **nicht stillschweigend geheilt**, sondern als **inkonsistenter technischer Dokumentzustand** behandelt,
|
||||
- die Quelldatei bleibt unverändert.
|
||||
|
||||
### 4. Dublettenregel in M6
|
||||
|
||||
- Die Dublettenregel wird im **Zielordner** physisch gegen bereits vorhandene Dateien aufgelöst.
|
||||
- Der erste freie Name ist zu verwenden:
|
||||
- `YYYY-MM-DD - Titel.pdf`
|
||||
- `YYYY-MM-DD - Titel(1).pdf`
|
||||
- `YYYY-MM-DD - Titel(2).pdf`
|
||||
- usw.
|
||||
- Das Suffix wird unmittelbar vor `.pdf` angehängt.
|
||||
- Die Dublettenauflösung ist rein technisch und führt **keine** neue fachliche Titelvariante ein.
|
||||
|
||||
### 5. Zielkopie und Dateisystemsemantik
|
||||
|
||||
- M6 erzeugt bei Erfolg **eine Kopie** im Zielordner.
|
||||
- Die Quelldatei wird **nicht** verändert, verschoben, gelöscht oder überschrieben.
|
||||
- Die Zielerzeugung erfolgt über eine temporäre Zieldatei mit anschließendem finalem Move/Rename, soweit dies im Zielkontext technisch möglich ist.
|
||||
- M6 führt **keinen** fachlichen oder technischen Sofort-Mehrfachversuch für die Zielkopie ein; dieser ist M7 vorbehalten.
|
||||
|
||||
### 6. Persistenzerweiterung und Historisierung in M6
|
||||
|
||||
Die Persistenz wird in M6 gezielt erweitert:
|
||||
|
||||
- Der **Dokument-Stammsatz** speichert zusätzlich mindestens:
|
||||
- letzten Zielpfad,
|
||||
- letzten Zieldateinamen.
|
||||
- Die **Versuchshistorie** speichert zusätzlich mindestens:
|
||||
- finalen Zieldateinamen.
|
||||
- Datum, Datumsquelle, validierter Titel und Reasoning bleiben weiterhin führend in der Versuchshistorie des `PROPOSAL_READY`-Versuchs und werden nicht redundant in den Stammsatz gespiegelt.
|
||||
- Ein M6-Enderfolg oder M6-Fehler wird als **zusätzlicher neuer Versuch** historisiert; der führende `PROPOSAL_READY`-Versuch bleibt dabei **unverändert erhalten**.
|
||||
- M6 überschreibt oder ersetzt **nicht** nachträglich den führenden Proposal-Versuch, sondern baut fachlich und historisch auf ihm auf.
|
||||
|
||||
### 7. Reihenfolge pro Dokument in M6
|
||||
|
||||
Die Verarbeitung eines einzelnen Kandidaten erfolgt in M6 verbindlich in dieser Reihenfolge:
|
||||
|
||||
1. Fingerprint berechnen
|
||||
2. Dokument-Stammsatz laden
|
||||
3. terminale Skip-Fälle entscheiden
|
||||
4. falls nötig den bestehenden M5-Pfad bis zu einem gültigen Benennungsvorschlag ausführen
|
||||
5. führenden `PROPOSAL_READY`-Versuch laden
|
||||
6. finalen Basis-Dateinamen bilden
|
||||
7. Dubletten-Suffix im Zielordner bestimmen
|
||||
8. Zielkopie technisch schreiben
|
||||
9. einen **neuen** M6-Versuch für Enderfolg oder technischen Fehler historisieren
|
||||
10. Dokument-Stammsatz konsistent fortschreiben
|
||||
|
||||
### 8. Erfolg und Konsistenz in M6
|
||||
|
||||
- `SUCCESS` darf erst gesetzt werden, wenn:
|
||||
1. die Zielkopie erfolgreich geschrieben wurde,
|
||||
2. der finale Zieldateiname bestimmt ist,
|
||||
3. die zugehörige Persistenz konsistent fortgeschrieben wurde.
|
||||
- Es darf **kein** Fall entstehen, in dem ein Dokument als `SUCCESS` persistiert ist, ohne dass die Zielkopie erfolgreich vorliegt.
|
||||
- Wenn die Persistenz nach erfolgreicher Zielkopie scheitert, ist **kein** `SUCCESS` zu setzen; ein best-effort Rückbau der neu erzeugten Zielkopie ist in M6 zweckmäßig und architekturtreu vorzusehen.
|
||||
- Wenn dieser Rückbau selbst nur teilweise gelingt, bleibt der Fall ein **dokumentbezogener technischer Fehler**; M6 erfindet daraus weder einen Erfolg noch eine neue finale Fehlerkategorie.
|
||||
|
||||
### 9. Fehlersemantik in M6
|
||||
|
||||
- Technische Fehler bei Proposal-Quelllesung, Zielpfadbildung, Dublettenauflösung, Zielkopie oder M6-relevanter Persistenz nach erfolgreicher Fingerprint-Ermittlung sind in M6 **dokumentbezogene technische Fehler**.
|
||||
- Diese Fehler bleiben retryable und laufen über den **Transientfehlerzähler**.
|
||||
- M6 führt **keine** neue finale Fehlerkategorie nur für Zielkopierfehler ein.
|
||||
- Dokumentbezogene M6-Fehler dürfen den Batch-Lauf für andere Dokumente nicht unnötig abbrechen.
|
||||
|
||||
### 10. Kein Vorgriff auf M7
|
||||
|
||||
M6 liefert den vollständigen Erfolgspfad für Dateinamensbildung, Dublettenbehandlung und Zielkopie, aber ausdrücklich **nicht**:
|
||||
|
||||
- den technischen Sofort-Wiederholversuch beim Schreiben,
|
||||
- den finalen Logging-Feinschliff,
|
||||
- die vollständige Betriebsrobustheit und Retry-Ausarbeitung des Endstands.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 M6-Kernobjekte, Zielerfolgssemantik und Port-Verträge präzisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M6-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die M6-relevanten Typen, Erfolgskriterien, Zielartefakt-Begriffe und Port-Verträge werden eindeutig eingeführt, damit spätere Arbeitspakete ohne Interpretationsspielraum implementiert werden können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Neue M6-relevante Kernobjekte bzw. Application-nahe Typen anlegen, insbesondere für:
|
||||
- finalen Dateinamenkandidaten,
|
||||
- Dublettenauflösung bzw. Namenskollision,
|
||||
- Zielartefakt-Planung,
|
||||
- Zielschreib-Ergebnis,
|
||||
- M6-bezogene Persistenzdaten für den Enderfolg,
|
||||
- lesbaren führenden Benennungsvorschlag aus dem neuesten `PROPOSAL_READY`-Versuch,
|
||||
- inkonsistenten Proposal-Quellzustand.
|
||||
- Die M6-Statussemantik in JavaDoc und ggf. `package-info` so schärfen, dass klar ist:
|
||||
- `PROPOSAL_READY` ist M6-verarbeitbar,
|
||||
- `SUCCESS` ist nur nach echter Zielkopie plus konsistenter Persistenz zulässig,
|
||||
- `SUCCESS` und `FAILED_FINAL` bleiben terminale Skip-Zustände,
|
||||
- ein inkonsistenter Zustand `PROPOSAL_READY` ohne lesbare führende Versuchsdaten ein technischer Dokumentfehler ist,
|
||||
- ein M6-Endversuch zusätzlich zum Proposal-Versuch historisiert wird und diesen nicht ersetzt.
|
||||
- Outbound-Ports definieren oder gezielt erweitern für:
|
||||
- Laden des führenden `PROPOSAL_READY`-Versuchs,
|
||||
- technische Dublettenauflösung im Zielordner,
|
||||
- Zielkopie/Schreiboperation,
|
||||
- Persistenz der M6-Zieldaten.
|
||||
- Port-Verträge so schneiden, dass **weder `Path`/`File` noch NIO-/JDBC-Typen** in Domain oder Application durchsickern.
|
||||
- Rückgabemodelle so anlegen, dass spätere Arbeitspakete ohne Zusatzannahmen unterscheiden können zwischen:
|
||||
- nutzbarem führenden Benennungsvorschlag,
|
||||
- fehlendem oder inkonsistentem Proposal-Quellzustand,
|
||||
- erfolgreicher Dublettenauflösung,
|
||||
- technischem Zielschreibfehler,
|
||||
- technischem Persistenzfehler nach Zielkopie,
|
||||
- konsistentem M6-Enderfolg.
|
||||
- Explizit dokumentieren, dass M6 **keine zweite Wahrheitsquelle** für Datum, Titel und Reasoning einführt.
|
||||
- Explizit dokumentieren, dass M6 **keinen Sofort-Wiederholversuch** der Zielkopie einführt.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- konkrete Dateisystem-Implementierung
|
||||
- konkrete SQLite-Schemaänderungen
|
||||
- Batch-Integration
|
||||
- reale Zielkopie
|
||||
- Tests für das Endverhalten
|
||||
|
||||
### Fertig wenn
|
||||
- die M6-relevanten Typen und Port-Verträge vorhanden sind,
|
||||
- Zielerfolg, Proposal-Quelle, inkonsistente Proposal-Zustände und technische Fehlerarten klar unterscheidbar modelliert sind,
|
||||
- Domain und Application frei von Infrastrukturtypen bleiben,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Technische Dateinamensbildung für den finalen M6-Basisnamen implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Aus dem führenden M5-Benennungsvorschlag kann ein technischer, finaler M6-Basisdateiname im Zielformat erzeugt werden, ohne bereits eine physische Zielkopie oder Dublettenauflösung im Dateisystem auszuführen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Einen M6-Baustein für die technische Dateinamensbildung implementieren.
|
||||
- Das verbindliche Zielformat exakt umsetzen:
|
||||
|
||||
```text
|
||||
YYYY-MM-DD - Titel.pdf
|
||||
```
|
||||
|
||||
- Den Basistitel aus dem führenden M5-Benennungsvorschlag übernehmen und technisch in einen final verwendbaren M6-Basisdateinamen überführen.
|
||||
- Die fachliche Titelregel **„keine Sonderzeichen außer Leerzeichen“** im M6-Kontext explizit absichern:
|
||||
- regulärer Happy-Path: der bereits validierte M5-Titel wird unverändert weiterverwendet,
|
||||
- inkonsistenter Persistenzfall: ein geladener Proposal-Titel, der diese Regel verletzt, wird **nicht** stillschweigend fachlich umgedeutet, sondern als technischer Dokumentfehler behandelt.
|
||||
- Windows-unzulässige Zeichen kontrolliert entfernen oder ersetzen, **soweit dies ausschließlich der technischen Dateisystemzulässigkeit dient**.
|
||||
- Sicherstellen, dass die **20-Zeichen-Regel** ausschließlich für den **Basistitel** gilt und nicht für ein späteres Dubletten-Suffix.
|
||||
- Keine neue fachliche Titelinterpretation einführen; M6 baut auf dem bereits validierten M5-Titel auf.
|
||||
- Einen defensiven technischen Schutz für inkonsistente Persistenzzustände vorsehen, falls ein geladener Proposal-Titel oder Proposal-Datumswert entgegen der M5-Semantik nicht verwertbar ist.
|
||||
- JavaDoc für Zielformat, fachliche Titelregel, Windows-Kompatibilität, Basistitelbegriff und Nicht-Ziele von M6 ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Dublettenprüfung gegen den Zielordner
|
||||
- physische Zielkopie
|
||||
- Persistenzänderungen
|
||||
- Batch-Orchestrierung
|
||||
- Zielpfadbildung im Dateisystem
|
||||
|
||||
### Fertig wenn
|
||||
- aus einem nutzbaren M5-Benennungsvorschlag deterministisch ein M6-Basisdateiname erzeugt werden kann,
|
||||
- das Zielformat exakt eingehalten wird,
|
||||
- die fachliche Titelregel und die technische Windows-Kompatibilität **klar getrennt** behandelt werden,
|
||||
- inkonsistente Proposal-Daten nicht stillschweigend fachlich bereinigt werden,
|
||||
- weiterhin keine physische Zielkopie oder Dublettenauflösung erfolgt,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 Zielordnerzugriff, Dublettenauflösung und Zielpfadplanung im Adapter-Out implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Zielordner kann technisch bewertet werden, der erste freie finale Zieldateiname im Zielkontext wird bestimmt und eine konsistente Zielpfadplanung steht für die spätere Kopieroperation bereit.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Dateisystem-Adapter für den Zielordnerzugriff implementieren.
|
||||
- Technische Dublettenauflösung im Zielordner umsetzen.
|
||||
- Den ersten freien Namen nach folgender Regel bestimmen:
|
||||
- ohne Suffix,
|
||||
- dann `(1)`, `(2)`, … direkt vor `.pdf`.
|
||||
- Sicherstellen, dass das Dubletten-Suffix **nicht** in die 20-Zeichen-Regel des Basistitels eingerechnet wird.
|
||||
- Die Zielpfadplanung so schneiden, dass sie den finalen Zielnamen und eine technische temporäre Zieldatei im Zielkontext vorbereiten kann.
|
||||
- Technische Fehler beim Lesen des Zielordners oder bei der Namensauflösung kontrolliert in den Port-Vertrag überführen.
|
||||
- JavaDoc für Zielordnerzugriff, Kollisionserkennung, Suffixlogik und technische Grenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- tatsächliches Kopieren der Datei
|
||||
- Persistenz von Zielpfad oder Zieldateiname
|
||||
- Batch-Use-Case-Integration
|
||||
- M7-Sofort-Wiederholversuch
|
||||
|
||||
### Fertig wenn
|
||||
- der Zielordner technisch ausgewertet werden kann,
|
||||
- der erste freie finale Zieldateiname korrekt bestimmbar ist,
|
||||
- eine temporäre und finale Zielpfadplanung bereitsteht,
|
||||
- technische Fehler kontrolliert über den Port geliefert werden,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Zielkopie mit temporärer Zieldatei und finalem Move/Rename implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-003 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Eine Quelldatei kann als M6-Zielartefakt technisch in den Zielordner kopiert werden, wobei temporäre Zielerzeugung und finaler Move/Rename sauber gekapselt sind.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Einen Adapter-Out für die physische Zielkopie implementieren.
|
||||
- Die Zielerzeugung so umsetzen, dass mindestens folgender technischer Ablauf möglich ist:
|
||||
1. Kopie in eine temporäre Zieldatei im Zielkontext,
|
||||
2. finaler Move/Rename auf den zuvor geplanten finalen Zieldateinamen.
|
||||
- Sicherstellen, dass die Quelldatei unverändert bleibt.
|
||||
- Kontrolliertes technisches Fehlerverhalten mindestens für folgende Fälle umsetzen:
|
||||
- Zielordner nicht schreibbar,
|
||||
- temporäre Zieldatei nicht anlegbar,
|
||||
- Kopieren scheitert,
|
||||
- finaler Move/Rename scheitert,
|
||||
- technische Aufräumarbeiten nach Fehler nur teilweise möglich.
|
||||
- Das Ergebnis so modellieren, dass spätere Arbeitspakete zwischen erfolgreicher Zielerzeugung, technischem Schreibfehler und technischem Teil-Cleanup unterscheiden können.
|
||||
- JavaDoc für Zielkopie, temporäre Datei, finalen Move/Rename und Quellunverändertheit ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Statusfortschreibung im Use Case
|
||||
- Persistenz von Enderfolg oder Zielpfad
|
||||
- Batch-Integration
|
||||
- technischer Sofort-Wiederholversuch im selben Lauf
|
||||
|
||||
### Fertig wenn
|
||||
- eine Zielkopie technisch erzeugt werden kann,
|
||||
- temporäre und finale Schreibschritte sauber gekapselt sind,
|
||||
- die Quelldatei unverändert bleibt,
|
||||
- technische Zielschreibfehler kontrolliert abgebildet werden,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 SQLite-Schema von M5 nach M6 evolvieren und M6-Zieldaten sowie Proposal-Quelle gezielt nutzbar machen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die bestehende M5-Persistenz wird kontrolliert auf den M6-Stand erweitert, sodass Zielpfad, finaler Zieldateiname und der führende `PROPOSAL_READY`-Versuch technisch sauber nutzbar sind, ohne die M5-Historie umzudeuten oder zu überschreiben.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Das bestehende SQLite-Schema **evolvieren**, nicht neu erfinden.
|
||||
- Die Schema-Initialisierung so erweitern, dass ein vorhandenes M5-Schema kontrolliert auf den M6-Stand gebracht werden kann.
|
||||
- Den Dokument-Stammsatz um die für M6 benötigten Felder erweitern, mindestens für:
|
||||
- letzten Zielpfad,
|
||||
- letzten Zieldateinamen.
|
||||
- Die Versuchshistorie um das für M6 benötigte Feld erweitern, mindestens für:
|
||||
- finalen Zieldateinamen.
|
||||
- Repository-Mapping so erweitern, dass die neuen M6-Zieldaten technisch schreib- und lesbar sind.
|
||||
- Eine gezielte Lesefähigkeit bereitstellen oder erweitern, um den **neuesten Versuch mit `PROPOSAL_READY`** als führende Proposal-Quelle für M6 zu laden.
|
||||
- Explizit sicherstellen, dass M6 bei Enderfolg oder technischem M6-Fehler **einen neuen Versuchseintrag** anlegt und den führenden Proposal-Versuch **nicht überschreibt**.
|
||||
- Sicherstellen, dass:
|
||||
- bestehende M5-Daten weiterhin lesbar bleiben,
|
||||
- die M6-Erweiterung idempotent initialisierbar ist,
|
||||
- keine M7+-Felder vorweggenommen werden,
|
||||
- keine redundante zweite Persistenzwahrheit für Datum, Titel und Reasoning im Stammsatz entsteht.
|
||||
- JavaDoc für Schemaevolution, führende Proposal-Quelle, neue M6-Versuchseinträge, M6-Zieldaten und Rückwärtsverträglichkeit ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Batch-Use-Case-Integration
|
||||
- reale Zielkopie
|
||||
- Status- und Zählerentscheidungen im laufenden Dokumentprozess
|
||||
- M7-Retry-Ausarbeitung
|
||||
|
||||
### Fertig wenn
|
||||
- das M5-Schema kontrolliert auf M6 erweitert werden kann,
|
||||
- Zielpfad und Zieldateiname technisch persistierbar sind,
|
||||
- der neueste `PROPOSAL_READY`-Versuch gezielt lesbar ist,
|
||||
- M6-Enderfolg/-Fehler historisch **zusätzlich** speicherbar sind, ohne den Proposal-Versuch zu ersetzen,
|
||||
- bestehende M5-Daten nicht unbrauchbar werden,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 M6-Entscheidungslogik und Batch-Integration für Enderfolg, Proposal-Quelle, Skip-Semantik und technische Fehlerfortschreibung umsetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der bestehende M5-Lauf wird zu einem echten M6-Lauf erweitert, der aus geeigneten Dokumenten eine korrekt benannte Zielkopie erzeugt, den Enderfolg als `SUCCESS` persistiert und technische M6-Fehler sauber im vorhandenen Zähler- und Statusrahmen fortschreibt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden Batch-Use-Case so erweitern, dass pro geeignetem Dokument zusätzlich gilt:
|
||||
1. terminale Skip-Fälle auswerten,
|
||||
2. falls nötig den M5-Pfad bis zu einem gültigen `PROPOSAL_READY` durchlaufen,
|
||||
3. bei vorhandenem `PROPOSAL_READY` den führenden Proposal-Versuch laden,
|
||||
4. finalen Basisdateinamen bilden,
|
||||
5. Dubletten-Suffix im Zielordner bestimmen,
|
||||
6. Zielkopie erzeugen,
|
||||
7. **einen neuen M6-Versuch** für Enderfolg oder technischen Fehler historisieren,
|
||||
8. Stammsatz konsistent fortschreiben.
|
||||
- Sicherstellen, dass **kein neuer KI-Aufruf** erfolgt, wenn bereits ein nutzbarer `PROPOSAL_READY`-Versuch vorliegt.
|
||||
- Sicherstellen, dass ein Dokument mit Status `PROPOSAL_READY` im M6-Lauf **nicht** fälschlich übersprungen wird, sondern in die M6-Finalisierung geht.
|
||||
- Folgende Regeln explizit umsetzen:
|
||||
- `SUCCESS` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_ALREADY_PROCESSED`
|
||||
- `FAILED_FINAL` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_FINAL_FAILURE`
|
||||
- gültige Zielkopie plus konsistente Persistenz → `SUCCESS`
|
||||
- technischer Fehler bei Proposal-Quelllesung, inkonsistentem Proposal-Zustand, Zielpfadbildung, Dublettenauflösung, Zielkopie oder M6-Persistenz → `FAILED_RETRYABLE`, **Transientfehlerzähler +1**
|
||||
- Sicherstellen, dass der finale Zieldateiname und der Zielpfad bei echtem Enderfolg konsistent persistiert werden.
|
||||
- Sicherstellen, dass bei erfolgreicher Zielkopie, aber scheiternder Persistenz **kein** `SUCCESS` entsteht.
|
||||
- Für den Fall einer nachgelagerten Persistenzstörung nach bereits erzeugter Zielkopie einen **best-effort Rückbau** des neu erzeugten Zielartefakts vorsehen, ohne M7-Retry-Verhalten vorwegzunehmen.
|
||||
- Sicherstellen, dass dokumentbezogene M6-Fehler den Batch-Lauf für andere Dokumente kontrolliert weiterlaufen lassen.
|
||||
- JavaDoc für M6-Laufreihenfolge, Proposal-Quelle, echte Enderfolgssemantik, neue M6-Historisierung, Skip-Regeln und Fehlerfortschreibung ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- technischer Sofort-Wiederholversuch der Zielkopie
|
||||
- Logging-Feinschliff des Endstands
|
||||
- M7-spezifische Retry-Ausarbeitung
|
||||
- Reporting oder Auswertung
|
||||
|
||||
### Fertig wenn
|
||||
- der Batch-Lauf geeignete Dokumente bis zur korrekt benannten Zielkopie verarbeiten kann,
|
||||
- bestehende `PROPOSAL_READY`-Dokumente in M6 korrekt finalisiert werden,
|
||||
- inkonsistente Proposal-Zustände kontrolliert als technische Dokumentfehler behandelt werden,
|
||||
- `SUCCESS` nur nach echter Zielkopie plus konsistenter Persistenz gesetzt wird,
|
||||
- M6-Enderfolg und M6-Fehler als **zusätzliche** Historieneinträge entstehen,
|
||||
- technische M6-Fehler sauber als retryable fortgeschrieben werden,
|
||||
- weiterhin keine M7+-Funktionalität enthalten ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Bootstrap- und CLI-Anpassungen für Zielordner-Konfiguration, M6-Schemaevolution und vollständige Verdrahtung durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Programmeinstieg ist sauber an den M6-Lauf angepasst; Zielordner-Konfiguration, M6-Schemaevolution und alle neuen M6-Bausteine sind verdrahtet und harte Startfehler führen weiterhin kontrolliert zu Exit-Code 1.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bootstrap-Verdrahtung auf die neuen M6-Ports, Adapter und Persistenzbausteine erweitern.
|
||||
- M6-relevante Konfiguration ergänzen bzw. verdrahten, insbesondere für:
|
||||
- `target.folder`
|
||||
- Startvalidierung so ergänzen, dass mindestens geprüft wird:
|
||||
- Zielordner ist vorhanden oder technisch anlegbar,
|
||||
- Zielordner ist als Verzeichnis nutzbar,
|
||||
- Zielordner ist für den M6-Schreibpfad technisch verwendbar,
|
||||
- M6-relevante Persistenzkonfiguration bleibt nutzbar.
|
||||
- Die bestehende M5-Schemainitialisierung sauber mit der M6-Schemaevolution kombinieren.
|
||||
- Sicherstellen, dass harte Start-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehler weiterhin zu **Exit-Code 1** führen.
|
||||
- Sicherstellen, dass dokumentbezogene M6-Fehler **nicht** als Startfehler fehlmodelliert werden.
|
||||
- JavaDoc und `package-info` für aktualisierte Verdrahtung, Konfiguration, Zielordner-Validierung und Modulgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Logging-Feinschliff des Endstands
|
||||
- M7-Retry-Mechanik
|
||||
- spätere Betriebsoptimierungen
|
||||
|
||||
### Fertig wenn
|
||||
- das Programm im M6-Stand vollständig startbar ist,
|
||||
- alle M6-Bausteine korrekt verdrahtet sind,
|
||||
- die M6-Startvalidierung greift,
|
||||
- harte Startfehler weiterhin kontrolliert zu Exit-Code 1 führen,
|
||||
- der Build fehlerfrei bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Tests für Dateinamensbildung, Dublettenbehandlung, Zielkopie, Schemaevolution, Proposal-Konsistenz, Statussemantik und M6-Ablauf vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige M6-Zielzustand wird automatisiert abgesichert und als konsistenter Übergabestand nachgewiesen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Unit-Tests für die technische Dateinamensbildung implementieren, insbesondere für:
|
||||
- korrektes Zielformat `YYYY-MM-DD - Titel.pdf`,
|
||||
- fachliche Titelregel „keine Sonderzeichen außer Leerzeichen“ im M6-Kontext,
|
||||
- Windows-Zeichenbereinigung,
|
||||
- unveränderte 20-Zeichen-Regel des Basistitels,
|
||||
- unveränderte Wirkung des Dubletten-Suffixes außerhalb des Basistitels.
|
||||
- Tests für die Dublettenauflösung im Zielordner implementieren, insbesondere für:
|
||||
- kein vorhandener Konflikt → Basename wird verwendet,
|
||||
- vorhandener Konflikt → `(1)`, `(2)`, …,
|
||||
- Suffix wird unmittelbar vor `.pdf` gesetzt.
|
||||
- Adapter-Tests für die Zielkopie ergänzen, insbesondere für:
|
||||
- erfolgreiche Zielerzeugung über temporäre Datei plus finalen Move/Rename,
|
||||
- Quelldatei bleibt unverändert,
|
||||
- technischer Kopierfehler,
|
||||
- technischer Rename-/Move-Fehler,
|
||||
- best-effort Cleanup nach technischem Fehler.
|
||||
- Repository- und Schema-Tests gegen SQLite ergänzen, insbesondere für:
|
||||
- Evolution eines M5-Schemas auf M6,
|
||||
- Persistenz und Auslesen von Zielpfad und Zieldateiname im Dokument-Stammsatz,
|
||||
- Persistenz und Auslesen des finalen Zieldateinamens in der Versuchshistorie,
|
||||
- gezieltes Laden des neuesten `PROPOSAL_READY`-Versuchs,
|
||||
- zusätzliche M6-Historisierung ohne Überschreiben des Proposal-Versuchs.
|
||||
- Integrationstests für den M6-Ablauf ergänzen, insbesondere:
|
||||
- gültiger M6-Happy-Path endet in `SUCCESS`, **nicht** in `PROPOSAL_READY`,
|
||||
- vorhandenes `PROPOSAL_READY` wird ohne erneuten KI-Aufruf finalisiert,
|
||||
- bestehendes `SUCCESS` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- bestehendes `FAILED_FINAL` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- technischer M6-Zielkopierfehler führt zu retryablem technischem Fehler und erhöht den Transientfehlerzähler,
|
||||
- erfolgreicher Zielschreibpfad mit scheiternder Persistenz führt **nicht** zu `SUCCESS`,
|
||||
- bei M3- oder M5-Vorfehlern erfolgt keine unzulässige M6-Finalisierung,
|
||||
- Status `PROPOSAL_READY`, aber **kein** lesbarer führender Proposal-Versuch führt zu dokumentbezogenem technischem Fehler,
|
||||
- lesbarer Proposal-Versuch mit inkonsistentem Titel- oder Datumswert führt zu dokumentbezogenem technischem Fehler,
|
||||
- es entsteht weiterhin **keine** M7-Sofort-Wiederholung.
|
||||
- Tests für Bootstrap- und Startverhalten ergänzen, insbesondere:
|
||||
- ungültige M6-Konfiguration führt zu Exit-Code 1,
|
||||
- nicht nutzbarer Zielordner führt zu Exit-Code 1,
|
||||
- M6-Schemaevolution wird beim Start wirksam,
|
||||
- dokumentbezogene M6-Fehler führen **nicht** zu Exit-Code 1.
|
||||
- Den M6-Stand abschließend auf Konsistenz, Architekturtreue und Nicht-Vorgriff auf M7+ prüfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Tests für M7-Sofort-Wiederholversuch
|
||||
- Tests für finalen Logging-Feinschliff
|
||||
- Tests für spätere Betriebsoptimierungen
|
||||
|
||||
### Fertig wenn
|
||||
- die Test-Suite für den M6-Umfang grün ist,
|
||||
- die wichtigsten M6-Randfälle einschließlich Proposal-Inkonsistenzen automatisiert abgesichert sind,
|
||||
- der definierte M6-Zielzustand vollständig erreicht ist,
|
||||
- ein fehlerfreier, übergabefähiger Stand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M6-Zielumfang aus den verbindlichen Spezifikationen ab und schließen die Brücke von **M5 `PROPOSAL_READY`** zum echten Produktiv-Enderfolg **M6 `SUCCESS`** sauber:
|
||||
|
||||
- technische Dateinamensbildung im Format `YYYY-MM-DD - Titel.pdf`
|
||||
- explizite Trennung zwischen fachlicher Titelregel und technischer Windows-Kompatibilität
|
||||
- Dublettenbehandlung im Zielordner mit `(1)`, `(2)`, …
|
||||
- Zielpfadplanung und physische Zielkopie
|
||||
- Persistenzerweiterung um Zielpfad und finalen Zieldateinamen
|
||||
- Nutzung des neuesten `PROPOSAL_READY`-Versuchs als führende Quelle
|
||||
- saubere Statussemantik `PROPOSAL_READY` → `SUCCESS`
|
||||
- zusätzliche M6-Historisierung ohne Überschreiben der M5-Proposal-Historie
|
||||
- technische Fehlerfortschreibung für Proposal-Quell-, Zielkopier- und Persistenzfehler
|
||||
- Tests für Dateinamen, Dubletten, Zielkopie, Proposal-Konsistenz, Schemaevolution und End-to-End-M6-Ablauf
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M5 sowie zu M7+ gewahrt. Insbesondere werden **kein** Sofort-Wiederholversuch der Zielkopie, **kein** finaler Logging-Feinschliff und **keine** weitergehende Betriebsrobustheit des Endstands vorweggenommen.
|
||||
540
docs/workpackages/M7 - Arbeitspakete.md
Normal file
540
docs/workpackages/M7 - Arbeitspakete.md
Normal file
@@ -0,0 +1,540 @@
|
||||
# M7 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M7 – Fehlerbehandlung, Retry-Logik, Logging und betriebliche Robustheit**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3**, **M4**, **M5** und **M6** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter und Bootstrap ändern.
|
||||
- Keine Annahmen treffen, die nicht durch dieses Dokument oder die verbindlichen Spezifikationen gedeckt sind.
|
||||
- Kein Vorgriff auf **M8+**.
|
||||
- Kein Umbau bestehender M1–M6-Strukturen ohne direkten M7-Bezug.
|
||||
- Neue Typen, Entscheidungsregeln, Konfigurationswerte, Repository-Erweiterungen und Adapter so schneiden, dass sie aus einem einzelnen Arbeitspaket heraus **klar benennbar, testbar und reviewbar** sind.
|
||||
- M7 schärft und vervollständigt die bereits vorhandene Fehler- und Statussemantik aus M3–M6, erfindet sie aber nicht stillschweigend neu.
|
||||
- M7 muss vorhandene M4–M6-Datenbestände **weiterhin lesen und korrekt fortschreiben** können.
|
||||
- Jeder positive M7-Zwischenstand muss bereits einen **robusten, wiederholt ausführbaren Task-Scheduler-Lauf** liefern, auch wenn der Retry-, Logging- und Exit-Code-Endstand erst mit späteren Arbeitspaketen vollständig erreicht wird.
|
||||
- Ein Arbeitspaket darf nur dann auf Repository- oder Persistenzfähigkeiten aufbauen, wenn diese entweder bereits aus M1–M6 vorhanden sind oder im unmittelbar vorhergehenden Arbeitspaket explizit hergestellt wurden.
|
||||
|
||||
## Explizit nicht Bestandteil von M7
|
||||
|
||||
- neue KI-Funktionalität oder Prompt-Evolution jenseits der robusten Weiterverwendung des M5-Stands
|
||||
- neue fachliche Benennungsregeln über M5/M6 hinaus
|
||||
- neue Dateisystem-Funktionalität jenseits des M6-Zielkopiepfads und des in M7 konkret geforderten technischen Sofort-Wiederholversuchs
|
||||
- Reporting-, Statistik- oder Monitoring-Funktionen
|
||||
- Web-UI, REST-API oder Benutzerinteraktion
|
||||
- OCR, Inhaltsänderung von PDFs oder manuelle Nachbearbeitung
|
||||
- abschließender Gesamt-Feinschliff, großflächige Refactorings oder generelle Qualitätskampagnen aus **M8**
|
||||
|
||||
## Verbindliche M7-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. M7 schließt die Betriebslücke zwischen M6 und dem finalen Zielbild
|
||||
|
||||
M6 liefert den vollständigen Erfolgspfad, aber noch nicht die vollständige betriebliche Robustheit des Endstands. Ab M7 gilt daher verbindlich:
|
||||
|
||||
- `SUCCESS` bleibt der echte terminale Enderfolg.
|
||||
- `FAILED_FINAL` bleibt der terminale Endfehler.
|
||||
- `FAILED_RETRYABLE` darf nur solange bestehen bleiben, wie **mindestens ein weiterer Scheduler-Lauf fachlich zulässig** ist.
|
||||
- `SKIPPED_ALREADY_PROCESSED` und `SKIPPED_FINAL_FAILURE` bleiben reine historisierte Skip-Ergebnisse und verändern selbst keine Fehlerzähler.
|
||||
- Dokumentbezogene Fehler dürfen den Gesamtbatch nicht unnötig abbrechen.
|
||||
|
||||
### 2. Vollständige Retry-Regel für deterministische Inhaltsfehler
|
||||
|
||||
Ab M7 gilt die vollständige fachliche Regel über spätere Läufe hinweg:
|
||||
|
||||
- deterministische Inhaltsfehler erhalten **genau einen** späteren Wiederholungsversuch,
|
||||
- der **erste** historisierte deterministische Inhaltsfehler eines Fingerprints führt zu `FAILED_RETRYABLE`,
|
||||
- der **zweite** historisierte deterministische Inhaltsfehler desselben Fingerprints führt zu `FAILED_FINAL`.
|
||||
|
||||
Für M7 sind mindestens alle bereits aus M3–M6 konkret erzeugbaren deterministischen Inhaltsfehler in diesen Regelrahmen einzuordnen, insbesondere:
|
||||
|
||||
- kein brauchbarer Text,
|
||||
- Seitenlimit überschritten,
|
||||
- fachlich unbrauchbarer oder generischer Titel,
|
||||
- vorhandenes, aber unbrauchbares KI-Datum.
|
||||
|
||||
Bereits vorhandene oder künftig im bestehenden Fachmodell erzeugte Mehrdeutigkeitsfälle laufen in denselben deterministischen Inhaltsfehler-Rahmen und erzeugen **kein** unsicheres Ergebnis.
|
||||
|
||||
### 3. Vollständige Retry-Regel für transiente technische Fehler
|
||||
|
||||
Ab M7 gilt für dokumentbezogene technische Fehler nach erfolgreicher Fingerprint-Ermittlung:
|
||||
|
||||
- sie laufen über den **Transientfehlerzähler**,
|
||||
- sie bleiben nur bis zum konfigurierten Grenzwert retryable,
|
||||
- nach Ausschöpfen der zulässigen transienten Fehlversuche wird der Dokumentstatus `FAILED_FINAL`.
|
||||
|
||||
Für die M7-Implementierung ist `max.retries.transient` verbindlich als **maximal zulässige Anzahl historisierter transienter Fehlversuche pro Fingerprint** zu interpretieren. Der Fehlversuch, der diesen Grenzwert erreicht, finalisiert den Dokumentstatus.
|
||||
|
||||
Zusätzlich gilt:
|
||||
|
||||
- `max.retries.transient` ist ein **ganzzahliger Wert >= 1**.
|
||||
- Der Wert `0` ist **ungültige Startkonfiguration**.
|
||||
- Beispiel: `1` bedeutet, dass bereits der **erste** historisierte transiente Fehlversuch finalisiert.
|
||||
- Beispiel: `2` bedeutet, dass der **erste** historisierte transiente Fehlversuch retryable bleibt und der **zweite** finalisiert.
|
||||
|
||||
### 4. Technischer Sofort-Wiederholversuch ist strikt auf den Zielkopierpfad begrenzt
|
||||
|
||||
Der in der Zielarchitektur vorgesehene technische Sofort-Wiederholversuch wird in M7 exakt wie folgt umgesetzt:
|
||||
|
||||
- **genau ein** zusätzlicher technischer Schreibversuch innerhalb desselben Dokumentlaufs,
|
||||
- ausschließlich für Fehler beim physischen Zielkopierpfad aus M6,
|
||||
- **kein** erneuter KI-Aufruf,
|
||||
- **keine** erneute fachliche Titel-/Datumsableitung,
|
||||
- **keine** Ausweitung auf Prompt-Laden, KI-HTTP, SQLite oder sonstige Adapter.
|
||||
|
||||
Der Sofort-Wiederholversuch ist ein technischer Mechanismus innerhalb desselben Laufs und **kein** zusätzlicher fachlicher Retry-Lauf im Sinne der laufübergreifenden Retry-Regeln.
|
||||
|
||||
### 5. Skip-Semantik des Endstands
|
||||
|
||||
Ab M7 gilt vollständig:
|
||||
|
||||
- `SUCCESS` wird in späteren Läufen **nicht erneut verarbeitet**, sondern mit `SKIPPED_ALREADY_PROCESSED` historisiert.
|
||||
- `FAILED_FINAL` wird in späteren Läufen **nicht erneut verarbeitet**, sondern mit `SKIPPED_FINAL_FAILURE` historisiert.
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI` und `PROPOSAL_READY` bleiben verarbeitbar, soweit der jeweilige Dokumentzustand dies fachlich zulässt.
|
||||
- Ein nach M6 noch offenes `PROPOSAL_READY` darf in M7 weiterhin sauber bis zum echten Enderfolg finalisiert werden.
|
||||
|
||||
### 6. Logging-Mindestumfang des Endstands
|
||||
|
||||
Das Logging muss ab M7 mindestens folgende Informationen nachvollziehbar liefern:
|
||||
|
||||
- Laufstart,
|
||||
- Laufende,
|
||||
- Lauf-ID,
|
||||
- erkannte Quelldatei,
|
||||
- Überspringen bereits erfolgreicher Dateien,
|
||||
- Überspringen final fehlgeschlagener Dateien,
|
||||
- erzeugter Zielname,
|
||||
- Retry-Entscheidung,
|
||||
- Fehler mit Klassifikation.
|
||||
|
||||
Die Logs müssen so geschnitten werden, dass dokumentbezogene Entscheidungen pro Fingerprint bzw. Kandidat nachvollziehbar bleiben, ohne zusätzliche Infrastrukturtypen in Domain oder Application zu ziehen.
|
||||
|
||||
Zusätzlich gilt für die Korrelation:
|
||||
|
||||
- sobald ein Fingerprint erfolgreich bestimmt wurde, müssen dokumentbezogene Logeinträge diesen Fingerprint oder eine daraus eindeutig ableitbare Referenz enthalten,
|
||||
- solange noch kein Fingerprint vorliegt, erfolgt die Korrelation mindestens über Lauf-ID und erkannte Quelldatei bzw. Kandidatenbezug,
|
||||
- M7 führt hierfür **keine** neue Persistenz-Wahrheit und **keine** zusätzliche Tracking-Ebene ein.
|
||||
|
||||
### 7. Sensibilitätsregel für KI-Inhalte im Logging
|
||||
|
||||
Ab M7 gilt verbindlich:
|
||||
|
||||
- die vollständige KI-Rohantwort bleibt in **SQLite** speicherbar,
|
||||
- die vollständige KI-Rohantwort wird **standardmäßig nicht** ins Log geschrieben,
|
||||
- `reasoning` wird ebenfalls **standardmäßig nicht** vollständig ins Log geschrieben,
|
||||
- die Ausgabe sensibler KI-Inhalte ist nur über eine **explizite Konfiguration** zulässig,
|
||||
- M7 führt hierfür einen klar dokumentierten, booleschen Konfigurationswert ein,
|
||||
- der Default muss auf **sicher/nicht loggen** stehen.
|
||||
|
||||
Als sensible KI-Inhalte gelten in M7 mindestens:
|
||||
|
||||
- vollständige KI-Rohantwort,
|
||||
- vollständiges KI-`reasoning`.
|
||||
|
||||
### 8. Exit-Code-Endsemantik
|
||||
|
||||
Ab M7 ist das Exit-Code-Verhalten final:
|
||||
|
||||
- `0`, wenn der Lauf technisch ordnungsgemäß durchgeführt wurde, auch wenn einzelne Dokumente fachlich oder transient fehlgeschlagen sind,
|
||||
- `1` nur bei harten Start-, Bootstrap-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehlern.
|
||||
|
||||
Dokumentbezogene Fehler dürfen **nicht** als harte Startfehler fehlmodelliert werden.
|
||||
|
||||
### 9. Konfigurationsvalidierung des Endstands
|
||||
|
||||
M7 vervollständigt die Startvalidierung insbesondere für:
|
||||
|
||||
- `max.retries.transient`,
|
||||
- M7-relevante Logging-Konfiguration,
|
||||
- bestehende M1–M6-Startparameter, soweit sie für einen robusten Batch-Lauf weiterhin zwingend sind.
|
||||
|
||||
Ungültige M7-Startkonfiguration verhindert den Laufbeginn und führt zu **Exit-Code 1**.
|
||||
|
||||
### 10. Keine zweite Wahrheitsquelle für Fehler- und Retry-Entscheidungen
|
||||
|
||||
M7 nutzt weiterhin die bestehende Kombination aus:
|
||||
|
||||
- Dokument-Stammsatz für Gesamtstatus und Zähler,
|
||||
- Versuchshistorie für einzelne Versuchsdaten und Nachvollziehbarkeit.
|
||||
|
||||
M7 führt **keine** parallele, dritte Wahrheitsquelle für Retry-Zustände, Logging-Entscheidungen oder Fehlerhistorien ein.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 M7-Kernobjekte, vollständige Fehlersemantik und Retry-/Logging-Verträge präzisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M7-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die M7-relevanten Typen, vollständigen Fehler- und Retry-Bedeutungen, Logging-bezogenen Entscheidungsobjekte und technischen Grenzen werden eindeutig eingeführt, damit spätere Arbeitspakete ohne Interpretationsspielraum implementiert werden können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Neue M7-relevante Kernobjekte bzw. Application-nahe Typen anlegen, insbesondere für:
|
||||
- vollständige Retry-Entscheidung,
|
||||
- Ausschöpfungszustand eines Retry-Rahmens,
|
||||
- technische Sofort-Wiederholungsentscheidung für den Zielkopierpfad,
|
||||
- dokumentbezogene Fehlerklassifikation des Endstands,
|
||||
- Logging-Ereignis bzw. Logging-relevante Dokumententscheidung,
|
||||
- Sensitivitätsentscheidung für KI-Inhalte im Logging.
|
||||
- Die bestehende Status- und Fehlersemantik in JavaDoc und ggf. `package-info` so schärfen, dass klar ist:
|
||||
- wann `FAILED_RETRYABLE` noch zulässig ist,
|
||||
- wann ein Dokumentstatus wegen ausgeschöpfter Retry-Regeln in `FAILED_FINAL` übergeht,
|
||||
- dass der technische Sofort-Wiederholversuch **nicht** zum laufübergreifenden Retry-Zähler gehört,
|
||||
- dass dokumentbezogene Fehler den Gesamtbatch nicht zu Exit-Code 1 eskalieren.
|
||||
- Application-seitige Verträge definieren oder gezielt erweitern für:
|
||||
- Ableitung der Retry-Entscheidung aus Status, Fehlerart, Zählern und Konfiguration,
|
||||
- Ableitung einer protokollierbaren Dokumententscheidung,
|
||||
- Ableitung der Zielkopier-Sofort-Wiederholung,
|
||||
- Auflösung der Sensitivitätsregel für KI-Logausgaben,
|
||||
- Korrelation dokumentbezogener Logging-Ereignisse ohne Infrastrukturtypen im Kern.
|
||||
- Port-Verträge so schneiden, dass weder Log4j2-, NIO-, JDBC- noch HTTP-Typen in Domain oder Application durchsickern.
|
||||
- Rückgabemodelle so anlegen, dass spätere Arbeitspakete ohne Zusatzannahmen unterscheiden können zwischen:
|
||||
- retryablem Inhaltsfehler,
|
||||
- finalem Inhaltsfehler,
|
||||
- retryablem technischem Fehler,
|
||||
- finalisiertem technischem Fehler nach ausgeschöpftem Transient-Rahmen,
|
||||
- technischem Zielschreibfehler mit zulässigem Sofort-Wiederholversuch,
|
||||
- dokumentbezogener Entscheidung mit M7-logbarem Ergebnis.
|
||||
- Explizit dokumentieren, dass M7 keine neue Persistenz-Wahrheit für Retry-Entscheidungen einführt.
|
||||
- Explizit dokumentieren, dass `max.retries.transient` als historisierter Fehlversuchs-Grenzwert interpretiert wird und als gültiger Konfigurationswert nur **Integer >= 1** zulässig ist.
|
||||
- Explizit dokumentieren, dass sensible KI-Logausgaben in M7 mindestens vollständige KI-Rohantwort und vollständiges KI-`reasoning` umfassen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- konkrete Retry-Implementierung im Batch-Lauf
|
||||
- konkrete Log4j2-Konfiguration
|
||||
- konkrete Zielkopier-Wiederholung
|
||||
- Bootstrap-Anpassungen
|
||||
- Tests des Endstands
|
||||
|
||||
### Fertig wenn
|
||||
- die M7-relevanten Typen und Verträge vorhanden sind,
|
||||
- Retry-, Finalisierungs-, Sensitivitäts- und Logging-Korrelationssemantik eindeutig dokumentiert ist,
|
||||
- Domain und Application frei von Infrastrukturtypen bleiben,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Vollständige Retry-Entscheidungslogik für deterministische Inhaltsfehler und transiente technische Fehler implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die fachlich vollständige laufübergreifende Retry-Entscheidung des Endstands ist als klarer, testbarer Baustein im Kern implementiert und kann von Batch-Lauf, Logging und Persistenz konsistent verwendet werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Einen zentralen M7-Baustein implementieren, der aus vorhandener Fehlerart, bestehendem Dokumentstatus, Fehlerzählern und Konfiguration die verbindliche Retry-Entscheidung ableitet.
|
||||
- Die vollständige deterministische Inhaltsfehlerregel explizit umsetzen:
|
||||
- erster historisierter deterministischer Inhaltsfehler → `FAILED_RETRYABLE`,
|
||||
- zweiter historisierter deterministischer Inhaltsfehler → `FAILED_FINAL`.
|
||||
- Die vollständige transiente Fehlerregel explizit umsetzen:
|
||||
- dokumentbezogene technische Fehler bleiben nur bis `max.retries.transient` retryable,
|
||||
- der Fehlversuch, der den Grenzwert erreicht, finalisiert den Status zu `FAILED_FINAL`.
|
||||
- Die Randfälle der Grenzwertinterpretation explizit abdecken, insbesondere:
|
||||
- `max.retries.transient = 1`,
|
||||
- Skip-Fälle ohne Zähleränderung,
|
||||
- bereits bestehende M4–M6-Datenbestände mit historischen Fehlerzählern.
|
||||
- Die Entscheidungslogik so schneiden, dass sie konsistent für bereits bestehende M4–M6-Datenbestände nutzbar bleibt und keine Sonderbehandlung außerhalb des zentralen Regelwerks erzwingt.
|
||||
- Explizit sicherstellen, dass Skip-Fälle keine Fehlerzähler verändern.
|
||||
- Explizit sicherstellen, dass der technische Sofort-Wiederholversuch **nicht** in diese laufübergreifende Retry-Entscheidung einfließt.
|
||||
- JavaDoc für Regelherkunft, Zählerbedeutung, Grenzwertinterpretation und Nicht-Ziele von M7 ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Batch-Use-Case-Integration
|
||||
- Persistenzfortschreibung im konkreten Dokumentlauf
|
||||
- Zielkopier-Wiederholung
|
||||
- Logging-Konfiguration
|
||||
- Exit-Code-Logik
|
||||
|
||||
### Fertig wenn
|
||||
- die Retry-Entscheidung zentral und testbar implementiert ist,
|
||||
- deterministische und transiente Fehler vollständig und widerspruchsfrei abgedeckt sind,
|
||||
- bestehende M4–M6-Zähler- und Statusdaten ohne Sonderlogik anschlussfähig bleiben,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 Technischen Sofort-Wiederholversuch für den Zielkopierpfad aus M6 implementieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der in der Zielarchitektur vorgesehene einmalige technische Sofort-Wiederholversuch für Zielkopierfehler wird sauber umgesetzt, ohne KI, Persistenzlogik oder laufübergreifende Retry-Semantik zu vermischen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden M6-Zielkopierpfad so erweitern, dass bei einem technischen Schreibfehler **genau ein** zusätzlicher technischer Sofort-Wiederholversuch innerhalb desselben Dokumentlaufs möglich ist.
|
||||
- Sicherstellen, dass der Sofort-Wiederholversuch ausschließlich für den physischen Zielkopierpfad gilt, insbesondere für:
|
||||
- temporäre Zieldatei nicht anlegbar,
|
||||
- Kopieren scheitert,
|
||||
- finaler Move/Rename scheitert,
|
||||
- technisches Cleanup nach erstem Schreibfehler nur teilweise erfolgreich.
|
||||
- Sicherstellen, dass dabei **kein** erneuter KI-Aufruf, **keine** erneute fachliche Proposal-Ableitung und **keine** neue Statusneubewertung außerhalb des M7-Regelrahmens stattfindet.
|
||||
- Den Mechanismus so schneiden, dass der zweite technische Versuch mit demselben fachlichen Dokumentkontext läuft und der Batch-Lauf danach genau **ein** dokumentbezogenes Ergebnis für Persistenz und Statusfortschreibung ableiten kann.
|
||||
- Technische Aufräumarbeiten zwischen erstem und zweitem Versuch kontrolliert kapseln.
|
||||
- JavaDoc für Reichweite, Grenzen und Abgrenzung zu laufübergreifenden Retries ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- endgültige Status- und Zählerfortschreibung im Batch-Lauf
|
||||
- Logging-Endstand
|
||||
- Bootstrap-Anpassungen
|
||||
- Erweiterung auf andere Fehlerarten als Zielkopierschreibfehler
|
||||
|
||||
### Fertig wenn
|
||||
- genau ein technischer Sofort-Wiederholversuch für Zielkopierfehler möglich ist,
|
||||
- kein KI- oder Fachpfad unzulässig erneut ausgelöst wird,
|
||||
- das Ergebnis kontrolliert an den späteren Batch-/Persistenzpfad übergeben werden kann,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Logging-Infrastruktur, Korrelation und Sensibilitätsregel für M7 vorbereiten
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die Logging-Infrastruktur ist für den M7-Endstand vorbereitet, die Sensibilitätsregel für KI-Inhalte ist technisch korrekt verdrahtet und dokumentbezogene Ereignisse können später im Batch-Lauf konsistent und eindeutig korreliert geloggt werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die bestehende Logging-Infrastruktur gezielt so erweitern, dass der in M7 geforderte Mindestumfang später ohne zusätzliche Architekturbrüche angebunden werden kann.
|
||||
- Einen klar dokumentierten, booleschen Konfigurationswert für sensible KI-Logausgaben einführen und verdrahten.
|
||||
- Sicherstellen, dass die vollständige KI-Rohantwort standardmäßig **nicht** geloggt wird.
|
||||
- Sicherstellen, dass vollständiges KI-`reasoning` standardmäßig **nicht** vollständig geloggt wird.
|
||||
- Sicherstellen, dass die vollständige KI-Rohantwort und das vollständige KI-`reasoning` weiterhin in SQLite verbleiben können und M7 hier keine Reduktion oder Löschung der Nachvollziehbarkeit einführt.
|
||||
- Einen M7-tauglichen Mechanismus für dokumentbezogene Log-Korrelation vorbereiten, insbesondere:
|
||||
- Lauf-ID-basierte Korrelation vor erfolgreicher Fingerprint-Ermittlung,
|
||||
- Fingerprint- oder eindeutig ableitbare Dokumentreferenz nach erfolgreicher Fingerprint-Ermittlung.
|
||||
- Die logbaren Ereignis- und Entscheidungsmodelle aus AP-001 an die Logging-Infrastruktur anbinden, ohne dass fachliche Entscheidungslogik in technische Logger-Aufrufe zerfällt.
|
||||
- Bereits auf dieser Stufe die nicht dokumentgebundenen Pflicht-Logpunkte sauber verdrahten, insbesondere:
|
||||
- Laufstart,
|
||||
- Laufende,
|
||||
- harte Startfehler, soweit auf aktuellem Stand erreichbar.
|
||||
- JavaDoc und ggf. `package-info` für Logging-Sensibilität, Korrelation, Mindestumfangsvorbereitung und Architekturgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Batch-Integration aller dokumentbezogenen M7-Logpunkte
|
||||
- Finalisierung der Retry- und Skip-Hooks im Dokumentlauf
|
||||
- Startvalidierung des Endstands
|
||||
- finale Exit-Code-Verdrahtung
|
||||
- Tests des gesamten Endstands
|
||||
|
||||
### Fertig wenn
|
||||
- die Logging-Infrastruktur den M7-Endstand ohne Zusatzannahmen tragen kann,
|
||||
- die Sensibilitätsregel standardmäßig auf „nicht loggen" steht,
|
||||
- sensible KI-Inhalte nur über explizite Konfiguration logbar sind,
|
||||
- dokumentbezogene Log-Korrelation technisch vorbereitet ist,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 Repository-, Persistenz- und Nachvollziehbarkeitsanpassungen für den M7-Endstand ergänzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001, AP-002 und AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die bestehende Persistenz aus M4–M6 unterstützt die vollständige M7-Fehler-, Retry-, Skip- und Logging-Nachvollziehbarkeit ohne neue Wahrheitsquelle und ohne unnötige Schema-Neuerfindung.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Prüfen und gezielt ergänzen, welche Repository-Fähigkeiten für den M7-Endstand tatsächlich fehlen, ohne das bestehende Zwei-Ebenen-Modell neu zu entwerfen.
|
||||
- Bestehende Repository-Operationen so erweitern oder schärfen, dass sie für M7 reproduzierbar unterstützen:
|
||||
- Finalisierung ausgeschöpfter Retry-Rahmen,
|
||||
- konsistente Fortschreibung von Inhalts- und Transientfehlerzählern,
|
||||
- historisierte Skip-Ereignisse,
|
||||
- dokumentbezogene Fehlerklassifikation und Retryable-Flag im Endstand,
|
||||
- lesende Auswertung der bestehenden Versuchshistorie, soweit für Retry- und Skip-Entscheidungen zwingend erforderlich,
|
||||
- konsistente Nachvollziehbarkeit zwischen Log-Entscheidung und SQLite-Historie.
|
||||
- Falls für den M7-Endstand zusätzliche lesende Auswertungen der bestehenden Versuchshistorie nötig sind, diese gezielt ergänzen, ohne Reporting- oder Statistikfunktionalität vorwegzunehmen.
|
||||
- Nur dann eine Schemaevolution vornehmen, wenn sie für den M7-Zielstand **zwingend** erforderlich ist; andernfalls ausdrücklich beim bestehenden M6-Schema bleiben.
|
||||
- Sicherstellen, dass bestehende M4–M6-Datenbestände lesbar und korrekt fortschreibbar bleiben.
|
||||
- Sicherstellen, dass der spätere Batch-Lauf aus AP-006 alle für M7 notwendigen Persistenzoperationen bereits vorfindet und **keine** impliziten Repository-Erweiterungen mehr nachschieben muss.
|
||||
- JavaDoc für Nachvollziehbarkeit, bestehende Persistenz-Wahrheit und M7-Grenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Batch-Use-Case-Integration der M7-Regeln
|
||||
- neue dritte Persistenzebene
|
||||
- Reporting/Analytics
|
||||
- Bootstrap-Anpassungen
|
||||
- Logging-Framework-Konfiguration
|
||||
- M8-Gesamtreview
|
||||
|
||||
### Fertig wenn
|
||||
- die Persistenz den vollständigen M7-Endstand konsistent unterstützt,
|
||||
- keine unnötige Schema-Neuerfindung oder Parallelwahrheit eingeführt wurde,
|
||||
- bestehende M4–M6-Datenbestände anschlussfähig bleiben,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 M7-Batch-Integration für Skip-Logik, Finalisierung ausgeschöpfter Retries, Logging-Hooks und konsistente Fehlerfortschreibung umsetzen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der bestehende M6-Lauf wird zum vollständigen M7-Lauf erweitert, der Retry-Entscheidungen, Finalisierung, Skip-Verhalten, Sofort-Wiederholversuch, dokumentbezogene Logging-Hooks und konsistente Status-/Persistenzfortschreibung zusammenführt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den bestehenden Batch-Use-Case so erweitern, dass pro Dokument die vollständigen M7-Regeln wirksam werden.
|
||||
- Folgende Regeln explizit umsetzen:
|
||||
- `SUCCESS` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_ALREADY_PROCESSED` historisieren,
|
||||
- `FAILED_FINAL` → kein erneuter fachlicher Durchlauf, stattdessen `SKIPPED_FINAL_FAILURE` historisieren,
|
||||
- `FAILED_RETRYABLE`, `READY_FOR_AI` und `PROPOSAL_READY` bleiben verarbeitbar,
|
||||
- deterministische Inhaltsfehler werden nach dem zweiten historisierten Auftreten finalisiert,
|
||||
- transiente technische Fehler werden bei Erreichen des Grenzwerts `max.retries.transient` finalisiert.
|
||||
- Sicherstellen, dass der technische Sofort-Wiederholversuch aus AP-003 ausschließlich im Zielkopierpfad wirkt und danach in **genau eine** dokumentbezogene Status- und Persistenzfortschreibung mündet.
|
||||
- Sicherstellen, dass dokumentbezogene Fehler und Finalisierungen den Batch-Lauf für andere Dokumente nicht unnötig abbrechen.
|
||||
- Sicherstellen, dass Historie und Stammsatz pro identifiziertem Dokument weiterhin konsistent fortgeschrieben werden und kein teilpersistierter M7-Zustand zurückbleibt.
|
||||
- Vor-Fingerprint-Fehler weiterhin ausdrücklich **nicht** als SQLite-Versuch historisieren.
|
||||
- Die vorbereitete Logging-Infrastruktur aus AP-004 an den fachlich relevanten Batch-Entscheidungspunkten anbinden, so dass der finale M7-Mindestumfang vollständig erreicht wird, insbesondere:
|
||||
- erkannte Quelldatei,
|
||||
- Überspringen bereits erfolgreicher Dateien,
|
||||
- Überspringen final fehlgeschlagener Dateien,
|
||||
- erzeugter Zielname,
|
||||
- Retry-Entscheidung,
|
||||
- Fehler mit Klassifikation.
|
||||
- Sicherstellen, dass dokumentbezogene Logs nach erfolgreicher Fingerprint-Ermittlung den Fingerprint oder eine eindeutig ableitbare Referenz enthalten und vor erfolgreicher Fingerprint-Ermittlung mindestens über Lauf-ID und Kandidatenbezug korreliert werden können.
|
||||
- JavaDoc für M7-Laufreihenfolge, Finalisierung ausgeschöpfter Retries, Skip-Regeln, Logging-Hooks und Fehlerfortschreibung ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Bootstrap- und Startvalidierungsanpassungen
|
||||
- finale Exit-Code-Verdrahtung
|
||||
- End-to-End-Tests
|
||||
- M8-Feinschliff
|
||||
|
||||
### Fertig wenn
|
||||
- der Batch-Lauf die vollständige M7-Retry- und Skip-Semantik umsetzt,
|
||||
- ausgeschöpfte Retry-Rahmen zu `FAILED_FINAL` führen,
|
||||
- der Sofort-Wiederholversuch korrekt in den Dokumentlauf integriert ist,
|
||||
- der finale dokumentbezogene Logging-Mindestumfang des M7-Stands vollständig angebunden ist,
|
||||
- dokumentbezogene Fehler den Gesamtbatch kontrolliert weiterlaufen lassen,
|
||||
- der Stand fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Bootstrap-, Startvalidierungs- und Exit-Code-Finalisierung für den M7-Endstand durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Programmeinstieg ist sauber auf den M7-Endstand verdrahtet; die finale Startvalidierung greift, dokumentbezogene Fehler werden korrekt von Startfehlern getrennt und das endgültige Exit-Code-Verhalten ist vollständig umgesetzt.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bootstrap-Verdrahtung auf die neuen M7-Bausteine erweitern.
|
||||
- M7-relevante Konfiguration ergänzen bzw. validieren, insbesondere für:
|
||||
- `max.retries.transient` als **Integer >= 1**,
|
||||
- den booleschen Konfigurationswert für sensible KI-Logausgaben,
|
||||
- bestehende M1–M6-Parameter, soweit sie für den robusten Endstand zwingend benötigt werden.
|
||||
- Startvalidierung so vervollständigen, dass ungültige M7-Konfiguration den Lauf **vor** dem Batch-Beginn stoppt.
|
||||
- Sicherstellen, dass harte Start-, Verdrahtungs-, Konfigurations- oder Initialisierungsfehler weiterhin zu **Exit-Code 1** führen.
|
||||
- Sicherstellen, dass dokumentbezogene Fehler aus M3–M7 **nicht** zu Exit-Code 1 eskalieren, solange der Batch-Lauf technisch ordnungsgemäß durchgeführt werden konnte.
|
||||
- Die M7-Logging-Verdrahtung so in den Startpfad integrieren, dass Laufstart, Laufende und harte Startfehler nachvollziehbar protokolliert werden.
|
||||
- JavaDoc und `package-info` für aktualisierte Verdrahtung, Konfigurationsvalidierung, Exit-Code-Endsemantik und Modulgrenzen ergänzen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- komplette Test-Suite
|
||||
- M8-Qualitätsmaßnahmen
|
||||
- neue fachliche Verarbeitung jenseits des M7-Zielbilds
|
||||
|
||||
### Fertig wenn
|
||||
- das Programm im M7-Stand vollständig startbar ist,
|
||||
- die M7-Startvalidierung greift,
|
||||
- das finale Exit-Code-Verhalten vollständig umgesetzt ist,
|
||||
- dokumentbezogene Fehler nicht als Startfehler fehlmodelliert werden,
|
||||
- der Build fehlerfrei bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Tests für Retry-Abläufe über mehrere Läufe, Sofort-Wiederholversuch, Logging-Sensibilität und Exit-Code-Endverhalten vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige M7-Zielzustand wird automatisiert abgesichert und als konsistenter Übergabestand nachgewiesen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Tests für Retry-Abläufe über mehrere Läufe implementieren, insbesondere für:
|
||||
- erster deterministischer Inhaltsfehler → `FAILED_RETRYABLE`,
|
||||
- zweiter deterministischer Inhaltsfehler → `FAILED_FINAL`,
|
||||
- transiente technische Fehler bleiben bis zum konfigurierten Grenzwert retryable,
|
||||
- der transiente Fehlversuch am Grenzwert finalisiert zu `FAILED_FINAL`,
|
||||
- `max.retries.transient = 1` finalisiert beim ersten historisierten transienten Fehlversuch,
|
||||
- `max.retries.transient = 0` wird als ungültige Startkonfiguration abgewiesen.
|
||||
- Tests für finale Fehlerzustände ergänzen, insbesondere:
|
||||
- `FAILED_FINAL` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- `SUCCESS` wird im Wiederholungslauf historisiert übersprungen,
|
||||
- Skip-Ereignisse verändern keine Fehlerzähler.
|
||||
- Tests für den technischen Sofort-Wiederholversuch im Zielkopierpfad ergänzen, insbesondere:
|
||||
- erster Schreibversuch scheitert, zweiter gelingt,
|
||||
- beide Schreibversuche scheitern,
|
||||
- kein erneuter KI-Aufruf,
|
||||
- kein zusätzlicher laufübergreifender Retry-Zähler durch den Sofort-Wiederholversuch.
|
||||
- Tests für Logging-Sensibilitätsregel ergänzen, soweit automatisierbar, insbesondere:
|
||||
- vollständige KI-Rohantwort wird standardmäßig nicht geloggt,
|
||||
- vollständiges KI-`reasoning` wird standardmäßig nicht vollständig geloggt,
|
||||
- vollständige KI-Rohantwort bleibt in SQLite verfügbar,
|
||||
- vollständiges KI-`reasoning` bleibt in SQLite verfügbar,
|
||||
- explizite Freischaltung sensibler KI-Logausgabe wirkt nur kontrolliert.
|
||||
- Tests für Logging-Korrelation ergänzen, soweit automatisierbar, insbesondere:
|
||||
- vor erfolgreicher Fingerprint-Ermittlung ist Kandidatenbezug über Lauf-ID und Quelldatei nachvollziehbar,
|
||||
- nach erfolgreicher Fingerprint-Ermittlung tragen dokumentbezogene Logs den Fingerprint oder eine eindeutig ableitbare Referenz.
|
||||
- Tests für finales Exit-Code-Verhalten ergänzen, insbesondere:
|
||||
- `0` bei technisch ordnungsgemäßem Lauf trotz dokumentbezogener Fehler,
|
||||
- `1` bei harter ungültiger Startkonfiguration,
|
||||
- `1` bei harten Bootstrap-/Initialisierungsfehlern,
|
||||
- dokumentbezogene Fehler aus M3–M7 führen nicht zu Exit-Code 1.
|
||||
- Tests für Konfigurationsvalidierung ergänzen, insbesondere:
|
||||
- ungültiges `max.retries.transient`,
|
||||
- ungültige Logging-Sensitivitätskonfiguration,
|
||||
- M7-Startkonfiguration verhindert bei Ungültigkeit den Laufbeginn.
|
||||
- Integrationstests für den vollständigen M7-Ablauf ergänzen, insbesondere:
|
||||
- robuster Happy-Path mit `SUCCESS`,
|
||||
- dokumentbezogene Teilfehler blockieren den Batch nicht,
|
||||
- ausgeschöpfte Retry-Rahmen führen stabil zu terminalen Skip-Folgeläufen,
|
||||
- bestehendes `PROPOSAL_READY` kann weiter bis zum Enderfolg finalisiert werden,
|
||||
- M4–M6-Altbestände bleiben anschlussfähig.
|
||||
- Den M7-Stand abschließend auf Konsistenz, Architekturtreue und Nicht-Vorgriff auf M8+ prüfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- M8-Gesamtfreigabe
|
||||
- zusätzliche Qualitätskampagnen außerhalb des M7-Zielumfangs
|
||||
|
||||
### Fertig wenn
|
||||
- die Test-Suite für den M7-Umfang grün ist,
|
||||
- die wichtigsten Retry-, Finalisierungs-, Logging-, Korrelation- und Exit-Code-Randfälle automatisiert abgesichert sind,
|
||||
- der definierte M7-Zielzustand vollständig erreicht ist,
|
||||
- ein fehlerfreier, übergabefähiger Stand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M7-Zielumfang aus den verbindlichen Spezifikationen ab und schließen die betriebliche Lücke zwischen dem M6-Erfolgspfad und dem final robusten Endstand sauber:
|
||||
|
||||
- vollständige Retry-Logik über spätere Läufe
|
||||
- saubere Finalisierung nach ausgeschöpften Retry-Rahmen
|
||||
- technischer Sofort-Wiederholversuch ausschließlich für Zielkopierfehler
|
||||
- vollständige Skip-Semantik für `SUCCESS` und `FAILED_FINAL`
|
||||
- finaler Logging-Mindestumfang
|
||||
- Sensibilitätsregel für KI-Inhalte im Logging
|
||||
- präzise Korrelation zwischen Logs und dokumentbezogenen Entscheidungen
|
||||
- finale Exit-Code-Semantik
|
||||
- vervollständigte Startvalidierung
|
||||
- konsistente Nachvollziehbarkeit in Logs und SQLite
|
||||
- Tests für Mehrlauf-Retries, Sofort-Wiederholversuch, Logging-Sensibilität und Exit-Code-Endverhalten
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M6 sowie zu M8+ gewahrt. Insbesondere werden **keine** neuen Fachfunktionen, **kein** M8-Gesamtfeinschliff und **keine** unnötigen Parallelwahrheiten für Persistenz oder Retry-Zustände eingeführt.
|
||||
583
docs/workpackages/M8 - Arbeitspakete.md
Normal file
583
docs/workpackages/M8 - Arbeitspakete.md
Normal file
@@ -0,0 +1,583 @@
|
||||
# M8 - Arbeitspakete
|
||||
|
||||
## Geltungsbereich
|
||||
|
||||
Dieses Dokument beschreibt ausschließlich die Arbeitspakete für den definierten Meilenstein **M8 – Abschlussmeilenstein: Qualitätssicherung, Feinschliff und vollständige Entwicklungsfreigabe**.
|
||||
|
||||
Die Meilensteine **M1**, **M2**, **M3**, **M4**, **M5**, **M6** und **M7** werden als vollständig umgesetzt vorausgesetzt.
|
||||
|
||||
Die Arbeitspakete sind bewusst so geschnitten, dass:
|
||||
|
||||
- **KI 1** daraus je Arbeitspaket einen klaren Einzel-Prompt ableiten kann,
|
||||
- **KI 2** genau dieses eine Arbeitspaket in **einem Durchgang** vollständig umsetzen kann,
|
||||
- nach **jedem** Arbeitspaket wieder ein **fehlerfreier, buildbarer Stand** vorliegt.
|
||||
|
||||
Die Reihenfolge der Arbeitspakete ist verbindlich.
|
||||
|
||||
## Zusätzliche Schnittregeln für die KI-Bearbeitung
|
||||
|
||||
- Pro Arbeitspaket nur die **minimal notwendigen Querschnitte** durch Domain, Application, Adapter, Bootstrap, Konfiguration, Dokumentation und Tests ändern.
|
||||
- Keine Annahmen treffen, die nicht durch die verbindlichen Spezifikationen oder den tatsächlich vorliegenden Code- und Teststand gedeckt sind.
|
||||
- Kein Vorgriff auf ein hypothetisches **M9** oder sonstige neue Produktfeatures.
|
||||
- Kein großflächiger Umbau bestehender M1–M7-Strukturen ohne nachweisbaren M8-Bezug.
|
||||
- M8 ist **review- und konsolidierungsgetrieben**: Es werden nur tatsächlich vorhandene Restlücken, Inkonsistenzen, Dokumentationsdefizite, Testlücken oder Qualitätsprobleme geschlossen.
|
||||
- M8 darf bestehende Implementierungen gezielt schärfen, vereinheitlichen oder bereinigen, aber nicht stillschweigend neue Fachregeln erfinden.
|
||||
- Jeder positive M8-Zwischenstand muss bereits einen **robusten, vollständig buildbaren und testbaren Endstand** liefern, auch wenn die vollständige Entwicklungsfreigabe erst mit späteren Arbeitspaketen nachgewiesen wird.
|
||||
- Ein Arbeitspaket darf nur dann auf neue Prüf-, Test- oder Repository-Fähigkeiten aufbauen, wenn diese bereits aus M1–M7 vorhanden sind oder im unmittelbar vorhergehenden Arbeitspaket explizit geschaffen wurden.
|
||||
- Ein M8-Arbeitspaket darf innerhalb seines benannten Themas zuerst **gezielt prüfen** und dann **nur die in genau diesem Thema nachweisbaren Befunde** beheben.
|
||||
- Unspezifische Sammelaufträge wie „alles prüfen und alles fixen“ sind **kein** zulässiger Zuschnitt für ein einzelnes Arbeitspaket.
|
||||
- Wo ein Arbeitspaket einen Prüfbericht oder Freigabenachweis verlangt, muss dieser **im Repository verbleiben** und auf den real ausgeführten Build-/Teststand bezogen sein.
|
||||
|
||||
## Explizit nicht Bestandteil von M8
|
||||
|
||||
- neue Fachfunktionalität jenseits des bereits definierten Zielbilds
|
||||
- neue Meilensteine, Folgeprodukte oder optionale Komfortfunktionen
|
||||
- Web-UI, REST-API, OCR, Benutzerinteraktion oder manuelle Nachbearbeitung
|
||||
- Reporting-, Monitoring- oder Statistikfunktionen ohne zwingenden M8-Bezug
|
||||
- großflächige Architektur-Neuerfindung statt gezielter Endstandskonsolidierung
|
||||
- kosmetische Änderungen ohne nachweisbaren Nutzen für Betrieb, Konsistenz, Verständlichkeit oder Qualität
|
||||
- Metrik-Tuning ohne fachlich oder technisch belastbare Begründung
|
||||
- pauschale „Aufräumarbeiten“, die nicht an einen konkreten, belegbaren M8-Befund gebunden sind
|
||||
|
||||
## Verbindliche M8-Regeln für **alle** Arbeitspakete
|
||||
|
||||
### 1. M8 schließt nur reale Restlücken des Endstands
|
||||
|
||||
M8 ergänzt keine neue Produktvision, sondern bringt den aus M1–M7 entstandenen Gesamtstand auf einen vollständig konsistenten, dokumentierten und freigabefähigen Abschlusszustand.
|
||||
|
||||
Daraus folgt:
|
||||
|
||||
- Es werden nur **nachweisbare** Restlücken geschlossen.
|
||||
- Spekulative Umbauten ohne konkreten Defekt-, Qualitäts- oder Konsistenzbezug sind unzulässig.
|
||||
- Änderungen müssen sich auf die verbindlichen Spezifikationen und den realen Projektstand zurückführen lassen.
|
||||
|
||||
### 2. Architekturtreue bleibt unverrückbar
|
||||
|
||||
Auch in M8 gilt unverändert:
|
||||
|
||||
- strikte hexagonale Architektur,
|
||||
- Abhängigkeiten zeigen nach innen,
|
||||
- keine Infrastrukturtypen in Domain oder Application,
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung,
|
||||
- keine neue Parallelstruktur neben dem etablierten Modul- und Port-Modell.
|
||||
|
||||
M8 darf bestehende Verstöße beseitigen, aber keine neuen einführen.
|
||||
|
||||
### 3. Keine zweite Wahrheitsquelle für fachliche oder technische Kernzustände
|
||||
|
||||
Die bereits etablierte Wahrheitsbasis bleibt auch in M8 verbindlich:
|
||||
|
||||
- Dokument-Stammsatz für Gesamtstatus und Zähler,
|
||||
- Versuchshistorie für einzelne Versuche und Nachvollziehbarkeit,
|
||||
- führender `PROPOSAL_READY`-Versuch als Quelle des M5-Benennungsvorschlags,
|
||||
- Zielartefaktzustand gemäß M6/M7.
|
||||
|
||||
M8 führt **keine** zusätzliche Parallelwahrheit für Status, Retry, Proposal, Zielname, Logging-Entscheidungen oder Ergebnisbewertung ein.
|
||||
|
||||
### 4. Dokumentation und Implementierung müssen widerspruchsfrei sein
|
||||
|
||||
Ab M8 gilt der Endstand nur dann als korrekt, wenn:
|
||||
|
||||
- JavaDoc,
|
||||
- `package-info`,
|
||||
- Konfigurationsbeispiele,
|
||||
- Start- und Betriebsdokumentation,
|
||||
- Logging- und Fehlermeldungssemantik,
|
||||
- Prüf- und Freigabenachweise,
|
||||
- sowie Tests
|
||||
|
||||
in ihrer Aussage mit dem tatsächlichen Verhalten des Codes übereinstimmen.
|
||||
|
||||
### 5. Testfokus auf Kerninvarianten statt auf Metrik-Kosmetik
|
||||
|
||||
M8 vervollständigt die Qualitätssicherung gezielt für die fachlich und technisch tragenden Regeln des Systems, insbesondere für:
|
||||
|
||||
- Status- und Retry-Semantik,
|
||||
- Persistenzkonsistenz,
|
||||
- Dateinamensbildung,
|
||||
- Zielkopie,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilität,
|
||||
- Mehrlaufverhalten,
|
||||
- End-to-End-Abläufe.
|
||||
|
||||
Reine Zahlenoptimierung ohne belastbaren Risikobezug ist nicht Ziel von M8.
|
||||
|
||||
### 6. Rückwärtsverträglichkeit bestehender Datenbestände bleibt erhalten
|
||||
|
||||
M8 muss bestehende M4–M7-Datenbestände weiterhin:
|
||||
|
||||
- lesen,
|
||||
- korrekt fortschreiben,
|
||||
- und konsistent interpretieren
|
||||
|
||||
können, soweit dies innerhalb des bereits definierten Zielbilds erforderlich ist.
|
||||
|
||||
### 7. Betreiberrelevante Rückmeldungen müssen klar, konsistent und belastbar sein
|
||||
|
||||
M8 schärft operator-seitige Rückmeldungen so, dass Start-, Konfigurations-, Dokument- und Fehlerzustände ohne unnötige Interpretation nachvollziehbar sind.
|
||||
|
||||
Daraus folgt:
|
||||
|
||||
- Fehlermeldungen dürfen weder irreführend noch widersprüchlich sein.
|
||||
- Logging und Dokumentation müssen dieselben Kernbegriffe verwenden.
|
||||
- Sensible KI-Inhalte bleiben standardmäßig geschützt.
|
||||
|
||||
### 8. Vollständige Entwicklungsfreigabe erfordert einen nachweisbaren Gesamtlauf
|
||||
|
||||
Der M8-Endstand gilt erst dann als abgeschlossen, wenn nachgewiesen ist, dass mindestens folgende Ebenen zusammenpassen:
|
||||
|
||||
- Maven-Reactor-Build,
|
||||
- relevante Test-Suiten,
|
||||
- Smoke- und Startverhalten,
|
||||
- End-to-End-Gesamtablauf,
|
||||
- Konfigurationsbeispiele,
|
||||
- Dokumentation,
|
||||
- Artefakterzeugung.
|
||||
|
||||
### 9. M8 darf gezielt bereinigen, aber nicht unkontrolliert refaktorieren
|
||||
|
||||
Zulässig sind nur solche Bereinigungen, die unmittelbar einem dieser Ziele dienen:
|
||||
|
||||
- Architekturtreue,
|
||||
- Konsistenz,
|
||||
- Verständlichkeit,
|
||||
- Testbarkeit,
|
||||
- Stabilität,
|
||||
- Dokumentationsklarheit,
|
||||
- Freigabefähigkeit.
|
||||
|
||||
Großflächige Strukturumbauten ohne unmittelbaren M8-Nutzen sind ausgeschlossen.
|
||||
|
||||
### 10. Gesamtprüfung, Blockerbehebung und Abschlussfreigabe sind getrennte Arbeitsschritte
|
||||
|
||||
Für die zweistufige KI-Bearbeitung gilt in M8 zusätzlich:
|
||||
|
||||
- **integrierte Gesamtprüfung**, **gezielte Release-Blocker-Behebung** und **finale Freigabebestätigung** sind getrennte Arbeitspakete,
|
||||
- ein einzelnes Arbeitspaket darf nicht gleichzeitig einen unbeschränkten Gesamtreview durchführen **und** unbegrenzt alle dabei gefundenen Themen beheben,
|
||||
- Release-Blocker dürfen nur dann in einem späteren Arbeitspaket behoben werden, wenn sie im unmittelbar vorhergehenden Prüf-Arbeitspaket **konkret nachgewiesen und eingegrenzt** wurden.
|
||||
|
||||
---
|
||||
|
||||
## AP-001 Architekturgrenzen und code-nahe Endstandsdokumentation finalisieren
|
||||
|
||||
### Voraussetzung
|
||||
Keine. Dieses Arbeitspaket ist der M8-Startpunkt.
|
||||
|
||||
### Ziel
|
||||
Die Architekturgrenzen des Gesamtstands werden abschließend geschärft und in Code-naher Dokumentation so verankert, dass spätere M8-Arbeitspakete ohne Interpretationsspielraum auf einem konsolidierten Endstandsverständnis aufsetzen können.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bestehende Modulgrenzen, Verantwortlichkeiten und Abhängigkeitsrichtungen gegen den realen Codebestand prüfen und **nur nachweisbare** M8-relevante Unschärfen oder Verstöße gezielt bereinigen.
|
||||
- JavaDoc und `package-info` dort vervollständigen oder schärfen, wo für den Endstand noch Lücken oder Widersprüche bestehen, insbesondere für:
|
||||
- Domain-Verantwortung,
|
||||
- Application-Orchestrierung,
|
||||
- Port-Zwecke,
|
||||
- Adapter-Verantwortung,
|
||||
- Bootstrap-Aufgaben,
|
||||
- Endstandsbegriffe wie Status, Retry, Proposal-Quelle, Zielerfolg und Nachvollziehbarkeit.
|
||||
- Sicherstellen, dass Architekturgrenzen in der Dokumentation dieselben Begriffe und dieselbe Semantik verwenden wie die implementierte Logik aus M1–M7.
|
||||
- Nachweisbare, code-seitig sichtbare Grenzverletzungen nur dort korrigieren, wo sie für M8-Freigabe, Wartbarkeit oder Spezifikationstreue relevant sind.
|
||||
- Änderungen in Produktionscode auf **architekturbezogene** Korrekturen begrenzen; keine operator-seitigen Meldungstexte, keine Persistenzbereinigung und keine Testkampagne dieses Arbeitspakets vorwegnehmen.
|
||||
- Die für den Endstand verbindlichen Architektur- und Begriffsinvarianten so dokumentieren, dass KI 1 daraus für nachfolgende Arbeitspakete einen präzisen Prompt ableiten kann.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Fachfunktionalität
|
||||
- neue Persistenzmodelle oder neue Port-Landschaften ohne Defektbezug
|
||||
- großflächige Umstrukturierungen ohne nachweisbaren Architekturverstoß
|
||||
- operator-seitige Logging-/Fehlermeldungsüberarbeitung
|
||||
- vollständige Testergänzung oder Dokumentationskonsolidierung außerhalb der code-nahen Architekturgrundlage
|
||||
|
||||
### Fertig wenn
|
||||
- die Architekturgrenzen des Endstands im Code und in der code-nahen Dokumentation klar, konsistent und belastbar beschrieben sind,
|
||||
- nachweisbare M8-relevante Architekturverstöße gezielt bereinigt sind,
|
||||
- spätere M8-Arbeitspakete ohne Grundsatzunklarheiten aufsetzen können,
|
||||
- der Build weiterhin fehlerfrei ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-002 Status-, Persistenz-, Proposal- und Zielzustandskonsistenz des Endstands bereinigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die letzte Konsistenzlücke zwischen Dokument-Stammsatz, Versuchshistorie, Proposal-Quelle, Zielartefaktzustand und Adapterverhalten wird geschlossen, ohne neue Wahrheitsquellen oder neue Fachregeln einzuführen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den realen Gesamtstand aus M4–M7 gezielt auf **nachweisbare** Inkonsistenzen prüfen, insbesondere zwischen:
|
||||
- Gesamtstatus im Dokument-Stammsatz,
|
||||
- Fehlerzählern,
|
||||
- historisierten Versuchsdaten,
|
||||
- führender `PROPOSAL_READY`-Quelle,
|
||||
- persistierten Zielartefaktdaten,
|
||||
- Adapter-Ergebnissen in Rand- und Fehlerfällen.
|
||||
- Tatsächlich vorhandene Inkonsistenzen im Endstand gezielt bereinigen, insbesondere wenn sie zu widersprüchlichem Mehrlaufverhalten, unstimmiger Persistenzfortschreibung oder fehleranfälliger M6/M7-Finalisierung führen können.
|
||||
- Sicherstellen, dass M4–M7-Datenbestände weiterhin lesbar und korrekt fortschreibbar bleiben.
|
||||
- Sicherstellen, dass keine redundante zweite Persistenzwahrheit für Proposal-, Retry-, Ziel- oder Fehlerzustände entsteht.
|
||||
- Nachweisbare Semantiklücken zwischen Repository-Verhalten und Use-Case-Entscheidungen nur soweit schließen, wie sie für den M8-Endstand kritisch sind.
|
||||
- Unmittelbar betroffene JavaDoc-, Mapping- und Teststellen mitziehen, aber keine operator-seitige Textschärfung oder allgemeine Testkampagne dieses Arbeitspakets vorwegnehmen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Fachregeln über M1–M7 hinaus
|
||||
- Reporting-, Statistik- oder Analysefunktionen
|
||||
- großflächige Schema-Neuerfindung ohne zwingenden M8-Bedarf
|
||||
- Logging-Feinschliff oder Dokumentationskonsolidierung außerhalb des Konsistenzbezugs
|
||||
- integrierte Gesamtprüfung des vollständigen Release-Kandidaten
|
||||
|
||||
### Fertig wenn
|
||||
- nachweisbare Inkonsistenzen zwischen Statusmodell, Persistenz, Proposal-Quelle, Zielartefaktzustand und Adapterverhalten beseitigt sind,
|
||||
- Mehrlaufverhalten, Proposal-Quelle und Zielartefaktzustand konsistent zusammenwirken,
|
||||
- keine neue Parallelwahrheit eingeführt wurde,
|
||||
- der Stand weiterhin fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-003 Betreiberrelevante Logging-, Fehlertext- und Startvalidierungsrückmeldungen des Endstands schärfen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die nach außen sichtbaren Rückmeldungen des Systems werden sprachlich und inhaltlich so geschärft, dass Betrieb, Fehlersuche und Freigabe des Endstands ohne unnötige Mehrdeutigkeit möglich sind.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Bestehende Logging- und Fehlermeldungen für Start-, Konfigurations-, Dokument- und Laufzustände auf **nachweisbare** Unschärfen, Widersprüche, missverständliche Formulierungen oder inkonsistente Begriffsnutzung prüfen.
|
||||
- Betreiberrelevante Meldungen gezielt nachschärfen, insbesondere für:
|
||||
- harte Start- und Konfigurationsfehler,
|
||||
- dokumentbezogene Fehlerklassifikation,
|
||||
- Retry-Entscheidungen,
|
||||
- Skip-Fälle,
|
||||
- Proposal- und Zielerfolgszustände,
|
||||
- Laufstart und Laufende.
|
||||
- Sicherstellen, dass die M7-Sensibilitätsregel für KI-Inhalte sprachlich und technisch konsistent bleibt und nicht durch irreführende Logs oder Fehlermeldungen unterlaufen wird.
|
||||
- Startvalidierungsfehler so strukturieren, dass sie klare Betreiberhinweise liefern, ohne technische Interna oder falsche Ursachenketten zu suggerieren.
|
||||
- Terminologie zwischen Logging, Exception-Texten, Konfigurationsvalidierung und dokumentierter Semantik vereinheitlichen.
|
||||
- Falls dafür gezielte technische Verdrahtungs- oder Formatierungsanpassungen erforderlich sind, diese minimal und architekturtreu umsetzen.
|
||||
- Nur die für diese Rückmeldungen unmittelbar nötigen Tests ergänzen oder schärfen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neues Logging-Framework oder neue Telemetrieebenen
|
||||
- neue Betriebsfeatures ohne M8-Bezug
|
||||
- umfassende Dokumentationskonsolidierung außerhalb der operator-seitigen Rückmeldungen
|
||||
- vollständige End-to-End-Testergänzung
|
||||
- Coverage-/PIT-Kampagne
|
||||
|
||||
### Fertig wenn
|
||||
- Logging- und Fehlerrückmeldungen des Endstands klar, konsistent und belastbar sind,
|
||||
- Betreiberrelevante Zustände ohne unnötige Interpretation nachvollziehbar bleiben,
|
||||
- die Sensibilitätsregel für KI-Inhalte weiterhin sauber greift,
|
||||
- der Stand fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-004 Konfigurationsbeispiele, Prompt-Bezug sowie Start- und Betriebsdokumentation konsolidieren
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-003 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Repository-Stand enthält eine konsolidierte, mit dem echten Endverhalten abgestimmte Dokumentations- und Beispielbasis, mit der lokale Starts, Batch-Läufe und Betriebsverständnis ohne implizite Annahmen möglich sind.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die vorhandenen Konfigurationsbeispiele gegen den realen Endstand prüfen und gezielt vervollständigen oder bereinigen, insbesondere für:
|
||||
- Pflichtparameter,
|
||||
- optionale Parameter,
|
||||
- sinnvolle Beispielwerte,
|
||||
- M7-relevante Logging- und Retry-Konfiguration,
|
||||
- Priorität von Umgebungsvariable gegenüber Properties beim API-Key.
|
||||
- Den vorhandenen Prompt-Bezug im Repository konsistent dokumentieren.
|
||||
- Falls für einen reproduzierbaren lokalen Start ein Prompt-Beispiel oder ein nachvollziehbares Prompt-Skelett im Repository fehlt, dieses **minimal und endstandskonform** ergänzen.
|
||||
- Start-, Konfigurations- und Betriebsdokumentation so konsolidieren, dass mindestens nachvollziehbar beschrieben sind:
|
||||
- benötigte Eingaben,
|
||||
- Start des ausführbaren Artefakts,
|
||||
- Quell- und Zielordnerbezug,
|
||||
- SQLite-Nutzung,
|
||||
- Retry- und Skip-Grundverhalten,
|
||||
- Logging-Grundverhalten,
|
||||
- Umgang mit sensiblen KI-Inhalten im Logging,
|
||||
- Grenzen des Systems.
|
||||
- Veraltete, widersprüchliche oder nicht mehr zum Endstand passende Dokumentation gezielt bereinigen.
|
||||
- Sicherstellen, dass Konfigurationsnamen, Dateinamen, Beispielpfade und Dokumentationsaussagen mit dem tatsächlichen Code übereinstimmen.
|
||||
- Nur dann produktiven Code anfassen, wenn Dokumentation und Code an einem **eindeutig nachweisbaren** Benennungs- oder Konfigurationskonflikt auseinanderlaufen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- externe Web-Dokumentation oder Handbücher außerhalb des Repositories
|
||||
- neue Fachfunktionalität
|
||||
- breit angelegte Code-Refactorings ohne Dokumentationsbezug
|
||||
- finale Testlückenschließung
|
||||
- integrierte Gesamtprüfung des Release-Kandidaten
|
||||
|
||||
### Fertig wenn
|
||||
- der Endstand über die im Repository enthaltenen Beispiele und Dokumente nachvollziehbar start- und betreibbar beschrieben ist,
|
||||
- Konfigurations- und Prompt-Beispiele zum realen Code passen,
|
||||
- veraltete oder widersprüchliche Dokumentation bereinigt ist,
|
||||
- der Stand weiterhin fehlerfrei buildbar bleibt.
|
||||
|
||||
---
|
||||
|
||||
## AP-005 Deterministische End-to-End-Testbasis und wiederverwendbare Testdaten für den Gesamtprozess bereitstellen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Für den finalen Qualitätsnachweis steht eine robuste, deterministische und wiederverwendbare End-to-End-Testbasis bereit, die den vollständigen Batch-Prozess des Endstands ohne externe Unsicherheiten reproduzierbar abbilden kann.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Eine wiederverwendbare End-to-End-Testbasis für den Gesamtprozess bereitstellen, die mindestens kontrolliert kapselt:
|
||||
- Quellordner,
|
||||
- Zielordner,
|
||||
- temporäre Artefakte,
|
||||
- SQLite-Datei,
|
||||
- Konfiguration,
|
||||
- erforderliche Test-Doubles für externe Abhängigkeiten.
|
||||
- Deterministische Testdaten bzw. Testfixturen für zentrale Endstands-Szenarien bereitstellen, insbesondere für:
|
||||
- Happy-Path bis `SUCCESS`,
|
||||
- deterministischen Inhaltsfehler,
|
||||
- transienten technischen Fehler,
|
||||
- Skip nach `SUCCESS`,
|
||||
- Skip nach `FAILED_FINAL`,
|
||||
- vorhandenes `PROPOSAL_READY` mit späterer Finalisierung,
|
||||
- Zielkopierfehler mit M7-Sofort-Wiederholversuch.
|
||||
- Sicherstellen, dass die End-to-End-Testbasis keine unkontrollierte Abhängigkeit von externen KI-Diensten, instabilen Dateisystemzuständen oder globalen Laufzeitumgebungen hat.
|
||||
- Testhilfen und Fixture-Strukturen so schneiden, dass spätere M8-Testarbeitspakete ohne erneut erfundene Testinfrastruktur darauf aufbauen können.
|
||||
- Dokumentieren, welche Endstands-Invarianten durch die End-to-End-Testbasis gezielt nachweisbar gemacht werden.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- vollständige Schließung aller Test- und Coverage-Lücken
|
||||
- willkürliche Testvermehrung ohne Endstandsbezug
|
||||
- neue Fachfunktionalität
|
||||
- Qualitätsmetriken-Tuning ohne konkreten Testfallbezug
|
||||
- globale Release-Freigabeentscheidung
|
||||
|
||||
### Fertig wenn
|
||||
- eine stabile und deterministische End-to-End-Testbasis vorhanden ist,
|
||||
- die relevanten Endstands-Szenarien reproduzierbar vorbereitet werden können,
|
||||
- spätere M8-Testarbeitspakete ohne neue Testgrundstruktur anschließen können,
|
||||
- der Stand fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-006 Regressionstests für Kernregeln, Randfälle und Konsistenzinvarianten des Endstands vervollständigen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die fachlich und technisch tragenden Regeln des Gesamtstands sind automatisiert so abgesichert, dass echte Regressionsrisiken des Produktiv-Endstands zuverlässig erkannt werden.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Gezielt Regressionstests für die tragenden Regeln aus M1–M7 ergänzen oder vervollständigen, insbesondere für:
|
||||
- Status- und Retry-Semantik,
|
||||
- Mehrlaufverhalten,
|
||||
- Skip-Regeln,
|
||||
- Proposal-Quelle,
|
||||
- Dateinamensbildung,
|
||||
- Windows-Kompatibilität,
|
||||
- Dublettenauflösung,
|
||||
- Zielkopie,
|
||||
- Persistenzkonsistenz,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilität,
|
||||
- Exit-Code-Endverhalten.
|
||||
- Randfälle gezielt absichern, die für den Endstand regressionskritisch sind, insbesondere:
|
||||
- inkonsistente historische Datenzustände im zulässigen M4–M7-Rahmen,
|
||||
- Grenzfälle bei Fehlerzählern,
|
||||
- fehlgeschlagene Persistenz nach technischer Zielkopie,
|
||||
- erneute Läufe nach terminalen Zuständen,
|
||||
- Proposal- und Finalisierungsübergänge.
|
||||
- Sicherstellen, dass die Tests reale Endstands-Invarianten prüfen und nicht bloß Implementierungsdetails einfrieren.
|
||||
- Bestehende Testlücken dort schließen, wo ohne diese Lücke eine belastbare Entwicklungsfreigabe nicht möglich wäre.
|
||||
- Die End-to-End-Testbasis aus AP-005 gezielt wiederverwenden und nicht parallel neu erfinden.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- rein kosmetische Testergänzungen ohne Risikobezug
|
||||
- neue Produktfeatures
|
||||
- breitflächige Qualitätsmetriken-Kampagnen ohne konkrete kritische Lücke
|
||||
- vollständige Freigabeprüfung des Gesamtprojekts
|
||||
|
||||
### Fertig wenn
|
||||
- die regressionskritischen Kernregeln des Endstands automatisiert abgesichert sind,
|
||||
- Randfälle mit hoher Relevanz für Stabilität, Konsistenz und Mehrlaufverhalten belastbar getestet sind,
|
||||
- die Testbasis kohärent und wiederverwendbar bleibt,
|
||||
- der Stand fehlerfrei buildbar und testbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-007 Kritische Coverage- und Mutationslücken des Endstands gezielt schließen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-006 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Die Qualitätsabsicherung des Endstands wird dort gezielt nachgehärtet, wo JaCoCo- oder PIT-Ergebnisse noch reale Risiken in den tragenden Entscheidungs- und Fehlerpfaden erkennen lassen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Die vorhandenen Qualitätsauswertungen des Projekts gezielt auf **fachlich und technisch kritische** Lücken prüfen, insbesondere in Bereichen wie:
|
||||
- Retry-Entscheidung,
|
||||
- Statusfortschreibung,
|
||||
- Persistenzkonsistenz,
|
||||
- Dateinamensbildung,
|
||||
- Zielkopierpfad,
|
||||
- Startvalidierung,
|
||||
- Logging-Sensibilitätsentscheidung.
|
||||
- Bedeutungsvolle Lücken oder überlebende Mutationen gezielt durch:
|
||||
- zusätzliche Tests,
|
||||
- kleinere, nachweisbar sinnvolle Implementierungsschärfungen,
|
||||
- oder eng begründete Testfallpräzisierungen
|
||||
schließen.
|
||||
- Vorhandene Qualitäts-Gates oder bestehende Qualitäts-Reports für den relevanten Projektstand stabil grün bekommen, soweit dies bereits Teil des Build-Setups ist.
|
||||
- Sicherstellen, dass keine Metrik-Kosmetik betrieben wird, etwa durch willkürliche Ausschlüsse oder nicht belastbare Testumgehungen.
|
||||
- Nur dann Build- oder Qualitätskonfiguration anfassen, wenn dies für einen korrekten, belastbaren M8-Endstand zwingend erforderlich ist und sachlich begründet werden kann.
|
||||
- Änderungen auf die **nachgewiesenen** Hochrisiko-Lücken begrenzen; kein blindes Nachhärten bereits unkritischer Bereiche.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- blindes Hochschrauben von Kennzahlen ohne Risikobezug
|
||||
- großflächige Qualitätsgate-Neuerfindung ohne bestehenden Projektbezug
|
||||
- neue Fachfunktionalität
|
||||
- Abschlussfreigabe des Gesamtprojekts ohne vorherigen Gesamtprüfnachweis
|
||||
|
||||
### Fertig wenn
|
||||
- die kritischen Coverage- und Mutationslücken des Endstands gezielt geschlossen sind,
|
||||
- verbleibende Qualitätsauswertungen keine offensichtlichen Hochrisiko-Blindstellen mehr zeigen,
|
||||
- das Build- und Testsetup belastbar grün bleibt,
|
||||
- keine Metrik-Kosmetik eingeführt wurde.
|
||||
|
||||
---
|
||||
|
||||
## AP-008 Integrierte Gesamtprüfung des Endstands und belastbare Befundliste erstellen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-007 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der zu diesem Zeitpunkt erreichte Endstand wird ganzheitlich geprüft, und es entsteht eine belastbare, im Repository verbleibende Befundliste, aus der KI 1 für ein mögliches Folge-Arbeitspaket ausschließlich die tatsächlich verbliebenen Release-Blocker ableiten kann.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den vollständigen Projektstand ganzheitlich gegen die verbindlichen Spezifikationen sowie die Ergebnisse aus M1–M7 prüfen.
|
||||
- Den vollständigen Maven-Reactor-Build, relevante Test-Suiten, Smoke-Tests des ausführbaren Artefakts und die maßgeblichen End-to-End-Prüfungen des Endstands tatsächlich ausführen und auswerten.
|
||||
- Prüfen und schriftlich festhalten, dass insbesondere zusammenpassen oder wo noch Abweichungen bestehen:
|
||||
- Architektur und Modulgrenzen,
|
||||
- Fachregeln,
|
||||
- Persistenz- und Retry-Semantik,
|
||||
- Dateinamens- und Zielkopierverhalten,
|
||||
- Startvalidierung und Exit-Code,
|
||||
- Logging und Sensibilitätsregel,
|
||||
- Konfigurationsbeispiele,
|
||||
- Betriebs- und Startdokumentation,
|
||||
- Build- und Testartefakte.
|
||||
- Eine knappe, im Repository verbleibende Befunddatei ergänzen oder aktualisieren, die:
|
||||
- die tatsächlich ausgeführten Prüfungen benennt,
|
||||
- grüne Bereiche von offenen Punkten trennt,
|
||||
- offene Punkte nach **Release-Blocker** vs. **nicht blockierend** klassifiziert,
|
||||
- pro Release-Blocker den betroffenen Themenbereich eindeutig eingrenzt.
|
||||
- Nur die minimal notwendigen Änderungen an Build-/Testhilfen oder Prüfskripten vornehmen, die erforderlich sind, um diese integrierte Gesamtprüfung reproduzierbar auszuführen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- pauschale Behebung aller in diesem Gesamtreview entdeckten Themen in demselben Arbeitspaket
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- nachträgliche Großrefactorings ohne klaren Prüfbezug
|
||||
- finale Freigabeerklärung des Projekts
|
||||
|
||||
### Fertig wenn
|
||||
- der vollständige Endstand ganzheitlich geprüft ist,
|
||||
- die tatsächlich ausgeführten Prüfungen belastbar dokumentiert sind,
|
||||
- eine klar eingegrenzte Befundliste im Repository vorliegt,
|
||||
- eventuelle Release-Blocker für ein Folge-Arbeitspaket präzise genug beschrieben sind,
|
||||
- der Stand weiterhin fehlerfrei buildbar ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-009 Gezielte Release-Blocker aus der integrierten Gesamtprüfung beheben
|
||||
|
||||
### Voraussetzung
|
||||
AP-008 ist abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Nur die in AP-008 konkret nachgewiesenen und eingegrenzten Release-Blocker werden gezielt beseitigt, ohne den Scope des Abschlussmeilensteins erneut zu öffnen.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Ausschließlich die in der Befundliste aus AP-008 als **Release-Blocker** ausgewiesenen Punkte bearbeiten.
|
||||
- Die Behebung pro Blocker auf den dort klar benannten Themenbereich begrenzen.
|
||||
- Sicherstellen, dass keine nicht belegten Nebenbaustellen oder neuen Qualitätskampagnen in dieses Arbeitspaket hineingezogen werden.
|
||||
- Unmittelbar betroffene Tests, Dokumentationsstellen und Konfigurationsbeispiele mitziehen, soweit dies zur konsistenten Behebung des konkreten Blockers nötig ist.
|
||||
- Falls AP-008 **keine** Release-Blocker nachgewiesen hat, in diesem Arbeitspaket keine unnötigen Produktionsänderungen vornehmen, sondern die Blockerfreiheit nur konsistent im Repository nachvollziehbar machen.
|
||||
- Nach der Blockerbehebung mindestens den für die betroffenen Blocker notwendigen Build-/Testumfang erneut ausführen.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- Behebung bloß nicht blockierender Schönheitsmängel
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- erneute globale Gesamtprüfung des kompletten Endstands
|
||||
- breitflächige Nachschärfung von Bereichen, die in AP-008 nicht als Blocker eingegrenzt wurden
|
||||
|
||||
### Fertig wenn
|
||||
- alle in AP-008 nachgewiesenen Release-Blocker gezielt beseitigt oder nachvollziehbar als nicht mehr vorhanden bestätigt sind,
|
||||
- keine unnötige Scope-Ausweitung stattgefunden hat,
|
||||
- die betroffenen Bereiche wieder fehlerfrei buildbar und testbar sind,
|
||||
- der Stand weiterhin übergabefähig ist.
|
||||
|
||||
---
|
||||
|
||||
## AP-010 Finale Gesamtprüfung, Freigabedokumentation und Abschluss des M8-Endstands durchführen
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-009 sind abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der Gesamtstand wird abschließend als vollständig freigabefähiger Produktiv-Endstand innerhalb des definierten Projektumfangs nachgewiesen und die Entwicklungsfreigabe wird nachvollziehbar dokumentiert.
|
||||
|
||||
### Muss umgesetzt werden
|
||||
- Den vollständigen Maven-Reactor-Build, die relevanten Test-Suiten, Smoke-Tests des ausführbaren Artefakts und die maßgeblichen End-to-End-Prüfungen des Endstands erneut tatsächlich ausführen und auswerten.
|
||||
- Prüfen und bestätigen, dass insbesondere zusammenpassen:
|
||||
- Architektur und Modulgrenzen,
|
||||
- Fachregeln,
|
||||
- Persistenz- und Retry-Semantik,
|
||||
- Dateinamens- und Zielkopierverhalten,
|
||||
- Startvalidierung und Exit-Code,
|
||||
- Logging und Sensibilitätsregel,
|
||||
- Konfigurationsbeispiele,
|
||||
- Betriebs- und Startdokumentation,
|
||||
- Build- und Testartefakte.
|
||||
- Eine knappe, im Repository verbleibende Abschluss- bzw. Freigabedokumentation ergänzen oder aktualisieren, die mindestens festhält:
|
||||
- welche Prüfungen tatsächlich ausgeführt wurden,
|
||||
- dass keine bekannten, spezifikationsrelevanten Release-Blocker für den definierten Projektumfang offen sind,
|
||||
- auf welche Artefakte, Tests oder Dokumente sich diese Aussage stützt.
|
||||
- Sicherstellen, dass nach diesem Arbeitspaket kein bekannter, spezifikationsrelevanter Blocker für den definierten Projektumfang offen bleibt.
|
||||
- Nur dann noch Änderungen am Produktionscode, an Tests oder an Dokumentation vornehmen, wenn im unmittelbaren Abschlussdurchlauf ein **konkret nachweisbarer** Freigabeblocker auftritt, der ohne Scope-Ausweitung minimal behoben werden kann.
|
||||
|
||||
### Explizit nicht Teil
|
||||
- neue Produktfeatures oder neue Meilensteine
|
||||
- nachträgliche Großrefactorings ohne unmittelbaren Freigabeblocker-Bezug
|
||||
- beliebige Schönheitskorrekturen ohne Freigaberelevanz
|
||||
|
||||
### Fertig wenn
|
||||
- der vollständige Endstand ganzheitlich geprüft und freigabefähig ist,
|
||||
- Build, Tests, Smoke-Verhalten und End-to-End-Abläufe belastbar grün sind,
|
||||
- keine bekannten, spezifikationsrelevanten Release-Blocker mehr offen sind,
|
||||
- Dokumentation, Konfiguration und Artefakterzeugung mit dem realen Endstand übereinstimmen,
|
||||
- ein fehlerfreier, übergabefähiger Abschlussstand vorliegt.
|
||||
|
||||
---
|
||||
|
||||
## Abschlussbewertung
|
||||
|
||||
Die Arbeitspakete decken den vollständigen M8-Zielumfang aus den verbindlichen Spezifikationen ab und schneiden den Abschlussmeilenstein für die zweistufige KI-Bearbeitung präziser als zuvor:
|
||||
|
||||
- abschließender Architektur- und Dokumentationsabgleich
|
||||
- gezielte Bereinigung realer Restinkonsistenzen im Endstand
|
||||
- Schärfung von Logging-, Fehler- und Betreiber-Rückmeldungen
|
||||
- Konsolidierung von Konfigurations-, Prompt- und Betriebsdokumentation
|
||||
- deterministische End-to-End-Testbasis
|
||||
- gezielte Regressionstests für Kernregeln und Randfälle
|
||||
- belastbare Schließung kritischer Coverage- und Mutationslücken
|
||||
- integrierte Gesamtprüfung mit dokumentierter Befundliste
|
||||
- gezielte, klar eingegrenzte Behebung nachgewiesener Release-Blocker
|
||||
- abschließende Gesamtprüfung mit nachvollziehbarer Entwicklungsfreigabe
|
||||
|
||||
Gleichzeitig bleiben die Grenzen zu M1–M7 gewahrt:
|
||||
|
||||
- M8 erfindet keine neue Produktfunktionalität,
|
||||
- M8 führt keine zweite Wahrheitsquelle ein,
|
||||
- M8 rollt M1/M2-Themen nicht pauschal neu auf, sondern schließt nur reale Restlücken des Endstands,
|
||||
- M8 trennt Gesamtprüfung, Blockerbehebung und Freigabe in eigenständige, für KI 1 und KI 2 präzise nutzbare Arbeitsschritte.
|
||||
149
docs/workpackages/V1.1 - Abschlussnachweis.md
Normal file
149
docs/workpackages/V1.1 - Abschlussnachweis.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# V1.1 – Abschlussnachweis
|
||||
|
||||
## Datum und betroffene Module
|
||||
|
||||
**Datum:** 2026-04-09
|
||||
|
||||
**Betroffene Module:**
|
||||
|
||||
| Modul | Art der Änderung |
|
||||
|---|---|
|
||||
| `pdf-umbenenner-application` | Neue Konfigurationstypen (`MultiProviderConfiguration`, `ProviderConfiguration`, `AiProviderFamily`) |
|
||||
| `pdf-umbenenner-adapter-out` | Neuer Anthropic-Adapter (`AnthropicClaudeHttpAdapter`), neuer Parser (`MultiProviderConfigurationParser`), neuer Validator (`MultiProviderConfigurationValidator`), Migrator (`LegacyConfigurationMigrator`), Schema-Migration (`ai_provider`-Spalte), aktualisierter OpenAI-Adapter (`OpenAiHttpAdapter`), aktualisierter Properties-Adapter (`PropertiesConfigurationPortAdapter`) |
|
||||
| `pdf-umbenenner-bootstrap` | Provider-Selektor (`AiProviderSelector`), aktualisierter `BootstrapRunner` (Migration, Provider-Auswahl, Logging) |
|
||||
| `pdf-umbenenner-adapter-in-cli` | Keine fachliche Änderung |
|
||||
| `pdf-umbenenner-domain` | Keine Änderung |
|
||||
| `config/` | Beispiel-Properties-Dateien auf neues Schema aktualisiert |
|
||||
| `docs/betrieb.md` | Abschnitte KI-Provider-Auswahl und Migration ergänzt |
|
||||
|
||||
---
|
||||
|
||||
## Pflicht-Testfälle je Arbeitspaket
|
||||
|
||||
### AP-001 – Konfigurations-Schema einführen
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `parsesNewSchemaWithOpenAiCompatibleActive` | `MultiProviderConfigurationTest` | grün |
|
||||
| `parsesNewSchemaWithClaudeActive` | `MultiProviderConfigurationTest` | grün |
|
||||
| `claudeBaseUrlDefaultsWhenMissing` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsMissingActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsUnknownActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `rejectsMissingMandatoryFieldForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `acceptsMissingMandatoryFieldForInactiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `envVarOverridesPropertiesApiKeyForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| `envVarOnlyResolvesForActiveProvider` | `MultiProviderConfigurationTest` | grün |
|
||||
| Bestehende Tests bleiben grün | `PropertiesConfigurationPortAdapterTest`, `StartConfigurationValidatorTest` | grün |
|
||||
|
||||
### AP-002 – Legacy-Migration mit `.bak`
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `migratesLegacyFileWithAllFlatKeys` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `createsBakBeforeOverwriting` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `bakSuffixIsIncrementedIfBakExists` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `noOpForAlreadyMigratedFile` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `reloadAfterMigrationSucceeds` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `migrationFailureKeepsBak` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `legacyDetectionRequiresAtLeastOneFlatKey` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `legacyValuesEndUpInOpenAiCompatibleNamespace` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `unrelatedKeysSurviveUnchanged` | `LegacyConfigurationMigratorTest` | grün |
|
||||
| `inPlaceWriteIsAtomic` | `LegacyConfigurationMigratorTest` | grün |
|
||||
|
||||
### AP-003 – Bootstrap-Provider-Auswahl und Umstellung des bestehenden OpenAI-Adapters
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `bootstrapWiresOpenAiCompatibleAdapterWhenActive` | `AiProviderSelectorTest` | grün |
|
||||
| `bootstrapFailsHardWhenActiveProviderUnknown` | `AiProviderSelectorTest` | grün |
|
||||
| `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` | `AiProviderSelectorTest` | grün |
|
||||
| `openAiAdapterReadsValuesFromNewNamespace` | `OpenAiHttpAdapterTest` | grün |
|
||||
| `openAiAdapterBehaviorIsUnchanged` | `OpenAiHttpAdapterTest` | grün |
|
||||
| `activeProviderIsLoggedAtRunStart` | `BootstrapRunnerTest` | grün |
|
||||
| `existingDocumentProcessingTestsRemainGreen` | `BatchRunEndToEndTest` | grün |
|
||||
| `legacyFileEndToEndStillRuns` | `BootstrapRunnerTest` | grün |
|
||||
|
||||
### AP-004 – Persistenz: Provider-Identifikator additiv
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `addsProviderColumnOnFreshDb` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `addsProviderColumnOnExistingDbWithoutColumn` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `migrationIsIdempotent` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `existingRowsKeepNullProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `newAttemptsWriteOpenAiCompatibleProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `newAttemptsWriteClaudeProvider` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `repositoryReadsProviderColumn` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `legacyDataReadingDoesNotFail` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
| `existingHistoryTestsRemainGreen` | `SqliteAttemptProviderPersistenceTest` | grün |
|
||||
|
||||
### AP-005 – Nativer Anthropic-Adapter implementieren und verdrahten
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `claudeAdapterBuildsCorrectRequest` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterUsesEnvVarApiKey` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFallsBackToPropertiesApiKey` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFailsValidationWhenBothKeysMissing` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterParsesSingleTextBlock` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterConcatenatesMultipleTextBlocks` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterIgnoresNonTextBlocks` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterFailsOnEmptyTextContent` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp401AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp429AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsHttp500AsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsTimeoutAsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `claudeAdapterMapsUnparseableJsonAsTechnical` | `AnthropicClaudeHttpAdapterTest` | grün |
|
||||
| `bootstrapSelectsClaudeWhenActive` | `AiProviderSelectorTest` | grün |
|
||||
| `claudeProviderIdentifierLandsInAttemptHistory` | `AnthropicClaudeAdapterIntegrationTest` | grün |
|
||||
| `existingOpenAiPathRemainsGreen` | alle `OpenAiHttpAdapterTest`-Tests | grün |
|
||||
|
||||
### AP-006 – Regression, Smoke, Doku, Abschlussnachweis
|
||||
|
||||
| Testfall | Klasse | Status |
|
||||
|---|---|---|
|
||||
| `smokeBootstrapWithOpenAiCompatibleActive` | `BootstrapSmokeTest` | grün |
|
||||
| `smokeBootstrapWithClaudeActive` | `BootstrapSmokeTest` | grün |
|
||||
| `e2eMigrationFromLegacyDemoConfig` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `regressionExistingOpenAiSuiteGreen` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `e2eClaudeRunWritesProviderIdentifierToHistory` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `e2eOpenAiRunWritesProviderIdentifierToHistory` | `ProviderIdentifierE2ETest` | grün |
|
||||
| `legacyDataFromBeforeV11RemainsReadable` | `ProviderIdentifierE2ETest` | grün |
|
||||
|
||||
---
|
||||
|
||||
## Belegte Eigenschaften
|
||||
|
||||
| Eigenschaft | Nachweis |
|
||||
|---|---|
|
||||
| Zwei Provider-Familien unterstützt | `AiProviderSelectorTest`, `BootstrapSmokeTest` |
|
||||
| Genau einer aktiv pro Lauf | `MultiProviderConfigurationTest`, `BootstrapSmokeTest` |
|
||||
| Kein automatischer Fallback | keine Fallback-Logik in `AiProviderSelector` oder Application-Schicht |
|
||||
| Fachlicher Vertrag (`NamingProposal`) unverändert | `AiResponseParser`, `AiNamingService` unverändert; beide Adapter liefern denselben Domain-Typ |
|
||||
| Persistenz rückwärtsverträglich | `SqliteAttemptProviderPersistenceTest`, `legacyDataFromBeforeV11RemainsReadable` |
|
||||
| Migration nachgewiesen | `LegacyConfigurationMigratorTest`, `e2eMigrationFromLegacyDemoConfig` |
|
||||
| `.bak`-Sicherung nachgewiesen | `LegacyConfigurationMigratorTest.createsBakBeforeOverwriting`, `e2eMigrationFromLegacyDemoConfig` |
|
||||
| Aktiver Provider wird geloggt | `BootstrapRunnerTest.activeProviderIsLoggedAtRunStart` |
|
||||
| Keine Architekturbrüche | kein `Application`- oder `Domain`-Code kennt OpenAI- oder Claude-spezifische Typen |
|
||||
| Keine neuen Bibliotheken | Anthropic-Adapter nutzt Java HTTP Client und `org.json` (beides bereits im Repo etabliert) |
|
||||
|
||||
---
|
||||
|
||||
## Betreiberaufgabe
|
||||
|
||||
Wer bisher die Umgebungsvariable `PDF_UMBENENNER_API_KEY` oder eine andere eigene Variable für den
|
||||
OpenAI-kompatiblen API-Schlüssel eingesetzt hat, muss diese auf **`OPENAI_COMPATIBLE_API_KEY`** umstellen.
|
||||
Die Anwendung akzeptiert nur diese kanonische Umgebungsvariable; ältere proprietäre Namen werden
|
||||
nicht automatisch ausgewertet.
|
||||
|
||||
---
|
||||
|
||||
## Build-Ergebnis
|
||||
|
||||
Build-Kommando:
|
||||
|
||||
```
|
||||
.\mvnw.cmd clean verify -pl pdf-umbenenner-domain,pdf-umbenenner-application,pdf-umbenenner-adapter-out,pdf-umbenenner-adapter-in-cli,pdf-umbenenner-bootstrap --also-make
|
||||
```
|
||||
|
||||
Build-Status: **ERFOLGREICH** — alle Tests grün, Mutationstests in allen Modulen ausgeführt.
|
||||
596
docs/workpackages/V1.1 - Arbeitspakete.md
Normal file
596
docs/workpackages/V1.1 - Arbeitspakete.md
Normal file
@@ -0,0 +1,596 @@
|
||||
# V1.1 – Arbeitspakete
|
||||
|
||||
> **Aktive Erweiterung:** Zusätzliche KI-Provider-Familie **Anthropic Claude** über die native Messages API, neben der bestehenden OpenAI-kompatiblen Anbindung. Bewusst minimale Erweiterung des freigegebenen Basisstands.
|
||||
|
||||
> **Ablage im Repository:** `docs/workpackages/V1.1 - Arbeitspakete.md`
|
||||
|
||||
---
|
||||
|
||||
## 0. Lesereihenfolge für jedes Arbeitspaket
|
||||
|
||||
Vor jedem AP **vollständig** lesen:
|
||||
1. `CLAUDE.md`
|
||||
2. `docs/specs/technik-und-architektur.md`
|
||||
3. `docs/specs/fachliche-anforderungen.md`
|
||||
4. dieses Dokument: Abschnitte 1 bis 6
|
||||
5. **nur** das aktive Arbeitspaket aus Abschnitt 7
|
||||
|
||||
Nicht vorgreifen. Nicht raten. Bei echter Unklarheit knapp benennen statt erfinden.
|
||||
|
||||
---
|
||||
|
||||
## 1. Arbeitsweise (verbindlich)
|
||||
|
||||
Diese Regeln ersetzen die nicht vorhandene `WORKFLOW.md` und gelten für alle APs in diesem Dokument.
|
||||
|
||||
### 1.1 Scope-Disziplin
|
||||
- Es wird **ausschließlich** das aktive Arbeitspaket umgesetzt.
|
||||
- Keine Inhalte späterer Arbeitspakete vorwegnehmen.
|
||||
- Keine kosmetischen Refactorings ohne direkten Bezug zum AP.
|
||||
- Keine Umbenennungen außerhalb des AP-Scopes.
|
||||
- Vor Änderungen die betroffenen Klassen über Typsuche im Repo lokalisieren, **nicht** über vermutete Pfade.
|
||||
|
||||
### 1.2 Build- und Testpflicht
|
||||
Build-Kommando vom Projekt-Root, identisch für alle APs:
|
||||
|
||||
```
|
||||
.\mvnw.cmd clean verify -pl pdf-umbenenner-domain,pdf-umbenenner-application,pdf-umbenenner-adapter-out,pdf-umbenenner-adapter-in-cli,pdf-umbenenner-bootstrap --also-make
|
||||
```
|
||||
|
||||
- Nach jeder substanziellen Änderung: Build ausführen.
|
||||
- Vor Abschluss eines AP: Build muss **fehlerfrei** sein, alle Tests grün.
|
||||
- Schlägt der Build fehl: Ursache sauber beheben, nicht kaschieren.
|
||||
- Bestehende Tests dürfen nicht stillschweigend gelöscht oder deaktiviert werden. Sie werden bei Bedarf **angepasst** und der Grund wird im AP-Output dokumentiert.
|
||||
|
||||
### 1.3 Pflicht-Tests pro AP
|
||||
- Jede neue Klasse mit fachlich oder technisch relevanter Logik bekommt mindestens einen Unit-Test.
|
||||
- Jede in einem AP geänderte Klasse, die bisher Tests hatte, behält Tests; betroffene Tests werden angepasst.
|
||||
- Pro AP gibt es eine Liste **kritischer Pflicht-Testfälle** (siehe jeweiliges AP). Diese sind namentlich umzusetzen.
|
||||
- Darüber hinaus gilt die übliche Repo-Praxis (Coverage, PIT-Mutationstests in den unmittelbar betroffenen Modulen, soweit bereits etabliert).
|
||||
|
||||
### 1.4 Dokumentation
|
||||
Pro AP werden mitgepflegt, soweit relevant:
|
||||
- JavaDoc und `package-info` der berührten Klassen
|
||||
- Konfigurationsbeispiele
|
||||
- unmittelbar betroffene Repository-Dokumente
|
||||
|
||||
### 1.5 Naming-Regel
|
||||
In Code, Kommentaren und JavaDoc dürfen **keine** Versions- oder AP-Bezeichner erscheinen:
|
||||
- Verboten: `V1.0`, `V1.1`, `M1`–`M8`, `AP-001` … `AP-006`
|
||||
- Stattdessen: zeitlose technische Bezeichnungen.
|
||||
|
||||
### 1.6 Pflicht-Output-Format am Ende jedes AP
|
||||
Am Ende der AP-Bearbeitung gibt Sonnet **genau** diesen Block aus:
|
||||
|
||||
```
|
||||
- Scope erfüllt: ja/nein
|
||||
- Geänderte Dateien:
|
||||
- <Pfad>
|
||||
- ...
|
||||
- Neue Dateien:
|
||||
- <Pfad>
|
||||
- ...
|
||||
- Build-Kommando: <verwendetes Kommando>
|
||||
- Build-Status: ERFOLGREICH / FEHLGESCHLAGEN
|
||||
- Pflicht-Tests umgesetzt: <Liste der namentlich geforderten Testfälle>
|
||||
- Offene Punkte: keine / <Beschreibung>
|
||||
- Risiken: keine / <Beschreibung>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Erweiterungsziel und Nicht-Ziele
|
||||
|
||||
### 2.1 Ziel
|
||||
- Der bestehende OpenAI-kompatible KI-Weg bleibt unverändert nutzbar.
|
||||
- Zusätzlich wird die **native Anthropic Messages API** als zweite, gleichwertig unterstützte Provider-Familie integriert.
|
||||
- Genau **ein** Provider ist pro Lauf aktiv – ausschließlich über Konfiguration.
|
||||
- Kein automatischer Fallback, keine Parallelnutzung, keine Profilverwaltung.
|
||||
- Der fachliche KI-Vertrag (`NamingProposal`) bleibt unverändert.
|
||||
- Bestehende Properties-Dateien werden beim ersten Start kontrolliert ins neue Schema migriert; vorher wird automatisch eine `.bak`-Sicherung angelegt.
|
||||
|
||||
### 2.2 Explizit nicht Bestandteil
|
||||
- Provider-Familien jenseits der zwei explizit unterstützten
|
||||
- Profilverwaltung mit mehreren Konfigurationen je Provider-Familie
|
||||
- automatische Fallback-Umschaltung
|
||||
- parallele Nutzung mehrerer Provider in einem Lauf
|
||||
- Änderung des fachlichen Ergebnisvertrags
|
||||
- Änderung der Dateinamensregeln, Retry-Regeln, Batch-Betriebsmodells
|
||||
- Persistenz- oder Schemaänderungen jenseits der einen additiven Provider-Identifikator-Spalte
|
||||
|
||||
### 2.3 Architekturtreue (unverhandelbar)
|
||||
- strikte hexagonale Architektur, Abhängigkeiten zeigen nach innen
|
||||
- `AiNamingPort` bleibt provider-neutral
|
||||
- provider-spezifische Endpunkte, Header, Auth, Request-/Response-Formate leben **ausschließlich** im jeweiligen Adapter-Out
|
||||
- keine direkte Adapter-zu-Adapter-Kopplung, keine gemeinsame „abstrakte KI-Adapter"-Zwischenschicht
|
||||
- die Provider-Auswahl ist eine **Bootstrap-Verdrahtungsentscheidung**
|
||||
|
||||
---
|
||||
|
||||
## 3. Zielzustand der Konfiguration (verbindlich)
|
||||
|
||||
### 3.1 Properties-Schema
|
||||
```properties
|
||||
# bestehende, unveränderte Parameter
|
||||
source.folder=...
|
||||
target.folder=...
|
||||
sqlite.file=...
|
||||
max.retries.transient=...
|
||||
max.pages=...
|
||||
max.text.characters=...
|
||||
prompt.template.file=...
|
||||
runtime.lock.file=...
|
||||
log.directory=...
|
||||
log.level=...
|
||||
log.ai.sensitive=...
|
||||
|
||||
# neue Provider-Auswahl (Pflicht)
|
||||
ai.provider.active=openai-compatible
|
||||
|
||||
# OpenAI-kompatible Provider-Familie
|
||||
ai.provider.openai-compatible.baseUrl=...
|
||||
ai.provider.openai-compatible.model=...
|
||||
ai.provider.openai-compatible.timeoutSeconds=...
|
||||
ai.provider.openai-compatible.apiKey=...
|
||||
|
||||
# Anthropic-Provider-Familie (Claude)
|
||||
ai.provider.claude.baseUrl=https://api.anthropic.com
|
||||
ai.provider.claude.model=...
|
||||
ai.provider.claude.timeoutSeconds=...
|
||||
ai.provider.claude.apiKey=...
|
||||
```
|
||||
|
||||
### 3.2 Zulässige Werte für `ai.provider.active`
|
||||
- `openai-compatible`
|
||||
- `claude`
|
||||
|
||||
Jeder andere Wert ist eine ungültige Startkonfiguration und führt zu Exit-Code `1`.
|
||||
|
||||
### 3.3 Pflichtwerte je aktivem Provider
|
||||
| Provider | Pflicht | Optional / mit Default |
|
||||
|---|---|---|
|
||||
| `openai-compatible` | `baseUrl`, `model`, `timeoutSeconds`, `apiKey` (Env hat Vorrang) | – |
|
||||
| `claude` | `model`, `timeoutSeconds`, `apiKey` (Env hat Vorrang) | `baseUrl` (Default `https://api.anthropic.com`) |
|
||||
|
||||
Für den **inaktiven** Provider werden keine Pflichtwerte erzwungen.
|
||||
|
||||
### 3.4 Umgebungsvariablen für API-Schlüssel
|
||||
| Provider | Umgebungsvariable |
|
||||
|---|---|
|
||||
| `openai-compatible` | `OPENAI_COMPATIBLE_API_KEY` |
|
||||
| `claude` | `ANTHROPIC_API_KEY` |
|
||||
|
||||
- Pro Provider gilt: Umgebungsvariable hat **Vorrang** vor dem Properties-Wert derselben Provider-Familie.
|
||||
- Schlüssel verschiedener Provider werden **niemals** vermischt.
|
||||
- Wenn der Betrieb bisher eine andere Umgebungsvariable für den OpenAI-kompatiblen Key genutzt hat, ist diese vom Betreiber auf `OPENAI_COMPATIBLE_API_KEY` umzustellen. Das ist im Abschlussnachweis (AP-006) zu dokumentieren.
|
||||
|
||||
### 3.5 Legacy-Form (vor V1.1)
|
||||
Eindeutig erkennbar an mindestens einem der flachen Schlüssel:
|
||||
```
|
||||
api.baseUrl
|
||||
api.model
|
||||
api.timeoutSeconds
|
||||
api.key
|
||||
```
|
||||
ohne Vorhandensein von `ai.provider.active`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Anthropic Messages API – verbindlicher technischer Faktenblock
|
||||
|
||||
> Quelle: offizielle Claude API-Dokumentation. Diese Werte sind verbindlich und nicht zu erfinden, abzuleiten oder zu „verbessern".
|
||||
|
||||
### 4.1 Endpoint und Methode
|
||||
- Methode: `POST`
|
||||
- URL: `{baseUrl}/v1/messages`
|
||||
- Default-`baseUrl`: `https://api.anthropic.com`
|
||||
|
||||
### 4.2 Pflicht-Header
|
||||
| Header | Wert |
|
||||
|---|---|
|
||||
| `x-api-key` | API-Schlüssel aus `ANTHROPIC_API_KEY` (Env) bzw. `ai.provider.claude.apiKey` (Properties) |
|
||||
| `anthropic-version` | `2023-06-01` |
|
||||
| `content-type` | `application/json` |
|
||||
|
||||
Nicht `Authorization: Bearer …` verwenden. Anthropic nutzt `x-api-key`.
|
||||
|
||||
### 4.3 Request-Body (relevante Felder)
|
||||
```json
|
||||
{
|
||||
"model": "<modellname aus ai.provider.claude.model>",
|
||||
"max_tokens": <Integer, > 0, Pflicht>,
|
||||
"system": "<optional, top-level Feld - NICHT als Message mit role=system>",
|
||||
"messages": [
|
||||
{ "role": "user", "content": "<Prompt-Text>" }
|
||||
]
|
||||
}
|
||||
```
|
||||
- `max_tokens` ist **Pflicht** (Unterschied zu OpenAI). Konkreter Wert: zweckmäßig fest verdrahtet im Adapter, ausreichend groß für die JSON-Antwort der Anwendung. Kein neuer Properties-Schlüssel.
|
||||
- `system` wird **nicht** als Message mit `role=system` modelliert. Anthropic akzeptiert nur `user` und `assistant` im `messages`-Array; ein System-Prompt geht ausschließlich ins Top-Level-Feld `system`.
|
||||
- Der bestehende Prompt der Anwendung wird **unverändert** als Inhalt der einen `user`-Message übergeben. Falls der bestehende Prompt-Mechanismus eine System-Komponente kennt, wandert diese in das `system`-Feld; sonst bleibt `system` weg.
|
||||
|
||||
### 4.4 Response-Body (relevante Felder)
|
||||
```json
|
||||
{
|
||||
"id": "...",
|
||||
"type": "message",
|
||||
"role": "assistant",
|
||||
"content": [
|
||||
{ "type": "text", "text": "<die eigentliche Antwort>" }
|
||||
],
|
||||
"stop_reason": "...",
|
||||
"usage": { "input_tokens": 0, "output_tokens": 0 }
|
||||
}
|
||||
```
|
||||
- Der für die Anwendung relevante Text wird **konkateniert aus allen Blöcken in `content` mit `type == "text"`** in Reihenfolge gewonnen.
|
||||
- Andere Block-Typen werden ignoriert.
|
||||
- Liefert die API kein einziges `text`-Block, ist das ein technischer Fehler des Adapters (klassifiziert wie ein leerer/unbrauchbarer Antwortinhalt).
|
||||
|
||||
### 4.5 Fehlerklassifikation im Claude-Adapter
|
||||
| Symptom | Klassifikation | Anmerkung |
|
||||
|---|---|---|
|
||||
| HTTP 4xx (außer 429) | technischer Fehler | Auth-Fehler (401/403) zählen hier rein |
|
||||
| HTTP 429 | technischer Fehler | rate limit |
|
||||
| HTTP 5xx | technischer Fehler | |
|
||||
| Timeout | technischer Fehler | |
|
||||
| Verbindung fehlgeschlagen | technischer Fehler | |
|
||||
| JSON nicht parsebar | technischer Fehler | |
|
||||
| Kein `content[*].text`-Block | technischer Fehler | |
|
||||
| Antworttext nicht nach `NamingProposal` parsebar | greift bestehende Antwort-Validierung der Application | nicht im Adapter behandeln |
|
||||
|
||||
Alle technischen Adapterfehler werden auf die **bestehende** transiente Fehlersemantik der Anwendung abgebildet. Es entsteht **keine** neue Fehlerkategorie.
|
||||
|
||||
---
|
||||
|
||||
## 5. Verbindliche Regeln für jedes AP
|
||||
|
||||
1. **Minimale Erweiterung.** Nichts ändern, was nicht für die Erweiterung zwingend erforderlich ist.
|
||||
2. **Einheitlicher fachlicher KI-Vertrag.** `NamingProposal` bleibt unverändert. Keine provider-spezifische Verzweigung in Application/Domain.
|
||||
3. **Genau ein aktiver Provider.** Kein Fallback, keine Profilverwaltung.
|
||||
4. **Properties-Datei bleibt führend.** Keine alternative Konfigurationsquelle.
|
||||
5. **Bestehender OpenAI-Pfad bleibt funktional unverändert.**
|
||||
6. **Architekturgrenzen** (siehe 2.3) werden niemals durchbrochen.
|
||||
7. **Rückwärtsverträglichkeit der SQLite-Daten** bleibt erhalten.
|
||||
8. **Build muss am Ende jedes AP fehlerfrei sein.**
|
||||
9. **Alle Pflicht-Testfälle des AP** sind umgesetzt.
|
||||
|
||||
---
|
||||
|
||||
## 6. Granularität und Reihenfolge
|
||||
|
||||
Sechs Arbeitspakete in dieser zwingenden Reihenfolge:
|
||||
|
||||
| AP | Thema | Risiko | Charakter |
|
||||
|---|---|---|---|
|
||||
| AP-001 | Konfigurations-Schema einführen (additiv) | niedrig | reine Erweiterung |
|
||||
| AP-002 | Legacy-Migration mit `.bak` | mittel | Datei-Umschreibung, geschützt durch Sicherung |
|
||||
| AP-003 | Bootstrap-Provider-Auswahl + bestehender Adapter umschalten | hoch | Verhaltensänderung im Wiring |
|
||||
| AP-004 | Persistenz: Provider-Identifikator additiv | mittel | additive DB-Schema-Migration |
|
||||
| AP-005 | Nativer Anthropic-Adapter implementieren und verdrahten | mittel | neue Adapter-Klasse |
|
||||
| AP-006 | Regression, Smoke, Doku, Abschlussnachweis | niedrig | Absicherung |
|
||||
|
||||
---
|
||||
|
||||
# 7. Arbeitspakete
|
||||
|
||||
---
|
||||
|
||||
## AP-001 – Konfigurations-Schema einführen (additiv)
|
||||
|
||||
### Voraussetzung
|
||||
Keine.
|
||||
|
||||
### Ziel
|
||||
Das neue, verschachtelte Properties-Schema (Abschnitt 3.1) wird im Code als parsbare und validierbare Struktur eingeführt. Der bestehende Lese- und Validierungspfad bleibt **unangetastet** – das neue Schema wird parallel additiv eingeführt. Es findet **kein** Wechsel im Bootstrap und **keine** Migration in diesem AP.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Modul `pdf-umbenenner-application` (oder dem Modul, in dem die heutigen Configuration-Klassen leben – per Typsuche lokalisieren) **neue** Konfigurationstypen einführen, mindestens:
|
||||
- eine Repräsentation einer einzelnen Provider-Konfiguration (Felder: `model`, `timeoutSeconds`, `baseUrl`, `apiKey`)
|
||||
- eine Repräsentation der Provider-Auswahl (`activeProviderId`) plus Map oder zwei Felder für die beiden Provider-Familien
|
||||
- einen klar benannten Aufzählungstyp oder konstanten String-Set für die zulässigen Werte `openai-compatible` und `claude`
|
||||
2. Im Adapter-Out-Modul den **Properties-Parser** so erweitern, dass er die neuen Schlüssel aus Abschnitt 3.1 erkennt und in die neuen Typen aus Schritt 1 einliest. Der bestehende Parser für die alten flachen Schlüssel bleibt **unverändert** lauffähig (parallele Erkennung).
|
||||
3. Eine **Validierung** für die neuen Typen einführen. Sie prüft:
|
||||
- `ai.provider.active` ist gesetzt und ein zulässiger Wert
|
||||
- alle Pflichtwerte des aktiven Providers sind vorhanden (Tabelle 3.3)
|
||||
- `timeoutSeconds` ist eine positive Ganzzahl
|
||||
- für Claude: Default-`baseUrl` wird gesetzt, wenn der Wert fehlt
|
||||
- für den **inaktiven** Provider werden keine Pflichtwerte erzwungen
|
||||
- **API-Schlüssel-Auflösung:** Umgebungsvariable des aktiven Providers (Tabelle 3.4) hat Vorrang vor dem Properties-Wert; ist beides leer, ist die Konfiguration ungültig
|
||||
4. **Bootstrap und bestehende Adapter werden in diesem AP nicht umgestellt.** Die neuen Typen sind ausschließlich über neue Tests erreichbar. Der Default-Lauf der Anwendung verwendet weiterhin die alten Klassen.
|
||||
5. JavaDoc für alle neuen Klassen und Methoden ergänzen.
|
||||
6. Konfigurationsbeispiel (`*.example.properties` o.ä.) **nicht** in diesem AP ändern. Folgt in AP-002 zusammen mit der Migration.
|
||||
|
||||
### Pflicht-Testfälle (kritisch, namentlich umzusetzen)
|
||||
1. `parsesNewSchemaWithOpenAiCompatibleActive` – vollständiges neues Schema, OpenAI aktiv, alle Pflichtwerte gesetzt → erfolgreich geparst, Validierung grün.
|
||||
2. `parsesNewSchemaWithClaudeActive` – vollständiges neues Schema, Claude aktiv, alle Pflichtwerte gesetzt → erfolgreich geparst, Validierung grün.
|
||||
3. `claudeBaseUrlDefaultsWhenMissing` – Claude aktiv, `ai.provider.claude.baseUrl` fehlt → Default `https://api.anthropic.com` wird gesetzt, Validierung grün.
|
||||
4. `rejectsMissingActiveProvider` – `ai.provider.active` fehlt → Validierung schlägt fehl mit klarer Meldung.
|
||||
5. `rejectsUnknownActiveProvider` – `ai.provider.active=foo` → Validierung schlägt fehl.
|
||||
6. `rejectsMissingMandatoryFieldForActiveProvider` – aktiver Provider hat ein Pflichtfeld leer → Validierung schlägt fehl.
|
||||
7. `acceptsMissingMandatoryFieldForInactiveProvider` – inaktiver Provider unvollständig → Validierung grün.
|
||||
8. `envVarOverridesPropertiesApiKeyForActiveProvider` – `OPENAI_COMPATIBLE_API_KEY` gesetzt, Properties-Key ebenfalls gesetzt → effektiver Key ist der aus der Env-Var. Analog für `ANTHROPIC_API_KEY`.
|
||||
9. `envVarOnlyResolvesForActiveProvider` – Env-Var nur für inaktiven Provider gesetzt, aktiver Provider hat Properties-Key → effektiver Key ist der Properties-Key des aktiven Providers; die Env-Var des inaktiven Providers wird ignoriert.
|
||||
10. `bestehende Tests bleiben grün` – alle bisherigen Configuration-Tests laufen weiter.
|
||||
|
||||
Test-Kategorien zusätzlich: Unit-Tests für die neuen Typen (Equality, Defaults), Parser-Tests, Validator-Tests.
|
||||
|
||||
### Explizit NICHT Teil dieses AP
|
||||
- Migration der Legacy-Datei
|
||||
- `.bak`-Sicherung
|
||||
- Bootstrap-Umstellung
|
||||
- Änderung am bestehenden OpenAI-Adapter
|
||||
- nativer Claude-Adapter
|
||||
- Persistenz-Änderungen
|
||||
- Logging-Änderungen
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei
|
||||
- alle Pflicht-Testfälle umgesetzt und grün
|
||||
- bestehende Tests grün
|
||||
- JavaDoc vollständig für neue Klassen
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-002 – Legacy-Migration mit `.bak`
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Beim ersten Start mit erkannter Legacy-Form wird die Properties-Datei kontrolliert in das neue Schema überführt. Vor jeder Migration wird eine `.bak`-Sicherung angelegt. Nach erfolgreicher Migration läuft die Anwendung **noch** auf dem alten Bootstrap-Pfad weiter (Umschaltung folgt in AP-003); aber die Datei auf der Platte ist bereits im neuen Format und beim nächsten Start sofort durch das neue Schema lesbar.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Eine neue Komponente im Adapter-Out-Modul anlegen, die rein auf Properties-Datei-Ebene arbeitet (kein HTTP, kein DB-Zugriff). Verantwortlichkeiten:
|
||||
- Erkennen der Legacy-Form (Abschnitt 3.5)
|
||||
- `.bak`-Sicherung anlegen: `<dateiname>.bak`. Wenn `.bak` schon existiert, mit aufsteigendem numerischen Suffix sichern (`<dateiname>.bak`, `<dateiname>.bak.1`, …) – **niemals** überschreiben.
|
||||
- Werte umschreiben gemäß Tabelle:
|
||||
| Legacy | Ziel |
|
||||
|---|---|
|
||||
| `api.baseUrl` | `ai.provider.openai-compatible.baseUrl` |
|
||||
| `api.model` | `ai.provider.openai-compatible.model` |
|
||||
| `api.timeoutSeconds` | `ai.provider.openai-compatible.timeoutSeconds` |
|
||||
| `api.key` | `ai.provider.openai-compatible.apiKey` |
|
||||
- `ai.provider.active=openai-compatible` ergänzen.
|
||||
- Leere/auskommentierte Platzhalter für die Claude-Sektion einfügen mit kurzem Hinweis-Kommentar (ein Block, max. 6 Zeilen).
|
||||
- Alle übrigen Schlüssel (`source.folder`, `target.folder`, `sqlite.file`, `max.*`, `prompt.template.file`, `runtime.lock.file`, `log.*`) **unverändert** und in **stabiler Reihenfolge** übernehmen.
|
||||
- Die migrierte Datei in-place schreiben (`.tmp` + atomischer Move/Rename, kein Truncate-and-write auf das Original).
|
||||
- Anschließend die Datei erneut über den **neuen** Parser aus AP-001 laden und über den neuen Validator validieren. Schlägt das fehl, ist dies ein harter Startfehler (Exit-Code 1, klare Meldung, `.bak` bleibt erhalten).
|
||||
2. Die Migration wird beim Programmstart **vor** dem bestehenden Konfigurationsladen aufgerufen, sobald die Datei bekannt ist. Dieser Aufruf passiert im Bootstrap genau an einer Stelle und ist als eigene Methode klar benennbar.
|
||||
3. Wird **kein** Legacy erkannt (also bereits neues Schema), passiert nichts: keine `.bak`, keine Schreibvorgänge.
|
||||
4. Bestehende ConfigurationPort-Implementierung **nicht** umstellen – das passiert in AP-003. Die Anwendung läuft nach AP-002 fachlich weiter wie zuvor; ihr Eingangs-File ist nur jetzt in beiden Formen lesbar.
|
||||
5. Konfigurationsbeispiel im Repo (z. B. `*.example.properties`) auf das **neue** Schema umstellen. Die Datei zeigt beide Provider-Sektionen mit sprechenden Platzhalterwerten.
|
||||
6. JavaDoc und kurzer Abschnitt in der Repo-Doku zur Migration ergänzen (was passiert, wann, wie wird gesichert, was bei Fehler).
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `migratesLegacyFileWithAllFlatKeys` – Legacy-Datei mit allen vier `api.*`-Schlüsseln wird korrekt ins neue Schema überführt; Werte bleiben inhaltlich identisch; übrige Schlüssel bleiben unverändert.
|
||||
2. `createsBakBeforeOverwriting` – vor Migration existiert keine `.bak`, danach existiert sie mit dem **Original-Inhalt**.
|
||||
3. `bakSuffixIsIncrementedIfBakExists` – `.bak` existiert bereits → neue Sicherung als `.bak.1`. Keine Sicherung wird überschrieben.
|
||||
4. `noOpForAlreadyMigratedFile` – Datei bereits im neuen Schema → kein Schreibvorgang, kein `.bak`.
|
||||
5. `reloadAfterMigrationSucceeds` – nach Migration kann der neue Parser/Validator aus AP-001 die Datei fehlerfrei laden.
|
||||
6. `migrationFailureKeepsBak` – Migration schreibt fehlerhafte Datei (Test-Mock erzwingt Validierungsfehler nach Schreiben) → Bootstrap meldet harten Startfehler, `.bak` ist unangetastet.
|
||||
7. `legacyDetectionRequiresAtLeastOneFlatKey` – Datei mit `ai.provider.active=...` und ohne `api.*` → kein Legacy, keine Migration.
|
||||
8. `legacyValuesEndUpInOpenAiCompatibleNamespace` – Werte `api.baseUrl`, `api.model`, `api.timeoutSeconds`, `api.key` landen exakt in den vier Zielschlüsseln; `ai.provider.active=openai-compatible` ist gesetzt.
|
||||
9. `unrelatedKeysSurviveUnchanged` – Schlüssel wie `source.folder`, `max.pages`, `log.level` bleiben mit identischem Wert erhalten.
|
||||
10. `inPlaceWriteIsAtomic` – Test-Doppel für das Dateisystem belegt: erst `.tmp` schreiben, dann atomic move; kein Punkt, an dem das Original teilbeschrieben ist.
|
||||
|
||||
Test-Kategorien zusätzlich: temporäre Dateien in `@TempDir`, Repository-/Integrationstests für die Migrations-Komponente.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Bootstrap-Umstellung des aktiven Konfigurationspfads
|
||||
- Änderung am bestehenden OpenAI-Adapter
|
||||
- Claude-Adapter
|
||||
- Persistenz
|
||||
- Logging-Änderungen über die Migrations-Meldungen hinaus
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, alle Pflicht-Testfälle grün
|
||||
- Beispiel-Properties-Datei im neuen Schema
|
||||
- Kurz-Doku zur Migration im Repo
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-003 – Bootstrap-Provider-Auswahl und Umstellung des bestehenden OpenAI-Adapters
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 und AP-002 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Das Bootstrap-Modul wählt anhand von `ai.provider.active` genau eine `AiNamingPort`-Implementierung als aktive Implementierung aus und verdrahtet sie. Der bestehende OpenAI-kompatible Adapter konsumiert ab jetzt seine Werte aus dem Namensraum `ai.provider.openai-compatible.*`. Sein fachliches Verhalten bleibt **unverändert**. Der aktive Provider wird beim Laufstart geloggt.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Bootstrap-Modul eine **Provider-Selektor-Komponente** einführen, die als Eingabe den Wert von `ai.provider.active` und alle bekannten `AiNamingPort`-Implementierungen erhält und genau eine zurückgibt. Initial kennt sie nur die OpenAI-Implementierung; die Erweiterung um Claude erfolgt in AP-005 an genau dieser Stelle.
|
||||
2. Bestehende `AiNamingPort`-Implementierung für die OpenAI-kompatible Schnittstelle so anpassen, dass sie die Werte aus `ai.provider.openai-compatible.*` konsumiert. Der bisherige fachliche Vertrag, das Request-/Response-Mapping und das Fehlerverhalten bleiben **identisch**.
|
||||
3. Bestehenden ConfigurationPort/`Configuration`-Lesepfad so umstellen, dass intern **nur noch** das neue Schema verwendet wird. Die alten flachen Klassen/Methoden, die nur zum Lesen von `api.*` dienten, werden entfernt – aber **nur**, wenn sie nirgends sonst benötigt werden (per Suche prüfen). Falls noch Verweise existieren, wird der entsprechende Konsument im selben AP auf das neue Schema umgestellt.
|
||||
4. Bestehende Konfigurations-Tests des Repos auf das neue Schema umstellen. Tests, die explizit das alte flache Schema geprüft haben, werden zu Migrations-Tests verschoben (gehört bereits zu AP-002) **oder** auf das neue Schema umgeschrieben. Kein Test wird stillschweigend gelöscht.
|
||||
5. Logging-Anbindung erweitern: beim Laufstart wird der **aktive Provider-Identifikator** geloggt (Standard-Loglevel `INFO`). Alle übrigen geforderten Log-Inhalte (siehe `CLAUDE.md`, Logging-Mindestumfang) bleiben unverändert.
|
||||
6. Sicherstellen, dass die Sensibilitätsregel für KI-Inhalte unverändert greift und provider-unabhängig gilt.
|
||||
7. Adapter-zu-Adapter-Kopplung aktiv vermeiden: Der Provider-Selektor lebt im Bootstrap, **nicht** im Adapter-Out-Modul.
|
||||
8. JavaDoc für Selektor und betroffene Klassen ergänzen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `bootstrapWiresOpenAiCompatibleAdapterWhenActive` – `ai.provider.active=openai-compatible` → Selektor liefert die OpenAI-Implementierung.
|
||||
2. `bootstrapFailsHardWhenActiveProviderUnknown` – Wert ist syntaktisch gesetzt, aber kein gültiger Provider → harter Startfehler, Exit-Code 1.
|
||||
3. `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` – Wert ist `claude`, aber Implementierung noch nicht registriert (Zustand nach AP-003) → harter Startfehler mit klarer Meldung. Dieser Test wird in AP-005 angepasst, sobald Claude registriert ist.
|
||||
4. `openAiAdapterReadsValuesFromNewNamespace` – Adapter-Test: gegebene `ai.provider.openai-compatible.*`-Werte landen 1:1 im HTTP-Request an die bisherige Endpoint-URL.
|
||||
5. `openAiAdapterBehaviorIsUnchanged` – bestehender Adapter-Verhaltenstest (Request-Form, Response-Mapping, Fehlerklassifikation) wird auf die neue Konfigurationsquelle umgestellt und bleibt grün.
|
||||
6. `activeProviderIsLoggedAtRunStart` – Smoke- oder Bootstrap-Test belegt, dass der aktive Provider bei Laufstart in einem definierten Log-Eintrag erscheint.
|
||||
7. `existingDocumentProcessingTestsRemainGreen` – sämtliche bestehenden End-to-End-/Integrations-Tests des bestehenden OpenAI-Pfads bleiben grün, ggf. mit angepasster Konfiguration.
|
||||
8. `legacyFileEndToEndStillRuns` – Test-Doppel: Anwendung startet mit Legacy-Datei → Migration aus AP-002 läuft → Bootstrap aus AP-003 wählt OpenAI → Lauf läuft fachlich durch wie zuvor.
|
||||
|
||||
Test-Kategorien zusätzlich: Bootstrap-/Wiring-Tests, ggf. Smoke-Test ohne realen externen Aufruf.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Claude-Adapter
|
||||
- Persistenz-Erweiterung um Provider-Identifikator
|
||||
- neue Fehlersemantik
|
||||
- Refactoring außerhalb der Adapter-Anbindung
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, Pflicht-Testfälle grün
|
||||
- bestehender OpenAI-Pfad fachlich unverändert
|
||||
- aktiver Provider wird beim Laufstart geloggt
|
||||
- keine Verweise mehr auf das alte flache Schema im Produktivpfad
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-004 – Persistenz: Provider-Identifikator additiv
|
||||
|
||||
### Voraussetzung
|
||||
AP-003 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Das SQLite-Schema wird **additiv** um eine Spalte für den Provider-Identifikator je Versuch erweitert. Bestehende Datensätze bleiben lesbar und korrekt interpretierbar (Default-Wert für Altdaten). Neue Versuche schreiben den Identifikator des für den Versuch aktiven Providers.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im SQLite-Schema der Versuchshistorie eine neue Spalte hinzufügen, z. B. `ai_provider TEXT NULL` (Spaltenname per bestehender Repo-Konvention wählen, sonst wie hier vorgeschlagen). Die Spalte ist nullable.
|
||||
2. Schema-Migration umsetzen:
|
||||
- Bei Programmstart prüfen, ob die Spalte existiert; wenn nein, per `ALTER TABLE` ergänzen.
|
||||
- Vorhandene Zeilen behalten den Wert `NULL`.
|
||||
- Migration muss idempotent sein (mehrfacher Start ohne Fehler).
|
||||
3. Die Versuchshistorie-Schreiblogik so erweitern, dass beim Anlegen eines neuen Versuchs der **Identifikator des aktiv ausgewählten Providers** mitgeschrieben wird (`openai-compatible` oder `claude`). Der Wert kommt aus der bereits in AP-003 verfügbaren Provider-Auswahl.
|
||||
4. Dokument-Stammsatz wird **nicht** verändert.
|
||||
5. Lesepfad anpassen, sodass der neue Wert mitausgelesen wird; bestehende Mapper/Domain-Typen werden minimal um ein optionales Feld erweitert. Application und Domain bekommen dadurch keinen provider-spezifischen Code – das Feld bleibt ein opaker String.
|
||||
6. JavaDoc und kurzer Abschnitt zur Schema-Erweiterung in der Repo-Doku ergänzen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `addsProviderColumnOnFreshDb` – frische DB → Schema enthält neue Spalte.
|
||||
2. `addsProviderColumnOnExistingDbWithoutColumn` – DB ohne Spalte (Simulation Altbestand) → Migration legt Spalte nullable an.
|
||||
3. `migrationIsIdempotent` – mehrfacher Start ändert nichts und wirft keinen Fehler.
|
||||
4. `existingRowsKeepNullProvider` – Altzeilen behalten `NULL`.
|
||||
5. `newAttemptsWriteOpenAiCompatibleProvider` – aktiver Provider OpenAI → neuer Versuch hat `ai_provider='openai-compatible'`.
|
||||
6. `newAttemptsWriteClaudeProvider` – aktiver Provider Claude (für diesen Test wird die Provider-Auswahl gemockt; in AP-005 wird derselbe Test mit echtem Claude-Adapter wiederholt) → `ai_provider='claude'`.
|
||||
7. `repositoryReadsProviderColumn` – Repository-Test: gespeicherter Wert wird korrekt zurückgelesen.
|
||||
8. `legacyDataReadingDoesNotFail` – Test mit DB-Datei aus dem Vor-V1.1-Stand: Lesen ohne Fehler, neuer Wert ist Optional/leer.
|
||||
9. `existingHistoryTestsRemainGreen` – alle bestehenden Tests rund um die Versuchshistorie bleiben grün, ggf. mit minimaler Anpassung.
|
||||
|
||||
Test-Kategorien zusätzlich: Repository-Tests gegen echte SQLite-Instanz (in-memory oder temporär), Schema-Migrations-Tests.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- Claude-Adapter (folgt in AP-005)
|
||||
- Änderungen am Dokument-Stammsatz
|
||||
- neue Wahrheitsquellen
|
||||
- Reporting/Statistiken
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, Pflicht-Testfälle grün
|
||||
- bestehende Datenbestände bleiben lesbar
|
||||
- Provider-Identifikator wird für neue Versuche geschrieben
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-005 – Nativer Anthropic-Adapter implementieren und verdrahten
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-004 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Eine zweite `AiNamingPort`-Implementierung wird im Adapter-Out-Modul angelegt, die die **native Anthropic Messages API** anspricht (siehe Faktenblock in Abschnitt 4). Sie wird im Provider-Selektor aus AP-003 als zweite Option registriert. Der Adapter bildet die Anthropic-Antwort auf den **bestehenden** fachlichen Vertrag ab; es entsteht kein Sonderweg in Application oder Domain.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. Im Adapter-Out-Modul eine neue Klasse anlegen, die `AiNamingPort` implementiert. Naming nach bestehender Repo-Konvention; per Typsuche prüfen, wie die OpenAI-Implementierung benannt ist, und analog vorgehen.
|
||||
2. HTTP-Aufruf gemäß Faktenblock 4 umsetzen:
|
||||
- URL aus `ai.provider.claude.baseUrl` (Default `https://api.anthropic.com`) plus Pfad `/v1/messages`
|
||||
- Methode `POST`
|
||||
- Header `x-api-key`, `anthropic-version: 2023-06-01`, `content-type: application/json`
|
||||
- Request-Body mit `model`, `max_tokens`, `messages` (eine `user`-Message mit dem bestehenden Prompt-Text), optional `system` falls die bestehende Prompt-Mechanik ein System-Segment kennt
|
||||
- Timeout aus `ai.provider.claude.timeoutSeconds`
|
||||
3. API-Schlüssel-Auflösung exakt nach Tabelle 3.4: zuerst `ANTHROPIC_API_KEY`, dann `ai.provider.claude.apiKey`.
|
||||
4. Antwortverarbeitung gemäß 4.4: Konkatenation aller `content[*].text`-Blöcke in Reihenfolge. Fehlt jeder `text`-Block oder ist die Antwort nicht parsebar → technischer Adapterfehler nach Tabelle 4.5.
|
||||
5. Den so gewonnenen Antworttext **unverändert** an die bestehende Antwortverarbeitung der Anwendung weitergeben (`NamingProposal`-Validierung passiert in Application/Domain wie bisher).
|
||||
6. Fehlerklassifikation streng nach Tabelle 4.5. Keine neuen Fehlerklassen.
|
||||
7. Den Provider-Selektor aus AP-003 um die neue Implementierung erweitern. **Keine** gemeinsame Basisklasse zwischen den beiden Adaptern, **keine** Hilfsklasse, die HTTP-Logik teilt. Was beide Adapter brauchen, kommt aus dem Repo-üblichen HTTP-/JSON-Standard, nicht aus einer neuen Adapter-Zwischenschicht.
|
||||
8. Den in AP-001 angelegten Test `bootstrapFailsHardWhenSelectedProviderHasNoImplementation` so anpassen, dass er ab jetzt auf einen neuen, weiterhin **unbekannten** Provider-Wert testet (Negativfall bleibt erhalten, aber `claude` ist jetzt registriert).
|
||||
9. Konfigurationsbeispiel im Repo um sprechende Claude-Beispielwerte ergänzen.
|
||||
10. JavaDoc für die neue Klasse und ggf. neue Hilfstypen.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `claudeAdapterBuildsCorrectRequest` – gegebener Prompt → HTTP-Request mit korrekter URL (`<baseUrl>/v1/messages`), Methode POST, allen drei Pflicht-Headern, Body enthält `model`, `max_tokens > 0`, `messages` mit genau einer `user`-Message und korrektem Prompt.
|
||||
2. `claudeAdapterUsesEnvVarApiKey` – `ANTHROPIC_API_KEY` gesetzt, Properties-Wert ebenfalls → Header `x-api-key` enthält den Env-Wert.
|
||||
3. `claudeAdapterFallsBackToPropertiesApiKey` – Env-Var leer, Properties-Wert gesetzt → Header `x-api-key` enthält den Properties-Wert.
|
||||
4. `claudeAdapterFailsValidationWhenBothKeysMissing` – beides leer → Konfigurationsfehler beim Start (greift auf AP-001-Validierung).
|
||||
5. `claudeAdapterParsesSingleTextBlock` – Mock-Response mit einem Block `{type:"text", text:"..."}` → Antworttext gleich dem Block-Text.
|
||||
6. `claudeAdapterConcatenatesMultipleTextBlocks` – mehrere `text`-Blöcke → Antworttext gleich der Konkatenation in Reihenfolge.
|
||||
7. `claudeAdapterIgnoresNonTextBlocks` – Mix aus `text`- und Nicht-`text`-Blöcken → nur die `text`-Inhalte landen im Antworttext.
|
||||
8. `claudeAdapterFailsOnEmptyTextContent` – Response ohne jeden `text`-Block → technischer Adapterfehler.
|
||||
9. `claudeAdapterMapsHttp401AsTechnical` – Mock-Response 401 → technischer Fehler nach Tabelle 4.5.
|
||||
10. `claudeAdapterMapsHttp429AsTechnical` – Mock-Response 429 → technischer Fehler.
|
||||
11. `claudeAdapterMapsHttp500AsTechnical` – Mock-Response 500 → technischer Fehler.
|
||||
12. `claudeAdapterMapsTimeoutAsTechnical` – simulierter Timeout → technischer Fehler.
|
||||
13. `claudeAdapterMapsUnparseableJsonAsTechnical` – Response-Body ist kein gültiges JSON → technischer Fehler.
|
||||
14. `bootstrapSelectsClaudeWhenActive` – `ai.provider.active=claude` → Selektor liefert die Claude-Implementierung.
|
||||
15. `claudeProviderIdentifierLandsInAttemptHistory` – End-zu-End mit gemocktem HTTP-Layer: nach erfolgreichem Lauf hat der neue Versuch `ai_provider='claude'` (knüpft an AP-004 an).
|
||||
16. `existingOpenAiPathRemainsGreen` – sämtliche bestehenden Tests des OpenAI-Pfads bleiben unverändert grün.
|
||||
|
||||
Test-Kategorien zusätzlich: Adapter-Tests mit gemocktem HTTP-Client (kein realer Netzwerkzugriff), Bootstrap-Wiring-Tests.
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- automatische Fallback-Logik zwischen Providern
|
||||
- gemeinsame Adapter-Basisklasse
|
||||
- Erweiterung des Persistenz-Schemas über AP-004 hinaus
|
||||
- Anpassung des Prompts (eine etwaige System-/User-Trennung der bestehenden Prompt-Datei darf genutzt werden, aber keine inhaltliche Änderung des Prompts)
|
||||
|
||||
### Definition of Done
|
||||
- Build fehlerfrei, alle Pflicht-Testfälle grün
|
||||
- nativer Anthropic-Adapter wird über Konfiguration auswählbar und liefert auf Mock-Basis korrekte Ergebnisse
|
||||
- bestehender OpenAI-Pfad unverändert grün
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
|
||||
---
|
||||
|
||||
## AP-006 – Regression, Smoke, Doku-Konsolidierung, Abschlussnachweis
|
||||
|
||||
### Voraussetzung
|
||||
AP-001 bis AP-005 abgeschlossen.
|
||||
|
||||
### Ziel
|
||||
Der vollständige Erweiterungsstand wird automatisiert abgesichert, dokumentarisch konsolidiert und als minimale, architekturtreue Erweiterung des Basisstands belastbar nachgewiesen.
|
||||
|
||||
### Konkret zu erledigende Schritte
|
||||
1. **Smoke-Test je Provider:** Zwei Smoke-Tests einrichten, die für je eine Provider-Konfiguration den Bootstrap-Pfad bis zur erfolgreichen Verdrahtung des `AiNamingPort` durchlaufen, **ohne** realen externen HTTP-Aufruf (gemockter HTTP-Layer). Beide müssen grün sein.
|
||||
2. **Regression OpenAI:** Alle bestehenden End-to-End-/Integrations-Tests des OpenAI-Pfads laufen grün. Falls Anpassungen in vorigen APs Tests berührt haben, ist hier der finale Konsistenz-Check.
|
||||
3. **Migration Smoke:** Ein End-zu-End-Test, der mit einer Legacy-Datei (Inhalt aus der bekannten Demo-Konfig) startet und nach einem ersten Lauf folgendes nachweist:
|
||||
- `.bak` existiert mit Original-Inhalt
|
||||
- Properties-Datei ist im neuen Schema
|
||||
- `ai.provider.active=openai-compatible`
|
||||
- der Lauf hat fachlich gleich funktioniert wie mit dem neuen Schema
|
||||
4. **PIT-/Mutationstests** in den unmittelbar betroffenen Modulen ausführen, soweit bereits etabliert. Lücken im neuen Code, die deutlich unter dem bestehenden Niveau liegen, gezielt schließen. Keine willkürliche Coverage-Kosmetik.
|
||||
5. **Doku-Konsolidierung:**
|
||||
- Beispiel-Properties-Datei zeigt das vollständige neue Schema für **beide** Provider mit sprechenden Platzhaltern.
|
||||
- Repo-Doku enthält einen kurzen Abschnitt „KI-Provider auswählen" mit den zulässigen Werten und der Env-Var-Konvention (`OPENAI_COMPATIBLE_API_KEY`, `ANTHROPIC_API_KEY`).
|
||||
- Repo-Doku enthält einen kurzen Abschnitt „Migration von der Vorgängerversion" mit dem Hinweis auf `.bak`.
|
||||
- JavaDoc aller in der Erweiterung neu eingeführten oder substanziell geänderten Klassen ist vorhanden.
|
||||
6. **Abschlussnachweis:** Eine kurze, im Repository verbleibende Markdown-Datei unter `docs/workpackages/V1.1 - Abschlussnachweis.md` anlegen, die mindestens enthält:
|
||||
- Datum, betroffene Module
|
||||
- Liste der ausgeführten Pflicht-Testfälle pro AP (kann tabellarisch sein)
|
||||
- Belegte Eigenschaften: zwei Provider unterstützt, genau einer aktiv, kein Fallback, fachlicher Vertrag unverändert, Persistenz rückwärtsverträglich, Migration nachgewiesen, `.bak` nachgewiesen, aktiver Provider geloggt
|
||||
- explizite Bestätigung: keine Architekturbrüche, keine neuen Bibliotheken außer denen, die für HTTP/JSON ohnehin im Repo etabliert sind
|
||||
- Hinweis auf die Betreiberaufgabe, ggf. die Umgebungsvariable des OpenAI-Keys auf `OPENAI_COMPATIBLE_API_KEY` umzustellen
|
||||
7. Den vollständigen Reactor-Build ausführen und das Ergebnis im AP-Output festhalten.
|
||||
|
||||
### Pflicht-Testfälle
|
||||
1. `smokeBootstrapWithOpenAiCompatibleActive`
|
||||
2. `smokeBootstrapWithClaudeActive`
|
||||
3. `e2eMigrationFromLegacyDemoConfig`
|
||||
4. `regressionExistingOpenAiSuiteGreen` (Sammelnachweis, nicht ein einzelner Test)
|
||||
5. `e2eClaudeRunWritesProviderIdentifierToHistory`
|
||||
6. `e2eOpenAiRunWritesProviderIdentifierToHistory`
|
||||
7. `legacyDataFromBeforeV11RemainsReadable`
|
||||
|
||||
Test-Kategorien zusätzlich: Mutationstests in betroffenen Modulen, Konsistenz-Checks der Doku-Beispiele gegen den realen Parser (z. B. „Beispiel-Properties-Datei wird vom Parser ohne Fehler geladen").
|
||||
|
||||
### Explizit NICHT Teil
|
||||
- weitere Provider
|
||||
- Komfortfunktionen
|
||||
- großflächiges Refactoring
|
||||
|
||||
### Definition of Done
|
||||
- vollständiger Reactor-Build fehlerfrei
|
||||
- alle Pflicht-Testfälle grün
|
||||
- Smoke-Tests je Provider grün
|
||||
- Doku konsolidiert
|
||||
- Abschlussnachweis-Datei im Repo
|
||||
- Pflicht-Output-Block ausgegeben
|
||||
@@ -1,68 +0,0 @@
|
||||
pdf-umbenenner-adapter-in-cli/src/main/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/package-info.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | |
|
||||
pdf-umbenenner-adapter-in-cli/src/main/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/SchedulerBatchCommand.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | class | SchedulerBatchCommand
|
||||
pdf-umbenenner-adapter-in-cli/src/test/java/de/gecheckt/pdf/umbenenner/adapter/in/cli/SchedulerBatchCommandTest.java | de.gecheckt.pdf.umbenenner.adapter.in.cli | class | SchedulerBatchCommandTest
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/PropertiesConfigurationPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | class | PropertiesConfigurationPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/FilesystemRunLockPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | class | FilesystemRunLockPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/PdfTextExtractionPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | class | PdfTextExtractionPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/package-info.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | |
|
||||
pdf-umbenenner-adapter-out/src/main/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/SourceDocumentCandidatesPortAdapter.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | class | SourceDocumentCandidatesPortAdapter
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/configuration/PropertiesConfigurationPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.configuration | class | PropertiesConfigurationPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/lock/FilesystemRunLockPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.lock | class | FilesystemRunLockPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/pdfextraction/PdfTextExtractionPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction | class | PdfTextExtractionPortAdapterTest
|
||||
pdf-umbenenner-adapter-out/src/test/java/de/gecheckt/pdf/umbenenner/adapter/out/sourcedocument/SourceDocumentCandidatesPortAdapterTest.java | de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument | class | SourceDocumentCandidatesPortAdapterTest
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/InvalidStartConfigurationException.java | de.gecheckt.pdf.umbenenner.application.config | class | InvalidStartConfigurationException
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/package-info.java | de.gecheckt.pdf.umbenenner.application.config | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/StartConfiguration.java | de.gecheckt.pdf.umbenenner.application.config | record | StartConfiguration
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/config/StartConfigurationValidator.java | de.gecheckt.pdf.umbenenner.application.config | class | StartConfigurationValidator
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/package-info.java | de.gecheckt.pdf.umbenenner.application | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/BatchRunOutcome.java | de.gecheckt.pdf.umbenenner.application.port.in | enum | BatchRunOutcome
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/package-info.java | de.gecheckt.pdf.umbenenner.application.port.in | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/in/RunBatchProcessingUseCase.java | de.gecheckt.pdf.umbenenner.application.port.in | interface | RunBatchProcessingUseCase
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/ClockPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | ClockPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/ConfigurationPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | ConfigurationPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/package-info.java | de.gecheckt.pdf.umbenenner.application.port.out | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/PdfTextExtractionPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | PdfTextExtractionPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/RunLockPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | RunLockPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/RunLockUnavailableException.java | de.gecheckt.pdf.umbenenner.application.port.out | class | RunLockUnavailableException
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/SourceDocumentAccessException.java | de.gecheckt.pdf.umbenenner.application.port.out | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/port/out/SourceDocumentCandidatesPort.java | de.gecheckt.pdf.umbenenner.application.port.out | interface | SourceDocumentCandidatesPort
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/DocumentProcessingService.java | de.gecheckt.pdf.umbenenner.application.service | class | DocumentProcessingService
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/package-info.java | de.gecheckt.pdf.umbenenner.application.service | |
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/service/PreCheckEvaluator.java | de.gecheckt.pdf.umbenenner.application.service | class | PreCheckEvaluator
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/usecase/BatchRunProcessingUseCase.java | de.gecheckt.pdf.umbenenner.application.usecase | class | BatchRunProcessingUseCase
|
||||
pdf-umbenenner-application/src/main/java/de/gecheckt/pdf/umbenenner/application/usecase/package-info.java | de.gecheckt.pdf.umbenenner.application.usecase | |
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/config/StartConfigurationValidatorTest.java | de.gecheckt.pdf.umbenenner.application.config | class | StartConfigurationValidatorTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/service/DocumentProcessingServiceTest.java | de.gecheckt.pdf.umbenenner.application.service | class | DocumentProcessingServiceTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/service/PreCheckEvaluatorTest.java | de.gecheckt.pdf.umbenenner.application.service | class | PreCheckEvaluatorTest
|
||||
pdf-umbenenner-application/src/test/java/de/gecheckt/pdf/umbenenner/application/usecase/BatchRunProcessingUseCaseTest.java | de.gecheckt.pdf.umbenenner.application.usecase | class | BatchRunProcessingUseCaseTest
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/BootstrapRunner.java | de.gecheckt.pdf.umbenenner.bootstrap | class | BootstrapRunner
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/package-info.java | de.gecheckt.pdf.umbenenner.bootstrap | |
|
||||
pdf-umbenenner-bootstrap/src/main/java/de/gecheckt/pdf/umbenenner/bootstrap/PdfUmbenennerApplication.java | de.gecheckt.pdf.umbenenner.bootstrap | class | PdfUmbenennerApplication
|
||||
pdf-umbenenner-bootstrap/src/test/java/de/gecheckt/pdf/umbenenner/bootstrap/BootstrapRunnerTest.java | de.gecheckt.pdf.umbenenner.bootstrap | class | BootstrapRunnerTest
|
||||
pdf-umbenenner-bootstrap/src/test/java/de/gecheckt/pdf/umbenenner/bootstrap/ExecutableJarSmokeTestIT.java | de.gecheckt.pdf.umbenenner.bootstrap | class | ExecutableJarSmokeTestIT
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/BatchRunContext.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/DocumentProcessingOutcome.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/package-info.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionContentError.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionContentError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionResult.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionSuccess.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionSuccess
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfExtractionTechnicalError.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfExtractionTechnicalError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PdfPageCount.java | de.gecheckt.pdf.umbenenner.domain.model | record | PdfPageCount
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckFailed.java | de.gecheckt.pdf.umbenenner.domain.model | record | PreCheckFailed
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckFailureReason.java | de.gecheckt.pdf.umbenenner.domain.model | enum | PreCheckFailureReason
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/PreCheckPassed.java | de.gecheckt.pdf.umbenenner.domain.model | record | PreCheckPassed
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingDecision.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingStatus.java | de.gecheckt.pdf.umbenenner.domain.model | enum | ProcessingStatus
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/RunId.java | de.gecheckt.pdf.umbenenner.domain.model | |
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/SourceDocumentCandidate.java | de.gecheckt.pdf.umbenenner.domain.model | record | SourceDocumentCandidate
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/SourceDocumentLocator.java | de.gecheckt.pdf.umbenenner.domain.model | record | SourceDocumentLocator
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/model/TechnicalDocumentError.java | de.gecheckt.pdf.umbenenner.domain.model | record | TechnicalDocumentError
|
||||
pdf-umbenenner-domain/src/main/java/de/gecheckt/pdf/umbenenner/domain/package-info.java | de.gecheckt.pdf.umbenenner.domain | |
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/BatchRunContextTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | BatchRunContextTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/DocumentProcessingOutcomeTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | DocumentProcessingOutcomeTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/ProcessingStatusTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | ProcessingStatusTest
|
||||
pdf-umbenenner-domain/src/test/java/de/gecheckt/pdf/umbenenner/domain/model/RunIdTest.java | de.gecheckt.pdf.umbenenner.domain.model | class | RunIdTest
|
||||
@@ -39,4 +39,42 @@
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.jacoco</groupId>
|
||||
<artifactId>jacoco-maven-plugin</artifactId>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>jacoco-check</id>
|
||||
<phase>verify</phase>
|
||||
<goals>
|
||||
<goal>check</goal>
|
||||
</goals>
|
||||
<configuration>
|
||||
<!-- Adapter-In is minimal wrapper, lower threshold acceptable -->
|
||||
<rules>
|
||||
<rule>
|
||||
<element>BUNDLE</element>
|
||||
<limits>
|
||||
<limit>
|
||||
<counter>LINE</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.60</minimum>
|
||||
</limit>
|
||||
<limit>
|
||||
<counter>BRANCH</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.55</minimum>
|
||||
</limit>
|
||||
</limits>
|
||||
</rule>
|
||||
</rules>
|
||||
</configuration>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</project>
|
||||
@@ -12,15 +12,11 @@ import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
* interface. It receives the batch run outcome and makes it available to the Bootstrap layer
|
||||
* for exit code determination and logging.
|
||||
* <p>
|
||||
* AP-003 Implementation: Minimal no-op command to validate the call chain from CLI to Application.
|
||||
* <p>
|
||||
* M2-AP-002 Update: Returns {@link BatchRunOutcome} instead of boolean,
|
||||
* allowing Bootstrap to systematically derive exit codes (AP-007).
|
||||
* <p>
|
||||
* M2-AP-003 Update: Accepts {@link BatchRunContext} and passes it to the use case,
|
||||
* Returns {@link BatchRunOutcome} to allow Bootstrap to systematically derive exit codes.
|
||||
* Accepts {@link BatchRunContext} and passes it to the use case,
|
||||
* enabling run ID and timing tracking throughout the batch cycle.
|
||||
* <p>
|
||||
* M2-AP-005 Update: Dependency inversion achieved - this adapter depends only on the
|
||||
* Dependency inversion achieved - this adapter depends only on the
|
||||
* BatchRunProcessingUseCase interface, not on any concrete implementation. Bootstrap
|
||||
* is responsible for injecting the appropriate use case implementation.
|
||||
*/
|
||||
|
||||
@@ -8,10 +8,10 @@
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.in.cli.SchedulerBatchCommand}
|
||||
* — CLI entry point that delegates to BatchRunProcessingUseCase interface (AP-005)</li>
|
||||
* — CLI entry point that delegates to BatchRunProcessingUseCase interface</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* M2-AP-005 Architecture:
|
||||
* Adapter Architecture:
|
||||
* <ul>
|
||||
* <li>Adapter depends on: {@link de.gecheckt.pdf.umbenenner.application.port.in.BatchRunProcessingUseCase} (interface)</li>
|
||||
* <li>Adapter does not depend on: any concrete use case implementation</li>
|
||||
|
||||
@@ -1,17 +1,19 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.in.cli;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.in.cli.SchedulerBatchCommand;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.time.Instant;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunOutcome;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.in.BatchRunProcessingUseCase;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.time.Instant;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link SchedulerBatchCommand}.
|
||||
* <p>
|
||||
|
||||
@@ -35,8 +35,23 @@
|
||||
<groupId>org.json</groupId>
|
||||
<artifactId>json</artifactId>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-api</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- Test dependencies -->
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-core</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-slf4j-impl</artifactId>
|
||||
<version>${log4j.version}</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.junit.jupiter</groupId>
|
||||
<artifactId>junit-jupiter</artifactId>
|
||||
@@ -52,5 +67,60 @@
|
||||
<artifactId>mockito-junit-jupiter</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.assertj</groupId>
|
||||
<artifactId>assertj-core</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.pitest</groupId>
|
||||
<artifactId>pitest-maven</artifactId>
|
||||
<configuration>
|
||||
<!-- Exclude heavy pipeline integration tests from mutation analysis.
|
||||
These tests run the full batch pipeline (SQLite, PDFBox, filesystem)
|
||||
and exceed PIT minion timeouts. They remain in the normal surefire run. -->
|
||||
<excludedTestClasses>
|
||||
<param>de.gecheckt.pdf.umbenenner.adapter.out.ai.AnthropicClaudeAdapterIntegrationTest</param>
|
||||
</excludedTestClasses>
|
||||
</configuration>
|
||||
</plugin>
|
||||
<plugin>
|
||||
<groupId>org.jacoco</groupId>
|
||||
<artifactId>jacoco-maven-plugin</artifactId>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>jacoco-check</id>
|
||||
<phase>verify</phase>
|
||||
<goals>
|
||||
<goal>check</goal>
|
||||
</goals>
|
||||
<configuration>
|
||||
<!-- Adapter-Out contains complex infrastructure; moderate threshold -->
|
||||
<rules>
|
||||
<rule>
|
||||
<element>BUNDLE</element>
|
||||
<limits>
|
||||
<limit>
|
||||
<counter>LINE</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.65</minimum>
|
||||
</limit>
|
||||
<limit>
|
||||
<counter>BRANCH</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.60</minimum>
|
||||
</limit>
|
||||
</limits>
|
||||
</rule>
|
||||
</rules>
|
||||
</configuration>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</project>
|
||||
@@ -0,0 +1,394 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import java.net.URI;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.time.Duration;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONException;
|
||||
import org.json.JSONObject;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Adapter implementing the native Anthropic Messages API for AI service invocation.
|
||||
* <p>
|
||||
* This adapter:
|
||||
* <ul>
|
||||
* <li>Translates an abstract {@link AiRequestRepresentation} into an Anthropic
|
||||
* Messages API request (POST {@code /v1/messages})</li>
|
||||
* <li>Configures HTTP connection, timeout, and authentication from the provider
|
||||
* configuration using the Anthropic-specific authentication scheme
|
||||
* ({@code x-api-key} header, not {@code Authorization: Bearer})</li>
|
||||
* <li>Extracts the response text by concatenating all {@code text}-type content
|
||||
* blocks from the Anthropic response, returning the result as a raw response
|
||||
* for Application-layer parsing and validation</li>
|
||||
* <li>Classifies technical failures (HTTP errors, timeouts, missing content blocks,
|
||||
* unparseable JSON) according to the existing transient error semantics</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Configuration</h2>
|
||||
* <ul>
|
||||
* <li>{@code baseUrl} — the HTTP(S) base URL; defaults to {@code https://api.anthropic.com}
|
||||
* when absent or blank</li>
|
||||
* <li>{@code model} — the Claude model identifier (e.g., {@code claude-3-5-sonnet-20241022})</li>
|
||||
* <li>{@code timeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token, resolved from environment variable
|
||||
* {@code ANTHROPIC_API_KEY} or property {@code ai.provider.claude.apiKey};
|
||||
* environment variable takes precedence (resolved by the configuration layer
|
||||
* before this adapter is constructed)</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>HTTP request structure</h2>
|
||||
* <p>
|
||||
* The adapter sends a POST request to {@code {baseUrl}/v1/messages} with:
|
||||
* <ul>
|
||||
* <li>Header {@code x-api-key} containing the resolved API key</li>
|
||||
* <li>Header {@code anthropic-version: 2023-06-01}</li>
|
||||
* <li>Header {@code content-type: application/json}</li>
|
||||
* <li>JSON body containing:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code max_tokens} — fixed at 1024; sufficient for the expected JSON response
|
||||
* without requiring a separate configuration property</li>
|
||||
* <li>{@code system} — the prompt content (if non-blank); Anthropic uses a
|
||||
* top-level field instead of a {@code role=system} message</li>
|
||||
* <li>{@code messages} — an array with exactly one {@code user} message containing
|
||||
* the document text</li>
|
||||
* </ul>
|
||||
* </li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Response handling</h2>
|
||||
* <ul>
|
||||
* <li><strong>HTTP 200:</strong> All {@code content} blocks with {@code type=="text"}
|
||||
* are concatenated in order; the result is returned as {@link AiInvocationSuccess}
|
||||
* with an {@link AiRawResponse} containing the concatenated text. The Application
|
||||
* layer then parses and validates this text as a NamingProposal JSON object.</li>
|
||||
* <li><strong>No text blocks in HTTP 200 response:</strong> Classified as a technical
|
||||
* failure; the Application layer cannot derive a naming proposal without text.</li>
|
||||
* <li><strong>Unparseable response JSON:</strong> Classified as a technical failure.</li>
|
||||
* <li><strong>HTTP non-200:</strong> Classified as a technical failure.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Technical error classification</h2>
|
||||
* <p>
|
||||
* All errors are mapped to {@link AiInvocationTechnicalFailure} and follow the existing
|
||||
* transient error semantics. No new error categories are introduced:
|
||||
* <ul>
|
||||
* <li>HTTP 4xx (including 401, 403, 429) and 5xx — technical failure</li>
|
||||
* <li>Connection timeout, read timeout — {@code TIMEOUT}</li>
|
||||
* <li>Connection failure — {@code CONNECTION_ERROR}</li>
|
||||
* <li>DNS failure — {@code DNS_ERROR}</li>
|
||||
* <li>IO errors — {@code IO_ERROR}</li>
|
||||
* <li>Interrupted operation — {@code INTERRUPTED}</li>
|
||||
* <li>JSON not parseable — {@code UNPARSEABLE_JSON}</li>
|
||||
* <li>No {@code text}-type content block in response — {@code NO_TEXT_CONTENT}</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Non-goals</h2>
|
||||
* <ul>
|
||||
* <li>NamingProposal JSON parsing or validation — the Application layer owns this</li>
|
||||
* <li>Retry logic — this adapter executes a single request only</li>
|
||||
* <li>Shared implementation with the OpenAI-compatible adapter — no common base class</li>
|
||||
* </ul>
|
||||
*/
|
||||
public class AnthropicClaudeHttpAdapter implements AiInvocationPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(AnthropicClaudeHttpAdapter.class);
|
||||
|
||||
private static final String MESSAGES_ENDPOINT = "/v1/messages";
|
||||
private static final String ANTHROPIC_VERSION_HEADER = "anthropic-version";
|
||||
private static final String ANTHROPIC_VERSION_VALUE = "2023-06-01";
|
||||
private static final String API_KEY_HEADER = "x-api-key";
|
||||
private static final String CONTENT_TYPE = "application/json";
|
||||
private static final String DEFAULT_BASE_URL = "https://api.anthropic.com";
|
||||
|
||||
/**
|
||||
* Fixed max_tokens value for the Anthropic request.
|
||||
* <p>
|
||||
* This value is sufficient for the expected NamingProposal JSON response
|
||||
* ({@code date}, {@code title}, {@code reasoning}) without requiring a separate
|
||||
* configuration property. Anthropic's API requires this field to be present.
|
||||
*/
|
||||
private static final int MAX_TOKENS = 1024;
|
||||
|
||||
private final HttpClient httpClient;
|
||||
private final URI apiBaseUrl;
|
||||
private final String apiModel;
|
||||
private final String apiKey;
|
||||
private final int apiTimeoutSeconds;
|
||||
|
||||
// Test-only field to capture the last built JSON body for assertion
|
||||
private volatile String lastBuiltJsonBody;
|
||||
|
||||
/**
|
||||
* Creates an adapter from the Claude provider configuration.
|
||||
* <p>
|
||||
* If {@code config.baseUrl()} is absent or blank, the default Anthropic endpoint
|
||||
* {@code https://api.anthropic.com} is used. The HTTP client is initialized with
|
||||
* the configured timeout.
|
||||
*
|
||||
* @param config the provider configuration for the Claude family; must not be null
|
||||
* @throws NullPointerException if config is null
|
||||
* @throws IllegalArgumentException if the model is missing or blank
|
||||
*/
|
||||
public AnthropicClaudeHttpAdapter(ProviderConfiguration config) {
|
||||
this(config, HttpClient.newBuilder()
|
||||
.connectTimeout(Duration.ofSeconds(config.timeoutSeconds()))
|
||||
.build());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an adapter with a custom HTTP client (primarily for testing).
|
||||
* <p>
|
||||
* This constructor allows tests to inject a mock or configurable HTTP client
|
||||
* while keeping configuration validation consistent with the production constructor.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> This is package-private to remain internal to the adapter.
|
||||
*
|
||||
* @param config the provider configuration; must not be null
|
||||
* @param httpClient the HTTP client to use; must not be null
|
||||
* @throws NullPointerException if config or httpClient is null
|
||||
* @throws IllegalArgumentException if the model is missing or blank
|
||||
*/
|
||||
AnthropicClaudeHttpAdapter(ProviderConfiguration config, HttpClient httpClient) {
|
||||
Objects.requireNonNull(config, "config must not be null");
|
||||
Objects.requireNonNull(httpClient, "httpClient must not be null");
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
throw new IllegalArgumentException("API model must not be null or empty");
|
||||
}
|
||||
|
||||
String baseUrlStr = (config.baseUrl() != null && !config.baseUrl().isBlank())
|
||||
? config.baseUrl()
|
||||
: DEFAULT_BASE_URL;
|
||||
|
||||
this.apiBaseUrl = URI.create(baseUrlStr);
|
||||
this.apiModel = config.model();
|
||||
this.apiKey = config.apiKey() != null ? config.apiKey() : "";
|
||||
this.apiTimeoutSeconds = config.timeoutSeconds();
|
||||
this.httpClient = httpClient;
|
||||
|
||||
LOG.debug("AnthropicClaudeHttpAdapter initialized with base URL: {}, model: {}, timeout: {}s",
|
||||
apiBaseUrl, apiModel, apiTimeoutSeconds);
|
||||
}
|
||||
|
||||
/**
|
||||
* Invokes the Anthropic Claude AI service with the given request.
|
||||
* <p>
|
||||
* Constructs an Anthropic Messages API request from the request representation,
|
||||
* executes it, extracts the text content from the response, and returns either
|
||||
* a successful response or a classified technical failure.
|
||||
*
|
||||
* @param request the AI request with prompt and document text; must not be null
|
||||
* @return an {@link AiInvocationResult} encoding either success (with extracted text)
|
||||
* or a technical failure with classified reason
|
||||
* @throws NullPointerException if request is null
|
||||
*/
|
||||
@Override
|
||||
public AiInvocationResult invoke(AiRequestRepresentation request) {
|
||||
Objects.requireNonNull(request, "request must not be null");
|
||||
|
||||
try {
|
||||
HttpRequest httpRequest = buildRequest(request);
|
||||
HttpResponse<String> response = executeRequest(httpRequest);
|
||||
|
||||
if (response.statusCode() == 200) {
|
||||
return extractTextFromResponse(request, response.body());
|
||||
} else {
|
||||
String reason = "HTTP_" + response.statusCode();
|
||||
String message = "Anthropic AI service returned status " + response.statusCode();
|
||||
LOG.warn("Claude AI invocation returned non-200 status: {}", response.statusCode());
|
||||
return new AiInvocationTechnicalFailure(request, reason, message);
|
||||
}
|
||||
} catch (java.net.http.HttpTimeoutException e) {
|
||||
String message = "HTTP timeout: " + e.getClass().getSimpleName();
|
||||
LOG.warn("Claude AI invocation timeout: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "TIMEOUT", message);
|
||||
} catch (java.net.ConnectException e) {
|
||||
String message = "Failed to connect to endpoint: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation connection error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "CONNECTION_ERROR", message);
|
||||
} catch (java.net.UnknownHostException e) {
|
||||
String message = "Endpoint hostname not resolvable: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation DNS error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "DNS_ERROR", message);
|
||||
} catch (java.io.IOException e) {
|
||||
String message = "IO error during AI invocation: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation IO error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "IO_ERROR", message);
|
||||
} catch (InterruptedException e) {
|
||||
Thread.currentThread().interrupt();
|
||||
String message = "AI invocation interrupted: " + e.getMessage();
|
||||
LOG.warn("Claude AI invocation interrupted: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "INTERRUPTED", message);
|
||||
} catch (Exception e) {
|
||||
String message = "Unexpected error during AI invocation: " + e.getClass().getSimpleName()
|
||||
+ " - " + e.getMessage();
|
||||
LOG.error("Unexpected error in Claude AI invocation", e);
|
||||
return new AiInvocationTechnicalFailure(request, "UNEXPECTED_ERROR", message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds an Anthropic Messages API request from the request representation.
|
||||
* <p>
|
||||
* Constructs:
|
||||
* <ul>
|
||||
* <li>Endpoint URL: {@code {apiBaseUrl}/v1/messages}</li>
|
||||
* <li>Headers: {@code x-api-key}, {@code anthropic-version: 2023-06-01},
|
||||
* {@code content-type: application/json}</li>
|
||||
* <li>Body: JSON with {@code model}, {@code max_tokens}, optional {@code system}
|
||||
* (prompt content), and {@code messages} with a single user message
|
||||
* (document text)</li>
|
||||
* <li>Timeout: configured timeout from provider configuration</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param request the request representation with prompt and document text
|
||||
* @return an {@link HttpRequest} ready to send
|
||||
*/
|
||||
private HttpRequest buildRequest(AiRequestRepresentation request) {
|
||||
URI endpoint = buildEndpointUri();
|
||||
String requestBody = buildJsonRequestBody(request);
|
||||
// Capture for test inspection (test-only field)
|
||||
this.lastBuiltJsonBody = requestBody;
|
||||
|
||||
return HttpRequest.newBuilder(endpoint)
|
||||
.header("content-type", CONTENT_TYPE)
|
||||
.header(API_KEY_HEADER, apiKey)
|
||||
.header(ANTHROPIC_VERSION_HEADER, ANTHROPIC_VERSION_VALUE)
|
||||
.POST(HttpRequest.BodyPublishers.ofString(requestBody))
|
||||
.timeout(Duration.ofSeconds(apiTimeoutSeconds))
|
||||
.build();
|
||||
}
|
||||
|
||||
/**
|
||||
* Composes the endpoint URI from the configured base URL.
|
||||
* <p>
|
||||
* Resolves {@code {apiBaseUrl}/v1/messages}.
|
||||
*
|
||||
* @return the complete endpoint URI
|
||||
*/
|
||||
private URI buildEndpointUri() {
|
||||
String endpointPath = apiBaseUrl.getPath().replaceAll("/$", "") + MESSAGES_ENDPOINT;
|
||||
return URI.create(apiBaseUrl.getScheme() + "://" +
|
||||
apiBaseUrl.getHost() +
|
||||
(apiBaseUrl.getPort() > 0 ? ":" + apiBaseUrl.getPort() : "") +
|
||||
endpointPath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds the JSON request body for the Anthropic Messages API.
|
||||
* <p>
|
||||
* The body contains:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code max_tokens} — fixed value sufficient for the expected response</li>
|
||||
* <li>{@code system} — the prompt content as a top-level field (only when non-blank;
|
||||
* Anthropic does not accept {@code role=system} inside the {@code messages} array)</li>
|
||||
* <li>{@code messages} — an array with exactly one user message containing the
|
||||
* document text</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Package-private for testing:</strong> This method is accessible to tests
|
||||
* in the same package to verify the actual JSON body structure and content.
|
||||
*
|
||||
* @param request the request with prompt and document text
|
||||
* @return JSON string ready to send in HTTP body
|
||||
*/
|
||||
String buildJsonRequestBody(AiRequestRepresentation request) {
|
||||
JSONObject body = new JSONObject();
|
||||
body.put("model", apiModel);
|
||||
body.put("max_tokens", MAX_TOKENS);
|
||||
|
||||
// Prompt content goes to the top-level system field (not a role=system message)
|
||||
if (request.promptContent() != null && !request.promptContent().isBlank()) {
|
||||
body.put("system", request.promptContent());
|
||||
}
|
||||
|
||||
JSONObject userMessage = new JSONObject();
|
||||
userMessage.put("role", "user");
|
||||
userMessage.put("content", request.documentText());
|
||||
body.put("messages", new JSONArray().put(userMessage));
|
||||
|
||||
return body.toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Extracts the text content from a successful (HTTP 200) Anthropic response.
|
||||
* <p>
|
||||
* Concatenates all {@code content} blocks with {@code type=="text"} in order.
|
||||
* Blocks of other types (e.g., tool use) are ignored.
|
||||
* If no {@code text} blocks are present, a technical failure is returned.
|
||||
*
|
||||
* @param request the original request (carried through to the result)
|
||||
* @param responseBody the raw HTTP response body
|
||||
* @return success with the concatenated text, or a technical failure
|
||||
*/
|
||||
private AiInvocationResult extractTextFromResponse(AiRequestRepresentation request, String responseBody) {
|
||||
try {
|
||||
JSONObject json = new JSONObject(responseBody);
|
||||
JSONArray contentArray = json.getJSONArray("content");
|
||||
|
||||
StringBuilder textBuilder = new StringBuilder();
|
||||
for (int i = 0; i < contentArray.length(); i++) {
|
||||
JSONObject block = contentArray.getJSONObject(i);
|
||||
if ("text".equals(block.optString("type"))) {
|
||||
textBuilder.append(block.getString("text"));
|
||||
}
|
||||
}
|
||||
|
||||
String extractedText = textBuilder.toString();
|
||||
if (extractedText.isEmpty()) {
|
||||
LOG.warn("Claude AI response contained no text-type content blocks");
|
||||
return new AiInvocationTechnicalFailure(request, "NO_TEXT_CONTENT",
|
||||
"Anthropic response contained no text-type content blocks");
|
||||
}
|
||||
|
||||
return new AiInvocationSuccess(request, new AiRawResponse(extractedText));
|
||||
} catch (JSONException e) {
|
||||
LOG.warn("Claude AI response could not be parsed as JSON: {}", e.getMessage());
|
||||
return new AiInvocationTechnicalFailure(request, "UNPARSEABLE_JSON",
|
||||
"Anthropic response body is not valid JSON: " + e.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Package-private accessor for the last constructed JSON body.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> Allows tests to verify the actual
|
||||
* JSON body sent in HTTP requests without exposing the BodyPublisher internals.
|
||||
*
|
||||
* @return the last JSON body string constructed by {@link #buildRequest(AiRequestRepresentation)},
|
||||
* or null if no request has been built yet
|
||||
*/
|
||||
String getLastBuiltJsonBodyForTesting() {
|
||||
return lastBuiltJsonBody;
|
||||
}
|
||||
|
||||
/**
|
||||
* Executes the HTTP request and returns the response.
|
||||
*
|
||||
* @param httpRequest the HTTP request to execute
|
||||
* @return the HTTP response with status code and body
|
||||
* @throws java.net.http.HttpTimeoutException if the request times out
|
||||
* @throws java.net.ConnectException if connection fails
|
||||
* @throws java.io.IOException on other IO errors
|
||||
* @throws InterruptedException if the request is interrupted
|
||||
*/
|
||||
private HttpResponse<String> executeRequest(HttpRequest httpRequest)
|
||||
throws java.io.IOException, InterruptedException {
|
||||
return httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,343 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import java.net.URI;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.time.Duration;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
import org.json.JSONObject;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Adapter implementing OpenAI-compatible HTTP communication for AI service invocation.
|
||||
* <p>
|
||||
* This adapter:
|
||||
* <ul>
|
||||
* <li>Translates an abstract {@link AiRequestRepresentation} into an OpenAI Chat
|
||||
* Completions API request</li>
|
||||
* <li>Configures HTTP connection, timeout, and authentication from the provider configuration</li>
|
||||
* <li>Executes the HTTP request against the configured AI endpoint</li>
|
||||
* <li>Distinguishes between successful HTTP responses (200) and technical failures
|
||||
* (timeout, unreachable, connection error, etc.)</li>
|
||||
* <li>Returns the raw response body as-is for Application-layer parsing and validation</li>
|
||||
* <li>Classifies and encodes technical failures for retry decision-making</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Configuration:</strong>
|
||||
* <ul>
|
||||
* <li>{@code baseUrl} — the HTTP(S) base URL of the AI service endpoint</li>
|
||||
* <li>{@code model} — the model identifier requested from the AI service</li>
|
||||
* <li>{@code timeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token (resolved from environment variable
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} or property {@code ai.provider.openai-compatible.apiKey},
|
||||
* environment variable takes precedence)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>HTTP request structure:</strong>
|
||||
* The adapter sends a POST request to the endpoint {@code {baseUrl}/v1/chat/completions}
|
||||
* with:
|
||||
* <ul>
|
||||
* <li>Authorization header containing the API key</li>
|
||||
* <li>Content-Type application/json</li>
|
||||
* <li>JSON body containing:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code messages} — array with system role (prompt) and user role (document text)</li>
|
||||
* <li>Optional fields like {@code temperature} for determinism (if desired)</li>
|
||||
* </ul>
|
||||
* </li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Response handling:</strong>
|
||||
* <ul>
|
||||
* <li><strong>HTTP 200:</strong> Returns {@link AiInvocationSuccess} with the raw response body,
|
||||
* even if the body is invalid JSON or semantically problematic. The Application layer
|
||||
* is responsible for parsing and validating content.</li>
|
||||
* <li><strong>HTTP non-200:</strong> Treated as a technical failure. The response body may
|
||||
* contain an error message, but this is logged for debugging; the client treats it as
|
||||
* a transient communication failure.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Technical error classification:</strong>
|
||||
* The following are classified as {@link AiInvocationTechnicalFailure}:
|
||||
* <ul>
|
||||
* <li>Connection timeout</li>
|
||||
* <li>Read timeout</li>
|
||||
* <li>Endpoint unreachable (connection refused, DNS failure, etc.)</li>
|
||||
* <li>Interrupted IO during HTTP communication</li>
|
||||
* <li>HTTP response with non-2xx status code</li>
|
||||
* <li>Any other transport-level exception</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Non-goals:</strong>
|
||||
* <ul>
|
||||
* <li>Response body parsing or validation — the Application layer owns this</li>
|
||||
* <li>Retry logic — this adapter executes a single request only</li>
|
||||
* <li>Functional error classification (invalid title, unparseable date) — above adapter scope</li>
|
||||
* </ul>
|
||||
*/
|
||||
public class OpenAiHttpAdapter implements AiInvocationPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(OpenAiHttpAdapter.class);
|
||||
|
||||
private static final String CHAT_COMPLETIONS_ENDPOINT = "/v1/chat/completions";
|
||||
private static final String CONTENT_TYPE = "application/json";
|
||||
private static final String AUTHORIZATION_HEADER = "Authorization";
|
||||
private static final String BEARER_PREFIX = "Bearer ";
|
||||
|
||||
private final HttpClient httpClient;
|
||||
private final URI apiBaseUrl;
|
||||
private final String apiModel;
|
||||
private final String apiKey;
|
||||
private final int apiTimeoutSeconds;
|
||||
|
||||
// Test-only field to capture the last built JSON body for assertion
|
||||
private volatile String lastBuiltJsonBody;
|
||||
|
||||
/**
|
||||
* Creates an adapter from the OpenAI-compatible provider configuration.
|
||||
* <p>
|
||||
* The adapter initializes an HTTP client with the configured timeout and parses
|
||||
* the endpoint URI from the configured base URL string.
|
||||
*
|
||||
* @param config the provider configuration for the OpenAI-compatible family; must not be null
|
||||
* @throws NullPointerException if config is null
|
||||
* @throws IllegalArgumentException if the base URL or model is missing/blank
|
||||
*/
|
||||
public OpenAiHttpAdapter(ProviderConfiguration config) {
|
||||
this(config, HttpClient.newBuilder()
|
||||
.connectTimeout(Duration.ofSeconds(config.timeoutSeconds()))
|
||||
.build());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an adapter with a custom HTTP client (primarily for testing).
|
||||
* <p>
|
||||
* This constructor allows tests to inject a mock or configurable HTTP client
|
||||
* while keeping configuration validation consistent with the production constructor.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> This is package-private to remain internal to the adapter.
|
||||
*
|
||||
* @param config the provider configuration; must not be null
|
||||
* @param httpClient the HTTP client to use; must not be null
|
||||
* @throws NullPointerException if config or httpClient is null
|
||||
* @throws IllegalArgumentException if the base URL or model is missing/blank
|
||||
*/
|
||||
OpenAiHttpAdapter(ProviderConfiguration config, HttpClient httpClient) {
|
||||
Objects.requireNonNull(config, "config must not be null");
|
||||
Objects.requireNonNull(httpClient, "httpClient must not be null");
|
||||
if (config.baseUrl() == null || config.baseUrl().isBlank()) {
|
||||
throw new IllegalArgumentException("API base URL must not be null");
|
||||
}
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
throw new IllegalArgumentException("API model must not be null or empty");
|
||||
}
|
||||
|
||||
this.apiBaseUrl = URI.create(config.baseUrl());
|
||||
this.apiModel = config.model();
|
||||
this.apiKey = config.apiKey() != null ? config.apiKey() : "";
|
||||
this.apiTimeoutSeconds = config.timeoutSeconds();
|
||||
this.httpClient = httpClient;
|
||||
|
||||
LOG.debug("OpenAiHttpAdapter initialized with base URL: {}, model: {}, timeout: {}s",
|
||||
apiBaseUrl, apiModel, apiTimeoutSeconds);
|
||||
}
|
||||
|
||||
/**
|
||||
* Invokes the AI service with the given request.
|
||||
* <p>
|
||||
* Constructs an OpenAI Chat Completions API request from the request representation,
|
||||
* executes it against the configured endpoint, and returns either a successful
|
||||
* response or a classified technical failure.
|
||||
* <p>
|
||||
* The request representation contains:
|
||||
* <ul>
|
||||
* <li>Prompt content and identifier (for audit)</li>
|
||||
* <li>Document text prepared by the Application layer</li>
|
||||
* <li>Character count metadata (for audit, not used to truncate content)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* These are formatted as system and user messages for the Chat Completions API.
|
||||
*
|
||||
* @param request the AI request with prompt and document text; must not be null
|
||||
* @return an {@link AiInvocationResult} encoding either success (with raw response body)
|
||||
* or a technical failure with classified reason
|
||||
* @throws NullPointerException if request is null
|
||||
*/
|
||||
@Override
|
||||
public AiInvocationResult invoke(AiRequestRepresentation request) {
|
||||
Objects.requireNonNull(request, "request must not be null");
|
||||
|
||||
try {
|
||||
HttpRequest httpRequest = buildRequest(request);
|
||||
HttpResponse<String> response = executeRequest(httpRequest);
|
||||
|
||||
if (response.statusCode() == 200) {
|
||||
return new AiInvocationSuccess(request, new AiRawResponse(response.body()));
|
||||
} else {
|
||||
String reason = "HTTP_" + response.statusCode();
|
||||
String message = "AI service returned status " + response.statusCode();
|
||||
LOG.warn("AI invocation returned non-200 status: {}", response.statusCode());
|
||||
return new AiInvocationTechnicalFailure(request, reason, message);
|
||||
}
|
||||
} catch (java.net.http.HttpTimeoutException e) {
|
||||
String message = "HTTP timeout after " + e.getClass().getSimpleName();
|
||||
LOG.warn("AI invocation timeout: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "TIMEOUT", message);
|
||||
} catch (java.net.ConnectException e) {
|
||||
String message = "Failed to connect to endpoint: " + e.getMessage();
|
||||
LOG.warn("AI invocation connection error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "CONNECTION_ERROR", message);
|
||||
} catch (java.net.UnknownHostException e) {
|
||||
String message = "Endpoint hostname not resolvable: " + e.getMessage();
|
||||
LOG.warn("AI invocation DNS error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "DNS_ERROR", message);
|
||||
} catch (java.io.IOException e) {
|
||||
String message = "IO error during AI invocation: " + e.getMessage();
|
||||
LOG.warn("AI invocation IO error: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "IO_ERROR", message);
|
||||
} catch (InterruptedException e) {
|
||||
Thread.currentThread().interrupt();
|
||||
String message = "AI invocation interrupted: " + e.getMessage();
|
||||
LOG.warn("AI invocation interrupted: {}", message);
|
||||
return new AiInvocationTechnicalFailure(request, "INTERRUPTED", message);
|
||||
} catch (Exception e) {
|
||||
String message = "Unexpected error during AI invocation: " + e.getClass().getSimpleName()
|
||||
+ " - " + e.getMessage();
|
||||
LOG.error("Unexpected error in AI invocation", e);
|
||||
return new AiInvocationTechnicalFailure(request, "UNEXPECTED_ERROR", message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds an OpenAI Chat Completions API request from the request representation.
|
||||
* <p>
|
||||
* Constructs:
|
||||
* <ul>
|
||||
* <li>Endpoint URL: {@code {apiBaseUrl}/v1/chat/completions}</li>
|
||||
* <li>Headers: Authorization with Bearer token, Content-Type application/json</li>
|
||||
* <li>Body: JSON with model, messages (system = prompt, user = document text)</li>
|
||||
* <li>Timeout: configured timeout from provider configuration</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param request the request representation with prompt and document text
|
||||
* @return an {@link HttpRequest} ready to send
|
||||
*/
|
||||
private HttpRequest buildRequest(AiRequestRepresentation request) {
|
||||
URI endpoint = buildEndpointUri();
|
||||
|
||||
String requestBody = buildJsonRequestBody(request);
|
||||
// Capture for test inspection (test-only field)
|
||||
this.lastBuiltJsonBody = requestBody;
|
||||
|
||||
return HttpRequest.newBuilder(endpoint)
|
||||
.header("Content-Type", CONTENT_TYPE)
|
||||
.header(AUTHORIZATION_HEADER, BEARER_PREFIX + apiKey)
|
||||
.POST(HttpRequest.BodyPublishers.ofString(requestBody))
|
||||
.timeout(Duration.ofSeconds(apiTimeoutSeconds))
|
||||
.build();
|
||||
}
|
||||
|
||||
/**
|
||||
* Composes the endpoint URI from the configured base URL.
|
||||
* <p>
|
||||
* Resolves {@code {apiBaseUrl}/v1/chat/completions}.
|
||||
*
|
||||
* @return the complete endpoint URI
|
||||
*/
|
||||
private URI buildEndpointUri() {
|
||||
String endpointPath = apiBaseUrl.getPath().replaceAll("/$", "") + CHAT_COMPLETIONS_ENDPOINT;
|
||||
return URI.create(apiBaseUrl.getScheme() + "://" +
|
||||
apiBaseUrl.getHost() +
|
||||
(apiBaseUrl.getPort() > 0 ? ":" + apiBaseUrl.getPort() : "") +
|
||||
endpointPath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Builds the JSON request body for the OpenAI Chat Completions API.
|
||||
* <p>
|
||||
* The body contains:
|
||||
* <ul>
|
||||
* <li>{@code model} — the configured model name</li>
|
||||
* <li>{@code messages} — array with system and user roles</li>
|
||||
* <li>{@code temperature} — 0.0 for deterministic output</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The prompt is sent as system message; the document text as user message.
|
||||
* <p>
|
||||
* <strong>Text limiting:</strong> The document text is already limited to the maximum
|
||||
* characters by the Application layer before creating the request representation.
|
||||
* The {@link AiRequestRepresentation#sentCharacterCount()} is recorded as audit metadata
|
||||
* but is <strong>not</strong> used to truncate content in the adapter. The full
|
||||
* document text is sent to the AI service.
|
||||
* <p>
|
||||
* <strong>Package-private for testing:</strong> This method is accessible to tests
|
||||
* in the same package to verify the actual JSON body structure and content.
|
||||
*
|
||||
* @param request the request with prompt and document text
|
||||
* @return JSON string ready to send in HTTP body
|
||||
*/
|
||||
String buildJsonRequestBody(AiRequestRepresentation request) {
|
||||
JSONObject body = new JSONObject();
|
||||
body.put("model", apiModel);
|
||||
body.put("temperature", 0.0);
|
||||
|
||||
JSONObject systemMessage = new JSONObject();
|
||||
systemMessage.put("role", "system");
|
||||
systemMessage.put("content", request.promptContent());
|
||||
|
||||
JSONObject userMessage = new JSONObject();
|
||||
userMessage.put("role", "user");
|
||||
userMessage.put("content", request.documentText());
|
||||
|
||||
body.put("messages", new org.json.JSONArray()
|
||||
.put(systemMessage)
|
||||
.put(userMessage));
|
||||
|
||||
return body.toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Package-private accessor for the last constructed JSON body.
|
||||
* <p>
|
||||
* <strong>For testing only:</strong> Allows tests to verify the actual
|
||||
* JSON body sent in HTTP requests without exposing the BodyPublisher internals.
|
||||
* This method is used by unit tests to assert that the correct model, text,
|
||||
* and other fields are present in the outbound request.
|
||||
*
|
||||
* @return the last JSON body string constructed by {@link #buildRequest(AiRequestRepresentation)},
|
||||
* or null if no request has been built yet
|
||||
*/
|
||||
String getLastBuiltJsonBodyForTesting() {
|
||||
return lastBuiltJsonBody;
|
||||
}
|
||||
|
||||
/**
|
||||
* Executes the HTTP request and returns the response.
|
||||
* <p>
|
||||
* Uses the HTTP client configured with the startup timeout to send the request
|
||||
* and receive the full response body.
|
||||
*
|
||||
* @param httpRequest the HTTP request to execute
|
||||
* @return the HTTP response with status code and body
|
||||
* @throws java.net.http.HttpTimeoutException if the request times out
|
||||
* @throws java.net.ConnectException if connection fails
|
||||
* @throws java.io.IOException on other IO errors
|
||||
* @throws InterruptedException if the request is interrupted
|
||||
*/
|
||||
private HttpResponse<String> executeRequest(HttpRequest httpRequest)
|
||||
throws java.io.IOException, InterruptedException {
|
||||
return httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,62 @@
|
||||
/**
|
||||
* Outbound adapter for AI service invocation over OpenAI-compatible HTTP.
|
||||
* <p>
|
||||
* <strong>Responsibility:</strong>
|
||||
* This package encapsulates all HTTP communication, authentication, and transport-level
|
||||
* configuration for invoking an AI service. It translates between the abstract
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationPort} and the
|
||||
* concrete OpenAI Chat Completions API (or compatible endpoints).
|
||||
* <p>
|
||||
* <strong>Architectural boundary:</strong>
|
||||
* <ul>
|
||||
* <li>Input: {@link de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation}
|
||||
* containing prompt, document text, and metadata</li>
|
||||
* <li>Output: {@link de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult}
|
||||
* encoding either a successful HTTP response or a classified technical failure</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>What is encapsulated here (NOT exposed to Application/Domain):</strong>
|
||||
* <ul>
|
||||
* <li>HTTP client library details</li>
|
||||
* <li>Authentication header construction</li>
|
||||
* <li>JSON serialization/deserialization of request/response</li>
|
||||
* <li>Endpoint URL composition</li>
|
||||
* <li>Timeout and connection configuration</li>
|
||||
* <li>Error classification for technical failures (timeout, network, etc.)</li>
|
||||
* <li>OpenAI Chat Completions API structure</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>What is NOT handled here (delegated to Application layer):</strong>
|
||||
* <ul>
|
||||
* <li>JSON parsing of the response body into domain objects</li>
|
||||
* <li>Validation of title, date, or reasoning fields</li>
|
||||
* <li>Retry logic or retry classification</li>
|
||||
* <li>Persistence of responses or request/response history</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Technical error classification:</strong>
|
||||
* The adapter recognizes these as technical failures (retryable):
|
||||
* <ul>
|
||||
* <li>Connection timeout</li>
|
||||
* <li>Read timeout</li>
|
||||
* <li>Endpoint unreachable (connection refused, DNS failure, etc.)</li>
|
||||
* <li>Interrupted IO during request/response</li>
|
||||
* <li>Other transport-level exceptions</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* A successful HTTP 200 response with unparseable or semantically invalid JSON is
|
||||
* <strong>not</strong> a technical failure; it is invocation success with a
|
||||
* problematic response body, which the Application layer will detect and classify.
|
||||
* <p>
|
||||
* <strong>Configuration usage:</strong>
|
||||
* The adapter receives startup configuration containing:
|
||||
* <ul>
|
||||
* <li>{@code apiBaseUrl} — the base URL of the AI endpoint (e.g., https://api.openai.com)</li>
|
||||
* <li>{@code apiModel} — the model name to request (e.g., gpt-4, local-llm)</li>
|
||||
* <li>{@code apiTimeoutSeconds} — connection and read timeout in seconds</li>
|
||||
* <li>{@code apiKey} — the authentication token, already resolved from env var or properties</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* These are not merely documented; they are <strong>actively used</strong> in the HTTP request.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
@@ -1,4 +1,4 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation;
|
||||
|
||||
/**
|
||||
* Exception thrown when startup configuration validation fails.
|
||||
@@ -0,0 +1,395 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
|
||||
|
||||
/**
|
||||
* Validates {@link StartConfiguration} before processing can begin.
|
||||
* <p>
|
||||
* Performs mandatory field checks, numeric range validation, URI scheme validation,
|
||||
* and basic path existence checks. Throws {@link InvalidStartConfigurationException}
|
||||
* if any validation rule fails.
|
||||
* <p>
|
||||
* Supports injected source and target folder validation for testability
|
||||
* (allows mocking of platform-dependent filesystem checks).
|
||||
*
|
||||
* <h2>Target folder validation</h2>
|
||||
* <p>
|
||||
* The target folder is validated as "present or technically creatable":
|
||||
* <ul>
|
||||
* <li>If it already exists: must be a directory and writable.</li>
|
||||
* <li>If it does not yet exist: the {@link TargetFolderChecker} attempts to create it
|
||||
* via {@code Files.createDirectories}. Creation failure is a hard validation error.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* This behaviour ensures the target write path is technically usable before any
|
||||
* document processing begins, without requiring the operator to create the folder manually.
|
||||
*/
|
||||
public class StartConfigurationValidator {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(StartConfigurationValidator.class);
|
||||
|
||||
/**
|
||||
* Abstraction for source folder existence, type, and readability checks.
|
||||
* <p>
|
||||
* Separates filesystem operations from validation logic to enable
|
||||
* platform-independent unit testing (mocking) of readability edge cases.
|
||||
* <p>
|
||||
* Implementation note: The default implementation uses {@code java.nio.file.Files}
|
||||
* static methods directly; tests can substitute alternative implementations.
|
||||
*/
|
||||
@FunctionalInterface
|
||||
public interface SourceFolderChecker {
|
||||
/**
|
||||
* Checks source folder and returns validation error message, or null if valid.
|
||||
* <p>
|
||||
* Checks (in order):
|
||||
* 1. Folder exists
|
||||
* 2. Is a directory
|
||||
* 3. Is readable
|
||||
*
|
||||
* @param path the source folder path
|
||||
* @return error message string, or null if all checks pass
|
||||
*/
|
||||
String checkSourceFolder(Path path);
|
||||
}
|
||||
|
||||
/**
|
||||
* Abstraction for target folder existence, creatability, and write-access checks.
|
||||
* <p>
|
||||
* Separates filesystem operations from validation logic to enable
|
||||
* platform-independent unit testing (mocking) of write-access and creation edge cases.
|
||||
* <p>
|
||||
* The default implementation attempts to create the folder via
|
||||
* {@code Files.createDirectories} if it does not yet exist, then verifies it is a
|
||||
* directory and writable. Tests can substitute alternative implementations.
|
||||
*/
|
||||
@FunctionalInterface
|
||||
public interface TargetFolderChecker {
|
||||
/**
|
||||
* Checks target folder usability and returns a validation error message, or null if valid.
|
||||
* <p>
|
||||
* Checks (in order):
|
||||
* <ol>
|
||||
* <li>If folder does not exist: attempt to create it via {@code createDirectories}.</li>
|
||||
* <li>Is a directory.</li>
|
||||
* <li>Is writable (required for the file-copy write path).</li>
|
||||
* </ol>
|
||||
*
|
||||
* @param path the target folder path
|
||||
* @return error message string, or null if all checks pass
|
||||
*/
|
||||
String checkTargetFolder(Path path);
|
||||
}
|
||||
|
||||
private final SourceFolderChecker sourceFolderChecker;
|
||||
private final TargetFolderChecker targetFolderChecker;
|
||||
|
||||
/**
|
||||
* Creates a validator with default NIO-based source and target folder checkers.
|
||||
*/
|
||||
public StartConfigurationValidator() {
|
||||
this(new DefaultSourceFolderChecker(), new DefaultTargetFolderChecker());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a validator with a custom source folder checker (primarily for testing).
|
||||
* Uses the default NIO-based target folder checker.
|
||||
*
|
||||
* @param sourceFolderChecker the source folder checker to use (must not be null)
|
||||
*/
|
||||
public StartConfigurationValidator(SourceFolderChecker sourceFolderChecker) {
|
||||
this(sourceFolderChecker, new DefaultTargetFolderChecker());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a validator with custom source and target folder checkers (primarily for testing).
|
||||
*
|
||||
* @param sourceFolderChecker the source folder checker to use (must not be null)
|
||||
* @param targetFolderChecker the target folder checker to use (must not be null)
|
||||
*/
|
||||
public StartConfigurationValidator(SourceFolderChecker sourceFolderChecker,
|
||||
TargetFolderChecker targetFolderChecker) {
|
||||
this.sourceFolderChecker = sourceFolderChecker;
|
||||
this.targetFolderChecker = targetFolderChecker;
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the given configuration.
|
||||
* <p>
|
||||
* Checks all mandatory fields, numeric constraints, URI validity, and path existence.
|
||||
* If validation fails, throws {@link InvalidStartConfigurationException} with an
|
||||
* aggregated error message listing all problems.
|
||||
*
|
||||
* @param config the configuration to validate, must not be null
|
||||
* @throws InvalidStartConfigurationException if any validation rule fails
|
||||
*/
|
||||
public void validate(StartConfiguration config) {
|
||||
List<String> errors = new ArrayList<>();
|
||||
|
||||
// Mandatory fields and required paths
|
||||
validateMandatoryFields(config, errors);
|
||||
|
||||
// Numeric constraints
|
||||
validateNumericConstraints(config, errors);
|
||||
|
||||
// Path relationships and optional paths
|
||||
validateSourceAndTargetNotSame(config.sourceFolder(), config.targetFolder(), errors);
|
||||
validateOptionalPaths(config, errors);
|
||||
|
||||
if (!errors.isEmpty()) {
|
||||
String errorMessage = "Invalid startup configuration:\n" + String.join("\n", errors);
|
||||
throw new InvalidStartConfigurationException(errorMessage);
|
||||
}
|
||||
|
||||
LOG.info("Configuration validation successful.");
|
||||
}
|
||||
|
||||
private void validateMandatoryFields(StartConfiguration config, List<String> errors) {
|
||||
validateSourceFolder(config.sourceFolder(), errors);
|
||||
validateTargetFolder(config.targetFolder(), errors);
|
||||
validateSqliteFile(config.sqliteFile(), errors);
|
||||
validatePromptTemplateFile(config.promptTemplateFile(), errors);
|
||||
if (config.multiProviderConfiguration() == null) {
|
||||
errors.add("- ai provider configuration: must not be null");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateNumericConstraints(StartConfiguration config, List<String> errors) {
|
||||
validateMaxRetriesTransient(config.maxRetriesTransient(), errors);
|
||||
validateMaxPages(config.maxPages(), errors);
|
||||
validateMaxTextCharacters(config.maxTextCharacters(), errors);
|
||||
}
|
||||
|
||||
private void validateOptionalPaths(StartConfiguration config, List<String> errors) {
|
||||
validateRuntimeLockFile(config.runtimeLockFile(), errors);
|
||||
validateLogDirectory(config.logDirectory(), errors);
|
||||
}
|
||||
|
||||
private void validateSourceFolder(Path sourceFolder, List<String> errors) {
|
||||
if (sourceFolder == null) {
|
||||
errors.add("- source.folder: must not be null");
|
||||
return;
|
||||
}
|
||||
String checkError = sourceFolderChecker.checkSourceFolder(sourceFolder);
|
||||
if (checkError != null) {
|
||||
errors.add(checkError);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateTargetFolder(Path targetFolder, List<String> errors) {
|
||||
if (targetFolder == null) {
|
||||
errors.add("- target.folder: must not be null");
|
||||
return;
|
||||
}
|
||||
String checkError = targetFolderChecker.checkTargetFolder(targetFolder);
|
||||
if (checkError != null) {
|
||||
errors.add(checkError);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateSqliteFile(Path sqliteFile, List<String> errors) {
|
||||
validateRequiredFileParentDirectory(sqliteFile, "sqlite.file", errors);
|
||||
}
|
||||
|
||||
private void validateMaxRetriesTransient(int maxRetriesTransient, List<String> errors) {
|
||||
if (maxRetriesTransient < 1) {
|
||||
errors.add("- max.retries.transient: must be >= 1, got: " + maxRetriesTransient);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxPages(int maxPages, List<String> errors) {
|
||||
if (maxPages <= 0) {
|
||||
errors.add("- max.pages: must be > 0, got: " + maxPages);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxTextCharacters(int maxTextCharacters, List<String> errors) {
|
||||
if (maxTextCharacters <= 0) {
|
||||
errors.add("- max.text.characters: must be > 0, got: " + maxTextCharacters);
|
||||
}
|
||||
}
|
||||
|
||||
private void validatePromptTemplateFile(Path promptTemplateFile, List<String> errors) {
|
||||
validateRequiredRegularFile(promptTemplateFile, "prompt.template.file", errors);
|
||||
}
|
||||
|
||||
private void validateSourceAndTargetNotSame(Path sourceFolder, Path targetFolder, List<String> errors) {
|
||||
if (sourceFolder != null && targetFolder != null) {
|
||||
try {
|
||||
Path normalizedSource = sourceFolder.toRealPath();
|
||||
Path normalizedTarget = targetFolder.toRealPath();
|
||||
if (normalizedSource.equals(normalizedTarget)) {
|
||||
errors.add("- source.folder and target.folder must not resolve to the same path: " + normalizedSource);
|
||||
}
|
||||
} catch (Exception e) {
|
||||
// If toRealPath fails (e.g., path doesn't exist), skip this check
|
||||
// The individual existence checks will catch missing paths
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private void validateRuntimeLockFile(Path runtimeLockFile, List<String> errors) {
|
||||
validateOptionalFileParentDirectory(runtimeLockFile, "runtime.lock.file", errors);
|
||||
}
|
||||
|
||||
private void validateLogDirectory(Path logDirectory, List<String> errors) {
|
||||
validateOptionalExistingDirectory(logDirectory, "log.directory", errors);
|
||||
}
|
||||
|
||||
// === Helper methods for common validation patterns ===
|
||||
|
||||
/**
|
||||
* Validates that a required directory path is not null, exists, and is a directory.
|
||||
* <p>
|
||||
* Used for paths like source and target folders that must already exist before processing can begin.
|
||||
*/
|
||||
private void validateRequiredExistingDirectory(Path path, String fieldName, List<String> errors) {
|
||||
if (path == null) {
|
||||
errors.add("- " + fieldName + ": must not be null");
|
||||
return;
|
||||
}
|
||||
if (!Files.exists(path)) {
|
||||
errors.add("- " + fieldName + ": path does not exist: " + path);
|
||||
} else if (!Files.isDirectory(path)) {
|
||||
errors.add("- " + fieldName + ": path is not a directory: " + path);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates that a required file path is not null and its parent directory exists and is a directory.
|
||||
* <p>
|
||||
* The file itself may not exist yet (e.g., SQLite will create it on first use), but the parent
|
||||
* directory must be present and writable. Used for files like sqlite.file where the application
|
||||
* will create the file if needed.
|
||||
*/
|
||||
private void validateRequiredFileParentDirectory(Path filePath, String fieldName, List<String> errors) {
|
||||
if (filePath == null) {
|
||||
errors.add("- " + fieldName + ": must not be null");
|
||||
return;
|
||||
}
|
||||
Path parent = filePath.getParent();
|
||||
if (parent == null) {
|
||||
errors.add("- " + fieldName + ": has no parent directory: " + filePath);
|
||||
} else if (!Files.exists(parent)) {
|
||||
errors.add("- " + fieldName + ": parent directory does not exist: " + parent);
|
||||
} else if (!Files.isDirectory(parent)) {
|
||||
errors.add("- " + fieldName + ": parent is not a directory: " + parent);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates that a required file path is not null, exists, and is a regular file.
|
||||
*/
|
||||
private void validateRequiredRegularFile(Path filePath, String fieldName, List<String> errors) {
|
||||
if (filePath == null) {
|
||||
errors.add("- " + fieldName + ": must not be null");
|
||||
return;
|
||||
}
|
||||
if (!Files.exists(filePath)) {
|
||||
errors.add("- " + fieldName + ": path does not exist: " + filePath);
|
||||
} else if (!Files.isRegularFile(filePath)) {
|
||||
errors.add("- " + fieldName + ": path is not a regular file: " + filePath);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates that an optional file path, if present and non-blank, has a parent directory
|
||||
* that exists and is a directory.
|
||||
*/
|
||||
private void validateOptionalFileParentDirectory(Path filePath, String fieldName, List<String> errors) {
|
||||
if (filePath != null && !filePath.toString().isBlank()) {
|
||||
Path parent = filePath.getParent();
|
||||
if (parent != null) {
|
||||
if (!Files.exists(parent)) {
|
||||
errors.add("- " + fieldName + ": parent directory does not exist: " + parent);
|
||||
} else if (!Files.isDirectory(parent)) {
|
||||
errors.add("- " + fieldName + ": parent is not a directory: " + parent);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates that an optional directory path, if present and non-blank, either does not exist
|
||||
* or exists and is a directory.
|
||||
*/
|
||||
private void validateOptionalExistingDirectory(Path directoryPath, String fieldName, List<String> errors) {
|
||||
if (directoryPath != null && !directoryPath.toString().isBlank()) {
|
||||
if (Files.exists(directoryPath)) {
|
||||
if (!Files.isDirectory(directoryPath)) {
|
||||
errors.add("- " + fieldName + ": exists but is not a directory: " + directoryPath);
|
||||
}
|
||||
}
|
||||
// If it doesn't exist yet, that's acceptable - we don't auto-create
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Default NIO-based implementation of {@link SourceFolderChecker}.
|
||||
* <p>
|
||||
* Uses {@code java.nio.file.Files} static methods to check existence, type, and readability.
|
||||
* <p>
|
||||
* This separation allows unit tests to inject alternative implementations
|
||||
* that control the outcome of readability checks without relying on actual filesystem
|
||||
* permissions (which are platform-dependent).
|
||||
*/
|
||||
private static class DefaultSourceFolderChecker implements SourceFolderChecker {
|
||||
@Override
|
||||
public String checkSourceFolder(Path path) {
|
||||
if (!Files.exists(path)) {
|
||||
return "- source.folder: path does not exist: " + path;
|
||||
}
|
||||
if (!Files.isDirectory(path)) {
|
||||
return "- source.folder: path is not a directory: " + path;
|
||||
}
|
||||
if (!Files.isReadable(path)) {
|
||||
return "- source.folder: directory is not readable: " + path;
|
||||
}
|
||||
return null; // All checks passed
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Default NIO-based implementation of {@link TargetFolderChecker}.
|
||||
* <p>
|
||||
* Validates that the target folder is present and writable for the file-copy write path.
|
||||
* If the folder does not yet exist, creation is attempted via {@code Files.createDirectories}.
|
||||
* <p>
|
||||
* This satisfies the "present or technically creatable" requirement: the folder need not
|
||||
* exist before the application starts, but must be reachable at startup time.
|
||||
* <p>
|
||||
* This separation allows unit tests to inject alternative implementations
|
||||
* that control the outcome of write-access or creation checks without relying on actual
|
||||
* filesystem permissions (which are platform-dependent).
|
||||
*/
|
||||
private static class DefaultTargetFolderChecker implements TargetFolderChecker {
|
||||
@Override
|
||||
public String checkTargetFolder(Path path) {
|
||||
if (!Files.exists(path)) {
|
||||
try {
|
||||
Files.createDirectories(path);
|
||||
} catch (IOException e) {
|
||||
return "- target.folder: path does not exist and could not be created: "
|
||||
+ path + " (" + e.getMessage() + ")";
|
||||
}
|
||||
}
|
||||
if (!Files.isDirectory(path)) {
|
||||
return "- target.folder: path is not a directory: " + path;
|
||||
}
|
||||
if (!Files.isWritable(path)) {
|
||||
return "- target.folder: directory is not writable: " + path;
|
||||
}
|
||||
return null; // All checks passed
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
/**
|
||||
* Bootstrap-phase technical configuration validation.
|
||||
* <p>
|
||||
* Handles startup configuration validation as a separate step after configuration loading.
|
||||
* Validates mandatory fields, numeric ranges, URI schemes, and path existence before
|
||||
* the batch application begins. If validation fails, the application exits with code 1.
|
||||
* <p>
|
||||
* Validation concerns include:
|
||||
* <ul>
|
||||
* <li>Mandatory field presence and non-nullness</li>
|
||||
* <li>Numeric constraints (timeout, retry limits, page counts, character limits)</li>
|
||||
* <li>URI validity (API base URL must be absolute with http or https scheme)</li>
|
||||
* <li>Path existence and type (source/target folders exist and are readable, etc.)</li>
|
||||
* <li>Path relationships (source and target folders are not the same)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* This validation is a technical responsibility that does not belong to the application layer
|
||||
* and is distinct from configuration loading. The validator is created and invoked by the
|
||||
* bootstrap phase after configuration is loaded.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation;
|
||||
@@ -0,0 +1,24 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.clock;
|
||||
|
||||
import java.time.Instant;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ClockPort;
|
||||
|
||||
/**
|
||||
* System clock implementation of {@link ClockPort}.
|
||||
* <p>
|
||||
* Returns the current wall-clock time from the JVM system clock.
|
||||
* Intended for production use; tests should inject a controlled clock implementation.
|
||||
*/
|
||||
public class SystemClockAdapter implements ClockPort {
|
||||
|
||||
/**
|
||||
* Returns the current system time as an {@link Instant}.
|
||||
*
|
||||
* @return the current UTC instant; never null
|
||||
*/
|
||||
@Override
|
||||
public Instant now() {
|
||||
return Instant.now();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
/**
|
||||
* Outbound adapter for system time access.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.clock.SystemClockAdapter}
|
||||
* — Production implementation of {@link de.gecheckt.pdf.umbenenner.application.port.out.ClockPort}
|
||||
* that delegates to the JVM system clock ({@code Instant.now()}).</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The {@link de.gecheckt.pdf.umbenenner.application.port.out.ClockPort} abstraction ensures that
|
||||
* all application-layer and domain-layer code obtains the current instant through the port,
|
||||
* enabling deterministic time injection in tests without coupling to wall-clock time.
|
||||
* <p>
|
||||
* No date/time logic or formatting is performed in this package; that responsibility
|
||||
* belongs to the application layer.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.clock;
|
||||
@@ -0,0 +1,38 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
/**
|
||||
* Exception thrown when configuration loading or parsing fails.
|
||||
* <p>
|
||||
* This exception covers all failures related to loading, reading, or parsing the configuration,
|
||||
* including:
|
||||
* <ul>
|
||||
* <li>I/O failures when reading the configuration file</li>
|
||||
* <li>Missing required properties</li>
|
||||
* <li>Invalid property values (e.g., unparseable integers, invalid URIs)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* This is a controlled failure mode that prevents processing from starting.
|
||||
*/
|
||||
public class ConfigurationLoadingException extends RuntimeException {
|
||||
|
||||
private static final long serialVersionUID = 1L;
|
||||
|
||||
/**
|
||||
* Creates the exception with an error message.
|
||||
*
|
||||
* @param message the error message describing what failed during configuration loading
|
||||
*/
|
||||
public ConfigurationLoadingException(String message) {
|
||||
super(message);
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates the exception with an error message and a cause.
|
||||
*
|
||||
* @param message the error message describing what failed during configuration loading
|
||||
* @param cause the underlying exception that caused the configuration failure
|
||||
*/
|
||||
public ConfigurationLoadingException(String message, Throwable cause) {
|
||||
super(message, cause);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,306 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.StringReader;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.StandardCopyOption;
|
||||
import java.util.Properties;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Detects and migrates a legacy flat-key configuration file to the multi-provider schema.
|
||||
*
|
||||
* <h2>Legacy form</h2>
|
||||
* A configuration file is considered legacy if it contains at least one of the flat property
|
||||
* keys ({@code api.baseUrl}, {@code api.model}, {@code api.timeoutSeconds}, {@code api.key})
|
||||
* and does <em>not</em> already contain {@code ai.provider.active}.
|
||||
*
|
||||
* <h2>Migration procedure</h2>
|
||||
* <ol>
|
||||
* <li>Detect legacy form; if absent, return immediately without any I/O side effect.</li>
|
||||
* <li>Create a {@code .bak} backup of the original file before any changes. If a {@code .bak}
|
||||
* file already exists, a numbered suffix is appended ({@code .bak.1}, {@code .bak.2}, …).
|
||||
* Existing backups are never overwritten.</li>
|
||||
* <li>Rewrite the file:
|
||||
* <ul>
|
||||
* <li>{@code api.baseUrl} → {@code ai.provider.openai-compatible.baseUrl}</li>
|
||||
* <li>{@code api.model} → {@code ai.provider.openai-compatible.model}</li>
|
||||
* <li>{@code api.timeoutSeconds} → {@code ai.provider.openai-compatible.timeoutSeconds}</li>
|
||||
* <li>{@code api.key} → {@code ai.provider.openai-compatible.apiKey}</li>
|
||||
* <li>{@code ai.provider.active=openai-compatible} is appended.</li>
|
||||
* <li>A commented placeholder section for the Claude provider is appended.</li>
|
||||
* <li>All other keys are carried over unchanged in stable order.</li>
|
||||
* </ul>
|
||||
* </li>
|
||||
* <li>Write the migrated content via a temporary file ({@code <file>.tmp}) followed by an
|
||||
* atomic move/rename. The original file is never partially overwritten.</li>
|
||||
* <li>Reload the migrated file and validate it with {@link MultiProviderConfigurationParser}
|
||||
* and {@link MultiProviderConfigurationValidator}. If validation fails, a
|
||||
* {@link ConfigurationLoadingException} is thrown; the {@code .bak} is preserved.</li>
|
||||
* </ol>
|
||||
*/
|
||||
public class LegacyConfigurationMigrator {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(LegacyConfigurationMigrator.class);
|
||||
|
||||
/** Legacy flat key for base URL, replaced during migration. */
|
||||
static final String LEGACY_BASE_URL = "api.baseUrl";
|
||||
|
||||
/** Legacy flat key for model name, replaced during migration. */
|
||||
static final String LEGACY_MODEL = "api.model";
|
||||
|
||||
/** Legacy flat key for timeout, replaced during migration. */
|
||||
static final String LEGACY_TIMEOUT = "api.timeoutSeconds";
|
||||
|
||||
/** Legacy flat key for API key, replaced during migration. */
|
||||
static final String LEGACY_API_KEY = "api.key";
|
||||
|
||||
private static final String[][] LEGACY_KEY_MAPPINGS = {
|
||||
{LEGACY_BASE_URL, "ai.provider.openai-compatible.baseUrl"},
|
||||
{LEGACY_MODEL, "ai.provider.openai-compatible.model"},
|
||||
{LEGACY_TIMEOUT, "ai.provider.openai-compatible.timeoutSeconds"},
|
||||
{LEGACY_API_KEY, "ai.provider.openai-compatible.apiKey"},
|
||||
};
|
||||
|
||||
private final MultiProviderConfigurationParser parser;
|
||||
private final MultiProviderConfigurationValidator validator;
|
||||
|
||||
/**
|
||||
* Creates a migrator backed by default parser and validator instances.
|
||||
*/
|
||||
public LegacyConfigurationMigrator() {
|
||||
this(new MultiProviderConfigurationParser(), new MultiProviderConfigurationValidator());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a migrator with injected parser and validator.
|
||||
* <p>
|
||||
* Intended for testing, where a controlled (e.g. always-failing) validator can be supplied
|
||||
* to verify that the {@code .bak} backup is preserved when post-migration validation fails.
|
||||
*
|
||||
* @param parser parser used to re-read the migrated file; must not be {@code null}
|
||||
* @param validator validator used to verify the migrated file; must not be {@code null}
|
||||
*/
|
||||
public LegacyConfigurationMigrator(MultiProviderConfigurationParser parser,
|
||||
MultiProviderConfigurationValidator validator) {
|
||||
this.parser = parser;
|
||||
this.validator = validator;
|
||||
}
|
||||
|
||||
/**
|
||||
* Migrates the configuration file at {@code configFilePath} if it is in legacy form.
|
||||
* <p>
|
||||
* If the file does not contain legacy flat keys or already contains
|
||||
* {@code ai.provider.active}, this method returns immediately without any I/O side effect.
|
||||
*
|
||||
* @param configFilePath path to the configuration file; must exist and be readable
|
||||
* @throws ConfigurationLoadingException if the file cannot be read, the backup cannot be
|
||||
* created, the migrated file cannot be written, or post-migration validation fails
|
||||
*/
|
||||
public void migrateIfLegacy(Path configFilePath) {
|
||||
String originalContent = readFile(configFilePath);
|
||||
Properties props = parsePropertiesFromContent(originalContent);
|
||||
|
||||
if (!isLegacyForm(props)) {
|
||||
return;
|
||||
}
|
||||
|
||||
LOG.info("Legacy configuration format detected. Migrating: {}", configFilePath);
|
||||
|
||||
createBakBackup(configFilePath, originalContent);
|
||||
|
||||
String migratedContent = generateMigratedContent(originalContent);
|
||||
writeAtomically(configFilePath, migratedContent);
|
||||
|
||||
LOG.info("Configuration file migrated to multi-provider schema: {}", configFilePath);
|
||||
|
||||
validateMigratedFile(configFilePath);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns {@code true} if the given properties are in legacy form.
|
||||
* <p>
|
||||
* A properties set is considered legacy when it contains at least one of the four
|
||||
* flat legacy keys and does not already contain {@code ai.provider.active}.
|
||||
*
|
||||
* @param props the parsed properties to inspect; must not be {@code null}
|
||||
* @return {@code true} if migration is required, {@code false} otherwise
|
||||
*/
|
||||
boolean isLegacyForm(Properties props) {
|
||||
boolean hasLegacyKey = props.containsKey(LEGACY_BASE_URL)
|
||||
|| props.containsKey(LEGACY_MODEL)
|
||||
|| props.containsKey(LEGACY_TIMEOUT)
|
||||
|| props.containsKey(LEGACY_API_KEY);
|
||||
boolean hasNewKey = props.containsKey(MultiProviderConfigurationParser.PROP_ACTIVE_PROVIDER);
|
||||
return hasLegacyKey && !hasNewKey;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a backup of the original file before overwriting it.
|
||||
* <p>
|
||||
* If {@code <file>.bak} does not yet exist, it is written directly. Otherwise,
|
||||
* numbered suffixes ({@code .bak.1}, {@code .bak.2}, …) are tried in ascending order
|
||||
* until a free slot is found. Existing backups are never overwritten.
|
||||
*/
|
||||
private void createBakBackup(Path configFilePath, String content) {
|
||||
Path bakPath = configFilePath.resolveSibling(configFilePath.getFileName() + ".bak");
|
||||
if (!Files.exists(bakPath)) {
|
||||
writeFile(bakPath, content);
|
||||
LOG.info("Backup created: {}", bakPath);
|
||||
return;
|
||||
}
|
||||
for (int i = 1; ; i++) {
|
||||
Path numbered = configFilePath.resolveSibling(configFilePath.getFileName() + ".bak." + i);
|
||||
if (!Files.exists(numbered)) {
|
||||
writeFile(numbered, content);
|
||||
LOG.info("Backup created: {}", numbered);
|
||||
return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Produces the migrated file content from the given original content string.
|
||||
* <p>
|
||||
* Each line is inspected: lines that define a legacy key are rewritten with the
|
||||
* corresponding new namespaced key; all other lines (comments, blank lines, other keys)
|
||||
* pass through unchanged. After all original lines, a {@code ai.provider.active} entry
|
||||
* and a commented Claude-provider placeholder block are appended.
|
||||
*
|
||||
* @param originalContent the raw original file content; must not be {@code null}
|
||||
* @return the migrated content ready to be written to disk
|
||||
*/
|
||||
String generateMigratedContent(String originalContent) {
|
||||
String[] lines = originalContent.split("\\r?\\n", -1);
|
||||
StringBuilder sb = new StringBuilder();
|
||||
for (String line : lines) {
|
||||
sb.append(transformLine(line)).append("\n");
|
||||
}
|
||||
sb.append("\n");
|
||||
sb.append("# Aktiver KI-Provider: openai-compatible oder claude\n");
|
||||
sb.append("ai.provider.active=openai-compatible\n");
|
||||
sb.append("\n");
|
||||
sb.append("# Anthropic Claude-Provider (nur benoetigt wenn ai.provider.active=claude)\n");
|
||||
sb.append("# ai.provider.claude.model=\n");
|
||||
sb.append("# ai.provider.claude.timeoutSeconds=\n");
|
||||
sb.append("# ai.provider.claude.apiKey=\n");
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Transforms a single properties-file line, replacing a legacy key with its new equivalent.
|
||||
* <p>
|
||||
* Comment lines, blank lines, and lines defining keys other than the four legacy keys
|
||||
* are returned unchanged.
|
||||
*/
|
||||
private String transformLine(String line) {
|
||||
for (String[] mapping : LEGACY_KEY_MAPPINGS) {
|
||||
String legacyKey = mapping[0];
|
||||
String newKey = mapping[1];
|
||||
if (lineDefinesKey(line, legacyKey)) {
|
||||
int keyStart = line.indexOf(legacyKey);
|
||||
return line.substring(0, keyStart) + newKey + line.substring(keyStart + legacyKey.length());
|
||||
}
|
||||
}
|
||||
return line;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns {@code true} when {@code line} defines the given {@code key}.
|
||||
* <p>
|
||||
* A line defines a key if — after stripping any leading whitespace — it starts with
|
||||
* the exact key string followed by {@code =}, {@code :}, whitespace, or end-of-string.
|
||||
* Comment-introducing characters ({@code #} or {@code !}) cause an immediate {@code false}.
|
||||
*/
|
||||
private boolean lineDefinesKey(String line, String key) {
|
||||
String trimmed = line.stripLeading();
|
||||
if (trimmed.isEmpty() || trimmed.startsWith("#") || trimmed.startsWith("!")) {
|
||||
return false;
|
||||
}
|
||||
if (!trimmed.startsWith(key)) {
|
||||
return false;
|
||||
}
|
||||
if (trimmed.length() == key.length()) {
|
||||
return true;
|
||||
}
|
||||
char next = trimmed.charAt(key.length());
|
||||
return next == '=' || next == ':' || Character.isWhitespace(next);
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes {@code content} to {@code target} via a temporary file and an atomic rename.
|
||||
* <p>
|
||||
* The temporary file is created as {@code <target>.tmp} in the same directory.
|
||||
* After the content is fully written, the temporary file is moved to {@code target},
|
||||
* replacing it. The original file is therefore never partially overwritten.
|
||||
*/
|
||||
private void writeAtomically(Path target, String content) {
|
||||
Path tmpPath = target.resolveSibling(target.getFileName() + ".tmp");
|
||||
try {
|
||||
Files.writeString(tmpPath, content, StandardCharsets.UTF_8);
|
||||
Files.move(tmpPath, target, StandardCopyOption.REPLACE_EXISTING);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Failed to write migrated configuration to " + target, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Re-reads the migrated file and validates it using the injected parser and validator.
|
||||
* <p>
|
||||
* A parse or validation failure is treated as a hard startup error. The {@code .bak} backup
|
||||
* created before migration is preserved in this case.
|
||||
*/
|
||||
private void validateMigratedFile(Path configFilePath) {
|
||||
String content = readFile(configFilePath);
|
||||
Properties props = parsePropertiesFromContent(content);
|
||||
|
||||
MultiProviderConfiguration config;
|
||||
try {
|
||||
config = parser.parse(props);
|
||||
} catch (ConfigurationLoadingException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Migrated configuration failed to parse: " + e.getMessage(), e);
|
||||
}
|
||||
|
||||
try {
|
||||
validator.validate(config);
|
||||
} catch (InvalidStartConfigurationException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Migrated configuration failed validation (backup preserved): " + e.getMessage(), e);
|
||||
}
|
||||
}
|
||||
|
||||
private String readFile(Path path) {
|
||||
try {
|
||||
return Files.readString(path, StandardCharsets.UTF_8);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to read file: " + path, e);
|
||||
}
|
||||
}
|
||||
|
||||
private void writeFile(Path path, String content) {
|
||||
try {
|
||||
Files.writeString(path, content, StandardCharsets.UTF_8);
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to write file: " + path, e);
|
||||
}
|
||||
}
|
||||
|
||||
private Properties parsePropertiesFromContent(String content) {
|
||||
Properties props = new Properties();
|
||||
try {
|
||||
props.load(new StringReader(content));
|
||||
} catch (IOException e) {
|
||||
throw new ConfigurationLoadingException("Failed to parse properties content", e);
|
||||
}
|
||||
return props;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,239 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.util.Properties;
|
||||
import java.util.function.Function;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Parses the multi-provider configuration schema from a {@link Properties} object.
|
||||
* <p>
|
||||
* Recognises the following property keys:
|
||||
* <pre>
|
||||
* ai.provider.active – required; must be "openai-compatible" or "claude"
|
||||
* ai.provider.openai-compatible.baseUrl – required for active OpenAI-compatible provider
|
||||
* ai.provider.openai-compatible.model – required for active OpenAI-compatible provider
|
||||
* ai.provider.openai-compatible.timeoutSeconds
|
||||
* ai.provider.openai-compatible.apiKey
|
||||
* ai.provider.claude.baseUrl – optional; defaults to https://api.anthropic.com
|
||||
* ai.provider.claude.model – required for active Claude provider
|
||||
* ai.provider.claude.timeoutSeconds
|
||||
* ai.provider.claude.apiKey
|
||||
* </pre>
|
||||
*
|
||||
* <h2>Environment-variable precedence for API keys</h2>
|
||||
* <ul>
|
||||
* <li>{@code OPENAI_COMPATIBLE_API_KEY} overrides {@code ai.provider.openai-compatible.apiKey}</li>
|
||||
* <li>{@code ANTHROPIC_API_KEY} overrides {@code ai.provider.claude.apiKey}</li>
|
||||
* </ul>
|
||||
* Each environment variable is applied only to its own provider family; the variables
|
||||
* of different families are never mixed.
|
||||
*
|
||||
* <h2>Error handling</h2>
|
||||
* <ul>
|
||||
* <li>If {@code ai.provider.active} is absent or blank, a {@link ConfigurationLoadingException}
|
||||
* is thrown.</li>
|
||||
* <li>If {@code ai.provider.active} holds an unrecognised value, a
|
||||
* {@link ConfigurationLoadingException} is thrown.</li>
|
||||
* <li>If a {@code timeoutSeconds} property is present but not a valid integer, a
|
||||
* {@link ConfigurationLoadingException} is thrown.</li>
|
||||
* <li>Missing optional fields result in {@code null} (String) or {@code 0} (int) stored in
|
||||
* the returned record; the validator enforces required fields for the active provider.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <p>The returned {@link MultiProviderConfiguration} is not yet validated. Use
|
||||
* {@link MultiProviderConfigurationValidator} after parsing.
|
||||
*/
|
||||
public class MultiProviderConfigurationParser {
|
||||
|
||||
/** Property key selecting the active provider family. */
|
||||
static final String PROP_ACTIVE_PROVIDER = "ai.provider.active";
|
||||
|
||||
static final String PROP_OPENAI_BASE_URL = "ai.provider.openai-compatible.baseUrl";
|
||||
static final String PROP_OPENAI_MODEL = "ai.provider.openai-compatible.model";
|
||||
static final String PROP_OPENAI_TIMEOUT = "ai.provider.openai-compatible.timeoutSeconds";
|
||||
static final String PROP_OPENAI_API_KEY = "ai.provider.openai-compatible.apiKey";
|
||||
|
||||
static final String PROP_CLAUDE_BASE_URL = "ai.provider.claude.baseUrl";
|
||||
static final String PROP_CLAUDE_MODEL = "ai.provider.claude.model";
|
||||
static final String PROP_CLAUDE_TIMEOUT = "ai.provider.claude.timeoutSeconds";
|
||||
static final String PROP_CLAUDE_API_KEY = "ai.provider.claude.apiKey";
|
||||
|
||||
/** Environment variable for the OpenAI-compatible provider API key. */
|
||||
static final String ENV_OPENAI_API_KEY = "OPENAI_COMPATIBLE_API_KEY";
|
||||
|
||||
/**
|
||||
* Legacy environment variable for the OpenAI-compatible provider API key.
|
||||
* <p>
|
||||
* Accepted as a fallback when {@code OPENAI_COMPATIBLE_API_KEY} is not set.
|
||||
* Existing installations that set this variable continue to work without change.
|
||||
* New installations should prefer {@code OPENAI_COMPATIBLE_API_KEY}.
|
||||
*/
|
||||
static final String ENV_LEGACY_OPENAI_API_KEY = "PDF_UMBENENNER_API_KEY";
|
||||
|
||||
/** Environment variable for the Anthropic Claude provider API key. */
|
||||
static final String ENV_CLAUDE_API_KEY = "ANTHROPIC_API_KEY";
|
||||
|
||||
/** Default base URL for the Anthropic Claude provider when not explicitly configured. */
|
||||
static final String CLAUDE_DEFAULT_BASE_URL = "https://api.anthropic.com";
|
||||
|
||||
private final Function<String, String> environmentLookup;
|
||||
|
||||
/**
|
||||
* Creates a parser that uses the real system environment for API key resolution.
|
||||
*/
|
||||
public MultiProviderConfigurationParser() {
|
||||
this(System::getenv);
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a parser with a custom environment lookup function.
|
||||
* <p>
|
||||
* This constructor is intended for testing to allow deterministic control over
|
||||
* environment variable values without modifying the real process environment.
|
||||
*
|
||||
* @param environmentLookup a function that maps environment variable names to their values;
|
||||
* must not be {@code null}
|
||||
*/
|
||||
public MultiProviderConfigurationParser(Function<String, String> environmentLookup) {
|
||||
this.environmentLookup = environmentLookup;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses the multi-provider configuration from the given properties.
|
||||
* <p>
|
||||
* The Claude default base URL ({@code https://api.anthropic.com}) is applied when
|
||||
* {@code ai.provider.claude.baseUrl} is absent. API keys are resolved with environment
|
||||
* variable precedence. The resulting configuration is not yet validated; call
|
||||
* {@link MultiProviderConfigurationValidator#validate(MultiProviderConfiguration)} afterward.
|
||||
*
|
||||
* @param props the properties to parse; must not be {@code null}
|
||||
* @return the parsed (but not yet validated) multi-provider configuration
|
||||
* @throws ConfigurationLoadingException if {@code ai.provider.active} is absent, blank,
|
||||
* or holds an unrecognised value, or if any present timeout property is not a
|
||||
* valid integer
|
||||
*/
|
||||
public MultiProviderConfiguration parse(Properties props) {
|
||||
AiProviderFamily activeFamily = parseActiveProvider(props);
|
||||
ProviderConfiguration openAiConfig = parseOpenAiCompatibleConfig(props);
|
||||
ProviderConfiguration claudeConfig = parseClaudeConfig(props);
|
||||
return new MultiProviderConfiguration(activeFamily, openAiConfig, claudeConfig);
|
||||
}
|
||||
|
||||
private AiProviderFamily parseActiveProvider(Properties props) {
|
||||
String raw = props.getProperty(PROP_ACTIVE_PROVIDER);
|
||||
if (raw == null || raw.isBlank()) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Required property missing or blank: " + PROP_ACTIVE_PROVIDER
|
||||
+ ". Valid values: openai-compatible, claude");
|
||||
}
|
||||
String trimmed = raw.trim();
|
||||
return AiProviderFamily.fromIdentifier(trimmed).orElseThrow(() ->
|
||||
new ConfigurationLoadingException(
|
||||
"Unknown provider identifier for " + PROP_ACTIVE_PROVIDER + ": '" + trimmed
|
||||
+ "'. Valid values: openai-compatible, claude"));
|
||||
}
|
||||
|
||||
private ProviderConfiguration parseOpenAiCompatibleConfig(Properties props) {
|
||||
String model = getOptionalString(props, PROP_OPENAI_MODEL);
|
||||
int timeout = parseTimeoutSeconds(props, PROP_OPENAI_TIMEOUT);
|
||||
String baseUrl = getOptionalString(props, PROP_OPENAI_BASE_URL);
|
||||
String apiKey = resolveOpenAiApiKey(props);
|
||||
return new ProviderConfiguration(model, timeout, baseUrl, apiKey);
|
||||
}
|
||||
|
||||
private ProviderConfiguration parseClaudeConfig(Properties props) {
|
||||
String model = getOptionalString(props, PROP_CLAUDE_MODEL);
|
||||
int timeout = parseTimeoutSeconds(props, PROP_CLAUDE_TIMEOUT);
|
||||
String baseUrl = getStringOrDefault(props, PROP_CLAUDE_BASE_URL, CLAUDE_DEFAULT_BASE_URL);
|
||||
String apiKey = resolveApiKey(props, PROP_CLAUDE_API_KEY, ENV_CLAUDE_API_KEY);
|
||||
return new ProviderConfiguration(model, timeout, baseUrl, apiKey);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the trimmed property value, or {@code null} if absent or blank.
|
||||
*/
|
||||
private String getOptionalString(Properties props, String key) {
|
||||
String value = props.getProperty(key);
|
||||
return (value == null || value.isBlank()) ? null : value.trim();
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the trimmed property value, or the {@code defaultValue} if absent or blank.
|
||||
*/
|
||||
private String getStringOrDefault(Properties props, String key, String defaultValue) {
|
||||
String value = props.getProperty(key);
|
||||
return (value == null || value.isBlank()) ? defaultValue : value.trim();
|
||||
}
|
||||
|
||||
/**
|
||||
* Parses a timeout property as a positive integer.
|
||||
* <p>
|
||||
* Returns {@code 0} when the property is absent or blank (indicating "not configured").
|
||||
* Throws {@link ConfigurationLoadingException} when the property is present but not
|
||||
* parseable as an integer.
|
||||
*/
|
||||
private int parseTimeoutSeconds(Properties props, String key) {
|
||||
String value = props.getProperty(key);
|
||||
if (value == null || value.isBlank()) {
|
||||
return 0;
|
||||
}
|
||||
try {
|
||||
return Integer.parseInt(value.trim());
|
||||
} catch (NumberFormatException e) {
|
||||
throw new ConfigurationLoadingException(
|
||||
"Invalid integer value for property " + key + ": '" + value.trim() + "'", e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the effective API key for the OpenAI-compatible provider.
|
||||
* <p>
|
||||
* Resolution order:
|
||||
* <ol>
|
||||
* <li>{@code OPENAI_COMPATIBLE_API_KEY} environment variable</li>
|
||||
* <li>{@code PDF_UMBENENNER_API_KEY} environment variable (legacy fallback;
|
||||
* accepted for backward compatibility with existing installations)</li>
|
||||
* <li>{@code ai.provider.openai-compatible.apiKey} property</li>
|
||||
* </ol>
|
||||
*
|
||||
* @param props the configuration properties
|
||||
* @return the resolved API key; never {@code null}, but may be blank
|
||||
*/
|
||||
private String resolveOpenAiApiKey(Properties props) {
|
||||
String primary = environmentLookup.apply(ENV_OPENAI_API_KEY);
|
||||
if (primary != null && !primary.isBlank()) {
|
||||
return primary.trim();
|
||||
}
|
||||
String legacy = environmentLookup.apply(ENV_LEGACY_OPENAI_API_KEY);
|
||||
if (legacy != null && !legacy.isBlank()) {
|
||||
return legacy.trim();
|
||||
}
|
||||
String propsValue = props.getProperty(PROP_OPENAI_API_KEY);
|
||||
return (propsValue != null) ? propsValue.trim() : "";
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the effective API key for a provider family.
|
||||
* <p>
|
||||
* The environment variable value takes precedence over the properties value.
|
||||
* If the environment variable is absent or blank, the properties value is used.
|
||||
* If both are absent or blank, an empty string is returned (the validator will
|
||||
* reject this for the active provider).
|
||||
*
|
||||
* @param props the configuration properties
|
||||
* @param propertyKey the property key for the API key of this provider family
|
||||
* @param envVarName the environment variable name for this provider family
|
||||
* @return the resolved API key; never {@code null}, but may be blank
|
||||
*/
|
||||
private String resolveApiKey(Properties props, String propertyKey, String envVarName) {
|
||||
String envValue = environmentLookup.apply(envVarName);
|
||||
if (envValue != null && !envValue.isBlank()) {
|
||||
return envValue.trim();
|
||||
}
|
||||
String propsValue = props.getProperty(propertyKey);
|
||||
return (propsValue != null) ? propsValue.trim() : "";
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,132 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import java.net.URI;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Validates a {@link MultiProviderConfiguration} before the application run begins.
|
||||
* <p>
|
||||
* Enforces all requirements for the active provider:
|
||||
* <ul>
|
||||
* <li>{@code ai.provider.active} refers to a recognised provider family.</li>
|
||||
* <li>{@code model} is non-blank.</li>
|
||||
* <li>{@code timeoutSeconds} is a positive integer.</li>
|
||||
* <li>{@code baseUrl} is a syntactically valid absolute URI with scheme {@code http} or
|
||||
* {@code https} (required for the OpenAI-compatible family; the Claude family always
|
||||
* has a default, but it is validated with the same rules).</li>
|
||||
* <li>{@code apiKey} is non-blank after environment-variable precedence has been applied
|
||||
* by {@link MultiProviderConfigurationParser}.</li>
|
||||
* </ul>
|
||||
* Required fields of the <em>inactive</em> provider are intentionally not enforced.
|
||||
* <p>
|
||||
* Validation errors are aggregated and reported together in a single
|
||||
* {@link InvalidStartConfigurationException}.
|
||||
*/
|
||||
public class MultiProviderConfigurationValidator {
|
||||
|
||||
/**
|
||||
* Validates the given multi-provider configuration.
|
||||
* <p>
|
||||
* Only the active provider's required fields are validated. The inactive provider's
|
||||
* configuration may be incomplete.
|
||||
*
|
||||
* @param config the configuration to validate; must not be {@code null}
|
||||
* @throws InvalidStartConfigurationException if any validation rule fails, with an aggregated
|
||||
* message listing all problems found
|
||||
*/
|
||||
public void validate(MultiProviderConfiguration config) {
|
||||
List<String> errors = new ArrayList<>();
|
||||
|
||||
validateActiveProvider(config, errors);
|
||||
|
||||
if (!errors.isEmpty()) {
|
||||
throw new InvalidStartConfigurationException(
|
||||
"Invalid AI provider configuration:\n" + String.join("\n", errors));
|
||||
}
|
||||
}
|
||||
|
||||
private void validateActiveProvider(MultiProviderConfiguration config, List<String> errors) {
|
||||
AiProviderFamily activeFamily = config.activeProviderFamily();
|
||||
if (activeFamily == null) {
|
||||
// Parser already throws for missing/unknown ai.provider.active,
|
||||
// but guard defensively in case the record is constructed directly in tests.
|
||||
errors.add("- ai.provider.active: must be set to a supported provider "
|
||||
+ "(openai-compatible, claude)");
|
||||
return;
|
||||
}
|
||||
|
||||
ProviderConfiguration providerConfig = config.activeProviderConfiguration();
|
||||
String providerLabel = "ai.provider." + activeFamily.getIdentifier();
|
||||
|
||||
validateModel(providerConfig, providerLabel, errors);
|
||||
validateTimeoutSeconds(providerConfig, providerLabel, errors);
|
||||
validateBaseUrl(activeFamily, providerConfig, providerLabel, errors);
|
||||
validateApiKey(providerConfig, providerLabel, errors);
|
||||
}
|
||||
|
||||
private void validateModel(ProviderConfiguration config, String providerLabel, List<String> errors) {
|
||||
if (config.model() == null || config.model().isBlank()) {
|
||||
errors.add("- " + providerLabel + ".model: must not be blank");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateTimeoutSeconds(ProviderConfiguration config, String providerLabel,
|
||||
List<String> errors) {
|
||||
if (config.timeoutSeconds() <= 0) {
|
||||
errors.add("- " + providerLabel + ".timeoutSeconds: must be a positive integer, got: "
|
||||
+ config.timeoutSeconds());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the base URL of the active provider.
|
||||
* <p>
|
||||
* The URL must be:
|
||||
* <ul>
|
||||
* <li>non-blank</li>
|
||||
* <li>a syntactically valid URI</li>
|
||||
* <li>an absolute URI (has a scheme component)</li>
|
||||
* <li>using scheme {@code http} or {@code https}</li>
|
||||
* </ul>
|
||||
* The OpenAI-compatible family requires an explicit base URL.
|
||||
* The Claude family always has a default ({@code https://api.anthropic.com}) applied by the
|
||||
* parser, so this check serves both as a primary and safety-net enforcement.
|
||||
*/
|
||||
private void validateBaseUrl(AiProviderFamily family, ProviderConfiguration config,
|
||||
String providerLabel, List<String> errors) {
|
||||
String baseUrl = config.baseUrl();
|
||||
if (baseUrl == null || baseUrl.isBlank()) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: must not be blank");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
URI uri = URI.create(baseUrl);
|
||||
if (!uri.isAbsolute()) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: must be an absolute URI with http or https scheme, got: '"
|
||||
+ baseUrl + "'");
|
||||
return;
|
||||
}
|
||||
String scheme = uri.getScheme();
|
||||
if (!"http".equalsIgnoreCase(scheme) && !"https".equalsIgnoreCase(scheme)) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: scheme must be http or https, got: '"
|
||||
+ scheme + "' in '" + baseUrl + "'");
|
||||
}
|
||||
} catch (IllegalArgumentException e) {
|
||||
errors.add("- " + providerLabel + ".baseUrl: not a valid URI: '" + baseUrl + "' (" + e.getMessage() + ")");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiKey(ProviderConfiguration config, String providerLabel,
|
||||
List<String> errors) {
|
||||
if (config.apiKey() == null || config.apiKey().isBlank()) {
|
||||
errors.add("- " + providerLabel + ".apiKey: must not be blank "
|
||||
+ "(set via environment variable or properties)");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,15 +1,7 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.io.StringReader;
|
||||
import java.net.URI;
|
||||
import java.net.URISyntaxException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
@@ -17,15 +9,27 @@ import java.nio.file.Paths;
|
||||
import java.util.Properties;
|
||||
import java.util.function.Function;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort;
|
||||
|
||||
/**
|
||||
* Properties-based implementation of {@link ConfigurationPort}.
|
||||
* AP-005: Loads configuration from config/application.properties with environment variable precedence.
|
||||
* <p>
|
||||
* Loads configuration from {@code config/application.properties} as the primary source.
|
||||
* The multi-provider AI configuration is parsed via {@link MultiProviderConfigurationParser}
|
||||
* and validated via {@link MultiProviderConfigurationValidator}. Environment variables
|
||||
* for API keys are resolved by the parser with provider-specific precedence rules:
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} for the OpenAI-compatible family and
|
||||
* {@code ANTHROPIC_API_KEY} for the Anthropic Claude family.
|
||||
*/
|
||||
public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(PropertiesConfigurationPortAdapter.class);
|
||||
private static final String DEFAULT_CONFIG_FILE_PATH = "config/application.properties";
|
||||
private static final String API_KEY_ENV_VAR = "PDF_UMBENENNER_API_KEY";
|
||||
|
||||
private final Function<String, String> environmentLookup;
|
||||
private final Path configFilePath;
|
||||
@@ -76,33 +80,49 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
|
||||
@Override
|
||||
public StartConfiguration loadConfiguration() {
|
||||
Properties props = loadPropertiesFile();
|
||||
MultiProviderConfiguration multiProviderConfig = parseAndValidateProviders(props);
|
||||
boolean logAiSensitive = parseAiContentSensitivity(props);
|
||||
return buildStartConfiguration(props, multiProviderConfig, logAiSensitive);
|
||||
}
|
||||
|
||||
private Properties loadPropertiesFile() {
|
||||
Properties props = new Properties();
|
||||
try {
|
||||
// Check if file exists first to preserve FileNotFoundException behavior for tests
|
||||
if (!Files.exists(configFilePath)) {
|
||||
throw new java.io.FileNotFoundException("Config file not found: " + configFilePath);
|
||||
}
|
||||
// Read file content as string to avoid escape sequence interpretation issues
|
||||
String content = Files.readString(configFilePath, StandardCharsets.UTF_8);
|
||||
// Escape backslashes to prevent Java Properties from interpreting them as escape sequences
|
||||
// This is needed because Windows paths use backslashes (e.g., C:\temp\...)
|
||||
// and Java Properties interprets \t as tab, \n as newline, etc.
|
||||
String escapedContent = content.replace("\\", "\\\\");
|
||||
String escapedContent = escapeBackslashes(content);
|
||||
props.load(new StringReader(escapedContent));
|
||||
} catch (IOException e) {
|
||||
throw new RuntimeException("Failed to load configuration from " + configFilePath, e);
|
||||
throw new ConfigurationLoadingException("Failed to load configuration from " + configFilePath, e);
|
||||
}
|
||||
return props;
|
||||
}
|
||||
|
||||
// Apply environment variable precedence for api.key
|
||||
String apiKey = getApiKey(props);
|
||||
/**
|
||||
* Parses and validates the multi-provider AI configuration from the given properties.
|
||||
* <p>
|
||||
* Uses {@link MultiProviderConfigurationParser} for parsing and
|
||||
* {@link MultiProviderConfigurationValidator} for validation. Throws on any
|
||||
* configuration error before returning.
|
||||
*/
|
||||
private MultiProviderConfiguration parseAndValidateProviders(Properties props) {
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(environmentLookup);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
new MultiProviderConfigurationValidator().validate(config);
|
||||
return config;
|
||||
}
|
||||
|
||||
private StartConfiguration buildStartConfiguration(Properties props,
|
||||
MultiProviderConfiguration multiProviderConfig,
|
||||
boolean logAiSensitive) {
|
||||
return new StartConfiguration(
|
||||
Paths.get(getRequiredProperty(props, "source.folder")),
|
||||
Paths.get(getRequiredProperty(props, "target.folder")),
|
||||
Paths.get(getRequiredProperty(props, "sqlite.file")),
|
||||
parseUri(getRequiredProperty(props, "api.baseUrl")),
|
||||
getRequiredProperty(props, "api.model"),
|
||||
parseInt(getRequiredProperty(props, "api.timeoutSeconds")),
|
||||
multiProviderConfig,
|
||||
parseInt(getRequiredProperty(props, "max.retries.transient")),
|
||||
parseInt(getRequiredProperty(props, "max.pages")),
|
||||
parseInt(getRequiredProperty(props, "max.text.characters")),
|
||||
@@ -110,24 +130,21 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
Paths.get(getOptionalProperty(props, "runtime.lock.file", "")),
|
||||
Paths.get(getOptionalProperty(props, "log.directory", "")),
|
||||
getOptionalProperty(props, "log.level", "INFO"),
|
||||
apiKey
|
||||
logAiSensitive
|
||||
);
|
||||
}
|
||||
|
||||
private String getApiKey(Properties props) {
|
||||
String envApiKey = environmentLookup.apply(API_KEY_ENV_VAR);
|
||||
if (envApiKey != null && !envApiKey.isBlank()) {
|
||||
LOG.info("Using API key from environment variable {}", API_KEY_ENV_VAR);
|
||||
return envApiKey;
|
||||
}
|
||||
String propsApiKey = props.getProperty("api.key");
|
||||
return propsApiKey != null ? propsApiKey : "";
|
||||
private String escapeBackslashes(String content) {
|
||||
// Escape backslashes to prevent Java Properties from interpreting them as escape sequences.
|
||||
// This is needed because Windows paths use backslashes (e.g., C:\temp\...)
|
||||
// and Java Properties interprets \t as tab, \n as newline, etc.
|
||||
return content.replace("\\", "\\\\");
|
||||
}
|
||||
|
||||
private String getRequiredProperty(Properties props, String key) {
|
||||
String value = props.getProperty(key);
|
||||
if (value == null || value.isBlank()) {
|
||||
throw new IllegalStateException("Required property missing: " + key);
|
||||
throw new ConfigurationLoadingException("Required property missing: " + key);
|
||||
}
|
||||
return normalizePath(value.trim());
|
||||
}
|
||||
@@ -151,15 +168,43 @@ public class PropertiesConfigurationPortAdapter implements ConfigurationPort {
|
||||
try {
|
||||
return Integer.parseInt(value.trim());
|
||||
} catch (NumberFormatException e) {
|
||||
throw new IllegalStateException("Invalid integer value for property: " + value, e);
|
||||
throw new ConfigurationLoadingException("Invalid integer value for property: " + value, e);
|
||||
}
|
||||
}
|
||||
|
||||
private URI parseUri(String value) {
|
||||
try {
|
||||
return new URI(value.trim());
|
||||
} catch (URISyntaxException e) {
|
||||
throw new IllegalStateException("Invalid URI value for property: " + value, e);
|
||||
/**
|
||||
* Parses the {@code log.ai.sensitive} configuration property with strict validation.
|
||||
* <p>
|
||||
* This property controls whether sensitive AI-generated content (raw response, reasoning)
|
||||
* may be written to log files. It must be either the literal string "true" or "false"
|
||||
* (case-insensitive). Any other value is rejected as an invalid startup configuration.
|
||||
* <p>
|
||||
* The default value (when the property is absent) is {@code false}, which is the safe default.
|
||||
*
|
||||
* @return {@code true} if the property is explicitly set to "true", {@code false} otherwise
|
||||
* @throws ConfigurationLoadingException if the property is present but contains an invalid value
|
||||
*/
|
||||
private boolean parseAiContentSensitivity(Properties props) {
|
||||
String value = props.getProperty("log.ai.sensitive");
|
||||
|
||||
// If absent, return safe default
|
||||
if (value == null) {
|
||||
return false;
|
||||
}
|
||||
|
||||
String trimmedValue = value.trim().toLowerCase();
|
||||
|
||||
// Only accept literal "true" or "false"
|
||||
if ("true".equals(trimmedValue)) {
|
||||
return true;
|
||||
} else if ("false".equals(trimmedValue)) {
|
||||
return false;
|
||||
} else {
|
||||
// Reject any other value as invalid configuration
|
||||
throw new ConfigurationLoadingException(
|
||||
"Invalid value for log.ai.sensitive: '" + value + "'. "
|
||||
+ "Must be either 'true' or 'false' (case-insensitive). "
|
||||
+ "Default is 'false' (sensitive content not logged).");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,5 +1,21 @@
|
||||
/**
|
||||
* Configuration adapters for outbound infrastructure access.
|
||||
* Contains implementations of configuration loading from external sources.
|
||||
* Configuration loading adapters for the bootstrap phase.
|
||||
* <p>
|
||||
* Contains implementations of the {@link de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort}
|
||||
* that load the complete {@link de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration}
|
||||
* from external sources (e.g., properties files, environment variables).
|
||||
* <p>
|
||||
* Responsibilities:
|
||||
* <ul>
|
||||
* <li>Load configuration from the properties file (default: config/application.properties)</li>
|
||||
* <li>Apply environment variable precedence for sensitive values (e.g., API key)</li>
|
||||
* <li>Construct the typed StartConfiguration object with all technical infrastructure parameters</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* These adapters bridge the outbound port contract with concrete infrastructure
|
||||
* (property file parsing, environment variable lookup) without leaking infrastructure details
|
||||
* into the application or bootstrap layers. Validation of the loaded configuration is performed
|
||||
* separately by the {@link de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.StartConfigurationValidator}
|
||||
* in the bootstrap phase.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
@@ -0,0 +1,150 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.fingerprint;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.InvalidPathException;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.Paths;
|
||||
import java.security.MessageDigest;
|
||||
import java.security.NoSuchAlgorithmException;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* SHA-256-based implementation of {@link FingerprintPort}.
|
||||
* <p>
|
||||
* Computes deterministic, content-based fingerprints for PDF documents by applying
|
||||
* SHA-256 to the raw file content. The implementation ensures that:
|
||||
* <ul>
|
||||
* <li>Fingerprints are derived exclusively from file content, not metadata</li>
|
||||
* <li>Identical content always produces the same fingerprint</li>
|
||||
* <li>Different content always produces different fingerprints</li>
|
||||
* <li>All file I/O and cryptographic operations remain in the adapter layer</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Technical failure handling:</strong> Any I/O errors, path resolution issues,
|
||||
* or cryptographic problems are converted to {@link FingerprintTechnicalError} results
|
||||
* without throwing exceptions. Pre-fingerprint failures are not historized in SQLite.
|
||||
*/
|
||||
public class Sha256FingerprintAdapter implements FingerprintPort {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(Sha256FingerprintAdapter.class);
|
||||
|
||||
/**
|
||||
* Computes the SHA-256 fingerprint for the given candidate.
|
||||
* <p>
|
||||
* The implementation:
|
||||
* <ol>
|
||||
* <li>Resolves the opaque locator to a filesystem path</li>
|
||||
* <li>Reads the entire file content</li>
|
||||
* <li>Applies SHA-256 hashing</li>
|
||||
* <li>Returns the hex-encoded result wrapped in a {@link FingerprintSuccess}</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
* Any technical failures during these steps are caught and returned as
|
||||
* {@link FingerprintTechnicalError} without throwing exceptions.
|
||||
*
|
||||
* @param candidate the candidate whose file content is to be hashed; must not be null
|
||||
* @return {@link FingerprintSuccess} on success, or {@link FingerprintTechnicalError}
|
||||
* on any infrastructure failure
|
||||
* @throws NullPointerException if {@code candidate} is null
|
||||
*/
|
||||
@Override
|
||||
public FingerprintResult computeFingerprint(SourceDocumentCandidate candidate) {
|
||||
if (candidate == null) {
|
||||
throw new NullPointerException("candidate must not be null");
|
||||
}
|
||||
|
||||
try {
|
||||
// Resolve the opaque locator to a filesystem path
|
||||
Path filePath = resolveFilePath(candidate.locator());
|
||||
|
||||
// Compute the SHA-256 hash of the file content
|
||||
String sha256Hex = computeSha256Hash(filePath);
|
||||
|
||||
// Create and return the successful result
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(sha256Hex);
|
||||
logger.debug("Successfully computed fingerprint for '{}': {}",
|
||||
candidate.uniqueIdentifier(), sha256Hex);
|
||||
return new FingerprintSuccess(fingerprint);
|
||||
|
||||
} catch (IOException e) {
|
||||
String errorMsg = String.format("Failed to read file for '%s': %s",
|
||||
candidate.uniqueIdentifier(), e.getMessage());
|
||||
logger.warn(errorMsg, e);
|
||||
return new FingerprintTechnicalError(errorMsg, e);
|
||||
} catch (InvalidPathException e) {
|
||||
String errorMsg = String.format("Invalid file path for '%s': %s",
|
||||
candidate.uniqueIdentifier(), e.getMessage());
|
||||
logger.warn(errorMsg, e);
|
||||
return new FingerprintTechnicalError(errorMsg, e);
|
||||
} catch (NoSuchAlgorithmException e) {
|
||||
String errorMsg = String.format("SHA-256 algorithm not available for '%s'",
|
||||
candidate.uniqueIdentifier());
|
||||
logger.error(errorMsg, e);
|
||||
return new FingerprintTechnicalError(errorMsg, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the opaque locator value to a filesystem path.
|
||||
* <p>
|
||||
* The locator's value is expected to contain an absolute file path as a string.
|
||||
* This is the intra-adapter convention between the source document scanner and
|
||||
* this fingerprint adapter.
|
||||
*
|
||||
* @param locator the opaque locator containing the file path; must not be null
|
||||
* @return the resolved filesystem path
|
||||
* @throws InvalidPathException if the locator value is not a valid path
|
||||
*/
|
||||
private Path resolveFilePath(SourceDocumentLocator locator) throws InvalidPathException {
|
||||
return Paths.get(locator.value());
|
||||
}
|
||||
|
||||
/**
|
||||
* Computes the SHA-256 hash of the file content at the given path.
|
||||
* <p>
|
||||
* Reads the entire file content and applies SHA-256 hashing to produce
|
||||
* a lowercase hexadecimal representation of the digest.
|
||||
*
|
||||
* @param filePath the path to the file to hash; must not be null
|
||||
* @return the lowercase hexadecimal representation of the SHA-256 digest (64 characters)
|
||||
* @throws IOException if reading the file fails
|
||||
* @throws NoSuchAlgorithmException if the SHA-256 algorithm is not available
|
||||
*/
|
||||
private String computeSha256Hash(Path filePath) throws IOException, NoSuchAlgorithmException {
|
||||
MessageDigest digest = MessageDigest.getInstance("SHA-256");
|
||||
byte[] fileBytes = Files.readAllBytes(filePath);
|
||||
byte[] hashBytes = digest.digest(fileBytes);
|
||||
return bytesToHex(hashBytes);
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a byte array to a lowercase hexadecimal string.
|
||||
* <p>
|
||||
* Each byte is represented by exactly two hexadecimal characters.
|
||||
*
|
||||
* @param bytes the byte array to convert; must not be null
|
||||
* @return the lowercase hexadecimal representation
|
||||
*/
|
||||
private String bytesToHex(byte[] bytes) {
|
||||
StringBuilder hexString = new StringBuilder();
|
||||
for (byte b : bytes) {
|
||||
String hex = Integer.toHexString(0xff & b);
|
||||
if (hex.length() == 1) {
|
||||
hexString.append('0');
|
||||
}
|
||||
hexString.append(hex);
|
||||
}
|
||||
return hexString.toString();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
/**
|
||||
* SHA-256 fingerprint adapter for computing content-based document fingerprints.
|
||||
*
|
||||
* <p>This package contains the concrete implementation of the {@link de.gecheckt.pdf.umbenenner.application.port.out.FingerprintPort}
|
||||
* that computes SHA-256 hashes of PDF document content to create stable, deterministic fingerprints.
|
||||
*
|
||||
* <p>All file I/O and cryptographic operations are strictly confined to this adapter layer,
|
||||
* maintaining the hexagonal architecture boundary.
|
||||
*
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.fingerprint;
|
||||
@@ -1,20 +1,20 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.lock;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.StandardOpenOption;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.StandardOpenOption;
|
||||
|
||||
/**
|
||||
* File-based implementation of {@link RunLockPort} that uses a lock file to prevent concurrent runs.
|
||||
* <p>
|
||||
* AP-006 Implementation: Creates an exclusive lock file on acquire and deletes it on release.
|
||||
* Creates an exclusive lock file on acquire and deletes it on release.
|
||||
* If the lock file already exists, {@link #acquire()} throws {@link RunLockUnavailableException}
|
||||
* to signal that another instance is already running.
|
||||
* <p>
|
||||
|
||||
@@ -4,10 +4,10 @@
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.lock.FilesystemRunLockPortAdapter}
|
||||
* — File-based run lock that prevents concurrent instances (AP-006)</li>
|
||||
* — File-based run lock that prevents concurrent instances</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* AP-006: Uses atomic file creation ({@code CREATE_NEW}) to establish an exclusive lock.
|
||||
* Implementation details: Uses atomic file creation ({@code CREATE_NEW}) to establish an exclusive lock.
|
||||
* Stores the acquiring process PID in the lock file for diagnostics.
|
||||
* Release is best-effort and logs a warning on failure without throwing.
|
||||
*/
|
||||
|
||||
@@ -1,24 +1,25 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Paths;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.pdfbox.Loader;
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.text.PDFTextStripper;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PdfTextExtractionPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfPageCount;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import org.apache.pdfbox.Loader;
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.text.PDFTextStripper;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Paths;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* PDFBox-based implementation of {@link PdfTextExtractionPort}.
|
||||
* <p>
|
||||
* AP-003 Implementation: Extracts text content and page count from a single PDF document
|
||||
* Extracts text content and page count from a single PDF document
|
||||
* using Apache PDFBox. All technical problems during extraction are reported as
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}.
|
||||
* <p>
|
||||
@@ -28,7 +29,7 @@ import java.util.Objects;
|
||||
* <li>Extracts complete text from all pages (may be empty)</li>
|
||||
* <li>Counts total page count</li>
|
||||
* <li>Returns results as typed {@link PdfExtractionResult} (no exceptions thrown)</li>
|
||||
* <li>All extraction failures are treated as technical errors (AP-003 scope)</li>
|
||||
* <li>All extraction failures are treated as technical errors</li>
|
||||
* <li>PDFBox is encapsulated and never exposed beyond this adapter</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
@@ -40,7 +41,7 @@ import java.util.Objects;
|
||||
* <li>All three values are combined into {@link PdfExtractionSuccess}</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Technical error cases (AP-003):
|
||||
* Technical error cases:
|
||||
* <ul>
|
||||
* <li>File not found or unreadable</li>
|
||||
* <li>PDF cannot be loaded by PDFBox (any load error)</li>
|
||||
@@ -48,14 +49,12 @@ import java.util.Objects;
|
||||
* <li>Text extraction fails or throws exception</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Non-goals (handled in later APs):
|
||||
* Out of scope (handled elsewhere):
|
||||
* <ul>
|
||||
* <li>Fachliche Bewertung des extrahierten Texts (AP-004)</li>
|
||||
* <li>Page limit checking (AP-004)</li>
|
||||
* <li>Fachliche Bewertung des extrahierten Texts</li>
|
||||
* <li>Page limit checking</li>
|
||||
* <li>Text normalization or preprocessing</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
|
||||
@@ -67,8 +66,8 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
* <p>
|
||||
* The locator is expected to contain an absolute file path as a String (adapter-internal convention).
|
||||
* <p>
|
||||
* In M3-AP-003, all technical problems are reported as {@link PdfExtractionTechnicalError}.
|
||||
* Fachliche Bewertungen like "text is not usable" are deferred to AP-004.
|
||||
* All technical problems are reported as {@link PdfExtractionTechnicalError}.
|
||||
* Fachliche Bewertungen like "text is not usable" are handled elsewhere.
|
||||
*
|
||||
* @param candidate the document to extract; must be non-null
|
||||
* @return a {@link PdfExtractionResult} encoding the outcome:
|
||||
@@ -103,7 +102,7 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
try {
|
||||
int pageCount = document.getNumberOfPages();
|
||||
|
||||
// AP-003: Handle case of zero pages as technical error
|
||||
// Handle case of zero pages as technical error
|
||||
// (PdfPageCount requires >= 1, so this is a constraint violation)
|
||||
if (pageCount < 1) {
|
||||
return new PdfExtractionTechnicalError(
|
||||
@@ -112,12 +111,12 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
}
|
||||
|
||||
// Extract text from all pages
|
||||
// Note: extractedText may be empty string, which is valid in M3 (no fachliche validation here)
|
||||
// Note: extractedText may be empty string, which is valid (no fachliche validation here)
|
||||
PDFTextStripper textStripper = new PDFTextStripper();
|
||||
String extractedText = textStripper.getText(document);
|
||||
|
||||
// Success: return extracted text and page count
|
||||
// (Empty text is not an error in AP-003; fachliche validation is AP-004)
|
||||
// (Empty text is not an error; fachliche validation is handled elsewhere)
|
||||
PdfPageCount pageCountTyped = new PdfPageCount(pageCount);
|
||||
return new PdfExtractionSuccess(extractedText, pageCountTyped);
|
||||
} finally {
|
||||
@@ -125,7 +124,7 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
}
|
||||
|
||||
} catch (IOException e) {
|
||||
// All I/O and PDFBox loading/parsing errors are technical errors in AP-003
|
||||
// All I/O and PDFBox loading/parsing errors are technical errors
|
||||
String errorMessage = e.getMessage() != null ? e.getMessage() : e.toString();
|
||||
return new PdfExtractionTechnicalError(
|
||||
"Failed to load or parse PDF: " + errorMessage,
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
/**
|
||||
* PDFBox-based adapter for PDF text extraction.
|
||||
* <p>
|
||||
* <strong>M3-AP-003:</strong> This package contains the sole implementation
|
||||
* This package contains the sole implementation
|
||||
* of {@link de.gecheckt.pdf.umbenenner.application.port.out.PdfTextExtractionPort},
|
||||
* using Apache PDFBox to extract text and page count from PDF documents.
|
||||
* <p>
|
||||
* <strong>Scope (AP-003):</strong>
|
||||
* <strong>Scope:</strong>
|
||||
* <ul>
|
||||
* <li>Pure technical extraction: read PDF, extract text, count pages</li>
|
||||
* <li>All extraction problems (file not found, PDF unreadable, PDFBox errors) → {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}</li>
|
||||
@@ -14,21 +14,18 @@
|
||||
* <li>Results always typed as {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult}, never exceptions</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Restriction:</strong>
|
||||
* <strong>Result types used:</strong>
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError} is reserved for later APs</li>
|
||||
* <li>AP-003 adapter uses only {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess} and
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess} for successful text extraction</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError} for technical problems</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Out of scope (handled in later APs):</strong>
|
||||
* <strong>Out of scope:</strong>
|
||||
* <ul>
|
||||
* <li>Text validation or quality assessment (AP-004)</li>
|
||||
* <li>Page limit checking (AP-004)</li>
|
||||
* <li>Text validation or quality assessment</li>
|
||||
* <li>Page limit checking</li>
|
||||
* <li>Text normalization or preprocessing</li>
|
||||
* <li>Fachliche Bewertung of extracted content</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction;
|
||||
|
||||
@@ -0,0 +1,97 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.prompt;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
/**
|
||||
* Filesystem-based implementation of {@link PromptPort}.
|
||||
* <p>
|
||||
* Loads prompt templates from an external file on disk and derives a stable identifier
|
||||
* from the filename. Ensures that empty or technically unusable prompts are rejected.
|
||||
* <p>
|
||||
* <strong>Identifier derivation:</strong>
|
||||
* The stable prompt identifier is derived from the filename of the prompt file.
|
||||
* This ensures deterministic, reproducible identification across batch runs.
|
||||
* For example, a prompt file named {@code "prompt_de_v2.txt"} receives the identifier
|
||||
* {@code "prompt_de_v2.txt"}.
|
||||
* <p>
|
||||
* <strong>Content validation:</strong>
|
||||
* After loading, the prompt content is trimmed and validated to ensure it is not empty.
|
||||
* An empty prompt (or one containing only whitespace) is considered technically unusable
|
||||
* and results in a {@link PromptLoadingFailure}.
|
||||
* <p>
|
||||
* <strong>Error handling:</strong>
|
||||
* All technical failures (file not found, I/O errors, permission issues) are caught
|
||||
* and returned as {@link PromptLoadingFailure} rather than thrown as exceptions.
|
||||
*/
|
||||
public class FilesystemPromptPortAdapter implements PromptPort {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(FilesystemPromptPortAdapter.class);
|
||||
|
||||
private final Path promptFilePath;
|
||||
|
||||
/**
|
||||
* Creates the adapter with the configured prompt file path.
|
||||
*
|
||||
* @param promptFilePath the path to the prompt template file; must not be null
|
||||
* @throws NullPointerException if promptFilePath is null
|
||||
*/
|
||||
public FilesystemPromptPortAdapter(Path promptFilePath) {
|
||||
this.promptFilePath = Objects.requireNonNull(promptFilePath, "promptFilePath must not be null");
|
||||
}
|
||||
|
||||
@Override
|
||||
public PromptLoadingResult loadPrompt() {
|
||||
try {
|
||||
if (!Files.exists(promptFilePath)) {
|
||||
return new PromptLoadingFailure(
|
||||
"FILE_NOT_FOUND",
|
||||
"Prompt file not found at: " + promptFilePath);
|
||||
}
|
||||
|
||||
String content = Files.readString(promptFilePath, StandardCharsets.UTF_8);
|
||||
String trimmedContent = content.trim();
|
||||
|
||||
if (trimmedContent.isEmpty()) {
|
||||
return new PromptLoadingFailure(
|
||||
"EMPTY_CONTENT",
|
||||
"Prompt file is empty or contains only whitespace: " + promptFilePath);
|
||||
}
|
||||
|
||||
PromptIdentifier identifier = deriveIdentifier();
|
||||
LOG.debug("Prompt loaded successfully from {}", promptFilePath);
|
||||
return new PromptLoadingSuccess(identifier, trimmedContent);
|
||||
|
||||
} catch (IOException e) {
|
||||
LOG.error("Failed to load prompt file: {}", promptFilePath, e);
|
||||
return new PromptLoadingFailure(
|
||||
"IO_ERROR",
|
||||
"Failed to read prompt file: " + e.getMessage());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Derives a stable prompt identifier from the filename.
|
||||
* <p>
|
||||
* The identifier is simply the filename (without the directory path).
|
||||
* This ensures that the same prompt file always receives the same identifier.
|
||||
*
|
||||
* @return a stable PromptIdentifier based on the filename
|
||||
*/
|
||||
private PromptIdentifier deriveIdentifier() {
|
||||
String filename = promptFilePath.getFileName().toString();
|
||||
return new PromptIdentifier(filename);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
/**
|
||||
* Adapters for external prompt template loading.
|
||||
* <p>
|
||||
* This package provides concrete implementations of the {@link de.gecheckt.pdf.umbenenner.application.port.out.PromptPort}
|
||||
* interface. These adapters handle all technical details of locating, loading, and validating prompt templates
|
||||
* from external sources (typically filesystem files).
|
||||
* <p>
|
||||
* Prompt files are never embedded in code. They are loaded at runtime, assigned stable identifiers for
|
||||
* traceability, and validated to ensure they are not empty or technically unusable.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.prompt;
|
||||
@@ -1,20 +1,20 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentCandidatesPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.List;
|
||||
import java.util.stream.Stream;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentCandidatesPort;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* File-system based implementation of {@link SourceDocumentCandidatesPort}.
|
||||
* <p>
|
||||
* AP-002 Implementation: Scans a configured source folder and returns only PDF files
|
||||
* Scans a configured source folder and returns only PDF files
|
||||
* (by extension) as {@link SourceDocumentCandidate} objects.
|
||||
* <p>
|
||||
* Design:
|
||||
@@ -29,13 +29,11 @@ import java.util.stream.Stream;
|
||||
* <p>
|
||||
* Non-goals:
|
||||
* <ul>
|
||||
* <li>No PDF validation (that is AP-003)</li>
|
||||
* <li>No PDF structure validation</li>
|
||||
* <li>No recursion into subdirectories</li>
|
||||
* <li>No content evaluation (that happens in AP-004: brauchbarer Text assessment)</li>
|
||||
* <li>No content evaluation (text usability is assessed during document processing)</li>
|
||||
* <li>No fachlich evaluation of candidates</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
public class SourceDocumentCandidatesPortAdapter implements SourceDocumentCandidatesPort {
|
||||
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
/**
|
||||
* Source document adapters for discovering and accessing PDF candidates.
|
||||
* <p>
|
||||
* M3-AP-002 implementations:
|
||||
* Implementations:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument.SourceDocumentCandidatesPortAdapter}
|
||||
* — File-system based discovery of PDF candidates from the source folder</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument;
|
||||
|
||||
@@ -0,0 +1,321 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.PreparedStatement;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentKnownProcessable;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordLookupResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordRepository;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalFinalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PersistenceLookupTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* SQLite implementation of {@link DocumentRecordRepository}.
|
||||
* <p>
|
||||
* Provides CRUD operations for the document master record (Dokument-Stammsatz)
|
||||
* with explicit mapping between application types and the SQLite schema.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All JDBC and SQLite details are strictly
|
||||
* confined to this class. No JDBC types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*/
|
||||
public class SqliteDocumentRecordRepositoryAdapter implements DocumentRecordRepository {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(SqliteDocumentRecordRepositoryAdapter.class);
|
||||
|
||||
private final String jdbcUrl;
|
||||
|
||||
/**
|
||||
* Constructs the adapter with the JDBC URL of the SQLite database file.
|
||||
*
|
||||
* @param jdbcUrl the JDBC URL of the SQLite database; must not be null or blank
|
||||
* @throws NullPointerException if {@code jdbcUrl} is null
|
||||
* @throws IllegalArgumentException if {@code jdbcUrl} is blank
|
||||
*/
|
||||
public SqliteDocumentRecordRepositoryAdapter(String jdbcUrl) {
|
||||
Objects.requireNonNull(jdbcUrl, "jdbcUrl must not be null");
|
||||
if (jdbcUrl.isBlank()) {
|
||||
throw new IllegalArgumentException("jdbcUrl must not be blank");
|
||||
}
|
||||
this.jdbcUrl = jdbcUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Looks up the master record for the given fingerprint.
|
||||
* <p>
|
||||
* Returns a {@link DocumentRecordLookupResult} that encodes all possible outcomes
|
||||
* including technical failures; this method never throws.
|
||||
*
|
||||
* @param fingerprint the content-based document identity to look up; must not be null
|
||||
* @return {@link DocumentUnknown} if no record exists,
|
||||
* {@link DocumentKnownProcessable} if the document is known but not terminal,
|
||||
* {@link DocumentTerminalSuccess} if the document succeeded,
|
||||
* {@link DocumentTerminalFinalFailure} if the document finally failed, or
|
||||
* {@link PersistenceLookupTechnicalFailure} if the lookup itself failed
|
||||
*/
|
||||
@Override
|
||||
public DocumentRecordLookupResult findByFingerprint(DocumentFingerprint fingerprint) {
|
||||
if (fingerprint == null) {
|
||||
throw new NullPointerException("fingerprint must not be null");
|
||||
}
|
||||
|
||||
String sql = """
|
||||
SELECT
|
||||
last_known_source_locator,
|
||||
last_known_source_file_name,
|
||||
overall_status,
|
||||
content_error_count,
|
||||
transient_error_count,
|
||||
last_failure_instant,
|
||||
last_success_instant,
|
||||
created_at,
|
||||
updated_at,
|
||||
last_target_path,
|
||||
last_target_file_name
|
||||
FROM document_record
|
||||
WHERE fingerprint = ?
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
|
||||
try (ResultSet rs = statement.executeQuery()) {
|
||||
if (rs.next()) {
|
||||
// Document exists - map to appropriate result type based on status
|
||||
DocumentRecord record = mapResultSetToDocumentRecord(rs, fingerprint);
|
||||
|
||||
return switch (record.overallStatus()) {
|
||||
case SUCCESS -> new DocumentTerminalSuccess(record);
|
||||
case FAILED_FINAL -> new DocumentTerminalFinalFailure(record);
|
||||
case READY_FOR_AI, PROPOSAL_READY, PROCESSING, FAILED_RETRYABLE,
|
||||
SKIPPED_ALREADY_PROCESSED, SKIPPED_FINAL_FAILURE ->
|
||||
new DocumentKnownProcessable(record);
|
||||
};
|
||||
} else {
|
||||
// Document not found
|
||||
return new DocumentUnknown();
|
||||
}
|
||||
}
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to lookup document record for fingerprint '" +
|
||||
fingerprint.sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
return new PersistenceLookupTechnicalFailure(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Persists a new master record for a previously unknown document.
|
||||
* <p>
|
||||
* The fingerprint within {@code record} must not yet exist in the persistence store.
|
||||
*
|
||||
* @param record the new master record to persist; must not be null
|
||||
* @throws DocumentPersistenceException if the insert fails due to a technical error
|
||||
*/
|
||||
@Override
|
||||
public void create(DocumentRecord record) {
|
||||
if (record == null) {
|
||||
throw new NullPointerException("record must not be null");
|
||||
}
|
||||
|
||||
String sql = """
|
||||
INSERT INTO document_record (
|
||||
fingerprint,
|
||||
last_known_source_locator,
|
||||
last_known_source_file_name,
|
||||
overall_status,
|
||||
content_error_count,
|
||||
transient_error_count,
|
||||
last_failure_instant,
|
||||
last_success_instant,
|
||||
created_at,
|
||||
updated_at,
|
||||
last_target_path,
|
||||
last_target_file_name
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
statement.setString(1, record.fingerprint().sha256Hex());
|
||||
statement.setString(2, record.lastKnownSourceLocator().value());
|
||||
statement.setString(3, record.lastKnownSourceFileName());
|
||||
statement.setString(4, record.overallStatus().name());
|
||||
statement.setInt(5, record.failureCounters().contentErrorCount());
|
||||
statement.setInt(6, record.failureCounters().transientErrorCount());
|
||||
statement.setString(7, instantToString(record.lastFailureInstant()));
|
||||
statement.setString(8, instantToString(record.lastSuccessInstant()));
|
||||
statement.setString(9, instantToString(record.createdAt()));
|
||||
statement.setString(10, instantToString(record.updatedAt()));
|
||||
statement.setString(11, record.lastTargetPath());
|
||||
statement.setString(12, record.lastTargetFileName());
|
||||
|
||||
int rowsAffected = statement.executeUpdate();
|
||||
if (rowsAffected != 1) {
|
||||
throw new DocumentPersistenceException(
|
||||
"Expected to insert 1 row but affected " + rowsAffected + " rows");
|
||||
}
|
||||
|
||||
logger.debug("Created document record for fingerprint: {}", record.fingerprint().sha256Hex());
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to create document record for fingerprint '" +
|
||||
record.fingerprint().sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Updates the mutable fields of an existing master record.
|
||||
* <p>
|
||||
* The record is identified by its {@link DocumentFingerprint}; the fingerprint
|
||||
* itself is never changed. Mutable fields include the overall status, failure
|
||||
* counters, last known source location, and all timestamp fields.
|
||||
*
|
||||
* @param record the updated master record; must not be null; fingerprint must exist
|
||||
* @throws DocumentPersistenceException if the update fails due to a technical error
|
||||
*/
|
||||
@Override
|
||||
public void update(DocumentRecord record) {
|
||||
if (record == null) {
|
||||
throw new NullPointerException("record must not be null");
|
||||
}
|
||||
|
||||
String sql = """
|
||||
UPDATE document_record SET
|
||||
last_known_source_locator = ?,
|
||||
last_known_source_file_name = ?,
|
||||
overall_status = ?,
|
||||
content_error_count = ?,
|
||||
transient_error_count = ?,
|
||||
last_failure_instant = ?,
|
||||
last_success_instant = ?,
|
||||
updated_at = ?,
|
||||
last_target_path = ?,
|
||||
last_target_file_name = ?
|
||||
WHERE fingerprint = ?
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
statement.setString(1, record.lastKnownSourceLocator().value());
|
||||
statement.setString(2, record.lastKnownSourceFileName());
|
||||
statement.setString(3, record.overallStatus().name());
|
||||
statement.setInt(4, record.failureCounters().contentErrorCount());
|
||||
statement.setInt(5, record.failureCounters().transientErrorCount());
|
||||
statement.setString(6, instantToString(record.lastFailureInstant()));
|
||||
statement.setString(7, instantToString(record.lastSuccessInstant()));
|
||||
statement.setString(8, instantToString(record.updatedAt()));
|
||||
statement.setString(9, record.lastTargetPath());
|
||||
statement.setString(10, record.lastTargetFileName());
|
||||
statement.setString(11, record.fingerprint().sha256Hex());
|
||||
|
||||
int rowsAffected = statement.executeUpdate();
|
||||
if (rowsAffected != 1) {
|
||||
throw new DocumentPersistenceException(
|
||||
"Expected to update 1 row but affected " + rowsAffected + " rows. " +
|
||||
"Fingerprint may not exist: " + record.fingerprint().sha256Hex());
|
||||
}
|
||||
|
||||
logger.debug("Updated document record for fingerprint: {}", record.fingerprint().sha256Hex());
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to update document record for fingerprint '" +
|
||||
record.fingerprint().sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Maps a ResultSet row to a DocumentRecord.
|
||||
*
|
||||
* @param rs the ResultSet positioned at the current row
|
||||
* @param fingerprint the fingerprint for this record
|
||||
* @return the mapped DocumentRecord
|
||||
* @throws SQLException if reading from the ResultSet fails
|
||||
*/
|
||||
private DocumentRecord mapResultSetToDocumentRecord(ResultSet rs, DocumentFingerprint fingerprint) throws SQLException {
|
||||
return new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator(rs.getString("last_known_source_locator")),
|
||||
rs.getString("last_known_source_file_name"),
|
||||
ProcessingStatus.valueOf(rs.getString("overall_status")),
|
||||
new FailureCounters(
|
||||
rs.getInt("content_error_count"),
|
||||
rs.getInt("transient_error_count")
|
||||
),
|
||||
stringToInstant(rs.getString("last_failure_instant")),
|
||||
stringToInstant(rs.getString("last_success_instant")),
|
||||
stringToInstant(rs.getString("created_at")),
|
||||
stringToInstant(rs.getString("updated_at")),
|
||||
rs.getString("last_target_path"),
|
||||
rs.getString("last_target_file_name")
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts an Instant to a string representation for storage.
|
||||
*
|
||||
* @param instant the instant to convert, may be null
|
||||
* @return the ISO-8601 string representation, or null if instant is null
|
||||
*/
|
||||
private String instantToString(Instant instant) {
|
||||
return instant != null ? instant.toString() : null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts a string representation back to an Instant.
|
||||
*
|
||||
* @param stringValue the ISO-8601 string representation, may be null
|
||||
* @return the parsed Instant, or null if stringValue is null or blank
|
||||
*/
|
||||
private Instant stringToInstant(String stringValue) {
|
||||
return stringValue != null && !stringValue.isBlank() ? Instant.parse(stringValue) : null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the JDBC URL this adapter uses to connect to the SQLite database.
|
||||
* <p>
|
||||
* Intended for logging and diagnostics only.
|
||||
*
|
||||
* @return the JDBC URL; never null or blank
|
||||
*/
|
||||
public String getJdbcUrl() {
|
||||
return jdbcUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets a connection to the database.
|
||||
* <p>
|
||||
* This method can be overridden by subclasses to provide a shared connection.
|
||||
*
|
||||
* @return a new database connection
|
||||
* @throws SQLException if the connection cannot be established
|
||||
*/
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return DriverManager.getConnection(jdbcUrl);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,373 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.PreparedStatement;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.sql.Statement;
|
||||
import java.sql.Types;
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttemptRepository;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* SQLite implementation of {@link ProcessingAttemptRepository}.
|
||||
* <p>
|
||||
* Provides CRUD operations for the processing attempt history (Versuchshistorie)
|
||||
* including all AI traceability fields added during schema evolution.
|
||||
* <p>
|
||||
* <strong>Schema compatibility:</strong> This adapter writes all columns including
|
||||
* the AI traceability columns and the provider-identifier column ({@code ai_provider}).
|
||||
* When reading rows that were written before schema evolution, those columns contain
|
||||
* {@code NULL} and are mapped to {@code null} in the Java record.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All JDBC and SQLite details are strictly
|
||||
* confined to this class. No JDBC types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*/
|
||||
public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttemptRepository {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(SqliteProcessingAttemptRepositoryAdapter.class);
|
||||
|
||||
private final String jdbcUrl;
|
||||
|
||||
private static final String PRAGMA_FOREIGN_KEYS_ON = "PRAGMA foreign_keys = ON";
|
||||
|
||||
/**
|
||||
* Constructs the adapter with the JDBC URL of the SQLite database file.
|
||||
*
|
||||
* @param jdbcUrl the JDBC URL of the SQLite database; must not be null or blank
|
||||
* @throws NullPointerException if {@code jdbcUrl} is null
|
||||
* @throws IllegalArgumentException if {@code jdbcUrl} is blank
|
||||
*/
|
||||
public SqliteProcessingAttemptRepositoryAdapter(String jdbcUrl) {
|
||||
Objects.requireNonNull(jdbcUrl, "jdbcUrl must not be null");
|
||||
if (jdbcUrl.isBlank()) {
|
||||
throw new IllegalArgumentException("jdbcUrl must not be blank");
|
||||
}
|
||||
this.jdbcUrl = jdbcUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the attempt number to assign to the <em>next</em> attempt for the given
|
||||
* fingerprint.
|
||||
* <p>
|
||||
* If no prior attempts exist for the fingerprint, returns 1.
|
||||
* Otherwise returns the current maximum attempt number plus 1.
|
||||
*
|
||||
* @param fingerprint the document identity; must not be null
|
||||
* @return the next monotonic attempt number; always >= 1
|
||||
* @throws DocumentPersistenceException if the query fails due to a technical error
|
||||
*/
|
||||
@Override
|
||||
public int loadNextAttemptNumber(DocumentFingerprint fingerprint) {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
|
||||
String sql = """
|
||||
SELECT COALESCE(MAX(attempt_number), 0) + 1 AS next_attempt_number
|
||||
FROM processing_attempt
|
||||
WHERE fingerprint = ?
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
try (Statement pragmaStmt = connection.createStatement()) {
|
||||
pragmaStmt.execute(PRAGMA_FOREIGN_KEYS_ON);
|
||||
}
|
||||
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
|
||||
try (ResultSet rs = statement.executeQuery()) {
|
||||
if (rs.next()) {
|
||||
return rs.getInt("next_attempt_number");
|
||||
} else {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to load next attempt number for fingerprint '"
|
||||
+ fingerprint.sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Persists exactly one processing attempt record including all AI traceability fields.
|
||||
*
|
||||
* @param attempt the attempt to persist; must not be null
|
||||
* @throws DocumentPersistenceException if the insert fails due to a technical error
|
||||
*/
|
||||
@Override
|
||||
public void save(ProcessingAttempt attempt) {
|
||||
Objects.requireNonNull(attempt, "attempt must not be null");
|
||||
|
||||
String sql = """
|
||||
INSERT INTO processing_attempt (
|
||||
fingerprint,
|
||||
run_id,
|
||||
attempt_number,
|
||||
started_at,
|
||||
ended_at,
|
||||
status,
|
||||
failure_class,
|
||||
failure_message,
|
||||
retryable,
|
||||
ai_provider,
|
||||
model_name,
|
||||
prompt_identifier,
|
||||
processed_page_count,
|
||||
sent_character_count,
|
||||
ai_raw_response,
|
||||
ai_reasoning,
|
||||
resolved_date,
|
||||
date_source,
|
||||
validated_title,
|
||||
final_target_file_name
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
Statement pragmaStmt = connection.createStatement();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
pragmaStmt.execute(PRAGMA_FOREIGN_KEYS_ON);
|
||||
|
||||
statement.setString(1, attempt.fingerprint().sha256Hex());
|
||||
statement.setString(2, attempt.runId().value());
|
||||
statement.setInt(3, attempt.attemptNumber());
|
||||
statement.setString(4, attempt.startedAt().toString());
|
||||
statement.setString(5, attempt.endedAt().toString());
|
||||
statement.setString(6, attempt.status().name());
|
||||
setNullableString(statement, 7, attempt.failureClass());
|
||||
setNullableString(statement, 8, attempt.failureMessage());
|
||||
statement.setBoolean(9, attempt.retryable());
|
||||
// AI provider identifier and AI traceability fields
|
||||
setNullableString(statement, 10, attempt.aiProvider());
|
||||
setNullableString(statement, 11, attempt.modelName());
|
||||
setNullableString(statement, 12, attempt.promptIdentifier());
|
||||
setNullableInteger(statement, 13, attempt.processedPageCount());
|
||||
setNullableInteger(statement, 14, attempt.sentCharacterCount());
|
||||
setNullableString(statement, 15, attempt.aiRawResponse());
|
||||
setNullableString(statement, 16, attempt.aiReasoning());
|
||||
setNullableString(statement, 17,
|
||||
attempt.resolvedDate() != null ? attempt.resolvedDate().toString() : null);
|
||||
setNullableString(statement, 18,
|
||||
attempt.dateSource() != null ? attempt.dateSource().name() : null);
|
||||
setNullableString(statement, 19, attempt.validatedTitle());
|
||||
setNullableString(statement, 20, attempt.finalTargetFileName());
|
||||
|
||||
int rowsAffected = statement.executeUpdate();
|
||||
if (rowsAffected != 1) {
|
||||
throw new DocumentPersistenceException(
|
||||
"Expected to insert 1 row but affected " + rowsAffected + " rows");
|
||||
}
|
||||
|
||||
logger.debug("Saved processing attempt #{} for fingerprint: {}",
|
||||
attempt.attemptNumber(), attempt.fingerprint().sha256Hex());
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to save processing attempt #" + attempt.attemptNumber()
|
||||
+ " for fingerprint '" + attempt.fingerprint().sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns all historised attempts for the given fingerprint, ordered by
|
||||
* {@link ProcessingAttempt#attemptNumber()} ascending.
|
||||
*
|
||||
* @param fingerprint the document identity; must not be null
|
||||
* @return immutable list of attempts; never null
|
||||
* @throws DocumentPersistenceException if the query fails
|
||||
*/
|
||||
@Override
|
||||
public List<ProcessingAttempt> findAllByFingerprint(DocumentFingerprint fingerprint) {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
|
||||
String sql = """
|
||||
SELECT
|
||||
fingerprint, run_id, attempt_number, started_at, ended_at,
|
||||
status, failure_class, failure_message, retryable,
|
||||
ai_provider, model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_raw_response, ai_reasoning, resolved_date, date_source, validated_title,
|
||||
final_target_file_name
|
||||
FROM processing_attempt
|
||||
WHERE fingerprint = ?
|
||||
ORDER BY attempt_number ASC
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
Statement pragmaStmt = connection.createStatement();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
pragmaStmt.execute(PRAGMA_FOREIGN_KEYS_ON);
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
|
||||
try (ResultSet rs = statement.executeQuery()) {
|
||||
List<ProcessingAttempt> attempts = new ArrayList<>();
|
||||
while (rs.next()) {
|
||||
attempts.add(mapResultSetToProcessingAttempt(rs));
|
||||
}
|
||||
return List.copyOf(attempts);
|
||||
}
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to find processing attempts for fingerprint '"
|
||||
+ fingerprint.sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the most recent attempt with status {@code PROPOSAL_READY} for the given
|
||||
* fingerprint, or {@code null} if no such attempt exists.
|
||||
* <p>
|
||||
* This is the <em>leading source</em> for the naming proposal: the most recent
|
||||
* {@code PROPOSAL_READY} attempt carries the validated date, title, and reasoning
|
||||
* that subsequent processing steps consume.
|
||||
*
|
||||
* @param fingerprint the document identity; must not be null
|
||||
* @return the most recent {@code PROPOSAL_READY} attempt, or {@code null}
|
||||
* @throws DocumentPersistenceException if the query fails
|
||||
*/
|
||||
@Override
|
||||
public ProcessingAttempt findLatestProposalReadyAttempt(DocumentFingerprint fingerprint) {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
|
||||
String sql = """
|
||||
SELECT
|
||||
fingerprint, run_id, attempt_number, started_at, ended_at,
|
||||
status, failure_class, failure_message, retryable,
|
||||
ai_provider, model_name, prompt_identifier, processed_page_count, sent_character_count,
|
||||
ai_raw_response, ai_reasoning, resolved_date, date_source, validated_title,
|
||||
final_target_file_name
|
||||
FROM processing_attempt
|
||||
WHERE fingerprint = ?
|
||||
AND status = ?
|
||||
ORDER BY attempt_number DESC
|
||||
LIMIT 1
|
||||
""";
|
||||
|
||||
try (Connection connection = getConnection();
|
||||
Statement pragmaStmt = connection.createStatement();
|
||||
PreparedStatement statement = connection.prepareStatement(sql)) {
|
||||
|
||||
pragmaStmt.execute(PRAGMA_FOREIGN_KEYS_ON);
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
statement.setString(2, ProcessingStatus.PROPOSAL_READY.name());
|
||||
|
||||
try (ResultSet rs = statement.executeQuery()) {
|
||||
if (rs.next()) {
|
||||
return mapResultSetToProcessingAttempt(rs);
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to find latest PROPOSAL_READY attempt for fingerprint '"
|
||||
+ fingerprint.sha256Hex() + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Mapping helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private ProcessingAttempt mapResultSetToProcessingAttempt(ResultSet rs) throws SQLException {
|
||||
String resolvedDateStr = rs.getString("resolved_date");
|
||||
LocalDate resolvedDate = resolvedDateStr != null ? LocalDate.parse(resolvedDateStr) : null;
|
||||
|
||||
String dateSourceStr = rs.getString("date_source");
|
||||
DateSource dateSource = dateSourceStr != null ? DateSource.valueOf(dateSourceStr) : null;
|
||||
|
||||
Integer processedPageCount = (Integer) getNullableInt(rs, "processed_page_count");
|
||||
Integer sentCharacterCount = (Integer) getNullableInt(rs, "sent_character_count");
|
||||
|
||||
return new ProcessingAttempt(
|
||||
new DocumentFingerprint(rs.getString("fingerprint")),
|
||||
new RunId(rs.getString("run_id")),
|
||||
rs.getInt("attempt_number"),
|
||||
Instant.parse(rs.getString("started_at")),
|
||||
Instant.parse(rs.getString("ended_at")),
|
||||
ProcessingStatus.valueOf(rs.getString("status")),
|
||||
rs.getString("failure_class"),
|
||||
rs.getString("failure_message"),
|
||||
rs.getBoolean("retryable"),
|
||||
rs.getString("ai_provider"),
|
||||
rs.getString("model_name"),
|
||||
rs.getString("prompt_identifier"),
|
||||
processedPageCount,
|
||||
sentCharacterCount,
|
||||
rs.getString("ai_raw_response"),
|
||||
rs.getString("ai_reasoning"),
|
||||
resolvedDate,
|
||||
dateSource,
|
||||
rs.getString("validated_title"),
|
||||
rs.getString("final_target_file_name")
|
||||
);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// JDBC nullable helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private static void setNullableString(PreparedStatement stmt, int index, String value)
|
||||
throws SQLException {
|
||||
if (value == null) {
|
||||
stmt.setNull(index, Types.VARCHAR);
|
||||
} else {
|
||||
stmt.setString(index, value);
|
||||
}
|
||||
}
|
||||
|
||||
private static void setNullableInteger(PreparedStatement stmt, int index, Integer value)
|
||||
throws SQLException {
|
||||
if (value == null) {
|
||||
stmt.setNull(index, Types.INTEGER);
|
||||
} else {
|
||||
stmt.setInt(index, value);
|
||||
}
|
||||
}
|
||||
|
||||
private static Object getNullableInt(ResultSet rs, String column) throws SQLException {
|
||||
int value = rs.getInt(column);
|
||||
return rs.wasNull() ? null : value;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the JDBC URL this adapter uses.
|
||||
*
|
||||
* @return the JDBC URL; never null or blank
|
||||
*/
|
||||
public String getJdbcUrl() {
|
||||
return jdbcUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a JDBC connection. May be overridden in tests to provide shared connections.
|
||||
*/
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return DriverManager.getConnection(jdbcUrl);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,337 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.sql.Statement;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitializationPort;
|
||||
|
||||
/**
|
||||
* SQLite implementation of {@link PersistenceSchemaInitializationPort}.
|
||||
* <p>
|
||||
* Creates or verifies the two-level persistence schema in the configured SQLite
|
||||
* database file, and performs a controlled schema evolution from an earlier schema
|
||||
* version to the current one.
|
||||
*
|
||||
* <h2>Two-level schema</h2>
|
||||
* <p>The schema consists of exactly two tables:
|
||||
* <ol>
|
||||
* <li><strong>{@code document_record}</strong> — the document master record
|
||||
* (Dokument-Stammsatz). One row per unique SHA-256 fingerprint.</li>
|
||||
* <li><strong>{@code processing_attempt}</strong> — the processing attempt history
|
||||
* (Versuchshistorie). One row per historised processing attempt, referencing
|
||||
* the master record via fingerprint.</li>
|
||||
* </ol>
|
||||
*
|
||||
* <h2>Schema evolution</h2>
|
||||
* <p>
|
||||
* When upgrading from an earlier schema, this adapter uses idempotent
|
||||
* {@code ALTER TABLE ... ADD COLUMN} statements for both tables. Columns that already
|
||||
* exist are silently skipped, making the evolution safe to run on both fresh and existing
|
||||
* databases. The current evolution adds:
|
||||
* <ul>
|
||||
* <li>AI-traceability columns to {@code processing_attempt}</li>
|
||||
* <li>Target-copy columns ({@code last_target_path}, {@code last_target_file_name}) to
|
||||
* {@code document_record}</li>
|
||||
* <li>Target-copy column ({@code final_target_file_name}) to {@code processing_attempt}</li>
|
||||
* <li>Provider-identifier column ({@code ai_provider}) to {@code processing_attempt};
|
||||
* existing rows receive {@code NULL} as the default, which is the correct value for
|
||||
* attempts recorded before provider tracking was introduced.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>Legacy-state migration</h2>
|
||||
* <p>
|
||||
* Documents in an earlier positive intermediate state ({@code SUCCESS} recorded without
|
||||
* a validated naming proposal) are idempotently migrated to {@code READY_FOR_AI} so that
|
||||
* the AI naming pipeline processes them in the next run. Terminal negative states
|
||||
* ({@code FAILED_RETRYABLE}, {@code FAILED_FINAL}, skip states) are left unchanged.
|
||||
*
|
||||
* <h2>Initialisation timing</h2>
|
||||
* <p>This adapter must be invoked <em>once</em> at program startup, before the batch
|
||||
* document processing loop begins.
|
||||
*
|
||||
* <h2>Architecture boundary</h2>
|
||||
* <p>All JDBC connections, SQL DDL, and SQLite-specific behaviour are strictly confined
|
||||
* to this class. No JDBC or SQLite types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*/
|
||||
public class SqliteSchemaInitializationAdapter implements PersistenceSchemaInitializationPort {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(SqliteSchemaInitializationAdapter.class);
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// DDL — document_record table
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* DDL for the document master record table.
|
||||
* <p>
|
||||
* Columns: id (PK), fingerprint (unique), last_known_source_locator,
|
||||
* last_known_source_file_name, overall_status, content_error_count,
|
||||
* transient_error_count, last_failure_instant, last_success_instant,
|
||||
* created_at, updated_at.
|
||||
*/
|
||||
private static final String DDL_CREATE_DOCUMENT_RECORD = """
|
||||
CREATE TABLE IF NOT EXISTS document_record (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
last_known_source_locator TEXT NOT NULL,
|
||||
last_known_source_file_name TEXT NOT NULL,
|
||||
overall_status TEXT NOT NULL,
|
||||
content_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
transient_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
last_failure_instant TEXT,
|
||||
last_success_instant TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL,
|
||||
CONSTRAINT uq_document_record_fingerprint UNIQUE (fingerprint)
|
||||
)
|
||||
""";
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// DDL — processing_attempt table (base schema, without AI traceability cols)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* DDL for the base processing attempt history table.
|
||||
* <p>
|
||||
* Base columns (present in all schema versions): id, fingerprint, run_id,
|
||||
* attempt_number, started_at, ended_at, status, failure_class, failure_message, retryable.
|
||||
* <p>
|
||||
* AI traceability columns are added separately via {@code ALTER TABLE} to support
|
||||
* idempotent evolution from earlier schemas.
|
||||
*/
|
||||
private static final String DDL_CREATE_PROCESSING_ATTEMPT = """
|
||||
CREATE TABLE IF NOT EXISTS processing_attempt (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
run_id TEXT NOT NULL,
|
||||
attempt_number INTEGER NOT NULL,
|
||||
started_at TEXT NOT NULL,
|
||||
ended_at TEXT NOT NULL,
|
||||
status TEXT NOT NULL,
|
||||
failure_class TEXT,
|
||||
failure_message TEXT,
|
||||
retryable INTEGER NOT NULL DEFAULT 0,
|
||||
CONSTRAINT fk_processing_attempt_fingerprint
|
||||
FOREIGN KEY (fingerprint) REFERENCES document_record (fingerprint),
|
||||
CONSTRAINT uq_processing_attempt_fingerprint_number
|
||||
UNIQUE (fingerprint, attempt_number)
|
||||
)
|
||||
""";
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// DDL — indexes
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/** Index on {@code processing_attempt.fingerprint} for fast per-document lookups. */
|
||||
private static final String DDL_IDX_ATTEMPT_FINGERPRINT =
|
||||
"CREATE INDEX IF NOT EXISTS idx_processing_attempt_fingerprint "
|
||||
+ "ON processing_attempt (fingerprint)";
|
||||
|
||||
/** Index on {@code processing_attempt.run_id} for fast per-run lookups. */
|
||||
private static final String DDL_IDX_ATTEMPT_RUN_ID =
|
||||
"CREATE INDEX IF NOT EXISTS idx_processing_attempt_run_id "
|
||||
+ "ON processing_attempt (run_id)";
|
||||
|
||||
/** Index on {@code document_record.overall_status} for fast status-based filtering. */
|
||||
private static final String DDL_IDX_RECORD_STATUS =
|
||||
"CREATE INDEX IF NOT EXISTS idx_document_record_overall_status "
|
||||
+ "ON document_record (overall_status)";
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// DDL — columns added to processing_attempt via schema evolution
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Columns to add idempotently to {@code processing_attempt}.
|
||||
* Each entry is {@code [column_name, column_type]}.
|
||||
* <p>
|
||||
* {@code ai_provider} is nullable; existing rows receive {@code NULL}, which is the
|
||||
* correct sentinel for attempts recorded before provider tracking was introduced.
|
||||
*/
|
||||
private static final String[][] EVOLUTION_ATTEMPT_COLUMNS = {
|
||||
{"model_name", "TEXT"},
|
||||
{"prompt_identifier", "TEXT"},
|
||||
{"processed_page_count", "INTEGER"},
|
||||
{"sent_character_count", "INTEGER"},
|
||||
{"ai_raw_response", "TEXT"},
|
||||
{"ai_reasoning", "TEXT"},
|
||||
{"resolved_date", "TEXT"},
|
||||
{"date_source", "TEXT"},
|
||||
{"validated_title", "TEXT"},
|
||||
{"final_target_file_name", "TEXT"},
|
||||
{"ai_provider", "TEXT"},
|
||||
};
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// DDL — columns added to document_record via schema evolution
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Columns to add idempotently to {@code document_record}.
|
||||
* Each entry is {@code [column_name, column_type]}.
|
||||
*/
|
||||
private static final String[][] EVOLUTION_RECORD_COLUMNS = {
|
||||
{"last_target_path", "TEXT"},
|
||||
{"last_target_file_name", "TEXT"},
|
||||
};
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Legacy-state status migration
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Migrates earlier positive intermediate states in {@code document_record} that were
|
||||
* recorded as {@code SUCCESS} without a validated naming proposal to {@code READY_FOR_AI},
|
||||
* so the AI naming pipeline processes them in the next run.
|
||||
* <p>
|
||||
* Only rows with {@code overall_status = 'SUCCESS'} that have no corresponding
|
||||
* {@code processing_attempt} with {@code status = 'PROPOSAL_READY'} are updated.
|
||||
* This migration is idempotent.
|
||||
*/
|
||||
private static final String SQL_MIGRATE_LEGACY_SUCCESS_TO_READY_FOR_AI = """
|
||||
UPDATE document_record
|
||||
SET overall_status = 'READY_FOR_AI',
|
||||
updated_at = datetime('now')
|
||||
WHERE overall_status = 'SUCCESS'
|
||||
AND NOT EXISTS (
|
||||
SELECT 1 FROM processing_attempt pa
|
||||
WHERE pa.fingerprint = document_record.fingerprint
|
||||
AND pa.status = 'PROPOSAL_READY'
|
||||
)
|
||||
""";
|
||||
|
||||
private final String jdbcUrl;
|
||||
|
||||
/**
|
||||
* Constructs the adapter with the JDBC URL of the SQLite database file.
|
||||
*
|
||||
* @param jdbcUrl the JDBC URL of the SQLite database; must not be null or blank
|
||||
* @throws NullPointerException if {@code jdbcUrl} is null
|
||||
* @throws IllegalArgumentException if {@code jdbcUrl} is blank
|
||||
*/
|
||||
public SqliteSchemaInitializationAdapter(String jdbcUrl) {
|
||||
Objects.requireNonNull(jdbcUrl, "jdbcUrl must not be null");
|
||||
if (jdbcUrl.isBlank()) {
|
||||
throw new IllegalArgumentException("jdbcUrl must not be blank");
|
||||
}
|
||||
this.jdbcUrl = jdbcUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates or verifies the persistence schema and performs schema evolution and
|
||||
* status migration.
|
||||
* <p>
|
||||
* Execution order:
|
||||
* <ol>
|
||||
* <li>Enable foreign key enforcement.</li>
|
||||
* <li>Create {@code document_record} table (if not exists).</li>
|
||||
* <li>Create {@code processing_attempt} table (if not exists).</li>
|
||||
* <li>Create all indexes (if not exist).</li>
|
||||
* <li>Add AI-traceability and provider-identifier columns to {@code processing_attempt}
|
||||
* (idempotent evolution).</li>
|
||||
* <li>Migrate earlier positive intermediate state to {@code READY_FOR_AI} (idempotent).</li>
|
||||
* </ol>
|
||||
* <p>
|
||||
* All steps are safe to run on both fresh and existing databases.
|
||||
*
|
||||
* @throws DocumentPersistenceException if any DDL or migration step fails
|
||||
*/
|
||||
@Override
|
||||
public void initializeSchema() {
|
||||
logger.info("Initialising SQLite persistence schema at: {}", jdbcUrl);
|
||||
try (Connection connection = DriverManager.getConnection(jdbcUrl);
|
||||
Statement statement = connection.createStatement()) {
|
||||
|
||||
// Enable foreign key enforcement (SQLite disables it by default)
|
||||
statement.execute("PRAGMA foreign_keys = ON");
|
||||
|
||||
// Level 1: document master record
|
||||
statement.execute(DDL_CREATE_DOCUMENT_RECORD);
|
||||
logger.debug("Table 'document_record' created or already present.");
|
||||
|
||||
// Level 2: processing attempt history (base columns only)
|
||||
statement.execute(DDL_CREATE_PROCESSING_ATTEMPT);
|
||||
logger.debug("Table 'processing_attempt' created or already present.");
|
||||
|
||||
// Indexes for efficient per-document, per-run, and per-status access
|
||||
statement.execute(DDL_IDX_ATTEMPT_FINGERPRINT);
|
||||
statement.execute(DDL_IDX_ATTEMPT_RUN_ID);
|
||||
statement.execute(DDL_IDX_RECORD_STATUS);
|
||||
logger.debug("Indexes created or already present.");
|
||||
|
||||
// Schema evolution: add AI-traceability + target-copy columns (idempotent)
|
||||
evolveTableColumns(connection, "processing_attempt", EVOLUTION_ATTEMPT_COLUMNS);
|
||||
evolveTableColumns(connection, "document_record", EVOLUTION_RECORD_COLUMNS);
|
||||
|
||||
// Status migration: earlier positive intermediate state → READY_FOR_AI
|
||||
int migrated = statement.executeUpdate(SQL_MIGRATE_LEGACY_SUCCESS_TO_READY_FOR_AI);
|
||||
if (migrated > 0) {
|
||||
logger.info("Status migration: {} document(s) migrated from legacy SUCCESS state to READY_FOR_AI.",
|
||||
migrated);
|
||||
} else {
|
||||
logger.debug("Status migration: no documents required migration.");
|
||||
}
|
||||
|
||||
logger.info("SQLite schema initialisation and migration completed successfully.");
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to initialise SQLite persistence schema at '" + jdbcUrl + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Idempotently adds the given columns to the specified table.
|
||||
* <p>
|
||||
* For each column that does not yet exist, an {@code ALTER TABLE ... ADD COLUMN}
|
||||
* statement is executed. Columns that already exist are silently skipped.
|
||||
*
|
||||
* @param connection an open JDBC connection to the database
|
||||
* @param tableName the name of the table to evolve
|
||||
* @param columns array of {@code [column_name, column_type]} pairs to add
|
||||
* @throws SQLException if a column addition fails for a reason other than duplicate column
|
||||
*/
|
||||
private void evolveTableColumns(Connection connection, String tableName, String[][] columns)
|
||||
throws SQLException {
|
||||
java.util.Set<String> existingColumns = new java.util.HashSet<>();
|
||||
try (ResultSet rs = connection.getMetaData().getColumns(null, null, tableName, null)) {
|
||||
while (rs.next()) {
|
||||
existingColumns.add(rs.getString("COLUMN_NAME").toLowerCase());
|
||||
}
|
||||
}
|
||||
|
||||
for (String[] col : columns) {
|
||||
String columnName = col[0];
|
||||
String columnType = col[1];
|
||||
if (!existingColumns.contains(columnName.toLowerCase())) {
|
||||
String alterSql = "ALTER TABLE " + tableName + " ADD COLUMN " + columnName + " " + columnType;
|
||||
try (Statement stmt = connection.createStatement()) {
|
||||
stmt.execute(alterSql);
|
||||
}
|
||||
logger.debug("Schema evolution: added column '{}' to '{}'.", columnName, tableName);
|
||||
} else {
|
||||
logger.debug("Schema evolution: column '{}' in '{}' already present, skipped.",
|
||||
columnName, tableName);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the JDBC URL this adapter uses to connect to the SQLite database.
|
||||
*
|
||||
* @return the JDBC URL; never null or blank
|
||||
*/
|
||||
public String getJdbcUrl() {
|
||||
return jdbcUrl;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,167 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import java.lang.reflect.InvocationTargetException;
|
||||
import java.lang.reflect.Proxy;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.SQLException;
|
||||
import java.util.Objects;
|
||||
import java.util.function.Consumer;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.UnitOfWorkPort;
|
||||
|
||||
/**
|
||||
* SQLite implementation of {@link UnitOfWorkPort}.
|
||||
* <p>
|
||||
* Provides transactional semantics for coordinated writes to both the document record
|
||||
* and processing attempt repositories.
|
||||
*/
|
||||
public class SqliteUnitOfWorkAdapter implements UnitOfWorkPort {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(SqliteUnitOfWorkAdapter.class);
|
||||
|
||||
private final String jdbcUrl;
|
||||
|
||||
public SqliteUnitOfWorkAdapter(String jdbcUrl) {
|
||||
Objects.requireNonNull(jdbcUrl, "jdbcUrl must not be null");
|
||||
if (jdbcUrl.isBlank()) {
|
||||
throw new IllegalArgumentException("jdbcUrl must not be blank");
|
||||
}
|
||||
this.jdbcUrl = jdbcUrl;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void executeInTransaction(Consumer<TransactionOperations> operations) {
|
||||
Objects.requireNonNull(operations, "operations must not be null");
|
||||
|
||||
Connection connection = null;
|
||||
try {
|
||||
connection = DriverManager.getConnection(jdbcUrl);
|
||||
connection.setAutoCommit(false);
|
||||
|
||||
TransactionOperationsImpl txOps = new TransactionOperationsImpl(connection);
|
||||
operations.accept(txOps);
|
||||
|
||||
connection.commit();
|
||||
logger.debug("Transaction committed successfully");
|
||||
|
||||
} catch (DocumentPersistenceException e) {
|
||||
// Re-throw document-level persistence errors as-is, but still rollback
|
||||
if (connection != null) {
|
||||
try {
|
||||
connection.rollback();
|
||||
logger.debug("Transaction rolled back due to document error: {}", e.getMessage());
|
||||
} catch (SQLException rollbackEx) {
|
||||
logger.error("Failed to rollback transaction: {}", rollbackEx.getMessage(), rollbackEx);
|
||||
}
|
||||
}
|
||||
throw e;
|
||||
} catch (RuntimeException e) {
|
||||
// Rollback on any RuntimeException and wrap in DocumentPersistenceException
|
||||
if (connection != null) {
|
||||
try {
|
||||
connection.rollback();
|
||||
logger.debug("Transaction rolled back due to error: {}", e.getMessage());
|
||||
} catch (SQLException rollbackEx) {
|
||||
logger.error("Failed to rollback transaction: {}", rollbackEx.getMessage(), rollbackEx);
|
||||
}
|
||||
}
|
||||
throw new DocumentPersistenceException("Transaction failed: " + e.getMessage(), e);
|
||||
} catch (SQLException e) {
|
||||
// Rollback for any SQL error
|
||||
if (connection != null) {
|
||||
try {
|
||||
connection.rollback();
|
||||
logger.debug("Transaction rolled back due to error: {}", e.getMessage());
|
||||
} catch (SQLException rollbackEx) {
|
||||
logger.error("Failed to rollback transaction: {}", rollbackEx.getMessage(), rollbackEx);
|
||||
}
|
||||
}
|
||||
throw new DocumentPersistenceException("Transaction failed: " + e.getMessage(), e);
|
||||
} finally {
|
||||
if (connection != null) {
|
||||
try {
|
||||
connection.close();
|
||||
} catch (SQLException e) {
|
||||
logger.warn("Failed to close connection: {}", e.getMessage(), e);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Wraps a shared transaction connection so that {@code close()} becomes a no-op.
|
||||
* <p>
|
||||
* Repository adapters manage their own connection lifecycle via try-with-resources,
|
||||
* which would close the shared transaction connection prematurely if not wrapped.
|
||||
* All other {@link Connection} methods are delegated unchanged to the underlying connection.
|
||||
*
|
||||
* @param underlying the real shared connection; must not be null
|
||||
* @return a proxy connection that ignores {@code close()} calls
|
||||
*/
|
||||
private static Connection nonClosingWrapper(Connection underlying) {
|
||||
return (Connection) Proxy.newProxyInstance(
|
||||
Connection.class.getClassLoader(),
|
||||
new Class<?>[] { Connection.class },
|
||||
(proxy, method, args) -> {
|
||||
if ("close".equals(method.getName())) {
|
||||
return null;
|
||||
}
|
||||
try {
|
||||
return method.invoke(underlying, args);
|
||||
} catch (InvocationTargetException e) {
|
||||
throw e.getCause();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
private class TransactionOperationsImpl implements TransactionOperations {
|
||||
private final Connection connection;
|
||||
|
||||
TransactionOperationsImpl(Connection connection) {
|
||||
this.connection = connection;
|
||||
}
|
||||
|
||||
@Override
|
||||
public void saveProcessingAttempt(ProcessingAttempt attempt) {
|
||||
SqliteProcessingAttemptRepositoryAdapter repo =
|
||||
new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.save(attempt);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void createDocumentRecord(DocumentRecord record) {
|
||||
SqliteDocumentRecordRepositoryAdapter repo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.create(record);
|
||||
}
|
||||
|
||||
@Override
|
||||
public void updateDocumentRecord(DocumentRecord record) {
|
||||
SqliteDocumentRecordRepositoryAdapter repo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl) {
|
||||
@Override
|
||||
protected Connection getConnection() throws SQLException {
|
||||
return nonClosingWrapper(connection);
|
||||
}
|
||||
};
|
||||
repo.update(record);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,35 @@
|
||||
/**
|
||||
* SQLite persistence adapter for the two-level persistence model.
|
||||
*
|
||||
* <h2>Purpose</h2>
|
||||
* <p>This package contains the technical SQLite infrastructure for the persistence
|
||||
* layer. It is the only place in the entire application where JDBC connections, SQL DDL,
|
||||
* and SQLite-specific types are used. No JDBC or SQLite types leak into the
|
||||
* {@code application} or {@code domain} modules.
|
||||
*
|
||||
* <h2>Two-level persistence model</h2>
|
||||
* <p>Persistence is structured in exactly two levels:
|
||||
* <ol>
|
||||
* <li><strong>Document master record</strong> ({@code document_record} table) —
|
||||
* one row per unique SHA-256 fingerprint; carries the current overall status,
|
||||
* failure counters, and the most recently known source location.</li>
|
||||
* <li><strong>Processing attempt history</strong> ({@code processing_attempt} table) —
|
||||
* one row per historised processing attempt; references the master record via
|
||||
* fingerprint; attempt numbers are monotonically increasing per fingerprint.</li>
|
||||
* </ol>
|
||||
*
|
||||
* <h2>Schema initialisation timing</h2>
|
||||
* <p>The {@link de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteSchemaInitializationAdapter}
|
||||
* implements the
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitializationPort}
|
||||
* and must be called <em>once</em> at program startup, before the batch document
|
||||
* processing loop begins. There is no lazy or hidden initialisation during document
|
||||
* processing.
|
||||
*
|
||||
* <h2>Architecture boundary</h2>
|
||||
* <p>All JDBC connections, SQL statements, and SQLite-specific behaviour are strictly
|
||||
* confined to this package. The application layer interacts exclusively through the
|
||||
* port interfaces defined in
|
||||
* {@code de.gecheckt.pdf.umbenenner.application.port.out}.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
@@ -0,0 +1,142 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.AtomicMoveNotSupportedException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.Paths;
|
||||
import java.nio.file.StandardCopyOption;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Filesystem-based implementation of {@link TargetFileCopyPort}.
|
||||
* <p>
|
||||
* Copies a source PDF to the configured target folder using a two-step approach:
|
||||
* <ol>
|
||||
* <li>Write the source content to a temporary file in the target folder.</li>
|
||||
* <li>Rename/move the temporary file to the final resolved filename.</li>
|
||||
* </ol>
|
||||
* The atomic-move option is attempted first. If the filesystem does not support atomic
|
||||
* moves (e.g., across different volumes), a standard move is used as a fallback.
|
||||
*
|
||||
* <h2>Source integrity</h2>
|
||||
* <p>
|
||||
* The source file is never modified, moved, or deleted. Only a copy is created.
|
||||
*
|
||||
* <h2>Temporary file naming</h2>
|
||||
* <p>
|
||||
* The temporary file uses the suffix {@code .tmp} appended to the resolved filename
|
||||
* and is placed in the same target folder. This ensures the final rename is typically
|
||||
* an intra-filesystem operation, maximising atomicity.
|
||||
*
|
||||
* <h2>Architecture boundary</h2>
|
||||
* <p>
|
||||
* All NIO operations are confined to this adapter. No {@code Path} or {@code File}
|
||||
* types appear in the port interface.
|
||||
*/
|
||||
public class FilesystemTargetFileCopyAdapter implements TargetFileCopyPort {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(FilesystemTargetFileCopyAdapter.class);
|
||||
|
||||
private final Path targetFolderPath;
|
||||
|
||||
/**
|
||||
* Creates the adapter for the given target folder.
|
||||
*
|
||||
* @param targetFolderPath the target folder path; must not be null
|
||||
* @throws NullPointerException if {@code targetFolderPath} is null
|
||||
*/
|
||||
public FilesystemTargetFileCopyAdapter(Path targetFolderPath) {
|
||||
this.targetFolderPath = Objects.requireNonNull(targetFolderPath, "targetFolderPath must not be null");
|
||||
}
|
||||
|
||||
/**
|
||||
* Copies the source document to the target folder under the given resolved filename.
|
||||
* <p>
|
||||
* The copy is performed via a temporary file ({@code resolvedFilename + ".tmp"}) in
|
||||
* the target folder followed by a move/rename to the final name.
|
||||
* <p>
|
||||
* If any step fails, a best-effort cleanup of the temporary file is attempted
|
||||
* before returning the failure result.
|
||||
*
|
||||
* @param sourceLocator opaque locator identifying the source file; must not be null
|
||||
* @param resolvedFilename the final filename in the target folder; must not be null or blank
|
||||
* @return {@link TargetFileCopySuccess} on success, or
|
||||
* {@link TargetFileCopyTechnicalFailure} on any failure
|
||||
*/
|
||||
@Override
|
||||
public TargetFileCopyResult copyToTarget(SourceDocumentLocator sourceLocator, String resolvedFilename) {
|
||||
Objects.requireNonNull(sourceLocator, "sourceLocator must not be null");
|
||||
Objects.requireNonNull(resolvedFilename, "resolvedFilename must not be null");
|
||||
|
||||
Path sourcePath = Paths.get(sourceLocator.value());
|
||||
Path finalTargetPath = targetFolderPath.resolve(resolvedFilename);
|
||||
Path tempTargetPath = targetFolderPath.resolve(resolvedFilename + ".tmp");
|
||||
|
||||
boolean tempCreated = false;
|
||||
|
||||
try {
|
||||
// Step 1: Copy source to temporary file in target folder
|
||||
Files.copy(sourcePath, tempTargetPath, StandardCopyOption.REPLACE_EXISTING);
|
||||
tempCreated = true;
|
||||
logger.debug("Copied source '{}' to temporary file '{}'.",
|
||||
sourceLocator.value(), tempTargetPath.getFileName());
|
||||
|
||||
// Step 2: Atomic move/rename to final target filename
|
||||
moveToFinalTarget(tempTargetPath, finalTargetPath);
|
||||
|
||||
logger.debug("Target copy completed: '{}'.", resolvedFilename);
|
||||
return new TargetFileCopySuccess();
|
||||
|
||||
} catch (Exception e) {
|
||||
String message = "Failed to copy source '" + sourceLocator.value()
|
||||
+ "' to target '" + resolvedFilename + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
|
||||
boolean cleaned = tempCreated && tryDeletePath(tempTargetPath);
|
||||
return new TargetFileCopyTechnicalFailure(message, cleaned);
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Moves the temporary file to the final target path.
|
||||
* Attempts an atomic move first; falls back to a standard move if the filesystem
|
||||
* does not support atomic moves.
|
||||
*/
|
||||
private void moveToFinalTarget(Path tempPath, Path finalPath) throws IOException {
|
||||
try {
|
||||
Files.move(tempPath, finalPath, StandardCopyOption.ATOMIC_MOVE);
|
||||
} catch (AtomicMoveNotSupportedException e) {
|
||||
logger.debug("Atomic move not supported, falling back to standard move.");
|
||||
Files.move(tempPath, finalPath, StandardCopyOption.REPLACE_EXISTING);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Best-effort deletion of a path. Returns {@code true} if deletion succeeded
|
||||
* or the file did not exist; {@code false} if an exception occurred.
|
||||
*/
|
||||
private boolean tryDeletePath(Path path) {
|
||||
try {
|
||||
Files.deleteIfExists(path);
|
||||
return true;
|
||||
} catch (IOException e) {
|
||||
logger.warn("Best-effort cleanup: could not delete temporary file '{}': {}",
|
||||
path, e.getMessage());
|
||||
return false;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,24 @@
|
||||
/**
|
||||
* Outbound adapter for writing the target file copy.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.targetcopy.FilesystemTargetFileCopyAdapter}
|
||||
* — Filesystem-based implementation of
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The adapter uses a two-step write pattern: the source is first copied to a temporary
|
||||
* file ({@code resolvedFilename + ".tmp"}) in the target folder, then renamed/moved to
|
||||
* the final filename. An atomic move is attempted first; a standard move is used as a
|
||||
* fallback when the filesystem does not support atomic cross-directory moves.
|
||||
* <p>
|
||||
* <strong>Source integrity:</strong> The source file is never modified, moved, or deleted.
|
||||
* Only a copy is created in the target folder.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All NIO ({@code Path}, {@code Files}) operations
|
||||
* are strictly confined to this package. The port interface
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyPort} contains no
|
||||
* filesystem types, preserving the hexagonal architecture boundary.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
@@ -0,0 +1,141 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.Objects;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
|
||||
/**
|
||||
* Filesystem-based implementation of {@link TargetFolderPort}.
|
||||
* <p>
|
||||
* Resolves unique filenames for the configured target folder by checking for existing
|
||||
* files and appending a numeric collision-avoidance suffix when necessary.
|
||||
*
|
||||
* <h2>Duplicate resolution algorithm</h2>
|
||||
* <p>
|
||||
* Given a base name such as {@code 2024-01-15 - Rechnung.pdf}, the adapter checks:
|
||||
* <ol>
|
||||
* <li>{@code 2024-01-15 - Rechnung.pdf} — if free, return it.</li>
|
||||
* <li>{@code 2024-01-15 - Rechnung(1).pdf} — if free, return it.</li>
|
||||
* <li>{@code 2024-01-15 - Rechnung(2).pdf} — and so on.</li>
|
||||
* </ol>
|
||||
* The suffix is inserted immediately before {@code .pdf}.
|
||||
* The 20-character base-title limit does not apply to the suffix.
|
||||
*
|
||||
* <h2>Architecture boundary</h2>
|
||||
* <p>
|
||||
* All NIO operations are confined to this adapter. No {@code Path} or {@code File} types
|
||||
* appear in the port interface.
|
||||
*/
|
||||
public class FilesystemTargetFolderAdapter implements TargetFolderPort {
|
||||
|
||||
private static final Logger logger = LogManager.getLogger(FilesystemTargetFolderAdapter.class);
|
||||
|
||||
/** Maximum number of duplicate suffixes attempted before giving up. */
|
||||
private static final int MAX_SUFFIX_ATTEMPTS = 9999;
|
||||
|
||||
private final Path targetFolderPath;
|
||||
|
||||
/**
|
||||
* Creates the adapter for the given target folder.
|
||||
*
|
||||
* @param targetFolderPath the target folder path; must not be null
|
||||
* @throws NullPointerException if {@code targetFolderPath} is null
|
||||
*/
|
||||
public FilesystemTargetFolderAdapter(Path targetFolderPath) {
|
||||
this.targetFolderPath = Objects.requireNonNull(targetFolderPath, "targetFolderPath must not be null");
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the absolute string representation of the target folder path.
|
||||
* <p>
|
||||
* Used by the application layer as an opaque target-folder locator for persistence.
|
||||
*
|
||||
* @return absolute path string of the target folder; never null or blank
|
||||
*/
|
||||
@Override
|
||||
public String getTargetFolderLocator() {
|
||||
return targetFolderPath.toAbsolutePath().toString();
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves the first available unique filename in the target folder for the given base name.
|
||||
* <p>
|
||||
* Checks for {@code baseName} first; if taken, appends {@code (1)}, {@code (2)}, etc.
|
||||
* directly before {@code .pdf} until a free name is found.
|
||||
*
|
||||
* @param baseName the desired filename including {@code .pdf} extension;
|
||||
* must not be null or blank
|
||||
* @return a {@link ResolvedTargetFilename} with the first available name, or a
|
||||
* {@link TargetFolderTechnicalFailure} if folder access fails
|
||||
*/
|
||||
@Override
|
||||
public TargetFilenameResolutionResult resolveUniqueFilename(String baseName) {
|
||||
Objects.requireNonNull(baseName, "baseName must not be null");
|
||||
|
||||
try {
|
||||
// Try without suffix first
|
||||
if (!Files.exists(targetFolderPath.resolve(baseName))) {
|
||||
logger.debug("Resolved target filename without suffix: '{}'", baseName);
|
||||
return new ResolvedTargetFilename(baseName);
|
||||
}
|
||||
|
||||
// Determine split point: everything before the final ".pdf"
|
||||
if (!baseName.toLowerCase().endsWith(".pdf")) {
|
||||
return new TargetFolderTechnicalFailure(
|
||||
"Base name does not end with .pdf: '" + baseName + "'");
|
||||
}
|
||||
String nameWithoutExt = baseName.substring(0, baseName.length() - 4);
|
||||
|
||||
// Try (1), (2), ...
|
||||
for (int i = 1; i <= MAX_SUFFIX_ATTEMPTS; i++) {
|
||||
String candidate = nameWithoutExt + "(" + i + ").pdf";
|
||||
if (!Files.exists(targetFolderPath.resolve(candidate))) {
|
||||
logger.debug("Resolved target filename with suffix ({}): '{}'", i, candidate);
|
||||
return new ResolvedTargetFilename(candidate);
|
||||
}
|
||||
}
|
||||
|
||||
return new TargetFolderTechnicalFailure(
|
||||
"Too many duplicate files for base name '" + baseName
|
||||
+ "': checked up to suffix (" + MAX_SUFFIX_ATTEMPTS + ")");
|
||||
|
||||
} catch (Exception e) {
|
||||
String message = "Failed to check target folder for duplicate resolution: " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
return new TargetFolderTechnicalFailure(message);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Best-effort deletion of a file in the target folder.
|
||||
* <p>
|
||||
* Used for rollback after a successful copy when subsequent persistence fails.
|
||||
* Never throws; all exceptions are caught and logged at warn level.
|
||||
*
|
||||
* @param resolvedFilename the filename (not full path) to delete; must not be null
|
||||
*/
|
||||
@Override
|
||||
public void tryDeleteTargetFile(String resolvedFilename) {
|
||||
Objects.requireNonNull(resolvedFilename, "resolvedFilename must not be null");
|
||||
try {
|
||||
boolean deleted = Files.deleteIfExists(targetFolderPath.resolve(resolvedFilename));
|
||||
if (deleted) {
|
||||
logger.debug("Best-effort rollback: deleted target file '{}'.", resolvedFilename);
|
||||
} else {
|
||||
logger.debug("Best-effort rollback: target file '{}' did not exist.", resolvedFilename);
|
||||
}
|
||||
} catch (IOException e) {
|
||||
logger.warn("Best-effort rollback: could not delete target file '{}': {}",
|
||||
resolvedFilename, e.getMessage());
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
/**
|
||||
* Outbound adapter for target folder management and unique filename resolution.
|
||||
* <p>
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.targetfolder.FilesystemTargetFolderAdapter}
|
||||
* — Filesystem-based implementation of
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort}.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Duplicate resolution:</strong> Given a base name such as
|
||||
* {@code 2024-01-15 - Rechnung.pdf}, the adapter checks whether the file exists in the
|
||||
* target folder and appends a numeric suffix ({@code (1)}, {@code (2)}, …) directly
|
||||
* before {@code .pdf} until a free name is found. The 20-character base-title limit
|
||||
* does not apply to the suffix.
|
||||
* <p>
|
||||
* <strong>Rollback support:</strong> The adapter provides a best-effort deletion method
|
||||
* used by the application layer to remove a successfully written target copy when
|
||||
* subsequent persistence fails, preventing orphaned target files.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All NIO ({@code Path}, {@code Files}) operations
|
||||
* are strictly confined to this package. The port interface
|
||||
* {@link de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderPort} contains no
|
||||
* filesystem types, preserving the hexagonal architecture boundary.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
@@ -0,0 +1,221 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.pdmodel.PDPage;
|
||||
import org.apache.pdfbox.pdmodel.PDPageContentStream;
|
||||
import org.apache.pdfbox.pdmodel.font.PDType1Font;
|
||||
import org.apache.pdfbox.pdmodel.font.Standard14Fonts;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.clock.SystemClockAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.fingerprint.Sha256FingerprintAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.lock.FilesystemRunLockPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction.PdfTextExtractionPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.prompt.FilesystemPromptPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument.SourceDocumentCandidatesPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteDocumentRecordRepositoryAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteProcessingAttemptRepositoryAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteSchemaInitializationAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sqlite.SqliteUnitOfWorkAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.targetcopy.FilesystemTargetFileCopyAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.targetfolder.FilesystemTargetFolderAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingLogger;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiNamingService;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.AiResponseValidator;
|
||||
import de.gecheckt.pdf.umbenenner.application.service.DocumentProcessingCoordinator;
|
||||
import de.gecheckt.pdf.umbenenner.application.usecase.DefaultBatchRunProcessingUseCase;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Integration test verifying that the Anthropic Claude adapter integrates correctly
|
||||
* with the full batch processing pipeline and that the provider identifier
|
||||
* {@code "claude"} is persisted in the processing attempt history.
|
||||
* <p>
|
||||
* Uses a mocked HTTP client to simulate the Anthropic API without real network calls.
|
||||
* All other adapters (SQLite, filesystem, PDF extraction, fingerprinting) are real
|
||||
* production implementations.
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@DisplayName("AnthropicClaudeAdapter integration")
|
||||
class AnthropicClaudeAdapterIntegrationTest {
|
||||
|
||||
/**
|
||||
* Pflicht-Testfall 15: claudeProviderIdentifierLandsInAttemptHistory
|
||||
* <p>
|
||||
* Verifies the end-to-end integration: the Claude adapter with a mocked HTTP layer
|
||||
* is wired into the batch pipeline, and after a successful run, the processing attempt
|
||||
* record contains {@code ai_provider='claude'}.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeProviderIdentifierLandsInAttemptHistory: ai_provider=claude in attempt history after successful run")
|
||||
void claudeProviderIdentifierLandsInAttemptHistory(@TempDir Path tempDir) throws Exception {
|
||||
// --- Infrastructure setup ---
|
||||
Path sourceFolder = Files.createDirectories(tempDir.resolve("source"));
|
||||
Path targetFolder = Files.createDirectories(tempDir.resolve("target"));
|
||||
Path promptFile = tempDir.resolve("prompt.txt");
|
||||
Files.writeString(promptFile, "Analysiere das Dokument und liefere JSON.");
|
||||
|
||||
String jdbcUrl = "jdbc:sqlite:" + tempDir.resolve("test.db")
|
||||
.toAbsolutePath().toString().replace('\\', '/');
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
// --- Create a searchable PDF in the source folder ---
|
||||
Path pdfPath = sourceFolder.resolve("testdokument.pdf");
|
||||
createSearchablePdf(pdfPath, "Testinhalt Rechnung Datum 15.01.2024 Betrag 99 EUR");
|
||||
|
||||
// --- Compute fingerprint for later verification ---
|
||||
Sha256FingerprintAdapter fingerprintAdapter = new Sha256FingerprintAdapter();
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate(
|
||||
pdfPath.getFileName().toString(), 0L,
|
||||
new SourceDocumentLocator(pdfPath.toAbsolutePath().toString()));
|
||||
DocumentFingerprint fingerprint = switch (fingerprintAdapter.computeFingerprint(candidate)) {
|
||||
case FingerprintSuccess s -> s.fingerprint();
|
||||
default -> throw new IllegalStateException("Fingerprint computation failed");
|
||||
};
|
||||
|
||||
// --- Mock the HTTP client for the Claude adapter ---
|
||||
HttpClient mockHttpClient = mock(HttpClient.class);
|
||||
// Build a valid Anthropic response with the NamingProposal JSON as text content
|
||||
String namingProposalJson =
|
||||
"{\\\"date\\\":\\\"2024-01-15\\\",\\\"title\\\":\\\"Testrechnung\\\","
|
||||
+ "\\\"reasoning\\\":\\\"Rechnung vom 15.01.2024\\\"}";
|
||||
String anthropicResponseBody = "{"
|
||||
+ "\"id\":\"msg_integration_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":[{\"type\":\"text\",\"text\":\"" + namingProposalJson + "\"}],"
|
||||
+ "\"stop_reason\":\"end_turn\""
|
||||
+ "}";
|
||||
|
||||
HttpResponse<String> mockHttpResponse = mockStringResponse(200, anthropicResponseBody);
|
||||
doReturn(mockHttpResponse).when(mockHttpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// --- Create the Claude adapter with the mocked HTTP client ---
|
||||
ProviderConfiguration claudeConfig = new ProviderConfiguration(
|
||||
"claude-3-5-sonnet-20241022", 60, "https://api.anthropic.com", "sk-ant-test");
|
||||
AnthropicClaudeHttpAdapter claudeAdapter =
|
||||
new AnthropicClaudeHttpAdapter(claudeConfig, mockHttpClient);
|
||||
|
||||
// --- Wire the full pipeline with provider identifier "claude" ---
|
||||
SqliteDocumentRecordRepositoryAdapter documentRepo =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
SqliteProcessingAttemptRepositoryAdapter attemptRepo =
|
||||
new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl);
|
||||
SqliteUnitOfWorkAdapter unitOfWork = new SqliteUnitOfWorkAdapter(jdbcUrl);
|
||||
|
||||
ProcessingLogger noOpLogger = new NoOpProcessingLogger();
|
||||
DocumentProcessingCoordinator coordinator = new DocumentProcessingCoordinator(
|
||||
documentRepo, attemptRepo, unitOfWork,
|
||||
new FilesystemTargetFolderAdapter(targetFolder),
|
||||
new FilesystemTargetFileCopyAdapter(targetFolder),
|
||||
noOpLogger,
|
||||
3,
|
||||
"claude"); // provider identifier for Claude
|
||||
|
||||
AiNamingService aiNamingService = new AiNamingService(
|
||||
claudeAdapter,
|
||||
new FilesystemPromptPortAdapter(promptFile),
|
||||
new AiResponseValidator(new SystemClockAdapter()),
|
||||
"claude-3-5-sonnet-20241022",
|
||||
10_000);
|
||||
|
||||
DefaultBatchRunProcessingUseCase useCase = new DefaultBatchRunProcessingUseCase(
|
||||
new RuntimeConfiguration(50, 3, AiContentSensitivity.PROTECT_SENSITIVE_CONTENT),
|
||||
new FilesystemRunLockPortAdapter(tempDir.resolve("run.lock")),
|
||||
new SourceDocumentCandidatesPortAdapter(sourceFolder),
|
||||
new PdfTextExtractionPortAdapter(),
|
||||
fingerprintAdapter,
|
||||
coordinator,
|
||||
aiNamingService,
|
||||
noOpLogger);
|
||||
|
||||
// --- Run the batch ---
|
||||
BatchRunContext context = new BatchRunContext(
|
||||
new RunId(UUID.randomUUID().toString()), Instant.now());
|
||||
useCase.execute(context);
|
||||
|
||||
// --- Verify: ai_provider='claude' is stored in the attempt history ---
|
||||
List<ProcessingAttempt> attempts = attemptRepo.findAllByFingerprint(fingerprint);
|
||||
assertThat(attempts)
|
||||
.as("At least one attempt must be recorded")
|
||||
.isNotEmpty();
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Provider identifier must be 'claude' in the attempt history")
|
||||
.isEqualTo("claude");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Helpers
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Creates a typed mock {@link HttpResponse} to avoid unchecked-cast warnings at call sites.
|
||||
* The suppression is confined to this helper because the raw-type cast is technically
|
||||
* unavoidable due to type erasure when mocking generic interfaces.
|
||||
*/
|
||||
@SuppressWarnings("unchecked")
|
||||
private static HttpResponse<String> mockStringResponse(int statusCode, String body) {
|
||||
HttpResponse<String> response = (HttpResponse<String>) mock(HttpResponse.class);
|
||||
when(response.statusCode()).thenReturn(statusCode);
|
||||
when(response.body()).thenReturn(body);
|
||||
return response;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a single-page searchable PDF with embedded text using PDFBox.
|
||||
*/
|
||||
private static void createSearchablePdf(Path pdfPath, String text) throws Exception {
|
||||
try (PDDocument doc = new PDDocument()) {
|
||||
PDPage page = new PDPage();
|
||||
doc.addPage(page);
|
||||
try (PDPageContentStream cs = new PDPageContentStream(doc, page)) {
|
||||
cs.beginText();
|
||||
cs.setFont(new PDType1Font(Standard14Fonts.FontName.HELVETICA), 12);
|
||||
cs.newLineAtOffset(50, 700);
|
||||
cs.showText(text);
|
||||
cs.endText();
|
||||
}
|
||||
doc.save(pdfPath.toFile());
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* No-op implementation of {@link ProcessingLogger} for use in integration tests
|
||||
* where log output is not relevant to the assertion.
|
||||
*/
|
||||
private static class NoOpProcessingLogger implements ProcessingLogger {
|
||||
@Override public void info(String message, Object... args) {}
|
||||
@Override public void debug(String message, Object... args) {}
|
||||
@Override public void warn(String message, Object... args) {}
|
||||
@Override public void error(String message, Object... args) {}
|
||||
@Override public void debugSensitiveAiContent(String message, Object... args) {}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,702 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.verify;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.ConnectException;
|
||||
import java.net.UnknownHostException;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.net.http.HttpTimeoutException;
|
||||
import java.time.Duration;
|
||||
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONObject;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.mockito.ArgumentCaptor;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.configuration.MultiProviderConfigurationValidator;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link AnthropicClaudeHttpAdapter}.
|
||||
* <p>
|
||||
* Tests inject a mock {@link HttpClient} via the package-private constructor
|
||||
* to exercise the adapter path without requiring network access.
|
||||
* Configuration is supplied via {@link ProviderConfiguration}.
|
||||
* <p>
|
||||
* Covered scenarios:
|
||||
* <ul>
|
||||
* <li>Correct HTTP request structure (URL, method, headers, body)</li>
|
||||
* <li>API key resolution (env var vs. properties value)</li>
|
||||
* <li>Configuration validation for missing API key</li>
|
||||
* <li>Single and multiple text-block extraction from Anthropic response</li>
|
||||
* <li>Ignoring non-text content blocks</li>
|
||||
* <li>Technical failure when no text blocks are present</li>
|
||||
* <li>HTTP 4xx (401, 429) and 5xx (500) mapped to technical failure</li>
|
||||
* <li>Timeout mapped to technical failure</li>
|
||||
* <li>Unparseable JSON response mapped to technical failure</li>
|
||||
* </ul>
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@DisplayName("AnthropicClaudeHttpAdapter")
|
||||
class AnthropicClaudeHttpAdapterTest {
|
||||
|
||||
private static final String API_BASE_URL = "https://api.anthropic.com";
|
||||
private static final String API_MODEL = "claude-3-5-sonnet-20241022";
|
||||
private static final String API_KEY = "sk-ant-test-key-12345";
|
||||
private static final int TIMEOUT_SECONDS = 60;
|
||||
|
||||
@Mock
|
||||
private HttpClient httpClient;
|
||||
|
||||
private ProviderConfiguration testConfiguration;
|
||||
private AnthropicClaudeHttpAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
testConfiguration = new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
adapter = new AnthropicClaudeHttpAdapter(testConfiguration, httpClient);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 1: claudeAdapterBuildsCorrectRequest
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that the adapter constructs the correct HTTP request:
|
||||
* URL with {@code /v1/messages} path, method POST, all three required headers
|
||||
* ({@code x-api-key}, {@code anthropic-version}, {@code content-type}), and
|
||||
* a body with {@code model}, {@code max_tokens > 0}, and {@code messages} containing
|
||||
* exactly one user message with the document text.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterBuildsCorrectRequest: correct URL, method, headers, and body")
|
||||
void claudeAdapterBuildsCorrectRequest() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, buildAnthropicSuccessResponse(
|
||||
"{\"date\":\"2024-01-15\",\"title\":\"Testititel\",\"reasoning\":\"Test\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("System-Prompt", "Dokumenttext");
|
||||
adapter.invoke(request);
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
|
||||
// URL must point to /v1/messages
|
||||
assertThat(capturedRequest.uri().toString())
|
||||
.as("URL must be based on configured baseUrl")
|
||||
.startsWith(API_BASE_URL)
|
||||
.endsWith("/v1/messages");
|
||||
|
||||
// Method must be POST
|
||||
assertThat(capturedRequest.method()).isEqualTo("POST");
|
||||
|
||||
// All three required headers must be present
|
||||
assertThat(capturedRequest.headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must be present")
|
||||
.isPresent();
|
||||
assertThat(capturedRequest.headers().firstValue("anthropic-version"))
|
||||
.as("anthropic-version header must be present")
|
||||
.isPresent()
|
||||
.hasValue("2023-06-01");
|
||||
assertThat(capturedRequest.headers().firstValue("content-type"))
|
||||
.as("content-type header must be present")
|
||||
.isPresent();
|
||||
|
||||
// Body must contain model, max_tokens > 0, and messages with one user message
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
JSONObject body = new JSONObject(sentBody);
|
||||
assertThat(body.getString("model"))
|
||||
.as("model must match configuration")
|
||||
.isEqualTo(API_MODEL);
|
||||
assertThat(body.getInt("max_tokens"))
|
||||
.as("max_tokens must be positive")
|
||||
.isGreaterThan(0);
|
||||
assertThat(body.getJSONArray("messages").length())
|
||||
.as("messages must contain exactly one entry")
|
||||
.isEqualTo(1);
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("role"))
|
||||
.as("the single message must be a user message")
|
||||
.isEqualTo("user");
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("content"))
|
||||
.as("user message content must be the document text")
|
||||
.isEqualTo("Dokumenttext");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 2: claudeAdapterUsesEnvVarApiKey
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when the {@code ANTHROPIC_API_KEY} environment variable is the source
|
||||
* of the resolved API key (represented in ProviderConfiguration after env-var precedence
|
||||
* was applied by the configuration layer), the adapter uses that key in the
|
||||
* {@code x-api-key} header.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterUsesEnvVarApiKey: env var value reaches x-api-key header")
|
||||
void claudeAdapterUsesEnvVarApiKey() throws Exception {
|
||||
String envVarValue = "sk-ant-from-env-variable";
|
||||
// Env var takes precedence: the configuration layer resolves this into apiKey
|
||||
ProviderConfiguration configWithEnvKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, envVarValue);
|
||||
AnthropicClaudeHttpAdapter adapterWithEnvKey =
|
||||
new AnthropicClaudeHttpAdapter(configWithEnvKey, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithEnvKey.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must contain the env var value")
|
||||
.hasValue(envVarValue);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 3: claudeAdapterFallsBackToPropertiesApiKey
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when no environment variable is set, the API key from the
|
||||
* properties configuration is used in the {@code x-api-key} header.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFallsBackToPropertiesApiKey: properties key reaches x-api-key header")
|
||||
void claudeAdapterFallsBackToPropertiesApiKey() throws Exception {
|
||||
String propertiesKey = "sk-ant-from-properties";
|
||||
ProviderConfiguration configWithPropertiesKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, propertiesKey);
|
||||
AnthropicClaudeHttpAdapter adapterWithPropertiesKey =
|
||||
new AnthropicClaudeHttpAdapter(configWithPropertiesKey, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPropertiesKey.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().headers().firstValue("x-api-key"))
|
||||
.as("x-api-key header must contain the properties value")
|
||||
.hasValue(propertiesKey);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 4: claudeAdapterFailsValidationWhenBothKeysMissing
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that when both the environment variable and the properties API key for the
|
||||
* Claude provider are empty, the {@link MultiProviderConfigurationValidator} rejects the
|
||||
* configuration with an {@link InvalidStartConfigurationException}.
|
||||
* <p>
|
||||
* This confirms that the adapter is protected by startup validation (from AP-001)
|
||||
* and will never be constructed with a truly missing API key in production.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFailsValidationWhenBothKeysMissing: validator rejects empty API key for Claude")
|
||||
void claudeAdapterFailsValidationWhenBothKeysMissing() {
|
||||
// Simulate both env var and properties key being absent (empty resolved key)
|
||||
ProviderConfiguration claudeConfigWithoutKey = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, "");
|
||||
ProviderConfiguration inactiveOpenAiConfig = new ProviderConfiguration(
|
||||
"unused-model", 0, null, null);
|
||||
MultiProviderConfiguration config = new MultiProviderConfiguration(
|
||||
AiProviderFamily.CLAUDE, inactiveOpenAiConfig, claudeConfigWithoutKey);
|
||||
|
||||
MultiProviderConfigurationValidator validator = new MultiProviderConfigurationValidator();
|
||||
|
||||
assertThatThrownBy(() -> validator.validate(config))
|
||||
.as("Validator must reject Claude configuration with empty API key")
|
||||
.isInstanceOf(InvalidStartConfigurationException.class);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 5: claudeAdapterParsesSingleTextBlock
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a response with a single text block is correctly extracted.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterParsesSingleTextBlock: single text block becomes raw response")
|
||||
void claudeAdapterParsesSingleTextBlock() throws Exception {
|
||||
String blockText = "{\"date\":\"2024-01-15\",\"title\":\"Rechnung\",\"reasoning\":\"Test\"}";
|
||||
String responseBody = buildAnthropicSuccessResponse(blockText);
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseBody);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
AiInvocationSuccess success = (AiInvocationSuccess) result;
|
||||
assertThat(success.rawResponse().content())
|
||||
.as("Raw response must equal the text block content")
|
||||
.isEqualTo(blockText);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 6: claudeAdapterConcatenatesMultipleTextBlocks
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that multiple text blocks are concatenated in order.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterConcatenatesMultipleTextBlocks: text blocks are concatenated in order")
|
||||
void claudeAdapterConcatenatesMultipleTextBlocks() throws Exception {
|
||||
String part1 = "Erster Teil der Antwort. ";
|
||||
String part2 = "Zweiter Teil der Antwort.";
|
||||
|
||||
// Build the response using JSONObject to ensure correct escaping
|
||||
JSONObject block1 = new JSONObject();
|
||||
block1.put("type", "text");
|
||||
block1.put("text", part1);
|
||||
JSONObject block2 = new JSONObject();
|
||||
block2.put("type", "text");
|
||||
block2.put("text", part2);
|
||||
JSONObject responseJson = new JSONObject();
|
||||
responseJson.put("id", "msg_test");
|
||||
responseJson.put("type", "message");
|
||||
responseJson.put("role", "assistant");
|
||||
responseJson.put("content", new JSONArray().put(block1).put(block2));
|
||||
responseJson.put("stop_reason", "end_turn");
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseJson.toString());
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content())
|
||||
.as("Multiple text blocks must be concatenated in order")
|
||||
.isEqualTo(part1 + part2);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 7: claudeAdapterIgnoresNonTextBlocks
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that non-text content blocks (e.g., tool_use) are ignored and only
|
||||
* the text blocks contribute to the raw response.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterIgnoresNonTextBlocks: only text-type blocks contribute to response")
|
||||
void claudeAdapterIgnoresNonTextBlocks() throws Exception {
|
||||
String textContent = "Nur dieser Text zaehlt als Antwort.";
|
||||
|
||||
// Build response with a tool_use block before and a tool_result-like block after the text block
|
||||
JSONObject toolUseBlock = new JSONObject();
|
||||
toolUseBlock.put("type", "tool_use");
|
||||
toolUseBlock.put("id", "tool_1");
|
||||
toolUseBlock.put("name", "get_weather");
|
||||
toolUseBlock.put("input", new JSONObject());
|
||||
|
||||
JSONObject textBlock = new JSONObject();
|
||||
textBlock.put("type", "text");
|
||||
textBlock.put("text", textContent);
|
||||
|
||||
JSONObject ignoredBlock = new JSONObject();
|
||||
ignoredBlock.put("type", "tool_result");
|
||||
ignoredBlock.put("content", "irrelevant");
|
||||
|
||||
JSONObject responseJson = new JSONObject();
|
||||
responseJson.put("id", "msg_test");
|
||||
responseJson.put("type", "message");
|
||||
responseJson.put("role", "assistant");
|
||||
responseJson.put("content", new JSONArray().put(toolUseBlock).put(textBlock).put(ignoredBlock));
|
||||
responseJson.put("stop_reason", "end_turn");
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseJson.toString());
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content())
|
||||
.as("Only text-type blocks must contribute to the raw response")
|
||||
.isEqualTo(textContent);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 8: claudeAdapterFailsOnEmptyTextContent
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a response with no text-type content blocks results in a
|
||||
* technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterFailsOnEmptyTextContent: no text blocks yields technical failure")
|
||||
void claudeAdapterFailsOnEmptyTextContent() throws Exception {
|
||||
String noTextBlockResponse = "{"
|
||||
+ "\"id\":\"msg_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":["
|
||||
+ "{\"type\":\"tool_use\",\"id\":\"tool_1\",\"name\":\"unused\",\"input\":{}}"
|
||||
+ "],"
|
||||
+ "\"stop_reason\":\"tool_use\""
|
||||
+ "}";
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, noTextBlockResponse);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason())
|
||||
.isEqualTo("NO_TEXT_CONTENT");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 9: claudeAdapterMapsHttp401AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 401 (Unauthorized) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp401AsTechnical: HTTP 401 yields technical failure")
|
||||
void claudeAdapterMapsHttp401AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(401, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_401");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 10: claudeAdapterMapsHttp429AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 429 (Rate Limit Exceeded) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp429AsTechnical: HTTP 429 yields technical failure")
|
||||
void claudeAdapterMapsHttp429AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(429, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_429");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 11: claudeAdapterMapsHttp500AsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that HTTP 500 (Internal Server Error) is classified as a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsHttp500AsTechnical: HTTP 500 yields technical failure")
|
||||
void claudeAdapterMapsHttp500AsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(500, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_500");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 12: claudeAdapterMapsTimeoutAsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a simulated HTTP timeout results in a technical failure with
|
||||
* reason {@code TIMEOUT}.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsTimeoutAsTechnical: timeout yields TIMEOUT technical failure")
|
||||
void claudeAdapterMapsTimeoutAsTechnical() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new HttpTimeoutException("Connection timed out"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("TIMEOUT");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Pflicht-Testfall 13: claudeAdapterMapsUnparseableJsonAsTechnical
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that a non-JSON response body (e.g., an HTML error page or plain text)
|
||||
* returned with HTTP 200 results in a technical failure.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("claudeAdapterMapsUnparseableJsonAsTechnical: non-JSON body yields technical failure")
|
||||
void claudeAdapterMapsUnparseableJsonAsTechnical() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
"<html><body>Service unavailable</body></html>");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("UNPARSEABLE_JSON");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Additional behavioral tests
|
||||
// =========================================================================
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured model in request body")
|
||||
void testConfiguredModelIsUsedInRequestBody() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(new JSONObject(sentBody).getString("model")).isEqualTo(API_MODEL);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured timeout in request")
|
||||
void testConfiguredTimeoutIsUsedInRequest() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapter.invoke(createTestRequest("prompt", "doc"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().timeout())
|
||||
.isPresent()
|
||||
.get()
|
||||
.isEqualTo(Duration.ofSeconds(TIMEOUT_SECONDS));
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should place prompt content in system field and document text in user message")
|
||||
void testPromptContentGoesToSystemFieldDocumentTextToUserMessage() throws Exception {
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
String promptContent = "Du bist ein Assistent zur Dokumentenbenennung.";
|
||||
String documentText = "Rechnungstext des Dokuments.";
|
||||
adapter.invoke(createTestRequest(promptContent, documentText));
|
||||
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
JSONObject body = new JSONObject(sentBody);
|
||||
|
||||
assertThat(body.getString("system"))
|
||||
.as("Prompt content must be placed in the top-level system field")
|
||||
.isEqualTo(promptContent);
|
||||
assertThat(body.getJSONArray("messages").getJSONObject(0).getString("content"))
|
||||
.as("Document text must be placed in the user message content")
|
||||
.isEqualTo(documentText);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should map CONNECTION_ERROR when ConnectException is thrown")
|
||||
void testConnectionExceptionIsMappedToConnectionError() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new ConnectException("Connection refused"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("CONNECTION_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should map DNS_ERROR when UnknownHostException is thrown")
|
||||
void testUnknownHostExceptionIsMappedToDnsError() throws Exception {
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new UnknownHostException("api.anthropic.com"));
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("DNS_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when request is null")
|
||||
void testNullRequestThrowsException() {
|
||||
assertThatThrownBy(() -> adapter.invoke(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("request must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when configuration is null")
|
||||
void testNullConfigurationThrowsException() {
|
||||
assertThatThrownBy(() -> new AnthropicClaudeHttpAdapter(null, httpClient))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("config must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is blank")
|
||||
void testBlankApiModelThrowsException() {
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
" ", TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new AnthropicClaudeHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("API model must not be null or empty");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use default base URL when baseUrl is null")
|
||||
void testDefaultBaseUrlUsedWhenNull() throws Exception {
|
||||
ProviderConfiguration configWithoutBaseUrl = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, null, API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithDefault =
|
||||
new AnthropicClaudeHttpAdapter(configWithoutBaseUrl, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithDefault.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Default base URL https://api.anthropic.com must be used when baseUrl is null")
|
||||
.startsWith("https://api.anthropic.com");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a custom, non-default base URL is used in the request.
|
||||
* <p>
|
||||
* This test uses a URL that differs from the default {@code https://api.anthropic.com},
|
||||
* ensuring the conditional that selects between the configured URL and the default
|
||||
* is correctly evaluated. If the conditional were negated, the request would be sent
|
||||
* to the default URL instead of the custom one.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should use custom non-default base URL when provided")
|
||||
void customNonDefaultBaseUrlIsUsedInRequest() throws Exception {
|
||||
String customBaseUrl = "http://internal.proxy.example.com:8080";
|
||||
ProviderConfiguration configWithCustomUrl = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, customBaseUrl, API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithCustomUrl =
|
||||
new AnthropicClaudeHttpAdapter(configWithCustomUrl, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithCustomUrl.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Custom non-default base URL must be used, not the default api.anthropic.com")
|
||||
.startsWith("http://internal.proxy.example.com:8080");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a port value of 0 in the base URL is not included in the endpoint URI.
|
||||
* <p>
|
||||
* {@link java.net.URI#getPort()} returns {@code 0} when the URL explicitly specifies
|
||||
* port 0. The endpoint builder must only include the port when it is greater than 0,
|
||||
* not when it is equal to 0 or negative.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should not include port 0 in the endpoint URI")
|
||||
void buildEndpointUri_doesNotIncludePortZero() throws Exception {
|
||||
ProviderConfiguration configWithPortZero = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, "http://example.com:0", API_KEY);
|
||||
AnthropicClaudeHttpAdapter adapterWithPortZero =
|
||||
new AnthropicClaudeHttpAdapter(configWithPortZero, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
buildAnthropicSuccessResponse("{\"title\":\"T\",\"reasoning\":\"R\"}"));
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPortZero.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Port 0 must not appear in the endpoint URI")
|
||||
.doesNotContain(":0");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Helper methods
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Builds a minimal valid Anthropic Messages API response body with a single text block.
|
||||
*/
|
||||
private static String buildAnthropicSuccessResponse(String textContent) {
|
||||
// Escape the textContent for embedding in JSON string
|
||||
String escaped = textContent
|
||||
.replace("\\", "\\\\")
|
||||
.replace("\"", "\\\"");
|
||||
return "{"
|
||||
+ "\"id\":\"msg_test\","
|
||||
+ "\"type\":\"message\","
|
||||
+ "\"role\":\"assistant\","
|
||||
+ "\"content\":[{\"type\":\"text\",\"text\":\"" + escaped + "\"}],"
|
||||
+ "\"stop_reason\":\"end_turn\""
|
||||
+ "}";
|
||||
}
|
||||
|
||||
@SuppressWarnings("unchecked")
|
||||
private HttpResponse<String> mockHttpResponse(int statusCode, String body) {
|
||||
HttpResponse<String> response = (HttpResponse<String>) mock(HttpResponse.class);
|
||||
when(response.statusCode()).thenReturn(statusCode);
|
||||
if (body != null) {
|
||||
when(response.body()).thenReturn(body);
|
||||
}
|
||||
return response;
|
||||
}
|
||||
|
||||
private AiRequestRepresentation createTestRequest(String promptContent, String documentText) {
|
||||
return new AiRequestRepresentation(
|
||||
new PromptIdentifier("test-v1"),
|
||||
promptContent,
|
||||
documentText,
|
||||
documentText.length()
|
||||
);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,608 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.ai;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
import static org.mockito.ArgumentMatchers.any;
|
||||
import static org.mockito.Mockito.doReturn;
|
||||
import static org.mockito.Mockito.mock;
|
||||
import static org.mockito.Mockito.verify;
|
||||
import static org.mockito.Mockito.when;
|
||||
|
||||
import java.net.ConnectException;
|
||||
import java.net.UnknownHostException;
|
||||
import java.net.http.HttpClient;
|
||||
import java.net.http.HttpRequest;
|
||||
import java.net.http.HttpResponse;
|
||||
import java.net.http.HttpTimeoutException;
|
||||
import java.time.Duration;
|
||||
|
||||
import org.json.JSONArray;
|
||||
import org.json.JSONObject;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.DisplayName;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.extension.ExtendWith;
|
||||
import org.mockito.ArgumentCaptor;
|
||||
import org.mockito.Mock;
|
||||
import org.mockito.junit.jupiter.MockitoExtension;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.ProviderConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiInvocationTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PromptIdentifier;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link OpenAiHttpAdapter}.
|
||||
* <p>
|
||||
* <strong>Test strategy:</strong>
|
||||
* Tests inject a mock {@link HttpClient} via the package-private constructor
|
||||
* to exercise the real HTTP adapter path without requiring network access.
|
||||
* Configuration is supplied via {@link ProviderConfiguration}.
|
||||
* <p>
|
||||
* <strong>Coverage goals:</strong>
|
||||
* <ul>
|
||||
* <li>Successful HTTP 200 responses are mapped to {@link AiInvocationSuccess}</li>
|
||||
* <li>Raw response body is preserved exactly</li>
|
||||
* <li>HTTP non-2xx responses are mapped to technical failure</li>
|
||||
* <li>HTTP timeout exceptions are classified as TIMEOUT</li>
|
||||
* <li>Connection failures are classified as CONNECTION_ERROR</li>
|
||||
* <li>DNS errors are classified as DNS_ERROR</li>
|
||||
* <li>IO errors are classified as IO_ERROR</li>
|
||||
* <li>Interrupted operations are classified as INTERRUPTED</li>
|
||||
* <li>Configured timeout is actually used in the request</li>
|
||||
* <li>Configured base URL is actually used in the endpoint</li>
|
||||
* <li>Configured model name is actually used in the request body</li>
|
||||
* <li>Effective API key is actually used in the Authorization header</li>
|
||||
* <li>Full document text is sent (not truncated)</li>
|
||||
* <li>Null request raises NullPointerException</li>
|
||||
* <li>Adapter reads all values from ProviderConfiguration (AP-003)</li>
|
||||
* <li>Behavioral contracts are unchanged after constructor change (AP-003)</li>
|
||||
* </ul>
|
||||
*/
|
||||
@ExtendWith(MockitoExtension.class)
|
||||
@DisplayName("OpenAiHttpAdapter")
|
||||
class OpenAiHttpAdapterTest {
|
||||
|
||||
private static final String API_BASE_URL = "https://api.example.com";
|
||||
private static final String API_MODEL = "test-model-v1";
|
||||
private static final String API_KEY = "test-key-12345";
|
||||
private static final int TIMEOUT_SECONDS = 45;
|
||||
|
||||
@Mock
|
||||
private HttpClient httpClient;
|
||||
|
||||
private ProviderConfiguration testConfiguration;
|
||||
private OpenAiHttpAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
testConfiguration = new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
// Use the package-private constructor with injected mock HttpClient
|
||||
adapter = new OpenAiHttpAdapter(testConfiguration, httpClient);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return AiInvocationSuccess when HTTP 200 is received with raw response")
|
||||
void testSuccessfulInvocationWith200Response() throws Exception {
|
||||
// Arrange
|
||||
String responseBody = "{\"choices\":[{\"message\":{\"content\":\"test response\"}}]}";
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, responseBody);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
AiInvocationSuccess success = (AiInvocationSuccess) result;
|
||||
assertThat(success.request()).isEqualTo(request);
|
||||
assertThat(success.rawResponse().content()).isEqualTo(responseBody);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return technical failure when HTTP 500 is received")
|
||||
void testNon200HttpStatusReturnsTechnicalFailure() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(500, null);
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("HTTP_500");
|
||||
assertThat(failure.failureMessage()).contains("500");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return TIMEOUT failure when HttpTimeoutException is thrown")
|
||||
void testTimeoutExceptionIsMappedToTimeout() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new HttpTimeoutException("Request timed out"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("TIMEOUT");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return CONNECTION_ERROR when ConnectException is thrown")
|
||||
void testConnectionExceptionIsMappedToConnectionError() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new ConnectException("Connection refused"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("CONNECTION_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return DNS_ERROR when UnknownHostException is thrown")
|
||||
void testDnsExceptionIsMappedToDnsError() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new UnknownHostException("api.example.com"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("DNS_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return IO_ERROR when IOException is thrown")
|
||||
void testIoExceptionIsMappedToIoError() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new java.io.IOException("Network unreachable"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("IO_ERROR");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should return INTERRUPTED when InterruptedException is thrown")
|
||||
void testInterruptedExceptionIsMappedToInterrupted() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new InterruptedException("Thread interrupted"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("INTERRUPTED");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured timeout value in the actual HTTP request")
|
||||
void testConfiguredTimeoutIsUsedInRequest() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
adapter.invoke(request);
|
||||
|
||||
// Assert - verify the actual timeout is set on the HttpRequest itself
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
assertThat(capturedRequest.timeout())
|
||||
.as("HttpRequest timeout should be present")
|
||||
.isPresent()
|
||||
.get()
|
||||
.isEqualTo(Duration.ofSeconds(TIMEOUT_SECONDS));
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured base URL in the endpoint")
|
||||
void testConfiguredBaseUrlIsUsedInEndpoint() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
adapter.invoke(request);
|
||||
|
||||
// Assert - capture the request and verify URI contains base URL
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
assertThat(capturedRequest.uri().toString())
|
||||
.startsWith(API_BASE_URL)
|
||||
.contains("/v1/chat/completions");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use configured model name in the request body")
|
||||
void testConfiguredModelIsUsedInRequestBody() throws Exception {
|
||||
// Arrange
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// Act - invoke to trigger actual request building
|
||||
adapter.invoke(request);
|
||||
|
||||
// Assert - verify model is in the actual request body that was sent
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
// Get the actual body that was sent in the request via test accessor
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(sentBody)
|
||||
.as("The actual HTTP request body should contain the configured model")
|
||||
.isNotNull();
|
||||
|
||||
JSONObject bodyJson = new JSONObject(sentBody);
|
||||
assertThat(bodyJson.getString("model"))
|
||||
.as("Model in actual request body must match configuration")
|
||||
.isEqualTo(API_MODEL);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should use effective API key in Authorization header")
|
||||
void testEffectiveApiKeyIsUsedInAuthorizationHeader() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
adapter.invoke(request);
|
||||
|
||||
// Assert - verify the Authorization header was set
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
assertThat(capturedRequest.headers().map())
|
||||
.containsKey("Authorization")
|
||||
.doesNotContainValue(null);
|
||||
|
||||
// Verify header contains the API key
|
||||
var authHeaders = capturedRequest.headers().allValues("Authorization");
|
||||
assertThat(authHeaders).isNotEmpty();
|
||||
assertThat(authHeaders.get(0)).startsWith("Bearer ").contains(API_KEY);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should send full document text without truncation")
|
||||
void testFullDocumentTextIsSentWithoutTruncation() throws Exception {
|
||||
// Arrange
|
||||
String fullDocumentText = "This is a long document text that should be sent in full.";
|
||||
int sentCharacterCount = 20; // Less than full length
|
||||
PromptIdentifier promptId = new PromptIdentifier("v1");
|
||||
AiRequestRepresentation request = new AiRequestRepresentation(
|
||||
promptId,
|
||||
"Test prompt",
|
||||
fullDocumentText,
|
||||
sentCharacterCount
|
||||
);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
// Act - invoke to trigger actual request building
|
||||
adapter.invoke(request);
|
||||
|
||||
// Assert - verify the full document text is in the actual request body sent (not truncated)
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
|
||||
// Get the actual body that was sent in the request via test accessor
|
||||
String sentBody = adapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(sentBody)
|
||||
.as("The actual HTTP request body should contain the full document text")
|
||||
.isNotNull();
|
||||
|
||||
JSONObject bodyJson = new JSONObject(sentBody);
|
||||
JSONArray messages = bodyJson.getJSONArray("messages");
|
||||
JSONObject userMessage = messages.getJSONObject(1); // User message is second
|
||||
String contentInBody = userMessage.getString("content");
|
||||
|
||||
// Prove the full text is sent in the actual request, not truncated by sentCharacterCount
|
||||
assertThat(contentInBody)
|
||||
.as("Document text in actual request body must be the full text")
|
||||
.isEqualTo(fullDocumentText);
|
||||
assertThat(contentInBody)
|
||||
.as("Sent text must not be truncated to sentCharacterCount")
|
||||
.isNotEqualTo(fullDocumentText.substring(0, sentCharacterCount));
|
||||
assertThat(contentInBody.length())
|
||||
.as("Text length must match full document, not truncated")
|
||||
.isEqualTo(fullDocumentText.length())
|
||||
.isGreaterThan(sentCharacterCount);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should preserve request in success result")
|
||||
void testSuccessPreservesRequest() throws Exception {
|
||||
// Arrange
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{\"result\":\"ok\"}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
AiInvocationSuccess success = (AiInvocationSuccess) result;
|
||||
assertThat(success.request()).isSameAs(request);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should preserve request in failure result")
|
||||
void testFailurePreservesRequest() throws Exception {
|
||||
// Arrange
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new ConnectException("Connection refused"));
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act
|
||||
AiInvocationResult result = adapter.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
AiInvocationTechnicalFailure failure = (AiInvocationTechnicalFailure) result;
|
||||
assertThat(failure.request()).isSameAs(request);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when request is null")
|
||||
void testNullRequestThrowsException() {
|
||||
assertThatThrownBy(() -> adapter.invoke(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("request must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when configuration is null")
|
||||
void testNullConfigurationThrowsException() {
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(null, httpClient))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("config must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw NullPointerException when HttpClient is null")
|
||||
void testNullHttpClientThrowsException() {
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(testConfiguration, null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("httpClient must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API base URL is null")
|
||||
void testNullApiBaseUrlThrowsException() {
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, null, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("API base URL must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is null")
|
||||
void testNullApiModelThrowsException() {
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
null, TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("API model must not be null or empty");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should throw IllegalArgumentException when API model is blank")
|
||||
void testBlankApiModelThrowsException() {
|
||||
ProviderConfiguration invalidConfig = new ProviderConfiguration(
|
||||
" ", TIMEOUT_SECONDS, API_BASE_URL, API_KEY);
|
||||
|
||||
assertThatThrownBy(() -> new OpenAiHttpAdapter(invalidConfig, httpClient))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("API model must not be null or empty");
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("should handle empty API key gracefully")
|
||||
void testEmptyApiKeyHandled() throws Exception {
|
||||
// Arrange
|
||||
OpenAiHttpAdapter adapterWithEmptyKey = new OpenAiHttpAdapter(
|
||||
new ProviderConfiguration(API_MODEL, TIMEOUT_SECONDS, API_BASE_URL, ""),
|
||||
httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("Test prompt", "Test document");
|
||||
|
||||
// Act - should not throw exception
|
||||
AiInvocationResult result = adapterWithEmptyKey.invoke(request);
|
||||
|
||||
// Assert
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory AP-003 test cases
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Verifies that the adapter reads all values from the new {@link ProviderConfiguration}
|
||||
* namespace and uses them correctly in outgoing HTTP requests.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("openAiAdapterReadsValuesFromNewNamespace: all ProviderConfiguration fields are used")
|
||||
void openAiAdapterReadsValuesFromNewNamespace() throws Exception {
|
||||
// Arrange: ProviderConfiguration with values distinct from setUp defaults
|
||||
ProviderConfiguration nsConfig = new ProviderConfiguration(
|
||||
"ns-model-v2", 20, "https://provider-ns.example.com", "ns-api-key-abc");
|
||||
OpenAiHttpAdapter nsAdapter = new OpenAiHttpAdapter(nsConfig, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200, "{}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiRequestRepresentation request = createTestRequest("prompt", "document");
|
||||
nsAdapter.invoke(request);
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
HttpRequest capturedRequest = requestCaptor.getValue();
|
||||
|
||||
// Verify baseUrl from ProviderConfiguration
|
||||
assertThat(capturedRequest.uri().toString())
|
||||
.as("baseUrl must come from ProviderConfiguration")
|
||||
.startsWith("https://provider-ns.example.com");
|
||||
|
||||
// Verify apiKey from ProviderConfiguration
|
||||
assertThat(capturedRequest.headers().firstValue("Authorization").orElse(""))
|
||||
.as("apiKey must come from ProviderConfiguration")
|
||||
.contains("ns-api-key-abc");
|
||||
|
||||
// Verify model from ProviderConfiguration
|
||||
String body = nsAdapter.getLastBuiltJsonBodyForTesting();
|
||||
assertThat(new JSONObject(body).getString("model"))
|
||||
.as("model must come from ProviderConfiguration")
|
||||
.isEqualTo("ns-model-v2");
|
||||
|
||||
// Verify timeout from ProviderConfiguration
|
||||
assertThat(capturedRequest.timeout())
|
||||
.as("timeout must come from ProviderConfiguration")
|
||||
.isPresent()
|
||||
.get()
|
||||
.isEqualTo(Duration.ofSeconds(20));
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that adapter behavioral contracts (success mapping, error classification)
|
||||
* are unchanged after the constructor was changed from StartConfiguration to
|
||||
* ProviderConfiguration.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("openAiAdapterBehaviorIsUnchanged: HTTP success and error mapping contracts are preserved")
|
||||
void openAiAdapterBehaviorIsUnchanged() throws Exception {
|
||||
// Success case: HTTP 200 must produce AiInvocationSuccess with raw body
|
||||
String successBody = "{\"choices\":[{\"message\":{\"content\":\"result\"}}]}";
|
||||
HttpResponse<String> successResponse = mockHttpResponse(200, successBody);
|
||||
doReturn(successResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
AiInvocationResult result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationSuccess.class);
|
||||
assertThat(((AiInvocationSuccess) result).rawResponse().content()).isEqualTo(successBody);
|
||||
|
||||
// Non-200 case: HTTP 429 must produce AiInvocationTechnicalFailure with HTTP_429 reason
|
||||
HttpResponse<String> rateLimitedResponse = mockHttpResponse(429, null);
|
||||
doReturn(rateLimitedResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("HTTP_429");
|
||||
|
||||
// Timeout case: HttpTimeoutException must produce TIMEOUT reason
|
||||
when(httpClient.send(any(HttpRequest.class), any()))
|
||||
.thenThrow(new HttpTimeoutException("timed out"));
|
||||
result = adapter.invoke(createTestRequest("p", "d"));
|
||||
assertThat(result).isInstanceOf(AiInvocationTechnicalFailure.class);
|
||||
assertThat(((AiInvocationTechnicalFailure) result).failureReason()).isEqualTo("TIMEOUT");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a port value of 0 in the base URL is not included in the endpoint URI.
|
||||
* <p>
|
||||
* {@link java.net.URI#getPort()} returns {@code 0} when the URL explicitly specifies
|
||||
* port 0. The endpoint builder must only include the port when it is greater than 0,
|
||||
* not when it is equal to 0 or negative.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("should not include port 0 in the endpoint URI")
|
||||
void buildEndpointUri_doesNotIncludePortZero() throws Exception {
|
||||
ProviderConfiguration configWithPortZero = new ProviderConfiguration(
|
||||
API_MODEL, TIMEOUT_SECONDS, "http://example.com:0", API_KEY);
|
||||
OpenAiHttpAdapter adapterWithPortZero = new OpenAiHttpAdapter(configWithPortZero, httpClient);
|
||||
|
||||
HttpResponse<String> httpResponse = mockHttpResponse(200,
|
||||
"{\"choices\":[{\"message\":{\"content\":\"test\"}}]}");
|
||||
doReturn(httpResponse).when(httpClient).send(any(HttpRequest.class), any());
|
||||
|
||||
adapterWithPortZero.invoke(createTestRequest("p", "d"));
|
||||
|
||||
ArgumentCaptor<HttpRequest> requestCaptor = ArgumentCaptor.forClass(HttpRequest.class);
|
||||
verify(httpClient).send(requestCaptor.capture(), any());
|
||||
assertThat(requestCaptor.getValue().uri().toString())
|
||||
.as("Port 0 must not appear in the endpoint URI")
|
||||
.doesNotContain(":0");
|
||||
}
|
||||
|
||||
// Helper methods
|
||||
|
||||
/**
|
||||
* Creates a mock HttpResponse with the specified status code and optional body.
|
||||
*/
|
||||
@SuppressWarnings("unchecked")
|
||||
private HttpResponse<String> mockHttpResponse(int statusCode, String body) {
|
||||
HttpResponse<String> response = (HttpResponse<String>) mock(HttpResponse.class);
|
||||
when(response.statusCode()).thenReturn(statusCode);
|
||||
if (body != null) {
|
||||
when(response.body()).thenReturn(body);
|
||||
}
|
||||
return response;
|
||||
}
|
||||
|
||||
private AiRequestRepresentation createTestRequest(String prompt, String documentText) {
|
||||
return new AiRequestRepresentation(
|
||||
new PromptIdentifier("test-v1"),
|
||||
prompt,
|
||||
documentText,
|
||||
documentText.length()
|
||||
);
|
||||
}
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,447 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.Properties;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Tests for {@link LegacyConfigurationMigrator}.
|
||||
* <p>
|
||||
* Covers all mandatory test cases for the legacy-to-multi-provider configuration migration.
|
||||
* Temporary files are managed via {@link TempDir} so no test artifacts remain on the file system.
|
||||
*/
|
||||
class LegacyConfigurationMigratorTest {
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/** Full legacy configuration containing all four api.* keys plus other required keys. */
|
||||
private static String fullLegacyContent() {
|
||||
return "source.folder=./source\n"
|
||||
+ "target.folder=./target\n"
|
||||
+ "sqlite.file=./db.sqlite\n"
|
||||
+ "api.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "max.retries.transient=3\n"
|
||||
+ "max.pages=10\n"
|
||||
+ "max.text.characters=5000\n"
|
||||
+ "prompt.template.file=./prompt.txt\n"
|
||||
+ "api.key=sk-test-legacy-key\n"
|
||||
+ "log.level=INFO\n"
|
||||
+ "log.ai.sensitive=false\n";
|
||||
}
|
||||
|
||||
private Path writeLegacyFile(String name, String content) throws IOException {
|
||||
Path file = tempDir.resolve(name);
|
||||
Files.writeString(file, content, StandardCharsets.UTF_8);
|
||||
return file;
|
||||
}
|
||||
|
||||
private Properties loadProperties(Path file) throws IOException {
|
||||
Properties props = new Properties();
|
||||
props.load(Files.newBufferedReader(file, StandardCharsets.UTF_8));
|
||||
return props;
|
||||
}
|
||||
|
||||
private LegacyConfigurationMigrator defaultMigrator() {
|
||||
return new LegacyConfigurationMigrator();
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 1
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Legacy file with all four {@code api.*} keys is correctly migrated.
|
||||
* Values in the migrated file must be identical to the originals; all other keys survive.
|
||||
*/
|
||||
@Test
|
||||
void migratesLegacyFileWithAllFlatKeys() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("https://api.openai.com/v1", migrated.getProperty("ai.provider.openai-compatible.baseUrl"));
|
||||
assertEquals("gpt-4o", migrated.getProperty("ai.provider.openai-compatible.model"));
|
||||
assertEquals("30", migrated.getProperty("ai.provider.openai-compatible.timeoutSeconds"));
|
||||
assertEquals("sk-test-legacy-key", migrated.getProperty("ai.provider.openai-compatible.apiKey"));
|
||||
assertEquals("openai-compatible", migrated.getProperty("ai.provider.active"));
|
||||
|
||||
// Old flat keys must be gone
|
||||
assertFalse(migrated.containsKey("api.baseUrl"), "api.baseUrl must be removed");
|
||||
assertFalse(migrated.containsKey("api.model"), "api.model must be removed");
|
||||
assertFalse(migrated.containsKey("api.timeoutSeconds"), "api.timeoutSeconds must be removed");
|
||||
assertFalse(migrated.containsKey("api.key"), "api.key must be removed");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 2
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A {@code .bak} backup is created with the exact original content before any changes.
|
||||
*/
|
||||
@Test
|
||||
void createsBakBeforeOverwriting() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
|
||||
assertFalse(Files.exists(bakFile), "No .bak should exist before migration");
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
assertTrue(Files.exists(bakFile), ".bak must be created during migration");
|
||||
assertEquals(original, Files.readString(bakFile, StandardCharsets.UTF_8),
|
||||
".bak must contain the exact original content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 3
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* When {@code .bak} already exists, the new backup is written as {@code .bak.1}.
|
||||
* Neither the existing {@code .bak} nor the new {@code .bak.1} is overwritten.
|
||||
*/
|
||||
@Test
|
||||
void bakSuffixIsIncrementedIfBakExists() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
|
||||
// Pre-create .bak with different content
|
||||
Path existingBak = tempDir.resolve("app.properties.bak");
|
||||
Files.writeString(existingBak, "# existing bak", StandardCharsets.UTF_8);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// Existing .bak must be untouched
|
||||
assertEquals("# existing bak", Files.readString(existingBak, StandardCharsets.UTF_8),
|
||||
"Existing .bak must not be overwritten");
|
||||
|
||||
// New backup must be .bak.1 with original content
|
||||
Path newBak = tempDir.resolve("app.properties.bak.1");
|
||||
assertTrue(Files.exists(newBak), ".bak.1 must be created when .bak already exists");
|
||||
assertEquals(original, Files.readString(newBak, StandardCharsets.UTF_8),
|
||||
".bak.1 must contain the original content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 4
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A file already in the new multi-provider schema triggers no write and no {@code .bak}.
|
||||
*/
|
||||
@Test
|
||||
void noOpForAlreadyMigratedFile() throws IOException {
|
||||
String newSchema = "ai.provider.active=openai-compatible\n"
|
||||
+ "ai.provider.openai-compatible.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "ai.provider.openai-compatible.model=gpt-4o\n"
|
||||
+ "ai.provider.openai-compatible.timeoutSeconds=30\n"
|
||||
+ "ai.provider.openai-compatible.apiKey=sk-key\n";
|
||||
Path file = writeLegacyFile("app.properties", newSchema);
|
||||
long modifiedBefore = Files.getLastModifiedTime(file).toMillis();
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// File must not have been rewritten
|
||||
assertEquals(modifiedBefore, Files.getLastModifiedTime(file).toMillis(),
|
||||
"File modification time must not change for already-migrated files");
|
||||
|
||||
// No .bak should exist
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
assertFalse(Files.exists(bakFile), "No .bak must be created for already-migrated files");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 5
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* After migration, the new parser and validator load the file without error.
|
||||
*/
|
||||
@Test
|
||||
void reloadAfterMigrationSucceeds() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// Reload and parse with the new parser+validator — must not throw
|
||||
Properties props = loadProperties(file);
|
||||
MultiProviderConfiguration config = assertDoesNotThrow(
|
||||
() -> new MultiProviderConfigurationParser().parse(props),
|
||||
"Migrated file must be parseable by MultiProviderConfigurationParser");
|
||||
assertDoesNotThrow(
|
||||
() -> new MultiProviderConfigurationValidator().validate(config),
|
||||
"Migrated file must pass MultiProviderConfigurationValidator");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 6
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* When post-migration validation fails, a {@link ConfigurationLoadingException} is thrown
|
||||
* and the {@code .bak} backup is preserved with the original content.
|
||||
*/
|
||||
@Test
|
||||
void migrationFailureKeepsBak() throws IOException {
|
||||
String original = fullLegacyContent();
|
||||
Path file = writeLegacyFile("app.properties", original);
|
||||
|
||||
// Validator that always rejects
|
||||
MultiProviderConfigurationValidator failingValidator = new MultiProviderConfigurationValidator() {
|
||||
@Override
|
||||
public void validate(MultiProviderConfiguration config) {
|
||||
throw new InvalidStartConfigurationException("Simulated validation failure");
|
||||
}
|
||||
};
|
||||
|
||||
LegacyConfigurationMigrator migrator = new LegacyConfigurationMigrator(
|
||||
new MultiProviderConfigurationParser(), failingValidator);
|
||||
|
||||
assertThrows(ConfigurationLoadingException.class,
|
||||
() -> migrator.migrateIfLegacy(file),
|
||||
"Migration must throw ConfigurationLoadingException when post-migration validation fails");
|
||||
|
||||
// .bak must be preserved with original content
|
||||
Path bakFile = tempDir.resolve("app.properties.bak");
|
||||
assertTrue(Files.exists(bakFile), ".bak must be preserved after migration failure");
|
||||
assertEquals(original, Files.readString(bakFile, StandardCharsets.UTF_8),
|
||||
".bak content must match the original file content");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 7
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A file that contains {@code ai.provider.active} but no legacy {@code api.*} keys
|
||||
* is not considered legacy and triggers no migration.
|
||||
*/
|
||||
@Test
|
||||
void legacyDetectionRequiresAtLeastOneFlatKey() throws IOException {
|
||||
String notLegacy = "ai.provider.active=openai-compatible\n"
|
||||
+ "source.folder=./source\n"
|
||||
+ "max.pages=10\n";
|
||||
Path file = writeLegacyFile("app.properties", notLegacy);
|
||||
|
||||
Properties props = new Properties();
|
||||
props.load(Files.newBufferedReader(file, StandardCharsets.UTF_8));
|
||||
|
||||
boolean detected = defaultMigrator().isLegacyForm(props);
|
||||
|
||||
assertFalse(detected, "File with ai.provider.active and no api.* keys must not be detected as legacy");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 8
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* The four legacy values land in exactly the target keys in the openai-compatible namespace,
|
||||
* and {@code ai.provider.active} is set to {@code openai-compatible}.
|
||||
*/
|
||||
@Test
|
||||
void legacyValuesEndUpInOpenAiCompatibleNamespace() throws IOException {
|
||||
String content = "api.baseUrl=https://legacy.example.com/v1\n"
|
||||
+ "api.model=legacy-model\n"
|
||||
+ "api.timeoutSeconds=42\n"
|
||||
+ "api.key=legacy-key\n"
|
||||
+ "source.folder=./src\n";
|
||||
Path file = writeLegacyFile("app.properties", content);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("https://legacy.example.com/v1", migrated.getProperty("ai.provider.openai-compatible.baseUrl"),
|
||||
"api.baseUrl must map to ai.provider.openai-compatible.baseUrl");
|
||||
assertEquals("legacy-model", migrated.getProperty("ai.provider.openai-compatible.model"),
|
||||
"api.model must map to ai.provider.openai-compatible.model");
|
||||
assertEquals("42", migrated.getProperty("ai.provider.openai-compatible.timeoutSeconds"),
|
||||
"api.timeoutSeconds must map to ai.provider.openai-compatible.timeoutSeconds");
|
||||
assertEquals("legacy-key", migrated.getProperty("ai.provider.openai-compatible.apiKey"),
|
||||
"api.key must map to ai.provider.openai-compatible.apiKey");
|
||||
assertEquals("openai-compatible", migrated.getProperty("ai.provider.active"),
|
||||
"ai.provider.active must be set to openai-compatible");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 9
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Keys unrelated to the legacy api.* set survive the migration with identical values.
|
||||
*/
|
||||
@Test
|
||||
void unrelatedKeysSurviveUnchanged() throws IOException {
|
||||
String content = "source.folder=./my/source\n"
|
||||
+ "target.folder=./my/target\n"
|
||||
+ "sqlite.file=./my/db.sqlite\n"
|
||||
+ "max.pages=15\n"
|
||||
+ "max.text.characters=3000\n"
|
||||
+ "log.level=DEBUG\n"
|
||||
+ "log.ai.sensitive=false\n"
|
||||
+ "api.baseUrl=https://api.openai.com/v1\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-unrelated-test\n";
|
||||
Path file = writeLegacyFile("app.properties", content);
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
Properties migrated = loadProperties(file);
|
||||
assertEquals("./my/source", migrated.getProperty("source.folder"), "source.folder must be unchanged");
|
||||
assertEquals("./my/target", migrated.getProperty("target.folder"), "target.folder must be unchanged");
|
||||
assertEquals("./my/db.sqlite", migrated.getProperty("sqlite.file"), "sqlite.file must be unchanged");
|
||||
assertEquals("15", migrated.getProperty("max.pages"), "max.pages must be unchanged");
|
||||
assertEquals("3000", migrated.getProperty("max.text.characters"), "max.text.characters must be unchanged");
|
||||
assertEquals("DEBUG", migrated.getProperty("log.level"), "log.level must be unchanged");
|
||||
assertEquals("false", migrated.getProperty("log.ai.sensitive"), "log.ai.sensitive must be unchanged");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 10
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Migration writes via a temporary {@code .tmp} file followed by a move/rename.
|
||||
* After successful migration, no {@code .tmp} file remains, and the original path
|
||||
* holds the fully migrated content (never partially overwritten).
|
||||
*/
|
||||
@Test
|
||||
void inPlaceWriteIsAtomic() throws IOException {
|
||||
Path file = writeLegacyFile("app.properties", fullLegacyContent());
|
||||
Path tmpFile = tempDir.resolve("app.properties.tmp");
|
||||
|
||||
defaultMigrator().migrateIfLegacy(file);
|
||||
|
||||
// .tmp must have been cleaned up (moved to target, not left behind)
|
||||
assertFalse(Files.exists(tmpFile),
|
||||
".tmp file must not exist after migration (must have been moved to target)");
|
||||
|
||||
// Target must contain migrated content
|
||||
Properties migrated = loadProperties(file);
|
||||
assertTrue(migrated.containsKey("ai.provider.active"),
|
||||
"Migrated file must contain ai.provider.active (complete write confirmed)");
|
||||
assertTrue(migrated.containsKey("ai.provider.openai-compatible.model"),
|
||||
"Migrated file must contain the new namespaced model key (complete write confirmed)");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: isLegacyForm – each individual legacy key triggers detection
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.baseUrl} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyBaseUrlPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_BASE_URL, "https://api.example.com");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.baseUrl must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.model} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyModelPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_MODEL, "gpt-4o");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.model must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.timeoutSeconds} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyTimeoutPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_TIMEOUT, "30");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.timeoutSeconds must be detected as legacy");
|
||||
}
|
||||
|
||||
/**
|
||||
* A properties set containing only {@code api.key} (without {@code ai.provider.active})
|
||||
* must be detected as legacy.
|
||||
*/
|
||||
@Test
|
||||
void isLegacyForm_detectedWhenOnlyApiKeyPresent() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty(LegacyConfigurationMigrator.LEGACY_API_KEY, "sk-test");
|
||||
assertTrue(defaultMigrator().isLegacyForm(props),
|
||||
"Properties with only api.key must be detected as legacy");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: lineDefinesKey / generateMigratedContent – prefix-only match must not fire
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* A line whose key is a prefix of a legacy key (e.g. {@code api.baseUrlExtra}) must not
|
||||
* be treated as defining the legacy key ({@code api.baseUrl}) and must survive migration
|
||||
* unchanged while the actual legacy key is correctly replaced.
|
||||
*/
|
||||
@Test
|
||||
void generateMigratedContent_doesNotReplacePrefixMatchKey() {
|
||||
String content = "api.baseUrlExtra=should-not-change\n"
|
||||
+ "api.baseUrl=https://real.example.com\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-real\n";
|
||||
|
||||
String migrated = defaultMigrator().generateMigratedContent(content);
|
||||
|
||||
assertTrue(migrated.contains("api.baseUrlExtra=should-not-change"),
|
||||
"Line with key that is a prefix of a legacy key must not be modified");
|
||||
assertTrue(migrated.contains("ai.provider.openai-compatible.baseUrl=https://real.example.com"),
|
||||
"The actual legacy key api.baseUrl must be replaced with the namespaced key");
|
||||
}
|
||||
|
||||
/**
|
||||
* A line that defines a legacy key with no value (key only, no separator)
|
||||
* must be recognized as defining that key and be replaced in migration.
|
||||
*/
|
||||
@Test
|
||||
void generateMigratedContent_handlesKeyWithoutValue() {
|
||||
String content = "api.baseUrl\n"
|
||||
+ "api.model=gpt-4o\n"
|
||||
+ "api.timeoutSeconds=30\n"
|
||||
+ "api.key=sk-test\n";
|
||||
|
||||
String migrated = defaultMigrator().generateMigratedContent(content);
|
||||
|
||||
assertTrue(migrated.contains("ai.provider.openai-compatible.baseUrl"),
|
||||
"Key-only line (no value, no separator) must still be recognized and replaced");
|
||||
assertFalse(migrated.contains("api.baseUrl\n") || migrated.contains("api.baseUrl\r"),
|
||||
"Original key-only line must not survive unchanged");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,463 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.util.Properties;
|
||||
import java.util.function.Function;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Tests for the multi-provider configuration parsing and validation pipeline.
|
||||
* <p>
|
||||
* Covers all mandatory test cases for the new configuration schema as defined
|
||||
* in the active work package specification.
|
||||
*/
|
||||
class MultiProviderConfigurationTest {
|
||||
|
||||
private static final Function<String, String> NO_ENV = key -> null;
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private Properties fullOpenAiProperties() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty("ai.provider.active", "openai-compatible");
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "https://api.openai.com");
|
||||
props.setProperty("ai.provider.openai-compatible.model", "gpt-4o");
|
||||
props.setProperty("ai.provider.openai-compatible.timeoutSeconds", "30");
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "sk-openai-test");
|
||||
// Claude side intentionally not set (inactive)
|
||||
return props;
|
||||
}
|
||||
|
||||
private Properties fullClaudeProperties() {
|
||||
Properties props = new Properties();
|
||||
props.setProperty("ai.provider.active", "claude");
|
||||
props.setProperty("ai.provider.claude.baseUrl", "https://api.anthropic.com");
|
||||
props.setProperty("ai.provider.claude.model", "claude-3-5-sonnet-20241022");
|
||||
props.setProperty("ai.provider.claude.timeoutSeconds", "60");
|
||||
props.setProperty("ai.provider.claude.apiKey", "sk-ant-test");
|
||||
// OpenAI side intentionally not set (inactive)
|
||||
return props;
|
||||
}
|
||||
|
||||
private MultiProviderConfiguration parseAndValidate(Properties props,
|
||||
Function<String, String> envLookup) {
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(envLookup);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
new MultiProviderConfigurationValidator().validate(config);
|
||||
return config;
|
||||
}
|
||||
|
||||
private MultiProviderConfiguration parseAndValidate(Properties props) {
|
||||
return parseAndValidate(props, NO_ENV);
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 1
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Full new schema with OpenAI-compatible active, all required values present.
|
||||
* Parser and validator must both succeed.
|
||||
*/
|
||||
@Test
|
||||
void parsesNewSchemaWithOpenAiCompatibleActive() {
|
||||
MultiProviderConfiguration config = parseAndValidate(fullOpenAiProperties());
|
||||
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE, config.activeProviderFamily());
|
||||
assertEquals("gpt-4o", config.openAiCompatibleConfig().model());
|
||||
assertEquals(30, config.openAiCompatibleConfig().timeoutSeconds());
|
||||
assertEquals("https://api.openai.com", config.openAiCompatibleConfig().baseUrl());
|
||||
assertEquals("sk-openai-test", config.openAiCompatibleConfig().apiKey());
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 2
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Full new schema with Claude active, all required values present.
|
||||
* Parser and validator must both succeed.
|
||||
*/
|
||||
@Test
|
||||
void parsesNewSchemaWithClaudeActive() {
|
||||
MultiProviderConfiguration config = parseAndValidate(fullClaudeProperties());
|
||||
|
||||
assertEquals(AiProviderFamily.CLAUDE, config.activeProviderFamily());
|
||||
assertEquals("claude-3-5-sonnet-20241022", config.claudeConfig().model());
|
||||
assertEquals(60, config.claudeConfig().timeoutSeconds());
|
||||
assertEquals("https://api.anthropic.com", config.claudeConfig().baseUrl());
|
||||
assertEquals("sk-ant-test", config.claudeConfig().apiKey());
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 3
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Claude active, {@code ai.provider.claude.baseUrl} absent.
|
||||
* The default {@code https://api.anthropic.com} must be applied; validation must pass.
|
||||
*/
|
||||
@Test
|
||||
void claudeBaseUrlDefaultsWhenMissing() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.remove("ai.provider.claude.baseUrl");
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props);
|
||||
|
||||
assertNotNull(config.claudeConfig().baseUrl(),
|
||||
"baseUrl must not be null when Claude default is applied");
|
||||
assertEquals(MultiProviderConfigurationParser.CLAUDE_DEFAULT_BASE_URL,
|
||||
config.claudeConfig().baseUrl(),
|
||||
"Default Claude baseUrl must be https://api.anthropic.com");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 4
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code ai.provider.active} is absent. Parser must throw with a clear message.
|
||||
*/
|
||||
@Test
|
||||
void rejectsMissingActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.active");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
ConfigurationLoadingException ex = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> parser.parse(props));
|
||||
|
||||
assertTrue(ex.getMessage().contains("ai.provider.active"),
|
||||
"Error message must reference the missing property");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 5
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code ai.provider.active=foo} – unrecognised value. Parser must throw.
|
||||
*/
|
||||
@Test
|
||||
void rejectsUnknownActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.active", "foo");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
ConfigurationLoadingException ex = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> parser.parse(props));
|
||||
|
||||
assertTrue(ex.getMessage().contains("foo"),
|
||||
"Error message must include the unrecognised value");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 6
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Active provider has a mandatory field blank (model removed). Validation must fail.
|
||||
*/
|
||||
@Test
|
||||
void rejectsMissingMandatoryFieldForActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.model");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("model"),
|
||||
"Error message must mention the missing field");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 7
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Inactive provider has incomplete configuration (Claude fields missing while OpenAI is active).
|
||||
* Validation must pass; inactive provider fields are not required.
|
||||
*/
|
||||
@Test
|
||||
void acceptsMissingMandatoryFieldForInactiveProvider() {
|
||||
// OpenAI active, Claude completely unconfigured
|
||||
Properties props = fullOpenAiProperties();
|
||||
// No ai.provider.claude.* keys set
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props);
|
||||
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE, config.activeProviderFamily(),
|
||||
"Active provider must be openai-compatible");
|
||||
// Claude config may have null/blank fields – no exception expected
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 8
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Environment variable for the active provider overrides the properties value.
|
||||
* <p>
|
||||
* Sub-case A: {@code OPENAI_COMPATIBLE_API_KEY} set, OpenAI active.
|
||||
* Sub-case B: {@code ANTHROPIC_API_KEY} set, Claude active.
|
||||
*/
|
||||
@Test
|
||||
void envVarOverridesPropertiesApiKeyForActiveProvider() {
|
||||
// Sub-case A: OpenAI active, OPENAI_COMPATIBLE_API_KEY set
|
||||
Properties openAiProps = fullOpenAiProperties();
|
||||
openAiProps.setProperty("ai.provider.openai-compatible.apiKey", "properties-key");
|
||||
|
||||
Function<String, String> envWithOpenAiKey = key ->
|
||||
MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)
|
||||
? "env-openai-key" : null;
|
||||
|
||||
MultiProviderConfiguration openAiConfig = parseAndValidate(openAiProps, envWithOpenAiKey);
|
||||
assertEquals("env-openai-key", openAiConfig.openAiCompatibleConfig().apiKey(),
|
||||
"Env var must override properties API key for OpenAI-compatible");
|
||||
|
||||
// Sub-case B: Claude active, ANTHROPIC_API_KEY set
|
||||
Properties claudeProps = fullClaudeProperties();
|
||||
claudeProps.setProperty("ai.provider.claude.apiKey", "properties-key");
|
||||
|
||||
Function<String, String> envWithClaudeKey = key ->
|
||||
MultiProviderConfigurationParser.ENV_CLAUDE_API_KEY.equals(key)
|
||||
? "env-claude-key" : null;
|
||||
|
||||
MultiProviderConfiguration claudeConfig = parseAndValidate(claudeProps, envWithClaudeKey);
|
||||
assertEquals("env-claude-key", claudeConfig.claudeConfig().apiKey(),
|
||||
"Env var must override properties API key for Claude");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Test: legacy env var PDF_UMBENENNER_API_KEY
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* {@code PDF_UMBENENNER_API_KEY} is set, {@code OPENAI_COMPATIBLE_API_KEY} is absent.
|
||||
* The legacy variable must be accepted as a fallback for the OpenAI-compatible provider.
|
||||
*/
|
||||
@Test
|
||||
void legacyEnvVarPdfUmbenennerApiKeyUsedWhenPrimaryAbsent() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.apiKey");
|
||||
|
||||
Function<String, String> envWithLegacy = key ->
|
||||
MultiProviderConfigurationParser.ENV_LEGACY_OPENAI_API_KEY.equals(key)
|
||||
? "legacy-env-key" : null;
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envWithLegacy);
|
||||
assertEquals("legacy-env-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"Legacy env var PDF_UMBENENNER_API_KEY must be used when OPENAI_COMPATIBLE_API_KEY is absent");
|
||||
}
|
||||
|
||||
/**
|
||||
* {@code OPENAI_COMPATIBLE_API_KEY} takes precedence over {@code PDF_UMBENENNER_API_KEY}.
|
||||
*/
|
||||
@Test
|
||||
void primaryEnvVarTakesPrecedenceOverLegacyEnvVar() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.remove("ai.provider.openai-compatible.apiKey");
|
||||
|
||||
Function<String, String> envBoth = key -> {
|
||||
if (MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)) return "primary-key";
|
||||
if (MultiProviderConfigurationParser.ENV_LEGACY_OPENAI_API_KEY.equals(key)) return "legacy-key";
|
||||
return null;
|
||||
};
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envBoth);
|
||||
assertEquals("primary-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"OPENAI_COMPATIBLE_API_KEY must take precedence over PDF_UMBENENNER_API_KEY");
|
||||
}
|
||||
|
||||
/**
|
||||
* Neither env var is set; the properties value is used as final fallback.
|
||||
*/
|
||||
@Test
|
||||
void propertiesApiKeyUsedWhenNoEnvVarSet() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "props-only-key");
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, NO_ENV);
|
||||
assertEquals("props-only-key", config.openAiCompatibleConfig().apiKey(),
|
||||
"Properties API key must be used when no env var is set");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: base URL validation
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* OpenAI-compatible provider with an invalid (non-URI) base URL must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsInvalidBaseUrlForActiveOpenAiProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "not a valid url at all ://");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
}
|
||||
|
||||
/**
|
||||
* Claude provider with an invalid base URL must be rejected when Claude is active.
|
||||
*/
|
||||
@Test
|
||||
void rejectsInvalidBaseUrlForActiveClaudeProvider() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.setProperty("ai.provider.claude.baseUrl", "ftp://api.anthropic.com");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
assertTrue(ex.getMessage().contains("ftp"),
|
||||
"Error message must mention the invalid scheme");
|
||||
}
|
||||
|
||||
/**
|
||||
* A relative URI (no scheme, no host) must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsRelativeUriAsBaseUrl() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "/v1/chat");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
}
|
||||
|
||||
/**
|
||||
* A non-http/https scheme (e.g. {@code ftp://}) must be rejected.
|
||||
*/
|
||||
@Test
|
||||
void rejectsNonHttpSchemeAsBaseUrl() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.baseUrl", "ftp://api.example.com");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("baseUrl"),
|
||||
"Error message must reference baseUrl");
|
||||
assertTrue(ex.getMessage().contains("ftp"),
|
||||
"Error message must mention the invalid scheme");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Mandatory test case 9
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Environment variable is set only for the inactive provider.
|
||||
* The active provider must use its own properties value; the inactive provider's
|
||||
* env var must not affect the active provider's resolved key.
|
||||
*/
|
||||
@Test
|
||||
void envVarOnlyResolvesForActiveProvider() {
|
||||
// OpenAI is active with a properties apiKey.
|
||||
// ANTHROPIC_API_KEY is set (for the inactive Claude provider).
|
||||
// The OpenAI config must use its properties key, not the Anthropic env var.
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.apiKey", "openai-properties-key");
|
||||
|
||||
Function<String, String> envWithClaudeKeyOnly = key ->
|
||||
MultiProviderConfigurationParser.ENV_CLAUDE_API_KEY.equals(key)
|
||||
? "anthropic-env-key" : null;
|
||||
|
||||
MultiProviderConfiguration config = parseAndValidate(props, envWithClaudeKeyOnly);
|
||||
|
||||
assertEquals("openai-properties-key",
|
||||
config.openAiCompatibleConfig().apiKey(),
|
||||
"Active provider (OpenAI) must use its own properties key, "
|
||||
+ "not the inactive provider's env var");
|
||||
// The Anthropic env var IS applied to the Claude config (inactive),
|
||||
// but that does not affect the active provider.
|
||||
assertEquals("anthropic-env-key",
|
||||
config.claudeConfig().apiKey(),
|
||||
"Inactive Claude config should still pick up its own env var");
|
||||
}
|
||||
|
||||
// =========================================================================
|
||||
// Tests: timeout validation
|
||||
// =========================================================================
|
||||
|
||||
/**
|
||||
* Active provider has timeout set to 0. Validation must fail and mention timeoutSeconds.
|
||||
* This verifies that validateTimeoutSeconds is called and that the boundary is strictly
|
||||
* positive (i.e. 0 is rejected, not just negative values).
|
||||
*/
|
||||
@Test
|
||||
void rejectsZeroTimeoutForActiveProvider() {
|
||||
Properties props = fullOpenAiProperties();
|
||||
props.setProperty("ai.provider.openai-compatible.timeoutSeconds", "0");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("timeoutSeconds"),
|
||||
"Error message must reference timeoutSeconds");
|
||||
}
|
||||
|
||||
/**
|
||||
* Active Claude provider has timeout set to 0. Same invariant for the other provider family.
|
||||
*/
|
||||
@Test
|
||||
void rejectsZeroTimeoutForActiveClaudeProvider() {
|
||||
Properties props = fullClaudeProperties();
|
||||
props.setProperty("ai.provider.claude.timeoutSeconds", "0");
|
||||
|
||||
MultiProviderConfigurationParser parser = new MultiProviderConfigurationParser(NO_ENV);
|
||||
MultiProviderConfiguration config = parser.parse(props);
|
||||
|
||||
InvalidStartConfigurationException ex = assertThrows(
|
||||
InvalidStartConfigurationException.class,
|
||||
() -> new MultiProviderConfigurationValidator().validate(config));
|
||||
|
||||
assertTrue(ex.getMessage().contains("timeoutSeconds"),
|
||||
"Error message must reference timeoutSeconds");
|
||||
}
|
||||
}
|
||||
@@ -1,23 +1,28 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.configuration.PropertiesConfigurationPortAdapter;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.io.FileWriter;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.function.Function;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.AiProviderFamily;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link PropertiesConfigurationPortAdapter}.
|
||||
* <p>
|
||||
* Tests cover valid configuration loading, missing mandatory properties,
|
||||
* invalid property values, and API-key environment variable precedence.
|
||||
* invalid property values, and API-key environment variable precedence
|
||||
* for the multi-provider schema.
|
||||
*/
|
||||
class PropertiesConfigurationPortAdapterTest {
|
||||
|
||||
@@ -40,13 +45,20 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertNotNull(config);
|
||||
// Use endsWith to handle platform-specific path separators
|
||||
assertTrue(config.sourceFolder().toString().endsWith("source"));
|
||||
assertTrue(config.targetFolder().toString().endsWith("target"));
|
||||
assertTrue(config.sqliteFile().toString().endsWith("db.sqlite"));
|
||||
assertEquals("https://api.example.com", config.apiBaseUrl().toString());
|
||||
assertEquals("gpt-4", config.apiModel());
|
||||
assertEquals(30, config.apiTimeoutSeconds());
|
||||
assertNotNull(config.multiProviderConfiguration());
|
||||
assertEquals(AiProviderFamily.OPENAI_COMPATIBLE,
|
||||
config.multiProviderConfiguration().activeProviderFamily());
|
||||
assertEquals("https://api.example.com",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().baseUrl());
|
||||
assertEquals("gpt-4",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().model());
|
||||
assertEquals(30,
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds());
|
||||
assertEquals("test-api-key-from-properties",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().apiKey());
|
||||
assertEquals(3, config.maxRetriesTransient());
|
||||
assertEquals(100, config.maxPages());
|
||||
assertEquals(50000, config.maxTextCharacters());
|
||||
@@ -54,57 +66,60 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
assertTrue(config.runtimeLockFile().toString().endsWith("lock.lock"));
|
||||
assertTrue(config.logDirectory().toString().endsWith("logs"));
|
||||
assertEquals("DEBUG", config.logLevel());
|
||||
assertEquals("test-api-key-from-properties", config.apiKey());
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsAbsent() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenAbsentAndNoEnvVar() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "API key should be empty when not in properties and no env var");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Missing API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsNull() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsNull() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> null;
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey());
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Null env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsEmpty() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsEmpty() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> "";
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "Empty env var should fall back to empty string");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Empty env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_usesPropertiesApiKeyWhenEnvVarIsBlank() throws Exception {
|
||||
void loadConfiguration_rejectsBlankApiKeyWhenEnvVarIsBlank() throws Exception {
|
||||
Path configFile = createConfigFile("no-api-key.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> " ";
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(envLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("", config.apiKey(), "Blank env var should fall back to empty string");
|
||||
assertThrows(
|
||||
de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.InvalidStartConfigurationException.class,
|
||||
adapter::loadConfiguration,
|
||||
"Blank env var with no properties API key must be rejected as invalid configuration");
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -112,7 +127,7 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
Path configFile = createConfigFile("valid-config.properties");
|
||||
|
||||
Function<String, String> envLookup = key -> {
|
||||
if ("PDF_UMBENENNER_API_KEY".equals(key)) {
|
||||
if (MultiProviderConfigurationParser.ENV_OPENAI_API_KEY.equals(key)) {
|
||||
return "env-api-key-override";
|
||||
}
|
||||
return null;
|
||||
@@ -122,17 +137,19 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("env-api-key-override", config.apiKey(), "Environment variable should override properties");
|
||||
assertEquals("env-api-key-override",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().apiKey(),
|
||||
"Environment variable must override properties API key");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsIllegalStateExceptionWhenRequiredPropertyMissing() throws Exception {
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionWhenRequiredPropertyMissing() throws Exception {
|
||||
Path configFile = createConfigFile("missing-required.properties");
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
IllegalStateException exception = assertThrows(
|
||||
IllegalStateException.class,
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
@@ -141,13 +158,13 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsRuntimeExceptionWhenConfigFileNotFound() {
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionWhenConfigFileNotFound() {
|
||||
Path nonExistentFile = tempDir.resolve("nonexistent.properties");
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, nonExistentFile);
|
||||
|
||||
RuntimeException exception = assertThrows(
|
||||
RuntimeException.class,
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
@@ -161,21 +178,22 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=60\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=60\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=5\n" +
|
||||
"max.pages=200\n" +
|
||||
"max.text.characters=100000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals(60, config.apiTimeoutSeconds());
|
||||
assertEquals(60, config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds());
|
||||
assertEquals(5, config.maxRetriesTransient());
|
||||
assertEquals(200, config.maxPages());
|
||||
assertEquals(100000, config.maxTextCharacters());
|
||||
@@ -187,43 +205,47 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds= 45 \n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds= 45 \n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=150\n" +
|
||||
"max.text.characters=75000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals(45, config.apiTimeoutSeconds(), "Whitespace should be trimmed from integer values");
|
||||
assertEquals(45,
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().timeoutSeconds(),
|
||||
"Whitespace should be trimmed from integer values");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsIllegalStateExceptionForInvalidIntegerValue() throws Exception {
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidIntegerValue() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=not-a-number\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=not-a-number\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=150\n" +
|
||||
"max.text.characters=75000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
IllegalStateException exception = assertThrows(
|
||||
IllegalStateException.class,
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
@@ -231,26 +253,28 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_parsesUriCorrectly() throws Exception {
|
||||
void loadConfiguration_parsesBaseUrlStringCorrectly() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com:8080/v1\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=30\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com:8080/v1\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertEquals("https://api.example.com:8080/v1", config.apiBaseUrl().toString());
|
||||
assertEquals("https://api.example.com:8080/v1",
|
||||
config.multiProviderConfiguration().activeProviderConfiguration().baseUrl());
|
||||
}
|
||||
|
||||
@Test
|
||||
@@ -259,14 +283,15 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"api.baseUrl=https://api.example.com\n" +
|
||||
"api.model=gpt-4\n" +
|
||||
"api.timeoutSeconds=30\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"api.key=test-key\n"
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
@@ -278,11 +303,278 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
assertEquals("INFO", config.logLevel(), "log.level should default to INFO");
|
||||
}
|
||||
|
||||
@Test
|
||||
void allConfigurationFailuresAreClassifiedAsConfigurationLoadingException() throws Exception {
|
||||
// File I/O failure
|
||||
Path nonExistentFile = tempDir.resolve("nonexistent.properties");
|
||||
PropertiesConfigurationPortAdapter adapter1 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, nonExistentFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter1.loadConfiguration(),
|
||||
"File I/O failure should throw ConfigurationLoadingException");
|
||||
|
||||
// Missing required property
|
||||
Path missingPropFile = createConfigFile("missing-required.properties");
|
||||
PropertiesConfigurationPortAdapter adapter2 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, missingPropFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter2.loadConfiguration(),
|
||||
"Missing required property should throw ConfigurationLoadingException");
|
||||
|
||||
// Invalid integer value
|
||||
Path invalidIntFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=invalid\n" +
|
||||
"ai.provider.openai-compatible.apiKey=key\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
PropertiesConfigurationPortAdapter adapter3 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, invalidIntFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter3.loadConfiguration(),
|
||||
"Invalid integer value should throw ConfigurationLoadingException");
|
||||
|
||||
// Unknown ai.provider.active value
|
||||
Path unknownProviderFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=unknown-provider\n" +
|
||||
"max.retries.transient=2\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
);
|
||||
PropertiesConfigurationPortAdapter adapter4 = new PropertiesConfigurationPortAdapter(emptyEnvLookup, unknownProviderFile);
|
||||
assertThrows(ConfigurationLoadingException.class, () -> adapter4.loadConfiguration(),
|
||||
"Unknown provider identifier should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveDefaultsFalseWhenAbsent() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n"
|
||||
// log.ai.sensitive intentionally omitted
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must default to false when the property is absent");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveParsedTrueWhenExplicitlySet() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=true\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertTrue(config.logAiSensitive(),
|
||||
"log.ai.sensitive must be parsed as true when explicitly set to 'true'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveParsedFalseWhenExplicitlySet() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=false\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must be parsed as false when explicitly set to 'false'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveHandlesCaseInsensitiveTrue() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=TRUE\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertTrue(config.logAiSensitive(),
|
||||
"log.ai.sensitive must handle case-insensitive 'TRUE'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_logAiSensitiveHandlesCaseInsensitiveFalse() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=FALSE\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
var config = adapter.loadConfiguration();
|
||||
|
||||
assertFalse(config.logAiSensitive(),
|
||||
"log.ai.sensitive must handle case-insensitive 'FALSE'");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitive() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=maybe\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value should throw ConfigurationLoadingException");
|
||||
assertTrue(exception.getMessage().contains("'maybe'"),
|
||||
"Error message should include the invalid value");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitiveYes() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=yes\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value 'yes' should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadConfiguration_throwsConfigurationLoadingExceptionForInvalidLogAiSensitive1() throws Exception {
|
||||
Path configFile = createInlineConfig(
|
||||
"source.folder=/tmp/source\n" +
|
||||
"target.folder=/tmp/target\n" +
|
||||
"sqlite.file=/tmp/db.sqlite\n" +
|
||||
"ai.provider.active=openai-compatible\n" +
|
||||
"ai.provider.openai-compatible.baseUrl=https://api.example.com\n" +
|
||||
"ai.provider.openai-compatible.model=gpt-4\n" +
|
||||
"ai.provider.openai-compatible.timeoutSeconds=30\n" +
|
||||
"ai.provider.openai-compatible.apiKey=test-key\n" +
|
||||
"max.retries.transient=3\n" +
|
||||
"max.pages=100\n" +
|
||||
"max.text.characters=50000\n" +
|
||||
"prompt.template.file=/tmp/prompt.txt\n" +
|
||||
"log.ai.sensitive=1\n"
|
||||
);
|
||||
|
||||
PropertiesConfigurationPortAdapter adapter = new PropertiesConfigurationPortAdapter(emptyEnvLookup, configFile);
|
||||
|
||||
ConfigurationLoadingException exception = assertThrows(
|
||||
ConfigurationLoadingException.class,
|
||||
() -> adapter.loadConfiguration()
|
||||
);
|
||||
|
||||
assertTrue(exception.getMessage().contains("Invalid value for log.ai.sensitive"),
|
||||
"Invalid log.ai.sensitive value '1' should throw ConfigurationLoadingException");
|
||||
}
|
||||
|
||||
private Path createConfigFile(String resourceName) throws Exception {
|
||||
Path sourceResource = Path.of("src/test/resources", resourceName);
|
||||
Path targetConfigFile = tempDir.resolve("application.properties");
|
||||
|
||||
// Copy content from resource file
|
||||
Files.copy(sourceResource, targetConfigFile);
|
||||
return targetConfigFile;
|
||||
}
|
||||
@@ -294,4 +586,4 @@ class PropertiesConfigurationPortAdapterTest {
|
||||
}
|
||||
return configFile;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,156 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.fingerprint;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FingerprintTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link Sha256FingerprintAdapter}.
|
||||
*/
|
||||
class Sha256FingerprintAdapterTest {
|
||||
|
||||
private Sha256FingerprintAdapter adapter;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
adapter = new Sha256FingerprintAdapter();
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldReturnSuccess_whenFileExistsAndReadable() throws IOException {
|
||||
// Given
|
||||
String content = "Test PDF content for fingerprinting";
|
||||
Path testFile = tempDir.resolve("test.pdf");
|
||||
Files.write(testFile, content.getBytes());
|
||||
|
||||
SourceDocumentLocator locator = new SourceDocumentLocator(testFile.toString());
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate("test.pdf", content.length(), locator);
|
||||
|
||||
// When
|
||||
FingerprintResult result = adapter.computeFingerprint(candidate);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(FingerprintSuccess.class);
|
||||
FingerprintSuccess success = (FingerprintSuccess) result;
|
||||
assertThat(success.fingerprint().sha256Hex()).hasSize(64);
|
||||
assertThat(success.fingerprint().sha256Hex()).matches("[0-9a-f]{64}");
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldReturnSameFingerprint_forSameContent() throws IOException {
|
||||
// Given
|
||||
String content = "Identical content for testing deterministic behavior";
|
||||
Path testFile1 = tempDir.resolve("test1.pdf");
|
||||
Path testFile2 = tempDir.resolve("test2.pdf");
|
||||
Files.write(testFile1, content.getBytes());
|
||||
Files.write(testFile2, content.getBytes());
|
||||
|
||||
SourceDocumentLocator locator1 = new SourceDocumentLocator(testFile1.toString());
|
||||
SourceDocumentLocator locator2 = new SourceDocumentLocator(testFile2.toString());
|
||||
SourceDocumentCandidate candidate1 = new SourceDocumentCandidate("test1.pdf", content.length(), locator1);
|
||||
SourceDocumentCandidate candidate2 = new SourceDocumentCandidate("test2.pdf", content.length(), locator2);
|
||||
|
||||
// When
|
||||
FingerprintResult result1 = adapter.computeFingerprint(candidate1);
|
||||
FingerprintResult result2 = adapter.computeFingerprint(candidate2);
|
||||
|
||||
// Then
|
||||
assertThat(result1).isInstanceOf(FingerprintSuccess.class);
|
||||
assertThat(result2).isInstanceOf(FingerprintSuccess.class);
|
||||
|
||||
FingerprintSuccess success1 = (FingerprintSuccess) result1;
|
||||
FingerprintSuccess success2 = (FingerprintSuccess) result2;
|
||||
|
||||
assertThat(success1.fingerprint().sha256Hex())
|
||||
.isEqualTo(success2.fingerprint().sha256Hex());
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldReturnDifferentFingerprints_forDifferentContent() throws IOException {
|
||||
// Given
|
||||
String content1 = "First PDF content";
|
||||
String content2 = "Second PDF content";
|
||||
Path testFile1 = tempDir.resolve("test1.pdf");
|
||||
Path testFile2 = tempDir.resolve("test2.pdf");
|
||||
Files.write(testFile1, content1.getBytes());
|
||||
Files.write(testFile2, content2.getBytes());
|
||||
|
||||
SourceDocumentLocator locator1 = new SourceDocumentLocator(testFile1.toString());
|
||||
SourceDocumentLocator locator2 = new SourceDocumentLocator(testFile2.toString());
|
||||
SourceDocumentCandidate candidate1 = new SourceDocumentCandidate("test1.pdf", content1.length(), locator1);
|
||||
SourceDocumentCandidate candidate2 = new SourceDocumentCandidate("test2.pdf", content2.length(), locator2);
|
||||
|
||||
// When
|
||||
FingerprintResult result1 = adapter.computeFingerprint(candidate1);
|
||||
FingerprintResult result2 = adapter.computeFingerprint(candidate2);
|
||||
|
||||
// Then
|
||||
assertThat(result1).isInstanceOf(FingerprintSuccess.class);
|
||||
assertThat(result2).isInstanceOf(FingerprintSuccess.class);
|
||||
|
||||
FingerprintSuccess success1 = (FingerprintSuccess) result1;
|
||||
FingerprintSuccess success2 = (FingerprintSuccess) result2;
|
||||
|
||||
assertThat(success1.fingerprint().sha256Hex())
|
||||
.isNotEqualTo(success2.fingerprint().sha256Hex());
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldReturnTechnicalError_whenFileDoesNotExist() {
|
||||
// Given
|
||||
Path nonExistentFile = tempDir.resolve("nonexistent.pdf");
|
||||
SourceDocumentLocator locator = new SourceDocumentLocator(nonExistentFile.toString());
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate("nonexistent.pdf", 0, locator);
|
||||
|
||||
// When
|
||||
FingerprintResult result = adapter.computeFingerprint(candidate);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(FingerprintTechnicalError.class);
|
||||
FingerprintTechnicalError error = (FingerprintTechnicalError) result;
|
||||
assertThat(error.errorMessage()).contains("nonexistent.pdf");
|
||||
assertThat(error.errorMessage()).contains("Failed to read file");
|
||||
assertThat(error.cause()).isNotNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldReturnTechnicalError_whenLocatorValueIsInvalid() {
|
||||
// Given
|
||||
SourceDocumentLocator locator = new SourceDocumentLocator("\0invalid\0path");
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate("invalid.pdf", 0, locator);
|
||||
|
||||
// When
|
||||
FingerprintResult result = adapter.computeFingerprint(candidate);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(FingerprintTechnicalError.class);
|
||||
FingerprintTechnicalError error = (FingerprintTechnicalError) result;
|
||||
assertThat(error.errorMessage()).contains("invalid.pdf");
|
||||
assertThat(error.errorMessage()).contains("Invalid file path");
|
||||
assertThat(error.cause()).isNotNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void computeFingerprint_shouldThrowNullPointerException_whenCandidateIsNull() {
|
||||
// When & Then
|
||||
assertThatThrownBy(() -> adapter.computeFingerprint(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessage("candidate must not be null");
|
||||
}
|
||||
}
|
||||
@@ -1,15 +1,18 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.lock;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.lock.FilesystemRunLockPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link FilesystemRunLockPortAdapter}.
|
||||
|
||||
@@ -1,23 +1,5 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction.PdfTextExtractionPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.pdmodel.PDPage;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.attribute.PosixFilePermission;
|
||||
import java.util.HashSet;
|
||||
import java.util.Set;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertInstanceOf;
|
||||
@@ -25,15 +7,31 @@ import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.nio.file.attribute.PosixFilePermission;
|
||||
import java.util.HashSet;
|
||||
import java.util.Set;
|
||||
|
||||
import org.apache.pdfbox.pdmodel.PDDocument;
|
||||
import org.apache.pdfbox.pdmodel.PDPage;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Tests for {@link PdfTextExtractionPortAdapter}.
|
||||
* <p>
|
||||
* M3-AP-003: Minimal tests validating basic extraction functionality and technical error handling.
|
||||
* In AP-003 scope: all extraction problems are treated as TechnicalError, not ContentError.
|
||||
* No fachliche validation of text content (that is AP-004).
|
||||
* Validates basic extraction functionality and technical error handling.
|
||||
* All extraction problems are treated as {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError},
|
||||
* not content errors. Content usability (text quality assessment) is handled in the application layer.
|
||||
* PDFs are created programmatically using PDFBox to avoid external dependencies on test files.
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
class PdfTextExtractionPortAdapterTest {
|
||||
|
||||
@@ -170,8 +168,8 @@ class PdfTextExtractionPortAdapterTest {
|
||||
|
||||
PdfExtractionResult result = adapter.extractTextAndPageCount(candidate);
|
||||
|
||||
// AP-003: Empty text is SUCCESS, not an error
|
||||
// Fachliche Bewertung of text content happens in AP-004
|
||||
// Empty text is SUCCESS at extraction level, not an error
|
||||
// Fachliche Bewertung of text content happens in the application layer
|
||||
assertInstanceOf(PdfExtractionSuccess.class, result);
|
||||
PdfExtractionSuccess success = (PdfExtractionSuccess) result;
|
||||
assertEquals(1, success.pageCount().value());
|
||||
@@ -200,6 +198,50 @@ class PdfTextExtractionPortAdapterTest {
|
||||
assertFalse(error.errorMessage().isBlank(), "TechnicalError message must not be blank");
|
||||
}
|
||||
|
||||
@Test
|
||||
void testExtractingLargePdfReturnsSuccess() throws Exception {
|
||||
// Create a large PDF with many pages
|
||||
Path largePdfFile = tempDir.resolve("large.pdf");
|
||||
createMultiPagePdf(largePdfFile, 50);
|
||||
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate(
|
||||
"large.pdf",
|
||||
Files.size(largePdfFile),
|
||||
new SourceDocumentLocator(largePdfFile.toAbsolutePath().toString())
|
||||
);
|
||||
|
||||
PdfExtractionResult result = adapter.extractTextAndPageCount(candidate);
|
||||
|
||||
assertInstanceOf(PdfExtractionSuccess.class, result);
|
||||
PdfExtractionSuccess success = (PdfExtractionSuccess) result;
|
||||
assertEquals(50, success.pageCount().value());
|
||||
assertNotNull(success.extractedText());
|
||||
}
|
||||
|
||||
@Test
|
||||
void testPartiallyCorruptedPdfStillReturnsError() throws Exception {
|
||||
// Create a file that starts like PDF but has corrupted content
|
||||
Path partialCorruptFile = tempDir.resolve("partial-corrupt.pdf");
|
||||
byte[] pdfSignature = new byte[] {0x25, 0x50, 0x44, 0x46}; // %PDF in hex
|
||||
byte[] corruptData = "This is corrupted PDF content that will fail parsing".getBytes();
|
||||
byte[] combined = new byte[pdfSignature.length + corruptData.length];
|
||||
System.arraycopy(pdfSignature, 0, combined, 0, pdfSignature.length);
|
||||
System.arraycopy(corruptData, 0, combined, pdfSignature.length, corruptData.length);
|
||||
Files.write(partialCorruptFile, combined);
|
||||
|
||||
SourceDocumentCandidate candidate = new SourceDocumentCandidate(
|
||||
"partial-corrupt.pdf",
|
||||
Files.size(partialCorruptFile),
|
||||
new SourceDocumentLocator(partialCorruptFile.toAbsolutePath().toString())
|
||||
);
|
||||
|
||||
PdfExtractionResult result = adapter.extractTextAndPageCount(candidate);
|
||||
|
||||
assertInstanceOf(PdfExtractionTechnicalError.class, result);
|
||||
PdfExtractionTechnicalError error = (PdfExtractionTechnicalError) result;
|
||||
assertNotNull(error.errorMessage());
|
||||
}
|
||||
|
||||
// --- Helper methods to create test PDFs ---
|
||||
|
||||
/**
|
||||
|
||||
@@ -0,0 +1,202 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.prompt;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.PromptLoadingSuccess;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link FilesystemPromptPortAdapter}.
|
||||
*/
|
||||
class FilesystemPromptPortAdapterTest {
|
||||
|
||||
private FilesystemPromptPortAdapter adapter;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
// Adapter will be created with a specific prompt file path in each test
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldReturnSuccess_whenPromptFileExists() throws IOException {
|
||||
// Given
|
||||
String promptContent = "You are a helpful AI assistant that renames documents.";
|
||||
Path promptFile = tempDir.resolve("prompt.txt");
|
||||
Files.writeString(promptFile, promptContent, StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingSuccess.class);
|
||||
PromptLoadingSuccess success = (PromptLoadingSuccess) result;
|
||||
assertThat(success.promptContent()).isEqualTo(promptContent);
|
||||
assertThat(success.promptIdentifier().identifier()).isEqualTo("prompt.txt");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldTrimWhitespace_whenPromptContainsLeadingTrailingWhitespace() throws IOException {
|
||||
// Given
|
||||
String promptContent = " \n Helpful AI assistant \n ";
|
||||
Path promptFile = tempDir.resolve("prompt_whitespace.txt");
|
||||
Files.writeString(promptFile, promptContent, StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingSuccess.class);
|
||||
PromptLoadingSuccess success = (PromptLoadingSuccess) result;
|
||||
assertThat(success.promptContent()).isEqualTo("Helpful AI assistant");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldDeriveIdentifierFromFilename() throws IOException {
|
||||
// Given
|
||||
String promptContent = "Test prompt";
|
||||
Path promptFile = tempDir.resolve("prompt_v2_de.txt");
|
||||
Files.writeString(promptFile, promptContent, StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingSuccess.class);
|
||||
PromptLoadingSuccess success = (PromptLoadingSuccess) result;
|
||||
assertThat(success.promptIdentifier().identifier()).isEqualTo("prompt_v2_de.txt");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldReturnFailure_whenFileDoesNotExist() {
|
||||
// Given
|
||||
Path nonExistentFile = tempDir.resolve("nonexistent_prompt.txt");
|
||||
adapter = new FilesystemPromptPortAdapter(nonExistentFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingFailure.class);
|
||||
PromptLoadingFailure failure = (PromptLoadingFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("FILE_NOT_FOUND");
|
||||
assertThat(failure.failureMessage()).contains("nonexistent_prompt.txt");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldReturnFailure_whenPromptIsEmpty() throws IOException {
|
||||
// Given
|
||||
Path promptFile = tempDir.resolve("empty_prompt.txt");
|
||||
Files.writeString(promptFile, "", StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingFailure.class);
|
||||
PromptLoadingFailure failure = (PromptLoadingFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("EMPTY_CONTENT");
|
||||
assertThat(failure.failureMessage()).contains("empty or contains only whitespace");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldReturnFailure_whenPromptContainsOnlyWhitespace() throws IOException {
|
||||
// Given
|
||||
Path promptFile = tempDir.resolve("whitespace_only_prompt.txt");
|
||||
Files.writeString(promptFile, " \n\n \t \n", StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingFailure.class);
|
||||
PromptLoadingFailure failure = (PromptLoadingFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("EMPTY_CONTENT");
|
||||
assertThat(failure.failureMessage()).contains("empty or contains only whitespace");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldReturnFailure_withIOError_whenFileCannotBeRead() throws IOException {
|
||||
// Given
|
||||
Path promptFile = tempDir.resolve("readable_prompt.txt");
|
||||
Files.writeString(promptFile, "Test prompt content", StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// Delete the file before calling loadPrompt to simulate a file disappearing
|
||||
Files.delete(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingFailure.class);
|
||||
PromptLoadingFailure failure = (PromptLoadingFailure) result;
|
||||
assertThat(failure.failureReason()).isEqualTo("FILE_NOT_FOUND");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldThrowNullPointerException_whenPromptFilePathIsNull() {
|
||||
// When & Then
|
||||
assertThatThrownBy(() -> new FilesystemPromptPortAdapter(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessage("promptFilePath must not be null");
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldPreserveMultilineContent() throws IOException {
|
||||
// Given
|
||||
String promptContent = "Line 1\nLine 2\nLine 3";
|
||||
Path promptFile = tempDir.resolve("multiline_prompt.txt");
|
||||
Files.writeString(promptFile, promptContent, StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(PromptLoadingSuccess.class);
|
||||
PromptLoadingSuccess success = (PromptLoadingSuccess) result;
|
||||
assertThat(success.promptContent()).isEqualTo(promptContent);
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadPrompt_shouldBeDeterministic_whenCalledMultipleTimes() throws IOException {
|
||||
// Given
|
||||
String promptContent = "Deterministic prompt content";
|
||||
Path promptFile = tempDir.resolve("stable_prompt.txt");
|
||||
Files.writeString(promptFile, promptContent, StandardCharsets.UTF_8);
|
||||
adapter = new FilesystemPromptPortAdapter(promptFile);
|
||||
|
||||
// When
|
||||
PromptLoadingResult result1 = adapter.loadPrompt();
|
||||
PromptLoadingResult result2 = adapter.loadPrompt();
|
||||
|
||||
// Then
|
||||
assertThat(result1).isInstanceOf(PromptLoadingSuccess.class);
|
||||
assertThat(result2).isInstanceOf(PromptLoadingSuccess.class);
|
||||
|
||||
PromptLoadingSuccess success1 = (PromptLoadingSuccess) result1;
|
||||
PromptLoadingSuccess success2 = (PromptLoadingSuccess) result2;
|
||||
|
||||
assertThat(success1.promptContent()).isEqualTo(success2.promptContent());
|
||||
assertThat(success1.promptIdentifier()).isEqualTo(success2.promptIdentifier());
|
||||
}
|
||||
}
|
||||
@@ -1,23 +1,25 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument.SourceDocumentCandidatesPortAdapter;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertNotNull;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.List;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.*;
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
/**
|
||||
* Tests for {@link SourceDocumentCandidatesPortAdapter}.
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
class SourceDocumentCandidatesPortAdapterTest {
|
||||
|
||||
@@ -194,7 +196,7 @@ class SourceDocumentCandidatesPortAdapterTest {
|
||||
|
||||
@Test
|
||||
void testLoadCandidates_EmptyPdfFilesAreIncluded() throws IOException {
|
||||
// Create empty PDF files (M3-AP-002 requirement: PDF-Dateien im Quellordner)
|
||||
// Create empty PDF files
|
||||
Files.createFile(tempDir.resolve("empty1.pdf"));
|
||||
Files.createFile(tempDir.resolve("empty2.pdf"));
|
||||
// Also add a non-empty PDF for contrast
|
||||
@@ -203,8 +205,26 @@ class SourceDocumentCandidatesPortAdapterTest {
|
||||
List<SourceDocumentCandidate> candidates = adapter.loadCandidates();
|
||||
|
||||
assertEquals(3, candidates.size(),
|
||||
"Empty PDF files should be included as candidates; content evaluation happens in AP-004");
|
||||
"Empty PDF files should be included as candidates; content evaluation happens during document processing");
|
||||
assertTrue(candidates.stream().allMatch(c -> c.uniqueIdentifier().endsWith(".pdf")),
|
||||
"All candidates should be PDF files");
|
||||
}
|
||||
|
||||
/**
|
||||
* A directory whose name ends with {@code .pdf} must not be included as a candidate.
|
||||
* <p>
|
||||
* The regular-file filter must exclude directories even when their name matches the
|
||||
* PDF extension, so that only actual PDF files are returned.
|
||||
*/
|
||||
@Test
|
||||
void testLoadCandidates_DirectoryWithPdfExtensionIsExcluded() throws IOException {
|
||||
Files.write(tempDir.resolve("real.pdf"), "content".getBytes());
|
||||
Files.createDirectory(tempDir.resolve("looks-like.pdf"));
|
||||
|
||||
List<SourceDocumentCandidate> candidates = adapter.loadCandidates();
|
||||
|
||||
assertEquals(1, candidates.size(),
|
||||
"A directory with .pdf extension must not be included as a candidate");
|
||||
assertEquals("real.pdf", candidates.get(0).uniqueIdentifier());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,394 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DatabaseMetaData;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.PreparedStatement;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.sql.Statement;
|
||||
import java.time.Instant;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
import java.util.List;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* Tests for the additive {@code ai_provider} column in {@code processing_attempt}.
|
||||
* <p>
|
||||
* Covers schema migration (idempotency, nullable default for existing rows),
|
||||
* write/read round-trips for both supported provider identifiers, and
|
||||
* backward compatibility with databases created before provider tracking was introduced.
|
||||
*/
|
||||
class SqliteAttemptProviderPersistenceTest {
|
||||
|
||||
private String jdbcUrl;
|
||||
private SqliteSchemaInitializationAdapter schemaAdapter;
|
||||
private SqliteProcessingAttemptRepositoryAdapter repository;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
Path dbFile = tempDir.resolve("provider-test.db");
|
||||
jdbcUrl = "jdbc:sqlite:" + dbFile.toAbsolutePath();
|
||||
schemaAdapter = new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
repository = new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Schema migration tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* A fresh database must contain the {@code ai_provider} column after schema initialisation.
|
||||
*/
|
||||
@Test
|
||||
void addsProviderColumnOnFreshDb() throws SQLException {
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider column must exist in processing_attempt after fresh schema init")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* A database that already has the {@code processing_attempt} table without
|
||||
* {@code ai_provider} (simulating an existing installation before this column was added)
|
||||
* must receive the column via the idempotent schema evolution.
|
||||
*/
|
||||
@Test
|
||||
void addsProviderColumnOnExistingDbWithoutColumn() throws SQLException {
|
||||
// Bootstrap schema without the ai_provider column (simulate legacy DB)
|
||||
createLegacySchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider must not be present before evolution")
|
||||
.isFalse();
|
||||
|
||||
// Running initializeSchema must add the column
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("ai_provider column must be added by schema evolution")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* Running schema initialisation multiple times must not fail and must not change the schema.
|
||||
*/
|
||||
@Test
|
||||
void migrationIsIdempotent() throws SQLException {
|
||||
schemaAdapter.initializeSchema();
|
||||
// Second and third init must not throw or change the schema
|
||||
schemaAdapter.initializeSchema();
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
assertThat(columnExists("processing_attempt", "ai_provider"))
|
||||
.as("Column must still be present after repeated init calls")
|
||||
.isTrue();
|
||||
}
|
||||
|
||||
/**
|
||||
* Rows that existed before the {@code ai_provider} column was added must have
|
||||
* {@code NULL} as the column value, not a non-null default.
|
||||
*/
|
||||
@Test
|
||||
void existingRowsKeepNullProvider() throws SQLException {
|
||||
// Create legacy schema and insert a row without ai_provider
|
||||
createLegacySchema();
|
||||
DocumentFingerprint fp = fingerprint("aa");
|
||||
insertLegacyDocumentRecord(fp);
|
||||
insertLegacyAttemptRow(fp, "READY_FOR_AI");
|
||||
|
||||
// Now evolve the schema
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
// Read the existing row — ai_provider must be NULL
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fp);
|
||||
assertThat(attempts).hasSize(1);
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Existing rows must have NULL ai_provider after schema evolution")
|
||||
.isNull();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Write tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* A new attempt written with an active OpenAI-compatible provider must
|
||||
* persist {@code "openai-compatible"} in {@code ai_provider}.
|
||||
*/
|
||||
@Test
|
||||
void newAttemptsWriteOpenAiCompatibleProvider() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("bb");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fp, new RunId("run-oai"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fp);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).aiProvider()).isEqualTo("openai-compatible");
|
||||
}
|
||||
|
||||
/**
|
||||
* A new attempt written with an active Claude provider must persist
|
||||
* {@code "claude"} in {@code ai_provider}.
|
||||
* <p>
|
||||
* The provider selection is simulated at the data level here; the actual
|
||||
* Claude adapter is wired in a later step.
|
||||
*/
|
||||
@Test
|
||||
void newAttemptsWriteClaudeProvider() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("cc");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fp, new RunId("run-claude"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"claude",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fp);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).aiProvider()).isEqualTo("claude");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Read tests
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* The repository must correctly return the persisted provider identifier
|
||||
* when reading an attempt back from the database.
|
||||
*/
|
||||
@Test
|
||||
void repositoryReadsProviderColumn() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("dd");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
repository.save(new ProcessingAttempt(
|
||||
fp, new RunId("run-read"), 1, now, now.plusSeconds(2),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"Timeout", "Connection timed out", true,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null));
|
||||
|
||||
List<ProcessingAttempt> loaded = repository.findAllByFingerprint(fp);
|
||||
assertThat(loaded).hasSize(1);
|
||||
assertThat(loaded.get(0).aiProvider())
|
||||
.as("Repository must return the persisted ai_provider value")
|
||||
.isEqualTo("openai-compatible");
|
||||
}
|
||||
|
||||
/**
|
||||
* Reading a database that was created without the {@code ai_provider} column
|
||||
* (a pre-extension database) must succeed; the new field must be empty/null
|
||||
* for historical attempts.
|
||||
*/
|
||||
@Test
|
||||
void legacyDataReadingDoesNotFail() throws SQLException {
|
||||
// Set up legacy schema with a row that has no ai_provider column
|
||||
createLegacySchema();
|
||||
DocumentFingerprint fp = fingerprint("ee");
|
||||
insertLegacyDocumentRecord(fp);
|
||||
insertLegacyAttemptRow(fp, "FAILED_RETRYABLE");
|
||||
|
||||
// Evolve schema — now ai_provider column exists but legacy rows have NULL
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
// Reading must not throw and must return null for ai_provider
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fp);
|
||||
assertThat(attempts).hasSize(1);
|
||||
assertThat(attempts.get(0).aiProvider())
|
||||
.as("Legacy attempt (from before provider tracking) must have null aiProvider")
|
||||
.isNull();
|
||||
// Other fields must still be readable
|
||||
assertThat(attempts.get(0).status()).isEqualTo(ProcessingStatus.FAILED_RETRYABLE);
|
||||
}
|
||||
|
||||
/**
|
||||
* All existing attempt history tests must remain green: the repository
|
||||
* handles null {@code ai_provider} values transparently without errors.
|
||||
*/
|
||||
@Test
|
||||
void existingHistoryTestsRemainGreen() {
|
||||
schemaAdapter.initializeSchema();
|
||||
DocumentFingerprint fp = fingerprint("ff");
|
||||
insertDocumentRecord(fp);
|
||||
|
||||
Instant base = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Save attempt with null provider (as in legacy path or non-AI attempt)
|
||||
ProcessingAttempt nullProviderAttempt = ProcessingAttempt.withoutAiFields(
|
||||
fp, new RunId("run-legacy"), 1,
|
||||
base, base.plusSeconds(1),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"Err", "msg", true);
|
||||
repository.save(nullProviderAttempt);
|
||||
|
||||
// Save attempt with explicit provider
|
||||
ProcessingAttempt withProvider = new ProcessingAttempt(
|
||||
fp, new RunId("run-new"), 2,
|
||||
base.plusSeconds(10), base.plusSeconds(11),
|
||||
ProcessingStatus.READY_FOR_AI,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null);
|
||||
repository.save(withProvider);
|
||||
|
||||
List<ProcessingAttempt> all = repository.findAllByFingerprint(fp);
|
||||
assertThat(all).hasSize(2);
|
||||
assertThat(all.get(0).aiProvider()).isNull();
|
||||
assertThat(all.get(1).aiProvider()).isEqualTo("openai-compatible");
|
||||
// Ordering preserved
|
||||
assertThat(all.get(0).attemptNumber()).isEqualTo(1);
|
||||
assertThat(all.get(1).attemptNumber()).isEqualTo(2);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private boolean columnExists(String table, String column) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
DatabaseMetaData meta = conn.getMetaData();
|
||||
try (ResultSet rs = meta.getColumns(null, null, table, column)) {
|
||||
return rs.next();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates the base tables that existed before the {@code ai_provider} column was added,
|
||||
* without running the schema evolution that adds that column.
|
||||
*/
|
||||
private void createLegacySchema() throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
Statement stmt = conn.createStatement()) {
|
||||
stmt.execute("PRAGMA foreign_keys = ON");
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS document_record (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
last_known_source_locator TEXT NOT NULL,
|
||||
last_known_source_file_name TEXT NOT NULL,
|
||||
overall_status TEXT NOT NULL,
|
||||
content_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
transient_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
last_failure_instant TEXT,
|
||||
last_success_instant TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL,
|
||||
CONSTRAINT uq_document_record_fingerprint UNIQUE (fingerprint)
|
||||
)""");
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS processing_attempt (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
run_id TEXT NOT NULL,
|
||||
attempt_number INTEGER NOT NULL,
|
||||
started_at TEXT NOT NULL,
|
||||
ended_at TEXT NOT NULL,
|
||||
status TEXT NOT NULL,
|
||||
failure_class TEXT,
|
||||
failure_message TEXT,
|
||||
retryable INTEGER NOT NULL DEFAULT 0,
|
||||
model_name TEXT,
|
||||
prompt_identifier TEXT,
|
||||
processed_page_count INTEGER,
|
||||
sent_character_count INTEGER,
|
||||
ai_raw_response TEXT,
|
||||
ai_reasoning TEXT,
|
||||
resolved_date TEXT,
|
||||
date_source TEXT,
|
||||
validated_title TEXT,
|
||||
final_target_file_name TEXT,
|
||||
CONSTRAINT fk_processing_attempt_fingerprint
|
||||
FOREIGN KEY (fingerprint) REFERENCES document_record (fingerprint),
|
||||
CONSTRAINT uq_processing_attempt_fingerprint_number
|
||||
UNIQUE (fingerprint, attempt_number)
|
||||
)""");
|
||||
}
|
||||
}
|
||||
|
||||
private void insertLegacyDocumentRecord(DocumentFingerprint fp) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, '/tmp/test.pdf', 'test.pdf', 'READY_FOR_AI',
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'))""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private void insertLegacyAttemptRow(DocumentFingerprint fp, String status) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO processing_attempt
|
||||
(fingerprint, run_id, attempt_number, started_at, ended_at, status, retryable)
|
||||
VALUES (?, 'run-legacy', 1, strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'), ?, 1)""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.setString(2, status);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private void insertDocumentRecord(DocumentFingerprint fp) {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
PreparedStatement ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, '/tmp/test.pdf', 'test.pdf', 'READY_FOR_AI',
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'),
|
||||
strftime('%Y-%m-%dT%H:%M:%SZ', 'now'))""")) {
|
||||
ps.setString(1, fp.sha256Hex());
|
||||
ps.executeUpdate();
|
||||
} catch (SQLException e) {
|
||||
throw new RuntimeException("Failed to insert test document record", e);
|
||||
}
|
||||
}
|
||||
|
||||
private static DocumentFingerprint fingerprint(String suffix) {
|
||||
return new DocumentFingerprint(
|
||||
("0".repeat(64 - suffix.length()) + suffix));
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,664 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentKnownProcessable;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordLookupResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalFinalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentTerminalSuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Tests for {@link SqliteDocumentRecordRepositoryAdapter}.
|
||||
*/
|
||||
class SqliteDocumentRecordRepositoryAdapterTest {
|
||||
|
||||
private SqliteDocumentRecordRepositoryAdapter repository;
|
||||
private String jdbcUrl;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
Path dbFile = tempDir.resolve("test.db");
|
||||
jdbcUrl = "jdbc:sqlite:" + dbFile.toAbsolutePath();
|
||||
|
||||
// Initialize schema first
|
||||
SqliteSchemaInitializationAdapter schemaInitializer =
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
schemaInitializer.initializeSchema();
|
||||
|
||||
repository = new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findByFingerprint_shouldReturnDocumentUnknown_whenRecordDoesNotExist() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"0000000000000000000000000000000000000000000000000000000000000000");
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentUnknown.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void create_and_findByFingerprint_shouldWorkForNewRecord() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"1111111111111111111111111111111111111111111111111111111111111111");
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/path/to/document.pdf"),
|
||||
"document.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.create(record);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
DocumentRecord foundRecord = known.record();
|
||||
|
||||
assertThat(foundRecord.fingerprint()).isEqualTo(fingerprint);
|
||||
assertThat(foundRecord.lastKnownSourceLocator().value()).isEqualTo("/path/to/document.pdf");
|
||||
assertThat(foundRecord.lastKnownSourceFileName()).isEqualTo("document.pdf");
|
||||
assertThat(foundRecord.overallStatus()).isEqualTo(ProcessingStatus.PROCESSING);
|
||||
assertThat(foundRecord.failureCounters()).isEqualTo(FailureCounters.zero());
|
||||
assertThat(foundRecord.lastFailureInstant()).isNull();
|
||||
assertThat(foundRecord.lastSuccessInstant()).isNull();
|
||||
assertThat(foundRecord.createdAt()).isEqualTo(record.createdAt());
|
||||
assertThat(foundRecord.updatedAt()).isEqualTo(record.updatedAt());
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldModifyExistingRecord() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"2222222222222222222222222222222222222222222222222222222222222222");
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/initial/path.pdf"),
|
||||
"initial.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
Instant.now().minusSeconds(60).truncatedTo(ChronoUnit.MICROS),
|
||||
Instant.now().minusSeconds(60).truncatedTo(ChronoUnit.MICROS),
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Updated record
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord updatedRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/updated/path.pdf"),
|
||||
"updated.pdf",
|
||||
ProcessingStatus.SUCCESS,
|
||||
new FailureCounters(0, 0),
|
||||
null,
|
||||
now,
|
||||
initialRecord.createdAt(),
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(updatedRecord);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentTerminalSuccess.class);
|
||||
DocumentTerminalSuccess success = (DocumentTerminalSuccess) result;
|
||||
DocumentRecord foundRecord = success.record();
|
||||
|
||||
assertThat(foundRecord.lastKnownSourceLocator().value()).isEqualTo("/updated/path.pdf");
|
||||
assertThat(foundRecord.lastKnownSourceFileName()).isEqualTo("updated.pdf");
|
||||
assertThat(foundRecord.overallStatus()).isEqualTo(ProcessingStatus.SUCCESS);
|
||||
assertThat(foundRecord.lastSuccessInstant()).isEqualTo(now);
|
||||
assertThat(foundRecord.updatedAt()).isEqualTo(now);
|
||||
}
|
||||
|
||||
@Test
|
||||
void create_shouldThrowException_whenFingerprintAlreadyExists() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"3333333333333333333333333333333333333333333333333333333333333333");
|
||||
DocumentRecord record1 = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/path1.pdf"),
|
||||
"doc1.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
repository.create(record1);
|
||||
|
||||
DocumentRecord record2 = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/path2.pdf"),
|
||||
"doc2.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.create(record2))
|
||||
.isInstanceOf(DocumentPersistenceException.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldThrowException_whenFingerprintDoesNotExist() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"4444444444444444444444444444444444444444444444444444444444444444");
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/nonexistent.pdf"),
|
||||
"nonexistent.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
Instant.now().truncatedTo(ChronoUnit.MICROS),
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.update(record))
|
||||
.isInstanceOf(DocumentPersistenceException.class)
|
||||
.hasMessageContaining("Expected to update 1 row but affected 0 rows");
|
||||
}
|
||||
|
||||
@Test
|
||||
void findByFingerprint_shouldReturnDocumentTerminalFinalFailure_whenStatusIsFailedFinal() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"5555555555555555555555555555555555555555555555555555555555555555");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
// Create initially as PROCESSING
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/path/to/document.pdf"),
|
||||
"document.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now.minusSeconds(120),
|
||||
now.minusSeconds(120),
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Update to FAILED_FINAL (second content error: count=2)
|
||||
Instant failureInstant = now.minusSeconds(60);
|
||||
DocumentRecord failedFinalRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/path/to/document.pdf"),
|
||||
"document.pdf",
|
||||
ProcessingStatus.FAILED_FINAL,
|
||||
new FailureCounters(2, 0),
|
||||
failureInstant,
|
||||
null,
|
||||
now.minusSeconds(120),
|
||||
failureInstant,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.update(failedFinalRecord);
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentTerminalFinalFailure.class);
|
||||
DocumentTerminalFinalFailure terminalFailure = (DocumentTerminalFinalFailure) result;
|
||||
DocumentRecord foundRecord = terminalFailure.record();
|
||||
assertThat(foundRecord.overallStatus()).isEqualTo(ProcessingStatus.FAILED_FINAL);
|
||||
assertThat(foundRecord.failureCounters().contentErrorCount()).isEqualTo(2);
|
||||
assertThat(foundRecord.failureCounters().transientErrorCount()).isEqualTo(0);
|
||||
assertThat(foundRecord.lastFailureInstant()).isEqualTo(failureInstant);
|
||||
assertThat(foundRecord.lastSuccessInstant()).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldPersistNonZeroFailureCountersAndFailureInstant() {
|
||||
// Given: create a new document as PROCESSING
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"6666666666666666666666666666666666666666666666666666666666666666");
|
||||
Instant createdAt = Instant.now().minusSeconds(120).truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/invoice.pdf"),
|
||||
"invoice.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
createdAt,
|
||||
createdAt,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Update to FAILED_RETRYABLE with content_error_count=1, transient_error_count=0
|
||||
Instant failureInstant = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord failedRetryableRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/invoice.pdf"),
|
||||
"invoice.pdf",
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
new FailureCounters(1, 0),
|
||||
failureInstant,
|
||||
null,
|
||||
createdAt,
|
||||
failureInstant,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(failedRetryableRecord);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then: lookup returns DocumentKnownProcessable (FAILED_RETRYABLE is not terminal)
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
DocumentRecord foundRecord = known.record();
|
||||
assertThat(foundRecord.overallStatus()).isEqualTo(ProcessingStatus.FAILED_RETRYABLE);
|
||||
assertThat(foundRecord.failureCounters().contentErrorCount()).isEqualTo(1);
|
||||
assertThat(foundRecord.failureCounters().transientErrorCount()).isEqualTo(0);
|
||||
assertThat(foundRecord.lastFailureInstant()).isEqualTo(failureInstant);
|
||||
assertThat(foundRecord.lastSuccessInstant()).isNull();
|
||||
assertThat(foundRecord.createdAt()).isEqualTo(createdAt);
|
||||
assertThat(foundRecord.updatedAt()).isEqualTo(failureInstant);
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldTransitionFromFailedRetryableToFailedFinalWithIncrementedCounter() {
|
||||
// Given: create as FAILED_RETRYABLE with content_error_count=1 (first content error already recorded)
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"7777777777777777777777777777777777777777777777777777777777777777");
|
||||
Instant createdAt = Instant.now().minusSeconds(300).truncatedTo(ChronoUnit.MICROS);
|
||||
Instant firstFailureAt = Instant.now().minusSeconds(120).truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/report.pdf"),
|
||||
"report.pdf",
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
new FailureCounters(1, 0),
|
||||
firstFailureAt,
|
||||
null,
|
||||
createdAt,
|
||||
firstFailureAt,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Second content error: update to FAILED_FINAL with content_error_count=2
|
||||
Instant secondFailureAt = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord failedFinalRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/report.pdf"),
|
||||
"report.pdf",
|
||||
ProcessingStatus.FAILED_FINAL,
|
||||
new FailureCounters(2, 0),
|
||||
secondFailureAt,
|
||||
null,
|
||||
createdAt,
|
||||
secondFailureAt,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(failedFinalRecord);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then: terminal final failure
|
||||
assertThat(result).isInstanceOf(DocumentTerminalFinalFailure.class);
|
||||
DocumentTerminalFinalFailure terminalFailure = (DocumentTerminalFinalFailure) result;
|
||||
DocumentRecord foundRecord = terminalFailure.record();
|
||||
assertThat(foundRecord.overallStatus()).isEqualTo(ProcessingStatus.FAILED_FINAL);
|
||||
assertThat(foundRecord.failureCounters().contentErrorCount()).isEqualTo(2);
|
||||
assertThat(foundRecord.failureCounters().transientErrorCount()).isEqualTo(0);
|
||||
assertThat(foundRecord.lastFailureInstant()).isEqualTo(secondFailureAt);
|
||||
assertThat(foundRecord.lastSuccessInstant()).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldPersistTransientErrorCounter() {
|
||||
// Given: create as PROCESSING
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"8888888888888888888888888888888888888888888888888888888888888888");
|
||||
Instant createdAt = Instant.now().minusSeconds(60).truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/scan.pdf"),
|
||||
"scan.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
createdAt,
|
||||
createdAt,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Update to FAILED_RETRYABLE with transient_error_count=3 (technical errors)
|
||||
Instant failureInstant = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord transientFailureRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/scan.pdf"),
|
||||
"scan.pdf",
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
new FailureCounters(0, 3),
|
||||
failureInstant,
|
||||
null,
|
||||
createdAt,
|
||||
failureInstant,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(transientFailureRecord);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
DocumentRecord foundRecord = known.record();
|
||||
assertThat(foundRecord.failureCounters().contentErrorCount()).isEqualTo(0);
|
||||
assertThat(foundRecord.failureCounters().transientErrorCount()).isEqualTo(3);
|
||||
assertThat(foundRecord.lastFailureInstant()).isEqualTo(failureInstant);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findByFingerprint_shouldThrowNullPointerException_whenFingerprintIsNull() {
|
||||
// Given
|
||||
DocumentFingerprint nullFingerprint = null;
|
||||
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.findByFingerprint(nullFingerprint))
|
||||
.isInstanceOf(NullPointerException.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void create_shouldThrowNullPointerException_whenRecordIsNull() {
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.create(null))
|
||||
.isInstanceOf(NullPointerException.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldThrowNullPointerException_whenRecordIsNull() {
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.update(null))
|
||||
.isInstanceOf(NullPointerException.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findByFingerprint_shouldReturnDocumentKnownProcessable_whenStatusIsSkippedAlreadyProcessed() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"9999999999999999999999999999999999999999999999999999999999999999");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/skipped.pdf"),
|
||||
"skipped.pdf",
|
||||
ProcessingStatus.SKIPPED_ALREADY_PROCESSED,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(record);
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then: SKIPPED_ALREADY_PROCESSED is not terminal, should be DocumentKnownProcessable
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
assertThat(known.record().overallStatus()).isEqualTo(ProcessingStatus.SKIPPED_ALREADY_PROCESSED);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findByFingerprint_shouldReturnDocumentKnownProcessable_whenStatusIsSkippedFinalFailure() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/final-skipped.pdf"),
|
||||
"final-skipped.pdf",
|
||||
ProcessingStatus.SKIPPED_FINAL_FAILURE,
|
||||
new FailureCounters(2, 0),
|
||||
now.minusSeconds(60),
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(record);
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then: SKIPPED_FINAL_FAILURE is not terminal, should be DocumentKnownProcessable
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
assertThat(known.record().overallStatus()).isEqualTo(ProcessingStatus.SKIPPED_FINAL_FAILURE);
|
||||
}
|
||||
|
||||
@Test
|
||||
void create_and_update_shouldPreserveNullTimestamps() {
|
||||
// Given: create with null timestamps
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/no-timestamps.pdf"),
|
||||
"no-timestamps.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null, // lastFailureInstant is null
|
||||
null, // lastSuccessInstant is null
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(record);
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentKnownProcessable known = (DocumentKnownProcessable) result;
|
||||
assertThat(known.record().lastFailureInstant()).isNull();
|
||||
assertThat(known.record().lastSuccessInstant()).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void create_and_update_shouldPersistAndReadTargetPathAndTargetFileName() {
|
||||
// Given: create a record with null target fields initially
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/doc.pdf"),
|
||||
"doc.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null, null,
|
||||
now, now,
|
||||
null, null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Update with target path and filename
|
||||
DocumentRecord successRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/doc.pdf"),
|
||||
"doc.pdf",
|
||||
ProcessingStatus.SUCCESS,
|
||||
FailureCounters.zero(),
|
||||
null, now,
|
||||
now, now,
|
||||
"/target/folder",
|
||||
"2026-01-15 - Rechnung.pdf"
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(successRecord);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentTerminalSuccess.class);
|
||||
DocumentRecord found = ((DocumentTerminalSuccess) result).record();
|
||||
assertThat(found.lastTargetPath()).isEqualTo("/target/folder");
|
||||
assertThat(found.lastTargetFileName()).isEqualTo("2026-01-15 - Rechnung.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldPersistNullTargetFields_whenNotYetCopied() {
|
||||
// Given: a record with null target path and filename (not yet in SUCCESS)
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/pending.pdf"),
|
||||
"pending.pdf",
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
new FailureCounters(0, 1),
|
||||
now, null,
|
||||
now, now,
|
||||
null, null
|
||||
);
|
||||
repository.create(record);
|
||||
|
||||
// When
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(result).isInstanceOf(DocumentKnownProcessable.class);
|
||||
DocumentRecord found = ((DocumentKnownProcessable) result).record();
|
||||
assertThat(found.lastTargetPath()).isNull();
|
||||
assertThat(found.lastTargetFileName()).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void update_shouldPreserveCreatedAtTimestamp() {
|
||||
// Given: create with specific createdAt
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc");
|
||||
Instant createdAt = Instant.now().minusSeconds(1000).truncatedTo(ChronoUnit.MICROS);
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
DocumentRecord initialRecord = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/test.pdf"),
|
||||
"test.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
createdAt, // Much older createdAt
|
||||
createdAt,
|
||||
null,
|
||||
null
|
||||
);
|
||||
repository.create(initialRecord);
|
||||
|
||||
// Update with new timestamps
|
||||
DocumentRecord updated = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/test.pdf"),
|
||||
"test.pdf",
|
||||
ProcessingStatus.SUCCESS,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
now,
|
||||
createdAt, // createdAt should remain unchanged
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.update(updated);
|
||||
DocumentRecordLookupResult result = repository.findByFingerprint(fingerprint);
|
||||
|
||||
// Then: createdAt should be preserved
|
||||
assertThat(result).isInstanceOf(DocumentTerminalSuccess.class);
|
||||
DocumentTerminalSuccess success = (DocumentTerminalSuccess) result;
|
||||
assertThat(success.record().createdAt()).isEqualTo(createdAt);
|
||||
assertThat(success.record().updatedAt()).isEqualTo(now);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,886 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.SQLException;
|
||||
import java.time.Instant;
|
||||
import java.time.LocalDate;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
import java.util.List;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DateSource;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
/**
|
||||
* Tests for {@link SqliteProcessingAttemptRepositoryAdapter}.
|
||||
* <p>
|
||||
* Covers base attempt persistence, AI traceability field round-trips,
|
||||
* proposal-ready lookup, and non-AI-attempt status storability.
|
||||
*/
|
||||
class SqliteProcessingAttemptRepositoryAdapterTest {
|
||||
|
||||
private SqliteProcessingAttemptRepositoryAdapter repository;
|
||||
private String jdbcUrl;
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
Path dbFile = tempDir.resolve("test.db");
|
||||
jdbcUrl = "jdbc:sqlite:" + dbFile.toAbsolutePath();
|
||||
|
||||
// Initialize schema first
|
||||
SqliteSchemaInitializationAdapter schemaInitializer =
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
schemaInitializer.initializeSchema();
|
||||
|
||||
repository = new SqliteProcessingAttemptRepositoryAdapter(jdbcUrl);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Construction
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void constructor_rejectsNullJdbcUrl() {
|
||||
assertThatThrownBy(() -> new SqliteProcessingAttemptRepositoryAdapter(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("jdbcUrl");
|
||||
}
|
||||
|
||||
@Test
|
||||
void constructor_rejectsBlankJdbcUrl() {
|
||||
assertThatThrownBy(() -> new SqliteProcessingAttemptRepositoryAdapter(" "))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("jdbcUrl");
|
||||
}
|
||||
|
||||
@Test
|
||||
void getJdbcUrl_returnsConfiguredUrl() {
|
||||
String url = "jdbc:sqlite:/some/path/test.db";
|
||||
SqliteProcessingAttemptRepositoryAdapter adapter = new SqliteProcessingAttemptRepositoryAdapter(url);
|
||||
assertThat(adapter.getJdbcUrl()).isEqualTo(url);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// loadNextAttemptNumber
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void loadNextAttemptNumber_shouldReturnOne_whenNoAttemptsExist() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"0000000000000000000000000000000000000000000000000000000000000000");
|
||||
|
||||
// When
|
||||
int nextAttemptNumber = repository.loadNextAttemptNumber(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(nextAttemptNumber).isEqualTo(1);
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadNextAttemptNumber_shouldReturnNextNumber_whenAttemptsExist() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"1111111111111111111111111111111111111111111111111111111111111111");
|
||||
RunId runId = new RunId("test-run-1");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Insert a document record first (FK constraint)
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
// Insert first attempt
|
||||
ProcessingAttempt firstAttempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId,
|
||||
1,
|
||||
now,
|
||||
now.plusSeconds(10),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"IOException",
|
||||
"File not found",
|
||||
true
|
||||
);
|
||||
repository.save(firstAttempt);
|
||||
|
||||
// When
|
||||
int nextAttemptNumber = repository.loadNextAttemptNumber(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(nextAttemptNumber).isEqualTo(2);
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadNextAttemptNumber_shouldBeMonotonicAcrossMultipleCalls() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"2222222222222222222222222222222222222222222222222222222222222222");
|
||||
RunId runId = new RunId("test-run-2");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Insert a document record first (FK constraint)
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
// Insert multiple attempts
|
||||
for (int i = 1; i <= 5; i++) {
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId,
|
||||
i,
|
||||
now.plusSeconds(i * 10),
|
||||
now.plusSeconds(i * 10 + 5),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"IOException",
|
||||
"File not found",
|
||||
true
|
||||
);
|
||||
repository.save(attempt);
|
||||
}
|
||||
|
||||
// When
|
||||
int nextAttemptNumber = repository.loadNextAttemptNumber(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(nextAttemptNumber).isEqualTo(6);
|
||||
}
|
||||
|
||||
@Test
|
||||
void loadNextAttemptNumber_shouldRejectNullFingerprint() {
|
||||
assertThatThrownBy(() -> repository.loadNextAttemptNumber(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("fingerprint");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// save
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void save_shouldPersistProcessingAttempt_withAllFields() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"3333333333333333333333333333333333333333333333333333333333333333");
|
||||
RunId runId = new RunId("test-run-3");
|
||||
Instant startedAt = Instant.now().minusSeconds(60).truncatedTo(ChronoUnit.MICROS);
|
||||
Instant endedAt = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Insert a document record first (FK constraint)
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId,
|
||||
1,
|
||||
startedAt,
|
||||
endedAt,
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"IOException",
|
||||
"File not found",
|
||||
true
|
||||
);
|
||||
|
||||
// When
|
||||
repository.save(attempt);
|
||||
|
||||
// Then
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(attempts).hasSize(1);
|
||||
|
||||
ProcessingAttempt saved = attempts.get(0);
|
||||
assertThat(saved.fingerprint()).isEqualTo(fingerprint);
|
||||
assertThat(saved.runId()).isEqualTo(runId);
|
||||
assertThat(saved.attemptNumber()).isEqualTo(1);
|
||||
assertThat(saved.startedAt()).isEqualTo(startedAt);
|
||||
assertThat(saved.endedAt()).isEqualTo(endedAt);
|
||||
assertThat(saved.status()).isEqualTo(ProcessingStatus.FAILED_RETRYABLE);
|
||||
assertThat(saved.failureClass()).isEqualTo("IOException");
|
||||
assertThat(saved.failureMessage()).isEqualTo("File not found");
|
||||
assertThat(saved.retryable()).isTrue();
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_shouldPersistProcessingAttempt_withNullFailureFields() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"4444444444444444444444444444444444444444444444444444444444444444");
|
||||
RunId runId = new RunId("test-run-4");
|
||||
Instant startedAt = Instant.now().minusSeconds(60).truncatedTo(ChronoUnit.MICROS);
|
||||
Instant endedAt = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Insert a document record first (FK constraint)
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId,
|
||||
1,
|
||||
startedAt,
|
||||
endedAt,
|
||||
ProcessingStatus.SUCCESS,
|
||||
null, // null failure class
|
||||
null, // null failure message
|
||||
false // not retryable
|
||||
);
|
||||
|
||||
// When
|
||||
repository.save(attempt);
|
||||
|
||||
// Then
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(attempts).hasSize(1);
|
||||
|
||||
ProcessingAttempt saved = attempts.get(0);
|
||||
assertThat(saved.failureClass()).isNull();
|
||||
assertThat(saved.failureMessage()).isNull();
|
||||
assertThat(saved.retryable()).isFalse();
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_shouldRejectNullAttempt() {
|
||||
assertThatThrownBy(() -> repository.save(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("attempt");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// findAllByFingerprint
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void findAllByFingerprint_shouldReturnEmptyList_whenNoAttemptsExist() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"5555555555555555555555555555555555555555555555555555555555555555");
|
||||
|
||||
// When
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(attempts).isEmpty();
|
||||
}
|
||||
|
||||
@Test
|
||||
void findAllByFingerprint_shouldReturnAllAttemptsOrderedByNumber() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"6666666666666666666666666666666666666666666666666666666666666666");
|
||||
RunId runId1 = new RunId("test-run-5");
|
||||
RunId runId2 = new RunId("test-run-6");
|
||||
Instant baseTime = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Insert a document record first (FK constraint)
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
// Insert attempts out of order to verify sorting
|
||||
ProcessingAttempt attempt3 = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId2,
|
||||
3,
|
||||
baseTime.plusSeconds(20),
|
||||
baseTime.plusSeconds(25),
|
||||
ProcessingStatus.FAILED_FINAL,
|
||||
"ContentError",
|
||||
"No text extractable",
|
||||
false
|
||||
);
|
||||
repository.save(attempt3);
|
||||
|
||||
ProcessingAttempt attempt1 = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId1,
|
||||
1,
|
||||
baseTime,
|
||||
baseTime.plusSeconds(5),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"IOException",
|
||||
"File not found",
|
||||
true
|
||||
);
|
||||
repository.save(attempt1);
|
||||
|
||||
ProcessingAttempt attempt2 = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId1,
|
||||
2,
|
||||
baseTime.plusSeconds(10),
|
||||
baseTime.plusSeconds(15),
|
||||
ProcessingStatus.SKIPPED_ALREADY_PROCESSED,
|
||||
null,
|
||||
null,
|
||||
false
|
||||
);
|
||||
repository.save(attempt2);
|
||||
|
||||
// When
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(attempts).hasSize(3);
|
||||
assertThat(attempts.get(0).attemptNumber()).isEqualTo(1);
|
||||
assertThat(attempts.get(1).attemptNumber()).isEqualTo(2);
|
||||
assertThat(attempts.get(2).attemptNumber()).isEqualTo(3);
|
||||
|
||||
// Verify all fields
|
||||
ProcessingAttempt first = attempts.get(0);
|
||||
assertThat(first.runId()).isEqualTo(runId1);
|
||||
assertThat(first.status()).isEqualTo(ProcessingStatus.FAILED_RETRYABLE);
|
||||
assertThat(first.failureClass()).isEqualTo("IOException");
|
||||
|
||||
ProcessingAttempt second = attempts.get(1);
|
||||
assertThat(second.status()).isEqualTo(ProcessingStatus.SKIPPED_ALREADY_PROCESSED);
|
||||
assertThat(second.failureClass()).isNull();
|
||||
|
||||
ProcessingAttempt third = attempts.get(2);
|
||||
assertThat(third.status()).isEqualTo(ProcessingStatus.FAILED_FINAL);
|
||||
assertThat(third.retryable()).isFalse();
|
||||
}
|
||||
|
||||
@Test
|
||||
void findAllByFingerprint_shouldReturnImmutableList() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"7777777777777777777777777777777777777777777777777777777777777777");
|
||||
|
||||
// When
|
||||
List<ProcessingAttempt> attempts = repository.findAllByFingerprint(fingerprint);
|
||||
|
||||
// Then
|
||||
assertThat(attempts).isEmpty();
|
||||
assertThatThrownBy(() -> attempts.add(null))
|
||||
.isInstanceOf(UnsupportedOperationException.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findAllByFingerprint_shouldRejectNullFingerprint() {
|
||||
assertThatThrownBy(() -> repository.findAllByFingerprint(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("fingerprint");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// AI traceability fields — round-trip persistence
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void save_persistsAllAiTraceabilityFields_andFindAllReadsThemBack() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
|
||||
RunId runId = new RunId("ai-run-1");
|
||||
Instant startedAt = Instant.now().minusSeconds(30).truncatedTo(ChronoUnit.MICROS);
|
||||
Instant endedAt = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
LocalDate resolvedDate = LocalDate.of(2026, 3, 15);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, runId, 1, startedAt, endedAt,
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
"gpt-4o", "prompt-v1.txt",
|
||||
5, 1234,
|
||||
"{\"date\":\"2026-03-15\",\"title\":\"Stromabrechnung\",\"reasoning\":\"Invoice date found.\"}",
|
||||
"Invoice date found.",
|
||||
resolvedDate, DateSource.AI_PROVIDED,
|
||||
"Stromabrechnung",
|
||||
null
|
||||
);
|
||||
|
||||
// When
|
||||
repository.save(attempt);
|
||||
|
||||
// Then
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
ProcessingAttempt result = saved.get(0);
|
||||
|
||||
assertThat(result.modelName()).isEqualTo("gpt-4o");
|
||||
assertThat(result.promptIdentifier()).isEqualTo("prompt-v1.txt");
|
||||
assertThat(result.processedPageCount()).isEqualTo(5);
|
||||
assertThat(result.sentCharacterCount()).isEqualTo(1234);
|
||||
assertThat(result.aiRawResponse()).contains("Stromabrechnung");
|
||||
assertThat(result.aiReasoning()).isEqualTo("Invoice date found.");
|
||||
assertThat(result.resolvedDate()).isEqualTo(resolvedDate);
|
||||
assertThat(result.dateSource()).isEqualTo(DateSource.AI_PROVIDED);
|
||||
assertThat(result.validatedTitle()).isEqualTo("Stromabrechnung");
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_persistsAiFieldsWithFallbackDateSource() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");
|
||||
RunId runId = new RunId("ai-run-2");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
LocalDate fallbackDate = LocalDate.of(2026, 4, 7);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, runId, 1, now, now.plusSeconds(5),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
"openai-compatible",
|
||||
"claude-sonnet-4-6", "prompt-v2.txt",
|
||||
3, 800,
|
||||
"{\"title\":\"Kontoauszug\",\"reasoning\":\"No date in document.\"}",
|
||||
"No date in document.",
|
||||
fallbackDate, DateSource.FALLBACK_CURRENT,
|
||||
"Kontoauszug",
|
||||
null
|
||||
);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
ProcessingAttempt result = saved.get(0);
|
||||
|
||||
assertThat(result.dateSource()).isEqualTo(DateSource.FALLBACK_CURRENT);
|
||||
assertThat(result.resolvedDate()).isEqualTo(fallbackDate);
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_persistsNullAiFields_whenNoAiCallWasMade() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc");
|
||||
RunId runId = new RunId("no-ai-run");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, runId, 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"NoTextError", "No extractable text", true
|
||||
);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
ProcessingAttempt result = saved.get(0);
|
||||
|
||||
assertThat(result.modelName()).isNull();
|
||||
assertThat(result.promptIdentifier()).isNull();
|
||||
assertThat(result.processedPageCount()).isNull();
|
||||
assertThat(result.sentCharacterCount()).isNull();
|
||||
assertThat(result.aiRawResponse()).isNull();
|
||||
assertThat(result.aiReasoning()).isNull();
|
||||
assertThat(result.resolvedDate()).isNull();
|
||||
assertThat(result.dateSource()).isNull();
|
||||
assertThat(result.validatedTitle()).isNull();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// findLatestProposalReadyAttempt
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void findLatestProposalReadyAttempt_returnsNull_whenNoAttemptsExist() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd");
|
||||
|
||||
ProcessingAttempt result = repository.findLatestProposalReadyAttempt(fingerprint);
|
||||
|
||||
assertThat(result).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void findLatestProposalReadyAttempt_returnsNull_whenNoProposalReadyAttemptExists() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, new RunId("run-x"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.FAILED_RETRYABLE, "Err", "msg", true
|
||||
);
|
||||
repository.save(attempt);
|
||||
|
||||
ProcessingAttempt result = repository.findLatestProposalReadyAttempt(fingerprint);
|
||||
|
||||
assertThat(result).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void findLatestProposalReadyAttempt_returnsSingleProposalReadyAttempt() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
LocalDate date = LocalDate.of(2026, 2, 1);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-p"), 1, now, now.plusSeconds(2),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4o", "prompt-v1.txt", 2, 500,
|
||||
"{\"title\":\"Rechnung\",\"reasoning\":\"Found.\"}",
|
||||
"Found.", date, DateSource.AI_PROVIDED, "Rechnung",
|
||||
null
|
||||
);
|
||||
repository.save(attempt);
|
||||
|
||||
ProcessingAttempt result = repository.findLatestProposalReadyAttempt(fingerprint);
|
||||
|
||||
assertThat(result).isNotNull();
|
||||
assertThat(result.status()).isEqualTo(ProcessingStatus.PROPOSAL_READY);
|
||||
assertThat(result.validatedTitle()).isEqualTo("Rechnung");
|
||||
assertThat(result.resolvedDate()).isEqualTo(date);
|
||||
assertThat(result.dateSource()).isEqualTo(DateSource.AI_PROVIDED);
|
||||
}
|
||||
|
||||
@Test
|
||||
void findLatestProposalReadyAttempt_returnsLatest_whenMultipleExist() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"1111111111111111111111111111111111111111111111111111111111111112");
|
||||
Instant base = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
// First PROPOSAL_READY attempt
|
||||
repository.save(new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-1"), 1, base, base.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-a", "prompt-v1.txt", 1, 100,
|
||||
"{}", "First.", LocalDate.of(2026, 1, 1), DateSource.AI_PROVIDED, "TitelEins",
|
||||
null
|
||||
));
|
||||
|
||||
// Subsequent FAILED attempt
|
||||
repository.save(ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, new RunId("run-2"), 2,
|
||||
base.plusSeconds(10), base.plusSeconds(11),
|
||||
ProcessingStatus.FAILED_RETRYABLE, "Err", "msg", true
|
||||
));
|
||||
|
||||
// Second PROPOSAL_READY attempt (newer)
|
||||
repository.save(new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-3"), 3, base.plusSeconds(20), base.plusSeconds(21),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-b", "prompt-v2.txt", 2, 200,
|
||||
"{}", "Second.", LocalDate.of(2026, 2, 2), DateSource.AI_PROVIDED, "TitelZwei",
|
||||
null
|
||||
));
|
||||
|
||||
ProcessingAttempt result = repository.findLatestProposalReadyAttempt(fingerprint);
|
||||
|
||||
assertThat(result).isNotNull();
|
||||
assertThat(result.attemptNumber()).isEqualTo(3);
|
||||
assertThat(result.validatedTitle()).isEqualTo("TitelZwei");
|
||||
assertThat(result.modelName()).isEqualTo("model-b");
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_persistsFinalTargetFileName_forSuccessAttempt() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"4444444444444444444444444444444444444444444444444444444444444445");
|
||||
RunId runId = new RunId("success-run");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
LocalDate date = LocalDate.of(2026, 1, 15);
|
||||
String expectedFileName = "2026-01-15 - Rechnung.pdf";
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, runId, 1, now, now.plusSeconds(3),
|
||||
ProcessingStatus.SUCCESS,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4", "prompt-v1.txt", 2, 600,
|
||||
"{\"title\":\"Rechnung\",\"reasoning\":\"Invoice.\"}",
|
||||
"Invoice.",
|
||||
date, DateSource.AI_PROVIDED,
|
||||
"Rechnung",
|
||||
expectedFileName
|
||||
);
|
||||
|
||||
// When
|
||||
repository.save(attempt);
|
||||
|
||||
// Then
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).finalTargetFileName()).isEqualTo(expectedFileName);
|
||||
assertThat(saved.get(0).status()).isEqualTo(ProcessingStatus.SUCCESS);
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_persistsNullFinalTargetFileName_forNonSuccessAttempt() {
|
||||
// finalTargetFileName must remain null for PROPOSAL_READY and non-SUCCESS attempts
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"5555555555555555555555555555555555555555555555555555555555555556");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-prop"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4", "prompt-v1.txt", 1, 200,
|
||||
"{}", "reason",
|
||||
LocalDate.of(2026, 3, 1), DateSource.AI_PROVIDED,
|
||||
"Kontoauszug",
|
||||
null // no target filename yet
|
||||
);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).finalTargetFileName()).isNull();
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_proposalAttemptNotOverwrittenBySubsequentSuccessAttempt() {
|
||||
// Verifies that the leading PROPOSAL_READY attempt remains unchanged when
|
||||
// a subsequent SUCCESS attempt is added (no update, only new insert).
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"6666666666666666666666666666666666666666666666666666666666666667");
|
||||
Instant base = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
LocalDate date = LocalDate.of(2026, 2, 10);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
// First attempt: PROPOSAL_READY
|
||||
ProcessingAttempt proposalAttempt = new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-1"), 1, base, base.plusSeconds(2),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-a", "prompt-v1.txt", 3, 700,
|
||||
"{}", "reason.", date, DateSource.AI_PROVIDED, "Bescheid", null
|
||||
);
|
||||
repository.save(proposalAttempt);
|
||||
|
||||
// Second attempt: SUCCESS (target copy completed)
|
||||
ProcessingAttempt successAttempt = new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-1"), 2,
|
||||
base.plusSeconds(5), base.plusSeconds(6),
|
||||
ProcessingStatus.SUCCESS,
|
||||
null, null, false,
|
||||
null, null, null, null, null, null,
|
||||
null, null, null, null,
|
||||
"2026-02-10 - Bescheid.pdf"
|
||||
);
|
||||
repository.save(successAttempt);
|
||||
|
||||
// Both attempts must be present
|
||||
List<ProcessingAttempt> all = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(all).hasSize(2);
|
||||
|
||||
// The original PROPOSAL_READY attempt must remain unchanged
|
||||
ProcessingAttempt first = all.get(0);
|
||||
assertThat(first.status()).isEqualTo(ProcessingStatus.PROPOSAL_READY);
|
||||
assertThat(first.validatedTitle()).isEqualTo("Bescheid");
|
||||
assertThat(first.finalTargetFileName()).isNull();
|
||||
|
||||
// The SUCCESS attempt carries the final filename
|
||||
ProcessingAttempt second = all.get(1);
|
||||
assertThat(second.status()).isEqualTo(ProcessingStatus.SUCCESS);
|
||||
assertThat(second.finalTargetFileName()).isEqualTo("2026-02-10 - Bescheid.pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void findLatestProposalReadyAttempt_rejectsNullFingerprint() {
|
||||
assertThatThrownBy(() -> repository.findLatestProposalReadyAttempt(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("fingerprint");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// READY_FOR_AI and PROPOSAL_READY status storability
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void save_canPersistReadyForAiStatus() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"2222222222222222222222222222222222222222222222222222222222222223");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint, new RunId("run-r"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.READY_FOR_AI, null, null, false
|
||||
);
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).status()).isEqualTo(ProcessingStatus.READY_FOR_AI);
|
||||
}
|
||||
|
||||
@Test
|
||||
void save_canPersistProposalReadyStatus() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"3333333333333333333333333333333333333333333333333333333333333334");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, new RunId("run-p2"), 1, now, now.plusSeconds(1),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"model-x", "prompt-v1.txt", 1, 50,
|
||||
"{}", "Reasoning.", LocalDate.of(2026, 1, 15), DateSource.AI_PROVIDED, "Titel",
|
||||
null
|
||||
);
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
assertThat(saved.get(0).status()).isEqualTo(ProcessingStatus.PROPOSAL_READY);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// AI field persistence is independent of logging configuration
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Verifies that the repository always stores the complete AI raw response and reasoning,
|
||||
* independent of any logging sensitivity configuration.
|
||||
* <p>
|
||||
* The {@code AiContentSensitivity} setting controls only whether sensitive content is
|
||||
* written to log files. It has no influence on what the repository persists. This test
|
||||
* demonstrates that full AI fields are stored regardless of any logging configuration by
|
||||
* verifying a round-trip with both full content and long reasoning text.
|
||||
*/
|
||||
@Test
|
||||
void save_persistsFullAiResponseAndReasoning_unaffectedByLoggingConfiguration() {
|
||||
// The repository has no dependency on AiContentSensitivity.
|
||||
// It always stores the complete AI raw response and reasoning.
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"d1d2d3d4d5d6d7d8d9dadbdcdddedfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfd0".substring(0, 64));
|
||||
RunId runId = new RunId("persistence-independence-run");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
// Deliberately long and complete AI raw response — must be stored in full
|
||||
String fullRawResponse = "{\"date\":\"2026-03-01\",\"title\":\"Stromabrechnung\","
|
||||
+ "\"reasoning\":\"Invoice date clearly stated on page 1. Utility provider named.\"}";
|
||||
// Deliberately complete reasoning — must be stored in full
|
||||
String fullReasoning = "Invoice date clearly stated on page 1. Utility provider named.";
|
||||
|
||||
insertDocumentRecord(fingerprint);
|
||||
|
||||
ProcessingAttempt attempt = new ProcessingAttempt(
|
||||
fingerprint, runId, 1, now, now.plusSeconds(5),
|
||||
ProcessingStatus.PROPOSAL_READY,
|
||||
null, null, false,
|
||||
null,
|
||||
"gpt-4o", "prompt-v1.txt",
|
||||
3, 750,
|
||||
fullRawResponse,
|
||||
fullReasoning,
|
||||
LocalDate.of(2026, 3, 1), DateSource.AI_PROVIDED,
|
||||
"Stromabrechnung",
|
||||
null
|
||||
);
|
||||
|
||||
repository.save(attempt);
|
||||
|
||||
List<ProcessingAttempt> saved = repository.findAllByFingerprint(fingerprint);
|
||||
assertThat(saved).hasSize(1);
|
||||
ProcessingAttempt result = saved.get(0);
|
||||
|
||||
// Full raw response is stored completely — not truncated, not suppressed
|
||||
assertThat(result.aiRawResponse()).isEqualTo(fullRawResponse);
|
||||
// Full reasoning is stored completely — not truncated, not suppressed
|
||||
assertThat(result.aiReasoning()).isEqualTo(fullReasoning);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Integration with document records (FK constraints)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void save_shouldFail_whenFingerprintDoesNotExistInDocumentRecord() {
|
||||
// Given
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"8888888888888888888888888888888888888888888888888888888888888888");
|
||||
RunId runId = new RunId("test-run-7");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
|
||||
ProcessingAttempt attempt = ProcessingAttempt.withoutAiFields(
|
||||
fingerprint,
|
||||
runId,
|
||||
1,
|
||||
now,
|
||||
now.plusSeconds(10),
|
||||
ProcessingStatus.FAILED_RETRYABLE,
|
||||
"IOException",
|
||||
"File not found",
|
||||
true
|
||||
);
|
||||
|
||||
// When / Then
|
||||
assertThatThrownBy(() -> repository.save(attempt))
|
||||
.isInstanceOf(DocumentPersistenceException.class);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private void insertDocumentRecord(DocumentFingerprint fingerprint) {
|
||||
String sql = """
|
||||
INSERT INTO document_record (
|
||||
fingerprint,
|
||||
last_known_source_locator,
|
||||
last_known_source_file_name,
|
||||
overall_status,
|
||||
content_error_count,
|
||||
transient_error_count,
|
||||
created_at,
|
||||
updated_at
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?)
|
||||
""";
|
||||
|
||||
try (Connection connection = DriverManager.getConnection(jdbcUrl);
|
||||
var statement = connection.prepareStatement(sql)) {
|
||||
|
||||
statement.setString(1, fingerprint.sha256Hex());
|
||||
statement.setString(2, "/test/path/document.pdf");
|
||||
statement.setString(3, "document.pdf");
|
||||
statement.setString(4, ProcessingStatus.PROCESSING.name());
|
||||
statement.setInt(5, 0);
|
||||
statement.setInt(6, 0);
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
statement.setString(7, now.toString());
|
||||
statement.setString(8, now.toString());
|
||||
|
||||
statement.executeUpdate();
|
||||
} catch (SQLException e) {
|
||||
throw new RuntimeException("Failed to insert document record for testing", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,469 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatThrownBy;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.sql.Connection;
|
||||
import java.sql.DatabaseMetaData;
|
||||
import java.sql.DriverManager;
|
||||
import java.sql.ResultSet;
|
||||
import java.sql.SQLException;
|
||||
import java.util.HashSet;
|
||||
import java.util.Set;
|
||||
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
|
||||
/**
|
||||
* Tests for {@link SqliteSchemaInitializationAdapter}.
|
||||
* <p>
|
||||
* Verifies that the two-level schema is created correctly, that schema evolution
|
||||
* (idempotent addition of AI traceability columns) works, that the idempotent
|
||||
* status migration of earlier positive intermediate states to {@code READY_FOR_AI}
|
||||
* is correct, and that invalid configuration is rejected.
|
||||
*/
|
||||
class SqliteSchemaInitializationAdapterTest {
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Construction
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void constructor_rejectsNullJdbcUrl() {
|
||||
assertThatThrownBy(() -> new SqliteSchemaInitializationAdapter(null))
|
||||
.isInstanceOf(NullPointerException.class)
|
||||
.hasMessageContaining("jdbcUrl");
|
||||
}
|
||||
|
||||
@Test
|
||||
void constructor_rejectsBlankJdbcUrl() {
|
||||
assertThatThrownBy(() -> new SqliteSchemaInitializationAdapter(" "))
|
||||
.isInstanceOf(IllegalArgumentException.class)
|
||||
.hasMessageContaining("jdbcUrl");
|
||||
}
|
||||
|
||||
@Test
|
||||
void getJdbcUrl_returnsConfiguredUrl() {
|
||||
String url = "jdbc:sqlite:/some/path/test.db";
|
||||
SqliteSchemaInitializationAdapter adapter = new SqliteSchemaInitializationAdapter(url);
|
||||
assertThat(adapter.getJdbcUrl()).isEqualTo(url);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Schema creation – tables present
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void initializeSchema_createsBothTables(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "schema_test.db");
|
||||
SqliteSchemaInitializationAdapter adapter = new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
|
||||
adapter.initializeSchema();
|
||||
|
||||
Set<String> tables = readTableNames(jdbcUrl);
|
||||
assertThat(tables).contains("document_record", "processing_attempt");
|
||||
}
|
||||
|
||||
@Test
|
||||
void initializeSchema_documentRecordHasAllMandatoryColumns(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "columns_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
Set<String> columns = readColumnNames(jdbcUrl, "document_record");
|
||||
assertThat(columns).containsExactlyInAnyOrder(
|
||||
"id",
|
||||
"fingerprint",
|
||||
"last_known_source_locator",
|
||||
"last_known_source_file_name",
|
||||
"overall_status",
|
||||
"content_error_count",
|
||||
"transient_error_count",
|
||||
"last_failure_instant",
|
||||
"last_success_instant",
|
||||
"created_at",
|
||||
"updated_at",
|
||||
"last_target_path",
|
||||
"last_target_file_name"
|
||||
);
|
||||
}
|
||||
|
||||
@Test
|
||||
void initializeSchema_processingAttemptHasAllMandatoryColumns(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "attempt_columns_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
Set<String> columns = readColumnNames(jdbcUrl, "processing_attempt");
|
||||
assertThat(columns).containsExactlyInAnyOrder(
|
||||
"id",
|
||||
"fingerprint",
|
||||
"run_id",
|
||||
"attempt_number",
|
||||
"started_at",
|
||||
"ended_at",
|
||||
"status",
|
||||
"failure_class",
|
||||
"failure_message",
|
||||
"retryable",
|
||||
"model_name",
|
||||
"prompt_identifier",
|
||||
"processed_page_count",
|
||||
"sent_character_count",
|
||||
"ai_raw_response",
|
||||
"ai_reasoning",
|
||||
"resolved_date",
|
||||
"date_source",
|
||||
"validated_title",
|
||||
"final_target_file_name",
|
||||
"ai_provider"
|
||||
);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Idempotency
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void initializeSchema_isIdempotent_calledTwice(@TempDir Path dir) {
|
||||
String jdbcUrl = jdbcUrl(dir, "idempotent_test.db");
|
||||
SqliteSchemaInitializationAdapter adapter = new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
|
||||
// Must not throw on second call
|
||||
adapter.initializeSchema();
|
||||
adapter.initializeSchema();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Unique constraint: fingerprint in document_record
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void documentRecord_fingerprintUniqueConstraintIsEnforced(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "unique_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String insertSql = """
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, 'locator', 'file.pdf', 'SUCCESS', '2026-01-01T00:00:00Z', '2026-01-01T00:00:00Z')
|
||||
""";
|
||||
String fp = "a".repeat(64);
|
||||
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
try (var ps = conn.prepareStatement(insertSql)) {
|
||||
ps.setString(1, fp);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
// Second insert with same fingerprint must fail
|
||||
try (var ps = conn.prepareStatement(insertSql)) {
|
||||
ps.setString(1, fp);
|
||||
org.junit.jupiter.api.Assertions.assertThrows(
|
||||
SQLException.class, ps::executeUpdate,
|
||||
"Expected UNIQUE constraint violation on document_record.fingerprint");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Unique constraint: (fingerprint, attempt_number) in processing_attempt
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void processingAttempt_fingerprintAttemptNumberUniqueConstraintIsEnforced(@TempDir Path dir)
|
||||
throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "attempt_unique_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String fp = "b".repeat(64);
|
||||
|
||||
// Insert master record first (FK)
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
try (var ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, 'loc', 'f.pdf', 'FAILED_RETRYABLE', '2026-01-01T00:00:00Z', '2026-01-01T00:00:00Z')
|
||||
""")) {
|
||||
ps.setString(1, fp);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
|
||||
String attemptSql = """
|
||||
INSERT INTO processing_attempt
|
||||
(fingerprint, run_id, attempt_number, started_at, ended_at, status, retryable)
|
||||
VALUES (?, 'run-1', 1, '2026-01-01T00:00:00Z', '2026-01-01T00:01:00Z', 'FAILED_RETRYABLE', 1)
|
||||
""";
|
||||
|
||||
try (var ps = conn.prepareStatement(attemptSql)) {
|
||||
ps.setString(1, fp);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
// Duplicate (fingerprint, attempt_number) must fail
|
||||
try (var ps = conn.prepareStatement(attemptSql)) {
|
||||
ps.setString(1, fp);
|
||||
org.junit.jupiter.api.Assertions.assertThrows(
|
||||
SQLException.class, ps::executeUpdate,
|
||||
"Expected UNIQUE constraint violation on (fingerprint, attempt_number)");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Skip attempts are storable
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void processingAttempt_skipStatusIsStorable(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "skip_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String fp = "c".repeat(64);
|
||||
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
// Insert master record
|
||||
try (var ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, 'loc', 'f.pdf', 'SUCCESS', '2026-01-01T00:00:00Z', '2026-01-01T00:00:00Z')
|
||||
""")) {
|
||||
ps.setString(1, fp);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
|
||||
// Insert a SKIPPED_ALREADY_PROCESSED attempt (null failure fields, retryable=0)
|
||||
try (var ps = conn.prepareStatement("""
|
||||
INSERT INTO processing_attempt
|
||||
(fingerprint, run_id, attempt_number, started_at, ended_at,
|
||||
status, failure_class, failure_message, retryable)
|
||||
VALUES (?, 'run-2', 2, '2026-01-02T00:00:00Z', '2026-01-02T00:00:01Z',
|
||||
'SKIPPED_ALREADY_PROCESSED', NULL, NULL, 0)
|
||||
""")) {
|
||||
ps.setString(1, fp);
|
||||
int rows = ps.executeUpdate();
|
||||
assertThat(rows).isEqualTo(1);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Schema evolution — AI traceability columns
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void initializeSchema_addsAiTraceabilityColumnsToExistingSchema(@TempDir Path dir)
|
||||
throws SQLException {
|
||||
// Simulate a pre-evolution schema: create the base tables without AI columns
|
||||
String jdbcUrl = jdbcUrl(dir, "evolution_test.db");
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
var stmt = conn.createStatement()) {
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS document_record (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
last_known_source_locator TEXT NOT NULL,
|
||||
last_known_source_file_name TEXT NOT NULL,
|
||||
overall_status TEXT NOT NULL,
|
||||
content_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
transient_error_count INTEGER NOT NULL DEFAULT 0,
|
||||
last_failure_instant TEXT,
|
||||
last_success_instant TEXT,
|
||||
created_at TEXT NOT NULL,
|
||||
updated_at TEXT NOT NULL,
|
||||
CONSTRAINT uq_document_record_fingerprint UNIQUE (fingerprint)
|
||||
)
|
||||
""");
|
||||
stmt.execute("""
|
||||
CREATE TABLE IF NOT EXISTS processing_attempt (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
fingerprint TEXT NOT NULL,
|
||||
run_id TEXT NOT NULL,
|
||||
attempt_number INTEGER NOT NULL,
|
||||
started_at TEXT NOT NULL,
|
||||
ended_at TEXT NOT NULL,
|
||||
status TEXT NOT NULL,
|
||||
failure_class TEXT,
|
||||
failure_message TEXT,
|
||||
retryable INTEGER NOT NULL DEFAULT 0
|
||||
)
|
||||
""");
|
||||
}
|
||||
|
||||
// Running initializeSchema on the existing base schema must succeed (evolution)
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
Set<String> columns = readColumnNames(jdbcUrl, "processing_attempt");
|
||||
assertThat(columns).contains(
|
||||
"model_name", "prompt_identifier", "processed_page_count",
|
||||
"sent_character_count", "ai_raw_response", "ai_reasoning",
|
||||
"resolved_date", "date_source", "validated_title");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Status migration — earlier positive intermediate state → READY_FOR_AI
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void initializeSchema_migrates_legacySuccessWithoutProposal_toReadyForAi(@TempDir Path dir)
|
||||
throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "migration_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
// Insert a document with SUCCESS status and no PROPOSAL_READY attempt
|
||||
String fp = "d".repeat(64);
|
||||
insertDocumentRecordWithStatus(jdbcUrl, fp, "SUCCESS");
|
||||
|
||||
// Run schema initialisation again (migration step runs every time)
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String status = readOverallStatus(jdbcUrl, fp);
|
||||
assertThat(status).isEqualTo("READY_FOR_AI");
|
||||
}
|
||||
|
||||
@Test
|
||||
void initializeSchema_migration_isIdempotent(@TempDir Path dir) throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "migration_idempotent_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String fp = "e".repeat(64);
|
||||
insertDocumentRecordWithStatus(jdbcUrl, fp, "SUCCESS");
|
||||
|
||||
// Run migration twice — must not corrupt data or throw
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String status = readOverallStatus(jdbcUrl, fp);
|
||||
assertThat(status).isEqualTo("READY_FOR_AI");
|
||||
}
|
||||
|
||||
@Test
|
||||
void initializeSchema_doesNotMigrate_successWithProposalReadyAttempt(@TempDir Path dir)
|
||||
throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "migration_proposal_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String fp = "f".repeat(64);
|
||||
// SUCCESS document that already has a PROPOSAL_READY attempt must NOT be migrated
|
||||
insertDocumentRecordWithStatus(jdbcUrl, fp, "SUCCESS");
|
||||
insertAttemptWithStatus(jdbcUrl, fp, "PROPOSAL_READY");
|
||||
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String status = readOverallStatus(jdbcUrl, fp);
|
||||
assertThat(status).isEqualTo("SUCCESS");
|
||||
}
|
||||
|
||||
@Test
|
||||
void initializeSchema_doesNotMigrate_terminalFailureStates(@TempDir Path dir)
|
||||
throws SQLException {
|
||||
String jdbcUrl = jdbcUrl(dir, "migration_failure_test.db");
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
String fpRetryable = "1".repeat(64);
|
||||
String fpFinal = "2".repeat(64);
|
||||
insertDocumentRecordWithStatus(jdbcUrl, fpRetryable, "FAILED_RETRYABLE");
|
||||
insertDocumentRecordWithStatus(jdbcUrl, fpFinal, "FAILED_FINAL");
|
||||
|
||||
new SqliteSchemaInitializationAdapter(jdbcUrl).initializeSchema();
|
||||
|
||||
assertThat(readOverallStatus(jdbcUrl, fpRetryable)).isEqualTo("FAILED_RETRYABLE");
|
||||
assertThat(readOverallStatus(jdbcUrl, fpFinal)).isEqualTo("FAILED_FINAL");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Error handling
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void initializeSchema_throwsDocumentPersistenceException_onInvalidUrl() {
|
||||
// SQLite is lenient with paths; use a truly invalid JDBC URL format
|
||||
SqliteSchemaInitializationAdapter badAdapter =
|
||||
new SqliteSchemaInitializationAdapter("not-a-jdbc-url-at-all");
|
||||
|
||||
assertThatThrownBy(badAdapter::initializeSchema)
|
||||
.isInstanceOf(DocumentPersistenceException.class);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private static String jdbcUrl(Path dir, String filename) {
|
||||
return "jdbc:sqlite:" + dir.resolve(filename).toAbsolutePath();
|
||||
}
|
||||
|
||||
private static Set<String> readTableNames(String jdbcUrl) throws SQLException {
|
||||
Set<String> tables = new HashSet<>();
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
DatabaseMetaData meta = conn.getMetaData();
|
||||
try (ResultSet rs = meta.getTables(null, null, "%", new String[]{"TABLE"})) {
|
||||
while (rs.next()) {
|
||||
tables.add(rs.getString("TABLE_NAME").toLowerCase());
|
||||
}
|
||||
}
|
||||
}
|
||||
return tables;
|
||||
}
|
||||
|
||||
private static Set<String> readColumnNames(String jdbcUrl, String tableName) throws SQLException {
|
||||
Set<String> columns = new HashSet<>();
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl)) {
|
||||
DatabaseMetaData meta = conn.getMetaData();
|
||||
try (ResultSet rs = meta.getColumns(null, null, tableName, "%")) {
|
||||
while (rs.next()) {
|
||||
columns.add(rs.getString("COLUMN_NAME").toLowerCase());
|
||||
}
|
||||
}
|
||||
}
|
||||
return columns;
|
||||
}
|
||||
|
||||
private static void insertDocumentRecordWithStatus(String jdbcUrl, String fingerprint,
|
||||
String status) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
var ps = conn.prepareStatement("""
|
||||
INSERT INTO document_record
|
||||
(fingerprint, last_known_source_locator, last_known_source_file_name,
|
||||
overall_status, created_at, updated_at)
|
||||
VALUES (?, '/src', 'doc.pdf', ?, '2026-01-01T00:00:00Z', '2026-01-01T00:00:00Z')
|
||||
""")) {
|
||||
ps.setString(1, fingerprint);
|
||||
ps.setString(2, status);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private static void insertAttemptWithStatus(String jdbcUrl, String fingerprint,
|
||||
String status) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
var ps = conn.prepareStatement("""
|
||||
INSERT INTO processing_attempt
|
||||
(fingerprint, run_id, attempt_number, started_at, ended_at, status, retryable)
|
||||
VALUES (?, 'run-1', 1, '2026-01-01T00:00:00Z', '2026-01-01T00:01:00Z', ?, 0)
|
||||
""")) {
|
||||
ps.setString(1, fingerprint);
|
||||
ps.setString(2, status);
|
||||
ps.executeUpdate();
|
||||
}
|
||||
}
|
||||
|
||||
private static String readOverallStatus(String jdbcUrl, String fingerprint) throws SQLException {
|
||||
try (Connection conn = DriverManager.getConnection(jdbcUrl);
|
||||
var ps = conn.prepareStatement(
|
||||
"SELECT overall_status FROM document_record WHERE fingerprint = ?")) {
|
||||
ps.setString(1, fingerprint);
|
||||
try (ResultSet rs = ps.executeQuery()) {
|
||||
if (rs.next()) {
|
||||
return rs.getString("overall_status");
|
||||
}
|
||||
throw new IllegalStateException("No document record found for fingerprint: " + fingerprint);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,234 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertFalse;
|
||||
import static org.junit.jupiter.api.Assertions.assertSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
import java.nio.file.Path;
|
||||
import java.time.Instant;
|
||||
import java.time.temporal.ChronoUnit;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Unit tests for {@link SqliteUnitOfWorkAdapter}.
|
||||
* <p>
|
||||
* Tests verify transactional semantics: successful commits, rollback on first-write failure,
|
||||
* rollback on second-write failure, and proper handling of DocumentPersistenceException.
|
||||
*
|
||||
*/
|
||||
class SqliteUnitOfWorkAdapterTest {
|
||||
|
||||
@TempDir
|
||||
Path tempDir;
|
||||
|
||||
private String jdbcUrl;
|
||||
private SqliteUnitOfWorkAdapter unitOfWorkAdapter;
|
||||
private SqliteSchemaInitializationAdapter schemaAdapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() throws Exception {
|
||||
Path dbFile = tempDir.resolve("test.db");
|
||||
jdbcUrl = "jdbc:sqlite:" + dbFile.toAbsolutePath().toString().replace('\\', '/');
|
||||
|
||||
// Initialize schema
|
||||
schemaAdapter = new SqliteSchemaInitializationAdapter(jdbcUrl);
|
||||
schemaAdapter.initializeSchema();
|
||||
|
||||
// Create the unit of work adapter
|
||||
unitOfWorkAdapter = new SqliteUnitOfWorkAdapter(jdbcUrl);
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies DocumentPersistenceException is properly re-thrown
|
||||
* without double-wrapping.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_reThrowsDocumentPersistenceExceptionAsIs() {
|
||||
DocumentPersistenceException originalException =
|
||||
new DocumentPersistenceException("Original error");
|
||||
|
||||
DocumentPersistenceException thrownException = assertThrows(
|
||||
DocumentPersistenceException.class,
|
||||
() -> {
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> {
|
||||
throw originalException;
|
||||
});
|
||||
}
|
||||
);
|
||||
|
||||
assertSame(originalException, thrownException,
|
||||
"DocumentPersistenceException should be re-thrown as-is, not wrapped");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies rollback occurs when the first write fails.
|
||||
* The transaction should not commit any data.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_rollsBackWhenFirstWriteFails() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint("2222222222222222222222222222222222222222222222222222222222222222");
|
||||
|
||||
// Create repositories for verification
|
||||
SqliteDocumentRecordRepositoryAdapter docRepository =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
|
||||
assertThrows(DocumentPersistenceException.class, () -> {
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> {
|
||||
// First write: throw exception directly (simulates write failure)
|
||||
throw new DocumentPersistenceException("Simulated first write failure");
|
||||
});
|
||||
});
|
||||
|
||||
// Verify no records were persisted (rollback occurred)
|
||||
var lookupResult = docRepository.findByFingerprint(fingerprint);
|
||||
assertTrue(lookupResult instanceof de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown,
|
||||
"No DocumentRecord should be persisted after rollback");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies rollback occurs when the second write fails after the first write succeeds.
|
||||
* The transaction should not commit the first write without the second.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_rollsBackWhenSecondWriteFails() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint("3333333333333333333333333333333333333333333333333333333333333333");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/test.pdf"),
|
||||
"test.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
// Create repositories for verification
|
||||
SqliteDocumentRecordRepositoryAdapter docRepository =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
|
||||
assertThrows(DocumentPersistenceException.class, () -> {
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> {
|
||||
// First write: succeeds
|
||||
txOps.createDocumentRecord(record);
|
||||
|
||||
// Second write: fails by throwing DocumentPersistenceException
|
||||
throw new DocumentPersistenceException("Simulated write failure");
|
||||
});
|
||||
});
|
||||
|
||||
// Verify no records were persisted due to rollback
|
||||
var lookupResult = docRepository.findByFingerprint(fingerprint);
|
||||
assertTrue(lookupResult instanceof de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown,
|
||||
"DocumentRecord should be rolled back when second write fails");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies rollback occurs when an arbitrary RuntimeException is thrown.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_rollsBackOnArbitraryRuntimeException() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint("4444444444444444444444444444444444444444444444444444444444444444");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/test.pdf"),
|
||||
"test.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
RuntimeException customException = new RuntimeException("Custom runtime error");
|
||||
|
||||
// Create repositories for verification
|
||||
SqliteDocumentRecordRepositoryAdapter docRepository =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
|
||||
DocumentPersistenceException thrownException = assertThrows(
|
||||
DocumentPersistenceException.class,
|
||||
() -> {
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> {
|
||||
txOps.createDocumentRecord(record);
|
||||
throw customException;
|
||||
});
|
||||
}
|
||||
);
|
||||
|
||||
// Verify the exception was wrapped in DocumentPersistenceException
|
||||
assertSame(customException, thrownException.getCause(),
|
||||
"RuntimeException should be wrapped in DocumentPersistenceException");
|
||||
|
||||
// Verify rollback occurred
|
||||
var lookupResult = docRepository.findByFingerprint(fingerprint);
|
||||
assertTrue(lookupResult instanceof de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown,
|
||||
"DocumentRecord should be rolled back on runtime exception");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that null operations Consumer throws NullPointerException.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_throwsNullPointerExceptionForNullOperations() {
|
||||
assertThrows(NullPointerException.class, () -> {
|
||||
unitOfWorkAdapter.executeInTransaction(null);
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Verifies that a document record written inside a successful transaction is persisted.
|
||||
* <p>
|
||||
* This confirms that the actual write operation is invoked and the transaction is
|
||||
* committed. Without an actual call to the underlying repository, the record would
|
||||
* not be retrievable after the transaction completes.
|
||||
*/
|
||||
@Test
|
||||
void executeInTransaction_committedRecordIsRetrievable() {
|
||||
DocumentFingerprint fingerprint = new DocumentFingerprint(
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
|
||||
Instant now = Instant.now().truncatedTo(ChronoUnit.MICROS);
|
||||
DocumentRecord record = new DocumentRecord(
|
||||
fingerprint,
|
||||
new SourceDocumentLocator("/source/commit-test.pdf"),
|
||||
"commit-test.pdf",
|
||||
ProcessingStatus.PROCESSING,
|
||||
FailureCounters.zero(),
|
||||
null,
|
||||
null,
|
||||
now,
|
||||
now,
|
||||
null,
|
||||
null
|
||||
);
|
||||
|
||||
SqliteDocumentRecordRepositoryAdapter docRepository =
|
||||
new SqliteDocumentRecordRepositoryAdapter(jdbcUrl);
|
||||
|
||||
unitOfWorkAdapter.executeInTransaction(txOps -> txOps.createDocumentRecord(record));
|
||||
|
||||
var result = docRepository.findByFingerprint(fingerprint);
|
||||
assertFalse(result instanceof de.gecheckt.pdf.umbenenner.application.port.out.DocumentUnknown,
|
||||
"Record must be persisted and retrievable after a successfully committed transaction");
|
||||
}
|
||||
|
||||
}
|
||||
@@ -0,0 +1,229 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetcopy;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopySuccess;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFileCopyTechnicalFailure;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
/**
|
||||
* Tests for {@link FilesystemTargetFileCopyAdapter}.
|
||||
* <p>
|
||||
* Covers the happy path (copy via temp file and final move), source integrity,
|
||||
* technical failure cases, and cleanup after failure.
|
||||
*/
|
||||
class FilesystemTargetFileCopyAdapterTest {
|
||||
|
||||
@TempDir
|
||||
Path sourceFolder;
|
||||
|
||||
@TempDir
|
||||
Path targetFolder;
|
||||
|
||||
private FilesystemTargetFileCopyAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
adapter = new FilesystemTargetFileCopyAdapter(targetFolder);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Happy path – successful copy
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_success_returnsTargetFileCopySuccess() throws IOException {
|
||||
Path sourceFile = createSourceFile("source.pdf", "PDF content");
|
||||
String resolvedFilename = "2026-01-15 - Rechnung.pdf";
|
||||
|
||||
TargetFileCopyResult result = adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
resolvedFilename);
|
||||
|
||||
assertThat(result).isInstanceOf(TargetFileCopySuccess.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_success_targetFileCreatedWithCorrectContent() throws IOException {
|
||||
byte[] content = "PDF content bytes".getBytes();
|
||||
Path sourceFile = sourceFolder.resolve("invoice.pdf");
|
||||
Files.write(sourceFile, content);
|
||||
String resolvedFilename = "2026-01-15 - Rechnung.pdf";
|
||||
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
resolvedFilename);
|
||||
|
||||
Path targetFile = targetFolder.resolve(resolvedFilename);
|
||||
assertThat(targetFile).exists();
|
||||
assertThat(Files.readAllBytes(targetFile)).isEqualTo(content);
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_success_sourceFileRemainsUnchanged() throws IOException {
|
||||
byte[] originalContent = "original PDF content".getBytes();
|
||||
Path sourceFile = sourceFolder.resolve("source.pdf");
|
||||
Files.write(sourceFile, originalContent);
|
||||
String resolvedFilename = "2026-01-15 - Rechnung.pdf";
|
||||
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
resolvedFilename);
|
||||
|
||||
// Source must remain completely unchanged
|
||||
assertThat(Files.readAllBytes(sourceFile)).isEqualTo(originalContent);
|
||||
assertThat(sourceFile).exists();
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_success_noTempFileRemainsInTargetFolder() throws IOException {
|
||||
Path sourceFile = createSourceFile("source.pdf", "content");
|
||||
String resolvedFilename = "2026-04-07 - Bescheid.pdf";
|
||||
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
resolvedFilename);
|
||||
|
||||
// The .tmp file must not remain after a successful copy
|
||||
Path tempFile = targetFolder.resolve(resolvedFilename + ".tmp");
|
||||
assertThat(tempFile).doesNotExist();
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_success_finalFileNameIsResolved() throws IOException {
|
||||
Path sourceFile = createSourceFile("source.pdf", "data");
|
||||
String resolvedFilename = "2026-03-05 - Kontoauszug.pdf";
|
||||
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
resolvedFilename);
|
||||
|
||||
assertThat(targetFolder.resolve(resolvedFilename)).exists();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Technical failure – source file does not exist
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_sourceDoesNotExist_returnsTargetFileCopyTechnicalFailure() {
|
||||
String nonExistentSource = sourceFolder.resolve("nonexistent.pdf").toAbsolutePath().toString();
|
||||
|
||||
TargetFileCopyResult result = adapter.copyToTarget(
|
||||
new SourceDocumentLocator(nonExistentSource),
|
||||
"2026-01-01 - Rechnung.pdf");
|
||||
|
||||
assertThat(result).isInstanceOf(TargetFileCopyTechnicalFailure.class);
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_sourceDoesNotExist_failureContainsSourcePath() {
|
||||
String nonExistentSource = sourceFolder.resolve("nonexistent.pdf").toAbsolutePath().toString();
|
||||
|
||||
TargetFileCopyResult result = adapter.copyToTarget(
|
||||
new SourceDocumentLocator(nonExistentSource),
|
||||
"2026-01-01 - Rechnung.pdf");
|
||||
|
||||
assertThat(result).isInstanceOf(TargetFileCopyTechnicalFailure.class);
|
||||
TargetFileCopyTechnicalFailure failure = (TargetFileCopyTechnicalFailure) result;
|
||||
assertThat(failure.errorMessage()).contains(nonExistentSource);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Technical failure – target folder does not exist
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_targetFolderDoesNotExist_returnsTargetFileCopyTechnicalFailure()
|
||||
throws IOException {
|
||||
Path sourceFile = createSourceFile("source.pdf", "content");
|
||||
Path nonExistentTargetFolder = targetFolder.resolve("nonexistent-subfolder");
|
||||
FilesystemTargetFileCopyAdapter adapterWithMissingFolder =
|
||||
new FilesystemTargetFileCopyAdapter(nonExistentTargetFolder);
|
||||
|
||||
TargetFileCopyResult result = adapterWithMissingFolder.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
"2026-01-01 - Rechnung.pdf");
|
||||
|
||||
assertThat(result).isInstanceOf(TargetFileCopyTechnicalFailure.class);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Cleanup after failure – no temp file left
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_sourceDoesNotExist_noTempFileLeftInTargetFolder() {
|
||||
String nonExistentSource = sourceFolder.resolve("missing.pdf").toAbsolutePath().toString();
|
||||
String resolvedFilename = "2026-01-01 - Test.pdf";
|
||||
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(nonExistentSource),
|
||||
resolvedFilename);
|
||||
|
||||
// Even though the copy failed, no temp file should remain
|
||||
Path tempFile = targetFolder.resolve(resolvedFilename + ".tmp");
|
||||
assertThat(tempFile).doesNotExist();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// TargetFileCopyTechnicalFailure semantics
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_failure_messageIsNonNull() {
|
||||
String nonExistentSource = sourceFolder.resolve("ghost.pdf").toAbsolutePath().toString();
|
||||
|
||||
TargetFileCopyTechnicalFailure failure = (TargetFileCopyTechnicalFailure)
|
||||
adapter.copyToTarget(
|
||||
new SourceDocumentLocator(nonExistentSource),
|
||||
"2026-01-01 - Test.pdf");
|
||||
|
||||
assertThat(failure.errorMessage()).isNotNull();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Null guards
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void copyToTarget_rejectsNullSourceLocator() throws IOException {
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> adapter.copyToTarget(null, "2026-01-01 - Test.pdf"));
|
||||
}
|
||||
|
||||
@Test
|
||||
void copyToTarget_rejectsNullResolvedFilename() throws IOException {
|
||||
Path sourceFile = createSourceFile("source.pdf", "content");
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> adapter.copyToTarget(
|
||||
new SourceDocumentLocator(sourceFile.toAbsolutePath().toString()),
|
||||
null));
|
||||
}
|
||||
|
||||
@Test
|
||||
void constructor_rejectsNullTargetFolderPath() {
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> new FilesystemTargetFileCopyAdapter(null));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
private Path createSourceFile(String filename, String content) throws IOException {
|
||||
Path file = sourceFolder.resolve(filename);
|
||||
Files.writeString(file, content);
|
||||
return file;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,259 @@
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.targetfolder;
|
||||
|
||||
import static org.assertj.core.api.Assertions.assertThat;
|
||||
import static org.assertj.core.api.Assertions.assertThatNullPointerException;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
|
||||
import org.junit.jupiter.api.BeforeEach;
|
||||
import org.junit.jupiter.api.Test;
|
||||
import org.junit.jupiter.api.io.TempDir;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.ResolvedTargetFilename;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFilenameResolutionResult;
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.TargetFolderTechnicalFailure;
|
||||
|
||||
/**
|
||||
* Tests for {@link FilesystemTargetFolderAdapter}.
|
||||
* <p>
|
||||
* Covers duplicate resolution (no conflict, single conflict, multiple conflicts),
|
||||
* suffix placement, rollback deletion, and error handling.
|
||||
*/
|
||||
class FilesystemTargetFolderAdapterTest {
|
||||
|
||||
@TempDir
|
||||
Path targetFolder;
|
||||
|
||||
private FilesystemTargetFolderAdapter adapter;
|
||||
|
||||
@BeforeEach
|
||||
void setUp() {
|
||||
adapter = new FilesystemTargetFolderAdapter(targetFolder);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// getTargetFolderLocator
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void getTargetFolderLocator_returnsAbsolutePath() {
|
||||
String locator = adapter.getTargetFolderLocator();
|
||||
|
||||
assertThat(locator).isEqualTo(targetFolder.toAbsolutePath().toString());
|
||||
}
|
||||
|
||||
@Test
|
||||
void getTargetFolderLocator_isNeverNullOrBlank() {
|
||||
assertThat(adapter.getTargetFolderLocator()).isNotNull().isNotBlank();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – no conflict
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_noConflict_returnsBaseName() {
|
||||
String baseName = "2026-01-15 - Rechnung.pdf";
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename()).isEqualTo(baseName);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – collision with base name
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_baseNameTaken_returnsSuffixOne() throws IOException {
|
||||
String baseName = "2026-01-15 - Rechnung.pdf";
|
||||
Files.createFile(targetFolder.resolve(baseName));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename())
|
||||
.isEqualTo("2026-01-15 - Rechnung(1).pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_baseAndOneTaken_returnsSuffixTwo() throws IOException {
|
||||
String baseName = "2026-01-15 - Rechnung.pdf";
|
||||
Files.createFile(targetFolder.resolve(baseName));
|
||||
Files.createFile(targetFolder.resolve("2026-01-15 - Rechnung(1).pdf"));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename())
|
||||
.isEqualTo("2026-01-15 - Rechnung(2).pdf");
|
||||
}
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_multipleTaken_returnsFirstFree() throws IOException {
|
||||
String baseName = "2026-03-31 - Stromabrechnung.pdf";
|
||||
// Create base + (1), (2), (3)
|
||||
Files.createFile(targetFolder.resolve(baseName));
|
||||
Files.createFile(targetFolder.resolve("2026-03-31 - Stromabrechnung(1).pdf"));
|
||||
Files.createFile(targetFolder.resolve("2026-03-31 - Stromabrechnung(2).pdf"));
|
||||
Files.createFile(targetFolder.resolve("2026-03-31 - Stromabrechnung(3).pdf"));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename())
|
||||
.isEqualTo("2026-03-31 - Stromabrechnung(4).pdf");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Suffix placement: immediately before .pdf
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_suffixPlacedImmediatelyBeforePdf() throws IOException {
|
||||
String baseName = "2026-04-07 - Bescheid.pdf";
|
||||
Files.createFile(targetFolder.resolve(baseName));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
String resolved = ((ResolvedTargetFilename) result).resolvedFilename();
|
||||
// Must end with "(1).pdf", not ".pdf(1)"
|
||||
assertThat(resolved).endsWith("(1).pdf");
|
||||
assertThat(resolved).doesNotContain(".pdf(");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Suffix does not count against 20-char base title
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_20CharTitle_suffixDoesNotViolateTitleLimit() throws IOException {
|
||||
// Base title has exactly 20 chars; with (1) suffix the title exceeds 20, but that is expected
|
||||
String title = "A".repeat(20); // 20-char title
|
||||
String baseName = "2026-01-01 - " + title + ".pdf";
|
||||
Files.createFile(targetFolder.resolve(baseName));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(baseName);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
String resolved = ((ResolvedTargetFilename) result).resolvedFilename();
|
||||
// The resolved filename must contain (1) even though overall length > 20 chars
|
||||
assertThat(resolved).contains("(1)");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – base name without .pdf extension
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_baseNameWithoutPdfExtension_whenConflict_returnsFailure()
|
||||
throws IOException {
|
||||
// When there is no conflict (file does not exist), the adapter returns the name as-is
|
||||
// because it only checks the extension when it needs to insert a suffix.
|
||||
String nameWithoutExt = "2026-01-15 - Rechnung";
|
||||
|
||||
// Create a file with that name (no extension) to trigger conflict handling
|
||||
Files.createFile(targetFolder.resolve(nameWithoutExt));
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(nameWithoutExt);
|
||||
|
||||
// Without .pdf extension, suffix insertion fails
|
||||
assertThat(result).isInstanceOf(TargetFolderTechnicalFailure.class);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – no conflict, name without .pdf (edge: no conflict → ok)
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_baseNameWithoutPdfExtension_whenNoConflict_returnsIt() {
|
||||
// If the name does not exist, the adapter returns it without checking the extension
|
||||
String nameWithoutExt = "2026-01-15 - Rechnung";
|
||||
|
||||
TargetFilenameResolutionResult result = adapter.resolveUniqueFilename(nameWithoutExt);
|
||||
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename()).isEqualTo(nameWithoutExt);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – null guard
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_rejectsNullBaseName() {
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> adapter.resolveUniqueFilename(null));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// tryDeleteTargetFile – file exists, gets deleted
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void tryDeleteTargetFile_fileExists_deletesFile() throws IOException {
|
||||
String filename = "2026-01-15 - Rechnung.pdf";
|
||||
Files.createFile(targetFolder.resolve(filename));
|
||||
assertThat(targetFolder.resolve(filename)).exists();
|
||||
|
||||
adapter.tryDeleteTargetFile(filename);
|
||||
|
||||
assertThat(targetFolder.resolve(filename)).doesNotExist();
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// tryDeleteTargetFile – file does not exist, no error
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void tryDeleteTargetFile_fileDoesNotExist_doesNotThrow() {
|
||||
// Must not throw even if the file is absent
|
||||
adapter.tryDeleteTargetFile("nonexistent.pdf");
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// tryDeleteTargetFile – null guard
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void tryDeleteTargetFile_rejectsNullFilename() {
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> adapter.tryDeleteTargetFile(null));
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// resolveUniqueFilename – non-existent target folder
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void resolveUniqueFilename_nonExistentTargetFolder_returnsFailure() {
|
||||
Path nonExistentFolder = targetFolder.resolve("does-not-exist");
|
||||
FilesystemTargetFolderAdapter adapterWithMissingFolder =
|
||||
new FilesystemTargetFolderAdapter(nonExistentFolder);
|
||||
|
||||
String baseName = "2026-01-01 - Test.pdf";
|
||||
|
||||
// Files.exists() on a file in a non-existent folder does not throw;
|
||||
// it simply returns false, so the adapter returns the base name.
|
||||
// This is consistent behaviour: no folder access error when just checking existence.
|
||||
TargetFilenameResolutionResult result = adapterWithMissingFolder.resolveUniqueFilename(baseName);
|
||||
|
||||
// Adapter returns the base name since no conflict is detected for a non-existent folder
|
||||
assertThat(result).isInstanceOf(ResolvedTargetFilename.class);
|
||||
assertThat(((ResolvedTargetFilename) result).resolvedFilename()).isEqualTo(baseName);
|
||||
}
|
||||
|
||||
// -------------------------------------------------------------------------
|
||||
// Construction – null guard
|
||||
// -------------------------------------------------------------------------
|
||||
|
||||
@Test
|
||||
void constructor_rejectsNullTargetFolderPath() {
|
||||
assertThatNullPointerException()
|
||||
.isThrownBy(() -> new FilesystemTargetFolderAdapter(null));
|
||||
}
|
||||
}
|
||||
@@ -1,11 +1,12 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
# sqlite.file is missing
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
ai.provider.openai-compatible.apiKey=test-api-key
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
api.key=test-api-key
|
||||
@@ -1,10 +1,11 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
sqlite.file=/tmp/db.sqlite
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
prompt.template.file=/tmp/prompt.txt
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
source.folder=/tmp/source
|
||||
target.folder=/tmp/target
|
||||
sqlite.file=/tmp/db.sqlite
|
||||
api.baseUrl=https://api.example.com
|
||||
api.model=gpt-4
|
||||
api.timeoutSeconds=30
|
||||
ai.provider.active=openai-compatible
|
||||
ai.provider.openai-compatible.baseUrl=https://api.example.com
|
||||
ai.provider.openai-compatible.model=gpt-4
|
||||
ai.provider.openai-compatible.timeoutSeconds=30
|
||||
ai.provider.openai-compatible.apiKey=test-api-key-from-properties
|
||||
max.retries.transient=3
|
||||
max.pages=100
|
||||
max.text.characters=50000
|
||||
@@ -11,4 +13,3 @@ prompt.template.file=/tmp/prompt.txt
|
||||
runtime.lock.file=/tmp/lock.lock
|
||||
log.directory=/tmp/logs
|
||||
log.level=DEBUG
|
||||
api.key=test-api-key-from-properties
|
||||
@@ -19,10 +19,10 @@
|
||||
<version>${project.version}</version>
|
||||
</dependency>
|
||||
|
||||
<!-- Logging -->
|
||||
<!-- JSON parsing for AI response parsing -->
|
||||
<dependency>
|
||||
<groupId>org.apache.logging.log4j</groupId>
|
||||
<artifactId>log4j-api</artifactId>
|
||||
<groupId>org.json</groupId>
|
||||
<artifactId>json</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- Test dependencies -->
|
||||
@@ -41,5 +41,48 @@
|
||||
<artifactId>mockito-junit-jupiter</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.assertj</groupId>
|
||||
<artifactId>assertj-core</artifactId>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.jacoco</groupId>
|
||||
<artifactId>jacoco-maven-plugin</artifactId>
|
||||
<executions>
|
||||
<execution>
|
||||
<id>jacoco-check</id>
|
||||
<phase>verify</phase>
|
||||
<goals>
|
||||
<goal>check</goal>
|
||||
</goals>
|
||||
<configuration>
|
||||
<!-- Application module: use cases and orchestration, tight coverage -->
|
||||
<rules>
|
||||
<rule>
|
||||
<element>BUNDLE</element>
|
||||
<limits>
|
||||
<limit>
|
||||
<counter>LINE</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.70</minimum>
|
||||
</limit>
|
||||
<limit>
|
||||
<counter>BRANCH</counter>
|
||||
<value>COVEREDRATIO</value>
|
||||
<minimum>0.60</minimum>
|
||||
</limit>
|
||||
</limits>
|
||||
</rule>
|
||||
</rules>
|
||||
</configuration>
|
||||
</execution>
|
||||
</executions>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</project>
|
||||
@@ -0,0 +1,69 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.port.out.AiContentSensitivity;
|
||||
|
||||
/**
|
||||
* Minimal runtime configuration for the application layer.
|
||||
* <p>
|
||||
* Contains only the application-level runtime parameters that are needed during
|
||||
* batch document processing. Technical infrastructure configuration (paths, API keys,
|
||||
* persistence parameters, etc.) is kept in the bootstrap layer.
|
||||
* <p>
|
||||
* This intentionally small contract ensures the application layer depends only on
|
||||
* the configuration values it actually uses, following hexagonal architecture principles.
|
||||
*
|
||||
* <h2>Validation invariants</h2>
|
||||
* <ul>
|
||||
* <li>{@link #maxPages()} must be ≥ 1.</li>
|
||||
* <li>{@link #maxRetriesTransient()} must be ≥ 1. The value {@code 0} is invalid
|
||||
* start configuration and must prevent the batch run from starting with exit
|
||||
* code 1.</li>
|
||||
* <li>{@link #aiContentSensitivity()} must not be {@code null}. The safe default is
|
||||
* {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT}.</li>
|
||||
* </ul>
|
||||
*
|
||||
* <h2>AI content sensitivity</h2>
|
||||
* <p>
|
||||
* The {@link #aiContentSensitivity()} field is derived from the {@code log.ai.sensitive}
|
||||
* configuration property (default: {@code false}). It governs whether the complete AI raw
|
||||
* response and complete AI {@code reasoning} may be written to log files. Sensitive AI
|
||||
* content is always persisted in SQLite regardless of this setting; only log output is
|
||||
* affected.
|
||||
* <p>
|
||||
* The safe default ({@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT}) must be used
|
||||
* whenever {@code log.ai.sensitive} is absent, {@code false}, or set to any value other
|
||||
* than the explicit opt-in.
|
||||
*/
|
||||
public record RuntimeConfiguration(
|
||||
/**
|
||||
* Maximum number of pages a document can have to be processed.
|
||||
* Documents exceeding this limit are rejected during pre-checks.
|
||||
*/
|
||||
int maxPages,
|
||||
|
||||
/**
|
||||
* Maximum number of historised transient technical errors allowed per fingerprint
|
||||
* across all scheduler runs.
|
||||
* <p>
|
||||
* The attempt that causes the counter to reach this value finalises the document
|
||||
* to {@code FAILED_FINAL}. Must be an Integer ≥ 1; the value {@code 0} is
|
||||
* invalid start configuration.
|
||||
* <p>
|
||||
* Example: {@code maxRetriesTransient = 1} means the first transient error
|
||||
* immediately finalises the document.
|
||||
*/
|
||||
int maxRetriesTransient,
|
||||
|
||||
/**
|
||||
* Sensitivity decision governing whether AI-generated content may be written to log files.
|
||||
* <p>
|
||||
* Derived from the {@code log.ai.sensitive} configuration property. The default is
|
||||
* {@link AiContentSensitivity#PROTECT_SENSITIVE_CONTENT} (do not log sensitive content).
|
||||
* Only {@link AiContentSensitivity#LOG_SENSITIVE_CONTENT} is produced when
|
||||
* {@code log.ai.sensitive = true} is explicitly set.
|
||||
* <p>
|
||||
* Must not be {@code null}.
|
||||
*/
|
||||
AiContentSensitivity aiContentSensitivity
|
||||
)
|
||||
{ }
|
||||
@@ -1,26 +0,0 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
|
||||
import java.net.URI;
|
||||
import java.nio.file.Path;
|
||||
|
||||
/**
|
||||
* Typed immutable configuration model for PDF Umbenenner startup parameters.
|
||||
* AP-005: Represents all M1-relevant configuration properties with strong typing.
|
||||
*/
|
||||
public record StartConfiguration(
|
||||
Path sourceFolder,
|
||||
Path targetFolder,
|
||||
Path sqliteFile,
|
||||
URI apiBaseUrl,
|
||||
String apiModel,
|
||||
int apiTimeoutSeconds,
|
||||
int maxRetriesTransient,
|
||||
int maxPages,
|
||||
int maxTextCharacters,
|
||||
Path promptTemplateFile,
|
||||
Path runtimeLockFile,
|
||||
Path logDirectory,
|
||||
String logLevel,
|
||||
String apiKey
|
||||
)
|
||||
{ }
|
||||
@@ -1,268 +0,0 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
|
||||
import org.apache.logging.log4j.LogManager;
|
||||
import org.apache.logging.log4j.Logger;
|
||||
|
||||
import java.nio.file.Files;
|
||||
import java.nio.file.Path;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Validates {@link StartConfiguration} before processing can begin.
|
||||
* <p>
|
||||
* Performs mandatory field checks, numeric range validation, URI scheme validation,
|
||||
* and basic path existence checks. Throws {@link InvalidStartConfigurationException}
|
||||
* if any validation rule fails.
|
||||
* <p>
|
||||
* M3/AP-007: Supports injected source folder validation for testability
|
||||
* (allows mocking of platform-dependent filesystem checks).
|
||||
*/
|
||||
public class StartConfigurationValidator {
|
||||
|
||||
private static final Logger LOG = LogManager.getLogger(StartConfigurationValidator.class);
|
||||
|
||||
/**
|
||||
* Abstraction for source folder existence, type, and readability checks.
|
||||
* <p>
|
||||
* Separates filesystem operations from validation logic to enable
|
||||
* platform-independent unit testing (mocking) of readability edge cases.
|
||||
* <p>
|
||||
* Implementation note: The default implementation uses {@code java.nio.file.Files}
|
||||
* static methods directly; tests can substitute alternative implementations.
|
||||
*/
|
||||
@FunctionalInterface
|
||||
public interface SourceFolderChecker {
|
||||
/**
|
||||
* Checks source folder and returns validation error message, or null if valid.
|
||||
* <p>
|
||||
* Checks (in order):
|
||||
* 1. Folder exists
|
||||
* 2. Is a directory
|
||||
* 3. Is readable
|
||||
*
|
||||
* @param path the source folder path
|
||||
* @return error message string, or null if all checks pass
|
||||
*/
|
||||
String checkSourceFolder(Path path);
|
||||
}
|
||||
|
||||
private final SourceFolderChecker sourceFolderChecker;
|
||||
|
||||
/**
|
||||
* Creates a validator with the default source folder checker (NIO-based).
|
||||
*/
|
||||
public StartConfigurationValidator() {
|
||||
this(new DefaultSourceFolderChecker());
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a validator with a custom source folder checker (primarily for testing).
|
||||
*
|
||||
* @param sourceFolderChecker the checker to use (must not be null)
|
||||
*/
|
||||
public StartConfigurationValidator(SourceFolderChecker sourceFolderChecker) {
|
||||
this.sourceFolderChecker = sourceFolderChecker;
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates the given configuration.
|
||||
* <p>
|
||||
* Checks all mandatory fields, numeric constraints, URI validity, and path existence.
|
||||
* If validation fails, throws {@link InvalidStartConfigurationException} with an
|
||||
* aggregated error message listing all problems.
|
||||
*
|
||||
* @param config the configuration to validate, must not be null
|
||||
* @throws InvalidStartConfigurationException if any validation rule fails
|
||||
*/
|
||||
public void validate(StartConfiguration config) {
|
||||
List<String> errors = new ArrayList<>();
|
||||
|
||||
// Mandatory string/path presence checks
|
||||
validateSourceFolder(config.sourceFolder(), errors);
|
||||
validateTargetFolder(config.targetFolder(), errors);
|
||||
validateSqliteFile(config.sqliteFile(), errors);
|
||||
validateApiBaseUrl(config.apiBaseUrl(), errors);
|
||||
validateApiModel(config.apiModel(), errors);
|
||||
validatePromptTemplateFile(config.promptTemplateFile(), errors);
|
||||
|
||||
// Numeric validation
|
||||
validateApiTimeoutSeconds(config.apiTimeoutSeconds(), errors);
|
||||
validateMaxRetriesTransient(config.maxRetriesTransient(), errors);
|
||||
validateMaxPages(config.maxPages(), errors);
|
||||
validateMaxTextCharacters(config.maxTextCharacters(), errors);
|
||||
|
||||
// Path relationship validation
|
||||
validateSourceAndTargetNotSame(config.sourceFolder(), config.targetFolder(), errors);
|
||||
|
||||
// Optional path validations (only if present)
|
||||
validateRuntimeLockFile(config.runtimeLockFile(), errors);
|
||||
validateLogDirectory(config.logDirectory(), errors);
|
||||
|
||||
if (!errors.isEmpty()) {
|
||||
String errorMessage = "Invalid startup configuration:\n" + String.join("\n", errors);
|
||||
throw new InvalidStartConfigurationException(errorMessage);
|
||||
}
|
||||
|
||||
LOG.info("Configuration validation successful.");
|
||||
}
|
||||
|
||||
private void validateSourceFolder(Path sourceFolder, List<String> errors) {
|
||||
if (sourceFolder == null) {
|
||||
errors.add("- source.folder: must not be null");
|
||||
return;
|
||||
}
|
||||
String checkError = sourceFolderChecker.checkSourceFolder(sourceFolder);
|
||||
if (checkError != null) {
|
||||
errors.add(checkError);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateTargetFolder(Path targetFolder, List<String> errors) {
|
||||
if (targetFolder == null) {
|
||||
errors.add("- target.folder: must not be null");
|
||||
return;
|
||||
}
|
||||
if (!Files.exists(targetFolder)) {
|
||||
errors.add("- target.folder: path does not exist: " + targetFolder);
|
||||
} else if (!Files.isDirectory(targetFolder)) {
|
||||
errors.add("- target.folder: path is not a directory: " + targetFolder);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateSqliteFile(Path sqliteFile, List<String> errors) {
|
||||
if (sqliteFile == null) {
|
||||
errors.add("- sqlite.file: must not be null");
|
||||
return;
|
||||
}
|
||||
Path parent = sqliteFile.getParent();
|
||||
if (parent == null) {
|
||||
errors.add("- sqlite.file: has no parent directory: " + sqliteFile);
|
||||
} else if (!Files.exists(parent)) {
|
||||
errors.add("- sqlite.file: parent directory does not exist: " + parent);
|
||||
} else if (!Files.isDirectory(parent)) {
|
||||
errors.add("- sqlite.file: parent is not a directory: " + parent);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiBaseUrl(java.net.URI apiBaseUrl, List<String> errors) {
|
||||
if (apiBaseUrl == null) {
|
||||
errors.add("- api.baseUrl: must not be null");
|
||||
return;
|
||||
}
|
||||
if (!apiBaseUrl.isAbsolute()) {
|
||||
errors.add("- api.baseUrl: must be an absolute URI: " + apiBaseUrl);
|
||||
return;
|
||||
}
|
||||
String scheme = apiBaseUrl.getScheme();
|
||||
if (scheme == null || (!"http".equalsIgnoreCase(scheme) && !"https".equalsIgnoreCase(scheme))) {
|
||||
errors.add("- api.baseUrl: scheme must be http or https, got: " + scheme);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiModel(String apiModel, List<String> errors) {
|
||||
if (apiModel == null || apiModel.isBlank()) {
|
||||
errors.add("- api.model: must not be null or blank");
|
||||
}
|
||||
}
|
||||
|
||||
private void validateApiTimeoutSeconds(int apiTimeoutSeconds, List<String> errors) {
|
||||
if (apiTimeoutSeconds <= 0) {
|
||||
errors.add("- api.timeoutSeconds: must be > 0, got: " + apiTimeoutSeconds);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxRetriesTransient(int maxRetriesTransient, List<String> errors) {
|
||||
if (maxRetriesTransient < 0) {
|
||||
errors.add("- max.retries.transient: must be >= 0, got: " + maxRetriesTransient);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxPages(int maxPages, List<String> errors) {
|
||||
if (maxPages <= 0) {
|
||||
errors.add("- max.pages: must be > 0, got: " + maxPages);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateMaxTextCharacters(int maxTextCharacters, List<String> errors) {
|
||||
if (maxTextCharacters <= 0) {
|
||||
errors.add("- max.text.characters: must be > 0, got: " + maxTextCharacters);
|
||||
}
|
||||
}
|
||||
|
||||
private void validatePromptTemplateFile(Path promptTemplateFile, List<String> errors) {
|
||||
if (promptTemplateFile == null) {
|
||||
errors.add("- prompt.template.file: must not be null");
|
||||
return;
|
||||
}
|
||||
if (!Files.exists(promptTemplateFile)) {
|
||||
errors.add("- prompt.template.file: path does not exist: " + promptTemplateFile);
|
||||
} else if (!Files.isRegularFile(promptTemplateFile)) {
|
||||
errors.add("- prompt.template.file: path is not a regular file: " + promptTemplateFile);
|
||||
}
|
||||
}
|
||||
|
||||
private void validateSourceAndTargetNotSame(Path sourceFolder, Path targetFolder, List<String> errors) {
|
||||
if (sourceFolder != null && targetFolder != null) {
|
||||
try {
|
||||
Path normalizedSource = sourceFolder.toRealPath();
|
||||
Path normalizedTarget = targetFolder.toRealPath();
|
||||
if (normalizedSource.equals(normalizedTarget)) {
|
||||
errors.add("- source.folder and target.folder must not resolve to the same path: " + normalizedSource);
|
||||
}
|
||||
} catch (Exception e) {
|
||||
// If toRealPath fails (e.g., path doesn't exist), skip this check
|
||||
// The individual existence checks will catch missing paths
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private void validateRuntimeLockFile(Path runtimeLockFile, List<String> errors) {
|
||||
if (runtimeLockFile != null && !runtimeLockFile.toString().isBlank()) {
|
||||
Path parent = runtimeLockFile.getParent();
|
||||
if (parent != null) {
|
||||
if (!Files.exists(parent)) {
|
||||
errors.add("- runtime.lock.file: parent directory does not exist: " + parent);
|
||||
} else if (!Files.isDirectory(parent)) {
|
||||
errors.add("- runtime.lock.file: parent is not a directory: " + parent);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private void validateLogDirectory(Path logDirectory, List<String> errors) {
|
||||
if (logDirectory != null && !logDirectory.toString().isBlank()) {
|
||||
if (Files.exists(logDirectory)) {
|
||||
if (!Files.isDirectory(logDirectory)) {
|
||||
errors.add("- log.directory: exists but is not a directory: " + logDirectory);
|
||||
}
|
||||
}
|
||||
// If it doesn't exist yet, that's acceptable - we don't auto-create
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Default NIO-based implementation of {@link SourceFolderChecker}.
|
||||
* <p>
|
||||
* Uses {@code java.nio.file.Files} static methods to check existence, type, and readability.
|
||||
* <p>
|
||||
* M3/AP-007: This separation allows unit tests to inject alternative implementations
|
||||
* that control the outcome of readability checks without relying on actual filesystem
|
||||
* permissions (which are platform-dependent).
|
||||
*/
|
||||
private static class DefaultSourceFolderChecker implements SourceFolderChecker {
|
||||
@Override
|
||||
public String checkSourceFolder(Path path) {
|
||||
if (!Files.exists(path)) {
|
||||
return "- source.folder: path does not exist: " + path;
|
||||
}
|
||||
if (!Files.isDirectory(path)) {
|
||||
return "- source.folder: path is not a directory: " + path;
|
||||
}
|
||||
if (!Files.isReadable(path)) {
|
||||
return "- source.folder: directory is not readable: " + path;
|
||||
}
|
||||
return null; // All checks passed
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1,5 +1,14 @@
|
||||
/**
|
||||
* Configuration model types for the PDF Umbenenner application.
|
||||
* Contains typed configuration objects representing startup parameters.
|
||||
* Application-layer runtime configuration model.
|
||||
* <p>
|
||||
* Contains only the minimal runtime configuration contract ({@link RuntimeConfiguration})
|
||||
* that the application layer actually depends on during batch document processing.
|
||||
* <p>
|
||||
* This is intentionally small and focused: it includes only the parameters the
|
||||
* application needs (e.g., maxPages), not the broader infrastructure configuration
|
||||
* (paths, API keys, persistence settings, etc.) which are handled at the bootstrap layer.
|
||||
* <p>
|
||||
* This separation follows hexagonal architecture principles by ensuring the application
|
||||
* layer depends only on configuration values it actually uses.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.application.config;
|
||||
@@ -0,0 +1,59 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
import java.util.Arrays;
|
||||
import java.util.Optional;
|
||||
|
||||
/**
|
||||
* Supported AI provider families for the PDF renaming process.
|
||||
* <p>
|
||||
* Each constant represents a distinct API protocol family. Exactly one provider family
|
||||
* is active per application run, selected via the {@code ai.provider.active} configuration property.
|
||||
* <p>
|
||||
* The {@link #getIdentifier()} method returns the string that must appear as the value of
|
||||
* {@code ai.provider.active} to activate the corresponding provider family.
|
||||
* Use {@link #fromIdentifier(String)} to resolve a configuration string to the enum constant.
|
||||
*/
|
||||
public enum AiProviderFamily {
|
||||
|
||||
/** OpenAI-compatible Chat Completions API – usable with OpenAI itself and compatible third-party endpoints. */
|
||||
OPENAI_COMPATIBLE("openai-compatible"),
|
||||
|
||||
/** Native Anthropic Messages API for Claude models. */
|
||||
CLAUDE("claude");
|
||||
|
||||
private final String identifier;
|
||||
|
||||
AiProviderFamily(String identifier) {
|
||||
this.identifier = identifier;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns the configuration identifier string for this provider family.
|
||||
* <p>
|
||||
* This value corresponds to valid values of the {@code ai.provider.active} property.
|
||||
*
|
||||
* @return the configuration identifier, never {@code null}
|
||||
*/
|
||||
public String getIdentifier() {
|
||||
return identifier;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolves a provider family from its configuration identifier string.
|
||||
* <p>
|
||||
* The comparison is case-sensitive and matches the exact identifier strings
|
||||
* defined by each constant (e.g., {@code "openai-compatible"}, {@code "claude"}).
|
||||
*
|
||||
* @param identifier the identifier as it appears in the {@code ai.provider.active} property;
|
||||
* {@code null} returns an empty Optional
|
||||
* @return the matching provider family, or {@link Optional#empty()} if not recognized
|
||||
*/
|
||||
public static Optional<AiProviderFamily> fromIdentifier(String identifier) {
|
||||
if (identifier == null) {
|
||||
return Optional.empty();
|
||||
}
|
||||
return Arrays.stream(values())
|
||||
.filter(f -> f.identifier.equals(identifier))
|
||||
.findFirst();
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,43 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
/**
|
||||
* Immutable multi-provider configuration model.
|
||||
* <p>
|
||||
* Represents the resolved configuration for both supported AI provider families
|
||||
* together with the selection of the one provider family that is active for this
|
||||
* application run.
|
||||
*
|
||||
* <h2>Invariants</h2>
|
||||
* <ul>
|
||||
* <li>Exactly one provider family is active per run.</li>
|
||||
* <li>Required fields are enforced only for the active provider; the inactive
|
||||
* provider's configuration may be incomplete.</li>
|
||||
* <li>Validation of these invariants is performed by the corresponding validator
|
||||
* in the adapter layer, not by this record itself.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param activeProviderFamily the selected provider family for this run; {@code null}
|
||||
* indicates that {@code ai.provider.active} was absent or
|
||||
* held an unrecognised value – the validator will reject this
|
||||
* @param openAiCompatibleConfig configuration for the OpenAI-compatible provider family
|
||||
* @param claudeConfig configuration for the Anthropic Claude provider family
|
||||
*/
|
||||
public record MultiProviderConfiguration(
|
||||
AiProviderFamily activeProviderFamily,
|
||||
ProviderConfiguration openAiCompatibleConfig,
|
||||
ProviderConfiguration claudeConfig) {
|
||||
|
||||
/**
|
||||
* Returns the {@link ProviderConfiguration} for the currently active provider family.
|
||||
*
|
||||
* @return the active provider's configuration, never {@code null} when
|
||||
* {@link #activeProviderFamily()} is not {@code null}
|
||||
* @throws NullPointerException if {@code activeProviderFamily} is {@code null}
|
||||
*/
|
||||
public ProviderConfiguration activeProviderConfiguration() {
|
||||
return switch (activeProviderFamily) {
|
||||
case OPENAI_COMPATIBLE -> openAiCompatibleConfig;
|
||||
case CLAUDE -> claudeConfig;
|
||||
};
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,34 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.provider;
|
||||
|
||||
/**
|
||||
* Immutable configuration for a single AI provider family.
|
||||
* <p>
|
||||
* Holds all parameters needed to connect to and authenticate with one AI provider endpoint.
|
||||
* Instances are created by the configuration parser in the adapter layer; validation
|
||||
* of required fields is performed by the corresponding validator.
|
||||
*
|
||||
* <h2>Field semantics</h2>
|
||||
* <ul>
|
||||
* <li>{@code model} – the AI model name; required for the active provider, may be {@code null}
|
||||
* for the inactive provider.</li>
|
||||
* <li>{@code timeoutSeconds} – HTTP connection/read timeout in seconds; must be positive for
|
||||
* the active provider. {@code 0} indicates the value was not configured.</li>
|
||||
* <li>{@code baseUrl} – the base URL of the API endpoint. For the Anthropic Claude family a
|
||||
* default of {@code https://api.anthropic.com} is applied by the parser when the property
|
||||
* is absent; for the OpenAI-compatible family it is required and may not be {@code null}.</li>
|
||||
* <li>{@code apiKey} – the resolved API key after environment-variable precedence has been
|
||||
* applied; may be blank for the inactive provider, must not be blank for the active provider.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param model the AI model name; {@code null} when not configured
|
||||
* @param timeoutSeconds HTTP timeout in seconds; {@code 0} when not configured
|
||||
* @param baseUrl the base URL of the API endpoint; {@code null} when not configured
|
||||
* (only applicable to providers without a built-in default)
|
||||
* @param apiKey the resolved API key; blank when not configured
|
||||
*/
|
||||
public record ProviderConfiguration(
|
||||
String model,
|
||||
int timeoutSeconds,
|
||||
String baseUrl,
|
||||
String apiKey) {
|
||||
}
|
||||
@@ -0,0 +1,52 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.config.startup;
|
||||
|
||||
import java.nio.file.Path;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.provider.MultiProviderConfiguration;
|
||||
|
||||
/**
|
||||
* Typed immutable configuration model for PDF Umbenenner startup parameters.
|
||||
* <p>
|
||||
* Contains all technical infrastructure and runtime configuration parameters
|
||||
* loaded and validated at bootstrap time. This is a complete configuration model
|
||||
* for the entire application startup, including paths, AI provider selection, persistence,
|
||||
* and operational parameters.
|
||||
*
|
||||
* <h2>AI provider configuration</h2>
|
||||
* <p>
|
||||
* The {@link MultiProviderConfiguration} encapsulates the active provider selection
|
||||
* together with the per-provider connection parameters for all supported provider families.
|
||||
* Exactly one provider family is active per run; the selection is driven by the
|
||||
* {@code ai.provider.active} configuration property.
|
||||
*
|
||||
* <h2>AI content sensitivity ({@code log.ai.sensitive})</h2>
|
||||
* <p>
|
||||
* The boolean property {@code log.ai.sensitive} controls whether sensitive AI-generated
|
||||
* content (complete raw AI response, complete AI {@code reasoning}) may be written to
|
||||
* log files. The default is {@code false} (safe/protect). Set to {@code true} only when
|
||||
* explicit diagnostic logging of AI content is required.
|
||||
* <p>
|
||||
* Sensitive AI content is always persisted in SQLite regardless of this setting.
|
||||
* Only log output is affected.
|
||||
*/
|
||||
public record StartConfiguration(
|
||||
Path sourceFolder,
|
||||
Path targetFolder,
|
||||
Path sqliteFile,
|
||||
MultiProviderConfiguration multiProviderConfiguration,
|
||||
int maxRetriesTransient,
|
||||
int maxPages,
|
||||
int maxTextCharacters,
|
||||
Path promptTemplateFile,
|
||||
Path runtimeLockFile,
|
||||
Path logDirectory,
|
||||
String logLevel,
|
||||
|
||||
/**
|
||||
* Whether sensitive AI content (raw response, reasoning) may be written to log files.
|
||||
* Corresponds to the {@code log.ai.sensitive} configuration property.
|
||||
* Default: {@code false} (do not log sensitive content).
|
||||
*/
|
||||
boolean logAiSensitive
|
||||
)
|
||||
{ }
|
||||
@@ -0,0 +1,13 @@
|
||||
/**
|
||||
* Startup configuration model types.
|
||||
* <p>
|
||||
* Contains the complete technical startup configuration model ({@link StartConfiguration})
|
||||
* that encompasses all infrastructure parameters, paths, API settings, and operational
|
||||
* parameters required to initialize and run the application.
|
||||
* <p>
|
||||
* This is separate from the minimal {@link de.gecheckt.pdf.umbenenner.application.config.RuntimeConfiguration}
|
||||
* which represents only the configuration values the application layer actually depends on
|
||||
* during batch processing. The startup configuration is the complete technical model used
|
||||
* by bootstrap and adapter layers for initialization, validation, and infrastructure wiring.
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.application.config.startup;
|
||||
@@ -10,15 +10,13 @@ package de.gecheckt.pdf.umbenenner.application.port.in;
|
||||
* The outcome is independent of individual document processing results;
|
||||
* it represents the batch operation itself (lock acquired, no critical startup failure, etc.).
|
||||
* <p>
|
||||
* Design Note: This contract is defined in AP-002 to enable AP-007 (exit code handling)
|
||||
* to derive exit codes systematically without requiring additional knowledge about
|
||||
* the batch run. Each outcome maps cleanly to an exit code semantic.
|
||||
* Design Note: This contract enables exit code handling to derive exit codes systematically
|
||||
* without requiring additional knowledge about the batch run. Each outcome maps cleanly
|
||||
* to an exit code semantic.
|
||||
* <p>
|
||||
* AP-007: Three distinct outcomes are now defined to make the difference between a
|
||||
* technically successful run, a controlled early termination due to start protection,
|
||||
* and a hard bootstrap failure explicit in both code and logs.
|
||||
*
|
||||
* @since M2-AP-002
|
||||
* Three distinct outcomes are defined to make the difference between a technically successful run,
|
||||
* a controlled early termination due to start protection, and a hard bootstrap failure
|
||||
* explicit in both code and logs.
|
||||
*/
|
||||
public enum BatchRunOutcome {
|
||||
|
||||
@@ -42,7 +40,6 @@ public enum BatchRunOutcome {
|
||||
* <p>
|
||||
* Maps to exit code 1.
|
||||
*
|
||||
* @since M2-AP-007
|
||||
*/
|
||||
LOCK_UNAVAILABLE("Another instance is already running; this run terminates immediately"),
|
||||
|
||||
@@ -98,7 +95,6 @@ public enum BatchRunOutcome {
|
||||
* the run lock being held by another instance.
|
||||
*
|
||||
* @return true if outcome is {@link #LOCK_UNAVAILABLE}, false otherwise
|
||||
* @since M2-AP-007
|
||||
*/
|
||||
public boolean isLockUnavailable() {
|
||||
return this == LOCK_UNAVAILABLE;
|
||||
|
||||
@@ -17,18 +17,13 @@ import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
|
||||
* <p>
|
||||
* The returned outcome is designed to be independent of individual document results,
|
||||
* representing only the batch operation itself. Individual document successes/failures
|
||||
* are tracked separately in persistence (future milestones).
|
||||
* are tracked separately in persistence.
|
||||
* <p>
|
||||
* M2-AP-002 Implementation:
|
||||
* Implementation details:
|
||||
* <ul>
|
||||
* <li>Port is defined with a structured return contract ({@link BatchRunOutcome})</li>
|
||||
* <li>Return model allows Bootstrap/CLI to systematically derive exit codes (AP-007)</li>
|
||||
* <li>No implementation of the use case itself yet (that is AP-004)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* M2-AP-003 Update:
|
||||
* <ul>
|
||||
* <li>execute() now accepts a {@link BatchRunContext} containing the run ID and timing</li>
|
||||
* <li>Return model allows Bootstrap/CLI to systematically derive exit codes</li>
|
||||
* <li>execute() accepts a {@link BatchRunContext} containing the run ID and timing</li>
|
||||
* <li>The context flows through the entire batch cycle for correlation and logging</li>
|
||||
* </ul>
|
||||
*/
|
||||
|
||||
@@ -13,13 +13,11 @@
|
||||
* Return models:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.in.BatchRunOutcome}
|
||||
* — Structured result of a batch run, designed for exit code mapping (AP-007)</li>
|
||||
* — Structured result of a batch run, designed for exit code mapping</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Architecture Rule: Inbound ports are independent of implementation and contain no business logic.
|
||||
* They define "what can be done to the application". All dependencies point inward;
|
||||
* adapters depend on ports, not vice versa.
|
||||
*
|
||||
* @since M2-AP-002
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.application.port.in;
|
||||
|
||||
@@ -0,0 +1,46 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sensitivity decision governing whether AI-generated content may be written to log files.
|
||||
* <p>
|
||||
* The following AI-generated content items are classified as sensitive and are subject to
|
||||
* this decision:
|
||||
* <ul>
|
||||
* <li>The <strong>complete AI raw response</strong> (full JSON body returned by the
|
||||
* AI service)</li>
|
||||
* <li>The <strong>complete AI {@code reasoning}</strong> field extracted from the
|
||||
* AI response</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Sensitive AI content is always written to SQLite (for traceability) regardless of
|
||||
* this decision. The decision controls only whether the content is also emitted into
|
||||
* log files.
|
||||
* <p>
|
||||
* <strong>Default behaviour:</strong> The default is {@link #PROTECT_SENSITIVE_CONTENT}.
|
||||
* Logging of sensitive AI content must be explicitly enabled by setting the boolean
|
||||
* configuration property {@code log.ai.sensitive = true}. Any other value, or the
|
||||
* absence of the property, results in {@link #PROTECT_SENSITIVE_CONTENT}.
|
||||
* <p>
|
||||
* <strong>Non-sensitive AI content</strong> (e.g. the resolved title, the resolved date,
|
||||
* the date source) is not covered by this decision and may always be logged.
|
||||
*/
|
||||
public enum AiContentSensitivity {
|
||||
|
||||
/**
|
||||
* Sensitive AI content (raw response, reasoning) must <strong>not</strong> be written
|
||||
* to log files.
|
||||
* <p>
|
||||
* This is the safe default. It is active whenever {@code log.ai.sensitive} is absent,
|
||||
* {@code false}, or set to any value other than the explicit opt-in.
|
||||
*/
|
||||
PROTECT_SENSITIVE_CONTENT,
|
||||
|
||||
/**
|
||||
* Sensitive AI content (raw response, reasoning) <strong>may</strong> be written
|
||||
* to log files.
|
||||
* <p>
|
||||
* This value is only produced when {@code log.ai.sensitive = true} is explicitly set
|
||||
* in the application configuration. It must never be the implicit default.
|
||||
*/
|
||||
LOG_SENSITIVE_CONTENT
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Outbound port for invoking an AI service over an OpenAI-compatible HTTP boundary.
|
||||
* <p>
|
||||
* This interface abstracts AI service communication, allowing the Application layer
|
||||
* to orchestrate AI-based naming without knowing about HTTP, authentication, or
|
||||
* provider-specific details.
|
||||
* <p>
|
||||
* <strong>Design principles:</strong>
|
||||
* <ul>
|
||||
* <li>Provider is configurable (OpenAI, Azure, local LLM, etc.), not hard-coded</li>
|
||||
* <li>Base URL, model name, and timeout are runtime configuration</li>
|
||||
* <li>Results are returned as structured types ({@link AiInvocationResult}),
|
||||
* never as exceptions</li>
|
||||
* <li>Technical success (HTTP 200) is distinct from response content validity</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Adapter responsibilities:</strong>
|
||||
* <ul>
|
||||
* <li>Construct an HTTP request from the {@link AiRequestRepresentation}</li>
|
||||
* <li>Apply all transport-level configuration (base URL, model, timeout, authentication)</li>
|
||||
* <li>Execute the HTTP request against the configured endpoint</li>
|
||||
* <li>Distinguish between successful reception of a response body and technical failure</li>
|
||||
* <li>Return either an invocation success with raw response or a classified technical error</li>
|
||||
* <li>Encapsulate all HTTP, JSON serialization, and authentication details</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Non-goals of this port:</strong>
|
||||
* <ul>
|
||||
* <li>JSON parsing of the response body (Application layer handles this)</li>
|
||||
* <li>Validation of response content against domain rules</li>
|
||||
* <li>Prompt construction or text formatting (Application layer does this)</li>
|
||||
* <li>Handling of provider-specific output formats or structured output schemas</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>OpenAI compatibility:</strong> The adapter must support the OpenAI Chat
|
||||
* Completions API or a compatible endpoint. The {@code AiRequestRepresentation}
|
||||
* contains the prompt and document text; the adapter is responsible for formatting
|
||||
* these as needed (e.g., system message + user message in the Chat API).
|
||||
*/
|
||||
public interface AiInvocationPort {
|
||||
|
||||
/**
|
||||
* Invokes an AI service with the given request representation.
|
||||
* <p>
|
||||
* This method sends a request to the configured AI endpoint and returns the result.
|
||||
* The request contains both the prompt and the document text, deterministically
|
||||
* composed by the Application layer.
|
||||
* <p>
|
||||
* <strong>Outcome distinction:</strong>
|
||||
* <ul>
|
||||
* <li>If the HTTP call succeeds and a response body is received,
|
||||
* {@link AiInvocationSuccess} is returned, even if the body is invalid JSON
|
||||
* or semantically problematic. The Application layer will parse and validate
|
||||
* the content.</li>
|
||||
* <li>If the HTTP call fails (timeout, network error, endpoint unreachable,
|
||||
* connection failure), {@link AiInvocationTechnicalFailure} is returned.</li>
|
||||
* </ul>
|
||||
*
|
||||
* @param request the complete request to send to the AI service; never null
|
||||
* @return an {@link AiInvocationResult} encoding either:
|
||||
* <ul>
|
||||
* <li>Success: response body was received (valid or not)</li>
|
||||
* <li>Technical failure: HTTP communication failed</li>
|
||||
* </ul>
|
||||
* @throws NullPointerException if request is null
|
||||
*
|
||||
* @see AiInvocationSuccess
|
||||
* @see AiInvocationTechnicalFailure
|
||||
*/
|
||||
AiInvocationResult invoke(AiRequestRepresentation request);
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sealed interface representing the outcome of invoking an AI service.
|
||||
* <p>
|
||||
* Implementations allow the Application layer to distinguish between:
|
||||
* <ul>
|
||||
* <li>Successful HTTP communication with a response body (which may still contain
|
||||
* functionally invalid content, but is at least technically received)</li>
|
||||
* <li>Technical failure (timeout, network error, endpoint unreachable, malformed response)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Permitted implementations:
|
||||
* <ul>
|
||||
* <li>{@link AiInvocationSuccess} — HTTP call completed with a response body</li>
|
||||
* <li>{@link AiInvocationTechnicalFailure} — HTTP call failed or no valid response was received</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Critical distinction:</strong> A successful invocation means the HTTP request
|
||||
* was sent and a response was received, but the response content may still be unparseable
|
||||
* or semantically invalid. This is crucial for retry logic: a technical HTTP success
|
||||
* with unparseable JSON is different from a timeout or network error.
|
||||
*/
|
||||
public sealed interface AiInvocationResult
|
||||
permits AiInvocationSuccess, AiInvocationTechnicalFailure {
|
||||
}
|
||||
@@ -0,0 +1,50 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRawResponse;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Represents successful HTTP communication with an AI service.
|
||||
* <p>
|
||||
* The HTTP request was sent and a response body was received. This indicates
|
||||
* technical success of the communication, but does NOT guarantee that the response
|
||||
* content is valid, parseable, or functionally usable.
|
||||
* <p>
|
||||
* <strong>Field semantics:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #request()} — the exact request that was sent to the AI service,
|
||||
* including prompt, document text, and character counts</li>
|
||||
* <li>{@link #rawResponse()} — the uninterpreted response body returned by the AI,
|
||||
* which may be valid JSON, malformed, empty, or otherwise problematic</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The Application layer is responsible for:
|
||||
* <ul>
|
||||
* <li>Parsing the raw response (JSON extraction, field validation)</li>
|
||||
* <li>Distinguishing between parseable and unparseable responses</li>
|
||||
* <li>Validating the content against rules (title length, date format, etc.)</li>
|
||||
* <li>Classifying any failures as technical or functional</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Persistence:</strong> Both request and response are stored in the
|
||||
* processing attempt history for debugging and audit.
|
||||
*
|
||||
* @param request the AI request that was sent; never null
|
||||
* @param rawResponse the uninterpreted response body; never null (but may be empty)
|
||||
*/
|
||||
public record AiInvocationSuccess(
|
||||
AiRequestRepresentation request,
|
||||
AiRawResponse rawResponse) implements AiInvocationResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating mandatory fields.
|
||||
*
|
||||
* @throws NullPointerException if either field is null
|
||||
*/
|
||||
public AiInvocationSuccess {
|
||||
Objects.requireNonNull(request, "request must not be null");
|
||||
Objects.requireNonNull(rawResponse, "rawResponse must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,52 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.AiRequestRepresentation;
|
||||
|
||||
/**
|
||||
* Represents a technical failure during AI service invocation.
|
||||
* <p>
|
||||
* The HTTP request could not be sent, or no valid response body was received.
|
||||
* This covers network errors, timeouts, endpoint unreachability, connection failures,
|
||||
* and other infrastructure-level problems.
|
||||
* <p>
|
||||
* <strong>Field semantics:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #request()} — the request that was attempted to be sent. Stored for
|
||||
* debugging and audit, even though it may not have reached the AI service.</li>
|
||||
* <li>{@link #failureReason()} — a classification of the technical error
|
||||
* (e.g., "TIMEOUT", "ENDPOINT_UNREACHABLE", "CONNECTION_ERROR")</li>
|
||||
* <li>{@link #failureMessage()} — a human-readable description of the error,
|
||||
* suitable for logging and operational troubleshooting</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Retry semantics:</strong> Technical failures are retryable. The Application
|
||||
* layer will record this as a transient error, and the document may be retried in
|
||||
* a later batch run up to the configured maximum transient-error count.
|
||||
* <p>
|
||||
* <strong>Distinction from functional errors:</strong> A 200 OK response with an
|
||||
* invalid JSON body is NOT a technical failure; it's an invocation success that
|
||||
* contains a functional error. Only communication/transport errors are classified here.
|
||||
*
|
||||
* @param request the request that was attempted (may not have been successfully sent);
|
||||
* never null
|
||||
* @param failureReason classification of the error type; never null (may be empty)
|
||||
* @param failureMessage human-readable error description; never null (may be empty)
|
||||
*/
|
||||
public record AiInvocationTechnicalFailure(
|
||||
AiRequestRepresentation request,
|
||||
String failureReason,
|
||||
String failureMessage) implements AiInvocationResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating mandatory fields.
|
||||
*
|
||||
* @throws NullPointerException if any field is null
|
||||
*/
|
||||
public AiInvocationTechnicalFailure {
|
||||
Objects.requireNonNull(request, "request must not be null");
|
||||
Objects.requireNonNull(failureReason, "failureReason must not be null");
|
||||
Objects.requireNonNull(failureMessage, "failureMessage must not be null");
|
||||
}
|
||||
}
|
||||
@@ -8,17 +8,13 @@ import java.time.Instant;
|
||||
* This port abstracts access to the system clock, enabling the batch run to:
|
||||
* <ul>
|
||||
* <li>Record timestamps for batch run start and completion</li>
|
||||
* <li>Generate or verify document timestamps (later in M5+)</li>
|
||||
* <li>Generate or verify document timestamps for traceability</li>
|
||||
* <li>Support testing with controlled time values</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* By isolating time access behind a port, the application can be tested with
|
||||
* deterministic time values without requiring system clock manipulation.
|
||||
* <p>
|
||||
* This port is defined in M2 for use in later milestones where timestamps
|
||||
* become relevant (e.g., run history, document date fallback).
|
||||
*
|
||||
* @since M2-AP-002
|
||||
*/
|
||||
public interface ClockPort {
|
||||
|
||||
|
||||
@@ -1,10 +1,10 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.application.config.StartConfiguration;
|
||||
import de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration;
|
||||
|
||||
/**
|
||||
* Outbound port for configuration access.
|
||||
* AP-005: Minimal interface for loading typed startup configuration.
|
||||
* Provides a minimal interface for loading typed startup configuration.
|
||||
*/
|
||||
public interface ConfigurationPort {
|
||||
|
||||
|
||||
@@ -0,0 +1,90 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Unified classification of all document-level errors in the end state.
|
||||
* <p>
|
||||
* This enumeration provides a single, exhaustive taxonomy for every error category
|
||||
* that the retry policy and logging infrastructure must distinguish. It replaces
|
||||
* any ad-hoc string-based classification where an authoritative type is needed.
|
||||
* <p>
|
||||
* <strong>Mapping to failure counters:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #DETERMINISTIC_CONTENT_ERROR} → increments the content-error counter
|
||||
* ({@link FailureCounters#contentErrorCount()}). The first occurrence leads to
|
||||
* {@code FAILED_RETRYABLE}; the second leads to {@code FAILED_FINAL}.
|
||||
* There is no further retry after the second deterministic content error.</li>
|
||||
* <li>{@link #TRANSIENT_TECHNICAL_ERROR} → increments the transient-error counter
|
||||
* ({@link FailureCounters#transientErrorCount()}). Remains retryable until the
|
||||
* counter reaches the configured {@code max.retries.transient} limit (Integer ≥ 1).
|
||||
* The attempt that reaches the limit finalises the document to {@code FAILED_FINAL}.</li>
|
||||
* <li>{@link #TARGET_COPY_TECHNICAL_ERROR} → signals a failure on the physical target
|
||||
* file copy path. Within the same run, exactly one immediate technical retry is
|
||||
* allowed. If the immediate retry also fails, the error is treated as a
|
||||
* {@link #TRANSIENT_TECHNICAL_ERROR} for the purposes of counter updates and
|
||||
* cross-run retry evaluation.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Scope of deterministic content errors:</strong>
|
||||
* <ul>
|
||||
* <li>No usable PDF text extracted</li>
|
||||
* <li>Page limit exceeded</li>
|
||||
* <li>AI response functionally invalid (generic/unusable title, unparseable date)</li>
|
||||
* <li>Document content ambiguous or not uniquely interpretable</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Scope of transient technical errors:</strong>
|
||||
* <ul>
|
||||
* <li>AI service unreachable, HTTP timeout, network error</li>
|
||||
* <li>Unparseable or structurally invalid AI JSON</li>
|
||||
* <li>Temporary I/O error during PDF text extraction</li>
|
||||
* <li>Temporary SQLite lock or persistence failure</li>
|
||||
* <li>Any other non-deterministic infrastructure failure</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Architecture note:</strong> This type carries no infrastructure dependencies.
|
||||
* It is safe to reference from Domain, Application and Adapter layers.
|
||||
*/
|
||||
public enum DocumentErrorClassification {
|
||||
|
||||
/**
|
||||
* A deterministic content error that cannot be resolved by retrying with the same
|
||||
* document content.
|
||||
* <p>
|
||||
* Examples: no extractable text, page limit exceeded, AI-returned title is generic
|
||||
* or unusable, document content is ambiguous.
|
||||
* <p>
|
||||
* Retry rule: the first historised occurrence of this error for a fingerprint leads
|
||||
* to {@code FAILED_RETRYABLE} (one later run may retry). The second historised
|
||||
* occurrence leads to {@code FAILED_FINAL} (no further retries).
|
||||
*/
|
||||
DETERMINISTIC_CONTENT_ERROR,
|
||||
|
||||
/**
|
||||
* A transient technical infrastructure failure unrelated to the document content.
|
||||
* <p>
|
||||
* Examples: AI endpoint not reachable, HTTP timeout, malformed or non-parseable
|
||||
* JSON, temporary I/O failure, temporary SQLite lock.
|
||||
* <p>
|
||||
* Retry rule: remains {@code FAILED_RETRYABLE} until the transient-error counter
|
||||
* reaches the configured {@code max.retries.transient} limit. The attempt that
|
||||
* reaches the limit finalises the document to {@code FAILED_FINAL}.
|
||||
* The configured limit must be an Integer ≥ 1; the value {@code 0} is invalid
|
||||
* start configuration and prevents the batch run from starting.
|
||||
*/
|
||||
TRANSIENT_TECHNICAL_ERROR,
|
||||
|
||||
/**
|
||||
* A technical failure specifically on the physical target-file copy path.
|
||||
* <p>
|
||||
* This error class is distinct from {@link #TRANSIENT_TECHNICAL_ERROR} because it
|
||||
* triggers a special within-run handling: exactly one immediate technical retry of
|
||||
* the copy operation is allowed within the same document run. No new AI call and no
|
||||
* new naming proposal derivation occur during the immediate retry.
|
||||
* <p>
|
||||
* If the immediate retry succeeds, the document proceeds to {@code SUCCESS}.
|
||||
* If the immediate retry also fails, the combined failure is recorded as a
|
||||
* {@link #TRANSIENT_TECHNICAL_ERROR} for counter and cross-run retry evaluation.
|
||||
* The immediate retry is not counted in the laufübergreifenden transient-error counter.
|
||||
*/
|
||||
TARGET_COPY_TECHNICAL_ERROR
|
||||
}
|
||||
@@ -8,14 +8,13 @@ import java.util.Objects;
|
||||
* The document is known (fingerprint exists in the persistence store) but its overall
|
||||
* status is neither {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SUCCESS}
|
||||
* nor {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#FAILED_FINAL}.
|
||||
* The use case may continue with normal M4 processing using the provided record.
|
||||
* The use case may continue with normal processing using the provided record.
|
||||
* <p>
|
||||
* The existing {@link DocumentRecord} is supplied so the use case can inspect the
|
||||
* current status, failure counters, and other fields required to apply M4 retry rules
|
||||
* current status, failure counters, and other fields required to apply retry rules
|
||||
* without an additional lookup.
|
||||
*
|
||||
* @param record the current master record for this document; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentKnownProcessable(DocumentRecord record) implements DocumentRecordLookupResult {
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user