M4 AP-001 Kernobjekte, Statusmodell und Port-Verträge präzisieren
This commit is contained in:
@@ -0,0 +1,30 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Lookup result indicating that a master record exists and the document is not yet terminal.
|
||||
* <p>
|
||||
* The document is known (fingerprint exists in the persistence store) but its overall
|
||||
* status is neither {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SUCCESS}
|
||||
* nor {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#FAILED_FINAL}.
|
||||
* The use case may continue with normal M4 processing using the provided record.
|
||||
* <p>
|
||||
* The existing {@link DocumentRecord} is supplied so the use case can inspect the
|
||||
* current status, failure counters, and other fields required to apply M4 retry rules
|
||||
* without an additional lookup.
|
||||
*
|
||||
* @param record the current master record for this document; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentKnownProcessable(DocumentRecord record) implements DocumentRecordLookupResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the non-null contract.
|
||||
*
|
||||
* @throws NullPointerException if {@code record} is null
|
||||
*/
|
||||
public DocumentKnownProcessable {
|
||||
Objects.requireNonNull(record, "record must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,48 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Unchecked exception thrown by persistence write operations when a technical
|
||||
* infrastructure failure prevents the operation from completing.
|
||||
* <p>
|
||||
* This exception is thrown by {@link DocumentRecordRepository} and
|
||||
* {@link ProcessingAttemptRepository} write methods, and by
|
||||
* {@link PersistenceSchemaInitializationPort#initializeSchema()}, when the underlying
|
||||
* persistence layer (SQLite) cannot be reached or returns an unrecoverable error.
|
||||
* <p>
|
||||
* <strong>Batch run impact:</strong>
|
||||
* <ul>
|
||||
* <li>If thrown during <em>schema initialisation</em> at startup, the run must abort
|
||||
* with exit code 1.</li>
|
||||
* <li>If thrown during <em>per-document write operations</em>, the current candidate
|
||||
* is treated as a transient failure; the batch run continues with the remaining
|
||||
* candidates.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* The exception is <em>not</em> used for read operations; read failures are modelled
|
||||
* as {@link PersistenceLookupTechnicalFailure} in the sealed
|
||||
* {@link DocumentRecordLookupResult} hierarchy to allow exhaustive pattern matching
|
||||
* at the call site.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public class DocumentPersistenceException extends RuntimeException {
|
||||
|
||||
/**
|
||||
* Constructs a new {@code DocumentPersistenceException} with the given message.
|
||||
*
|
||||
* @param message human-readable description of the persistence failure
|
||||
*/
|
||||
public DocumentPersistenceException(String message) {
|
||||
super(message);
|
||||
}
|
||||
|
||||
/**
|
||||
* Constructs a new {@code DocumentPersistenceException} with message and cause.
|
||||
*
|
||||
* @param message human-readable description of the persistence failure
|
||||
* @param cause the underlying throwable that caused this failure
|
||||
*/
|
||||
public DocumentPersistenceException(String message, Throwable cause) {
|
||||
super(message, cause);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,83 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Application-facing representation of the document master record (Dokument-Stammsatz).
|
||||
* <p>
|
||||
* One {@code DocumentRecord} exists per unique {@link DocumentFingerprint}. It carries
|
||||
* the current overall status, failure counters, and the most recently known source
|
||||
* location of the document.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> This type contains no SQLite or JDBC types.
|
||||
* Mapping between {@code DocumentRecord} and the persistence layer is performed
|
||||
* exclusively by the repository adapter in {@code adapter-out}.
|
||||
* <p>
|
||||
* <strong>M4 field semantics:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #fingerprint()} — primary identity; never changes for a given record.</li>
|
||||
* <li>{@link #lastKnownSourceLocator()} — opaque locator used by adapters; the
|
||||
* application passes it through without interpreting the value.</li>
|
||||
* <li>{@link #lastKnownSourceFileName()} — human-readable file name for logging and
|
||||
* diagnostics; not used for identity.</li>
|
||||
* <li>{@link #overallStatus()} — the current terminal or active status of the document
|
||||
* across all runs. See {@link ProcessingStatus} for semantics.</li>
|
||||
* <li>{@link #failureCounters()} — independent counters for content and transient errors;
|
||||
* never increased by skip events.</li>
|
||||
* <li>{@link #lastFailureInstant()} — timestamp of the most recent failure; {@code null}
|
||||
* if no failure has been recorded yet.</li>
|
||||
* <li>{@link #lastSuccessInstant()} — timestamp of the successful processing; {@code null}
|
||||
* if the document has never been processed successfully.</li>
|
||||
* <li>{@link #createdAt()} — timestamp when this master record was first created.</li>
|
||||
* <li>{@link #updatedAt()} — timestamp of the most recent update to this master record.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Not included in M4:</strong> target path, target file name, AI-related fields.
|
||||
* These are added in later milestones.
|
||||
*
|
||||
* @param fingerprint content-based identity; never null
|
||||
* @param lastKnownSourceLocator opaque locator to the physical source file; never null
|
||||
* @param lastKnownSourceFileName file name at the time of the last known access; never null or blank
|
||||
* @param overallStatus current processing status; never null
|
||||
* @param failureCounters counters for content and transient errors; never null
|
||||
* @param lastFailureInstant timestamp of the most recent failure, or {@code null}
|
||||
* @param lastSuccessInstant timestamp of the successful processing, or {@code null}
|
||||
* @param createdAt timestamp when this record was first created; never null
|
||||
* @param updatedAt timestamp of the most recent update; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentRecord(
|
||||
DocumentFingerprint fingerprint,
|
||||
SourceDocumentLocator lastKnownSourceLocator,
|
||||
String lastKnownSourceFileName,
|
||||
ProcessingStatus overallStatus,
|
||||
FailureCounters failureCounters,
|
||||
Instant lastFailureInstant,
|
||||
Instant lastSuccessInstant,
|
||||
Instant createdAt,
|
||||
Instant updatedAt) {
|
||||
|
||||
/**
|
||||
* Compact constructor validating mandatory non-null fields.
|
||||
*
|
||||
* @throws NullPointerException if any mandatory field is null
|
||||
* @throws IllegalArgumentException if {@code lastKnownSourceFileName} is blank
|
||||
*/
|
||||
public DocumentRecord {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
Objects.requireNonNull(lastKnownSourceLocator, "lastKnownSourceLocator must not be null");
|
||||
Objects.requireNonNull(lastKnownSourceFileName, "lastKnownSourceFileName must not be null");
|
||||
if (lastKnownSourceFileName.isBlank()) {
|
||||
throw new IllegalArgumentException("lastKnownSourceFileName must not be blank");
|
||||
}
|
||||
Objects.requireNonNull(overallStatus, "overallStatus must not be null");
|
||||
Objects.requireNonNull(failureCounters, "failureCounters must not be null");
|
||||
Objects.requireNonNull(createdAt, "createdAt must not be null");
|
||||
Objects.requireNonNull(updatedAt, "updatedAt must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,32 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sealed result type for a document master record lookup via {@link DocumentRecordRepository}.
|
||||
* <p>
|
||||
* The use case uses this result to make the per-document processing decision in M4
|
||||
* without additional assumptions:
|
||||
* <ul>
|
||||
* <li>{@link DocumentUnknown} — the fingerprint is not yet in the persistence store;
|
||||
* the document must be processed for the first time.</li>
|
||||
* <li>{@link DocumentKnownProcessable} — a master record exists but the document is
|
||||
* not in a terminal state; normal processing may continue.</li>
|
||||
* <li>{@link DocumentTerminalSuccess} — the document was already processed
|
||||
* successfully; skip with {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SKIPPED_ALREADY_PROCESSED}.</li>
|
||||
* <li>{@link DocumentTerminalFinalFailure} — the document has finally failed; skip
|
||||
* with {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SKIPPED_FINAL_FAILURE}.</li>
|
||||
* <li>{@link PersistenceLookupTechnicalFailure} — the lookup itself failed due to a
|
||||
* technical infrastructure problem; the document cannot be processed in this run.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> No JDBC, SQLite, or filesystem types appear
|
||||
* in this sealed hierarchy or in any of its implementations.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public sealed interface DocumentRecordLookupResult
|
||||
permits DocumentUnknown,
|
||||
DocumentKnownProcessable,
|
||||
DocumentTerminalSuccess,
|
||||
DocumentTerminalFinalFailure,
|
||||
PersistenceLookupTechnicalFailure {
|
||||
}
|
||||
@@ -0,0 +1,72 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
/**
|
||||
* Outbound port for reading and writing the document master record (Dokument-Stammsatz).
|
||||
* <p>
|
||||
* One master record exists per unique {@link DocumentFingerprint}. The repository is
|
||||
* responsible for the persistence of {@link DocumentRecord} values; it holds no
|
||||
* business logic about retry rules, skip decisions, or status transitions.
|
||||
* <p>
|
||||
* <strong>Lookup semantics:</strong>
|
||||
* {@link #findByFingerprint(DocumentFingerprint)} returns a sealed
|
||||
* {@link DocumentRecordLookupResult} that allows the use case to distinguish exhaustively
|
||||
* between an unknown document, a known processable document, a terminal success, a
|
||||
* terminal final failure, and a technical persistence failure — without additional
|
||||
* assumptions or null checks.
|
||||
* <p>
|
||||
* <strong>Write semantics:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #create(DocumentRecord)} inserts a new record for a previously unknown
|
||||
* document.</li>
|
||||
* <li>{@link #update(DocumentRecord)} replaces the mutable fields of an existing
|
||||
* record identified by its fingerprint.</li>
|
||||
* </ul>
|
||||
* Both write methods throw {@link DocumentPersistenceException} on technical failure.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> No JDBC, SQLite, or filesystem types appear
|
||||
* in this interface or in any type it references. Mapping to and from the persistence
|
||||
* schema is the exclusive responsibility of the adapter implementation.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public interface DocumentRecordRepository {
|
||||
|
||||
/**
|
||||
* Looks up the master record for the given fingerprint.
|
||||
* <p>
|
||||
* Returns a {@link DocumentRecordLookupResult} that encodes all possible outcomes
|
||||
* including technical failures; this method never throws.
|
||||
*
|
||||
* @param fingerprint the content-based document identity to look up; must not be null
|
||||
* @return {@link DocumentUnknown} if no record exists,
|
||||
* {@link DocumentKnownProcessable} if the document is known but not terminal,
|
||||
* {@link DocumentTerminalSuccess} if the document succeeded,
|
||||
* {@link DocumentTerminalFinalFailure} if the document finally failed, or
|
||||
* {@link PersistenceLookupTechnicalFailure} if the lookup itself failed
|
||||
*/
|
||||
DocumentRecordLookupResult findByFingerprint(DocumentFingerprint fingerprint);
|
||||
|
||||
/**
|
||||
* Persists a new master record for a previously unknown document.
|
||||
* <p>
|
||||
* The fingerprint within {@code record} must not yet exist in the persistence store.
|
||||
*
|
||||
* @param record the new master record to persist; must not be null
|
||||
* @throws DocumentPersistenceException if the insert fails due to a technical error
|
||||
*/
|
||||
void create(DocumentRecord record);
|
||||
|
||||
/**
|
||||
* Updates the mutable fields of an existing master record.
|
||||
* <p>
|
||||
* The record is identified by its {@link DocumentFingerprint}; the fingerprint
|
||||
* itself is never changed. Mutable fields include the overall status, failure
|
||||
* counters, last known source location, and all timestamp fields.
|
||||
*
|
||||
* @param record the updated master record; must not be null; fingerprint must exist
|
||||
* @throws DocumentPersistenceException if the update fails due to a technical error
|
||||
*/
|
||||
void update(DocumentRecord record);
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Lookup result indicating that the document has finally and irrecoverably failed.
|
||||
* <p>
|
||||
* The master record's overall status is
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#FAILED_FINAL}.
|
||||
* The use case must skip further processing and historise a
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SKIPPED_FINAL_FAILURE}
|
||||
* attempt. No failure counters are changed.
|
||||
* <p>
|
||||
* The existing {@link DocumentRecord} is supplied so the use case can read the
|
||||
* current record for the skip attempt historisation without an additional lookup.
|
||||
*
|
||||
* @param record the current (finally failed) master record for this document; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentTerminalFinalFailure(DocumentRecord record) implements DocumentRecordLookupResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the non-null contract.
|
||||
*
|
||||
* @throws NullPointerException if {@code record} is null
|
||||
*/
|
||||
public DocumentTerminalFinalFailure {
|
||||
Objects.requireNonNull(record, "record must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Lookup result indicating that the document was already successfully processed.
|
||||
* <p>
|
||||
* The master record's overall status is
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SUCCESS}.
|
||||
* The use case must skip further processing and historise a
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus#SKIPPED_ALREADY_PROCESSED}
|
||||
* attempt. No failure counters are changed.
|
||||
* <p>
|
||||
* The existing {@link DocumentRecord} is supplied so the use case can read the
|
||||
* current record for the skip attempt historisation without an additional lookup.
|
||||
*
|
||||
* @param record the current (successful) master record for this document; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentTerminalSuccess(DocumentRecord record) implements DocumentRecordLookupResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the non-null contract.
|
||||
*
|
||||
* @throws NullPointerException if {@code record} is null
|
||||
*/
|
||||
public DocumentTerminalSuccess {
|
||||
Objects.requireNonNull(record, "record must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,14 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Lookup result indicating that the fingerprint is not yet present in the persistence store.
|
||||
* <p>
|
||||
* The document has never been processed before. The use case must create a new
|
||||
* {@link DocumentRecord} and proceed with normal M4 processing.
|
||||
* <p>
|
||||
* This variant carries no data because there is no existing record to return.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record DocumentUnknown() implements DocumentRecordLookupResult {
|
||||
}
|
||||
@@ -0,0 +1,75 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Immutable snapshot of the two independent failure counters maintained per document.
|
||||
* <p>
|
||||
* M4 tracks two distinct counters separately because they drive different retry rules:
|
||||
* <ul>
|
||||
* <li><strong>Content error counter</strong> ({@link #contentErrorCount()}):
|
||||
* counts how many times a deterministic content error occurred for this document
|
||||
* (no usable text, page limit exceeded). At count 1 the document is
|
||||
* {@code FAILED_RETRYABLE}; at count 2 it becomes {@code FAILED_FINAL}.
|
||||
* Skip events do <em>not</em> increase this counter.</li>
|
||||
* <li><strong>Transient error counter</strong> ({@link #transientErrorCount()}):
|
||||
* counts how many times a technical infrastructure error occurred after a
|
||||
* successful fingerprint was computed. The document remains
|
||||
* {@code FAILED_RETRYABLE} until the configured maximum is reached in later
|
||||
* milestones. Skip events do <em>not</em> increase this counter.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* A freshly discovered document starts with both counters at zero.
|
||||
* Counters are only written by the repository layer on the instructions of the
|
||||
* application use case; they never change as a side-effect of a read operation.
|
||||
*
|
||||
* @param contentErrorCount number of deterministic content errors recorded so far;
|
||||
* must be >= 0
|
||||
* @param transientErrorCount number of transient technical errors recorded so far;
|
||||
* must be >= 0
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record FailureCounters(int contentErrorCount, int transientErrorCount) {
|
||||
|
||||
/**
|
||||
* Compact constructor validating that neither counter is negative.
|
||||
*
|
||||
* @throws IllegalArgumentException if either counter is negative
|
||||
*/
|
||||
public FailureCounters {
|
||||
if (contentErrorCount < 0) {
|
||||
throw new IllegalArgumentException(
|
||||
"contentErrorCount must be >= 0, but was: " + contentErrorCount);
|
||||
}
|
||||
if (transientErrorCount < 0) {
|
||||
throw new IllegalArgumentException(
|
||||
"transientErrorCount must be >= 0, but was: " + transientErrorCount);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a {@code FailureCounters} instance with both counters at zero.
|
||||
* Use this when initialising a master record for a newly discovered document.
|
||||
*
|
||||
* @return zero-value counters
|
||||
*/
|
||||
public static FailureCounters zero() {
|
||||
return new FailureCounters(0, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a copy with the content error counter incremented by one.
|
||||
*
|
||||
* @return new instance with {@code contentErrorCount + 1}
|
||||
*/
|
||||
public FailureCounters withIncrementedContentErrorCount() {
|
||||
return new FailureCounters(contentErrorCount + 1, transientErrorCount);
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns a copy with the transient error counter incremented by one.
|
||||
*
|
||||
* @return new instance with {@code transientErrorCount + 1}
|
||||
*/
|
||||
public FailureCounters withIncrementedTransientErrorCount() {
|
||||
return new FailureCounters(contentErrorCount, transientErrorCount + 1);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,40 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
|
||||
/**
|
||||
* Outbound port for computing the content-based fingerprint of exactly one
|
||||
* processing candidate.
|
||||
* <p>
|
||||
* Implementations must derive the fingerprint <em>exclusively</em> from the binary
|
||||
* content of the file referenced by the candidate. File name, path, and metadata must
|
||||
* not influence the result.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> All hashing logic and file I/O are confined
|
||||
* to the {@code adapter-out} implementation. This interface exposes no
|
||||
* {@code java.nio.file.Path}, {@code java.io.File}, or cryptographic types to Domain
|
||||
* or Application.
|
||||
* <p>
|
||||
* <strong>Failure semantics:</strong> Technical failures (unreadable file, I/O error)
|
||||
* are returned as {@link FingerprintTechnicalError} rather than thrown as exceptions.
|
||||
* A {@link FingerprintTechnicalError} result means no
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint} is available
|
||||
* and the candidate cannot be identified; consequently no SQLite attempt record is
|
||||
* created for this candidate in M4.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public interface FingerprintPort {
|
||||
|
||||
/**
|
||||
* Computes the fingerprint for the given candidate.
|
||||
* <p>
|
||||
* This method never throws. All outcomes, including technical failures, are
|
||||
* encoded in the returned {@link FingerprintResult}.
|
||||
*
|
||||
* @param candidate the candidate whose file content is to be hashed; must not be null
|
||||
* @return {@link FingerprintSuccess} on success, or {@link FingerprintTechnicalError}
|
||||
* on any infrastructure failure
|
||||
*/
|
||||
FingerprintResult computeFingerprint(SourceDocumentCandidate candidate);
|
||||
}
|
||||
@@ -0,0 +1,20 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Sealed result type for a fingerprint computation attempt via {@link FingerprintPort}.
|
||||
* <p>
|
||||
* Exhaustive variants:
|
||||
* <ul>
|
||||
* <li>{@link FingerprintSuccess} — fingerprint computed successfully.</li>
|
||||
* <li>{@link FingerprintTechnicalError} — fingerprint computation failed due to a
|
||||
* technical infrastructure problem (e.g. I/O error, file no longer accessible).</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Historisation impact:</strong> If the result is {@link FingerprintTechnicalError},
|
||||
* the document cannot be identified and <em>no</em> SQLite attempt record is created.
|
||||
* The failure is treated as a non-identifiable run event.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public sealed interface FingerprintResult permits FingerprintSuccess, FingerprintTechnicalError {
|
||||
}
|
||||
@@ -0,0 +1,27 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Successful outcome of a fingerprint computation.
|
||||
* <p>
|
||||
* Carries the computed {@link DocumentFingerprint} that uniquely identifies the
|
||||
* document by its content. The fingerprint can now be used as the primary key
|
||||
* for all subsequent persistence operations in M4.
|
||||
*
|
||||
* @param fingerprint the successfully computed fingerprint; never null
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record FingerprintSuccess(DocumentFingerprint fingerprint) implements FingerprintResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the non-null contract.
|
||||
*
|
||||
* @throws NullPointerException if {@code fingerprint} is null
|
||||
*/
|
||||
public FingerprintSuccess {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,34 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Technical failure during fingerprint computation.
|
||||
* <p>
|
||||
* Returned by {@link FingerprintPort} when the adapter cannot read the file content
|
||||
* to compute the SHA-256 hash. Typical causes include the file no longer being
|
||||
* accessible between candidate discovery and hashing, I/O errors, or permission issues.
|
||||
* <p>
|
||||
* <strong>Historisation impact:</strong> Because no {@link de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint}
|
||||
* could be produced, this failure is <em>not</em> historised in SQLite. No
|
||||
* {@link ProcessingAttempt} is created.
|
||||
*
|
||||
* @param errorMessage human-readable description of the failure; never null or blank
|
||||
* @param cause the underlying throwable, or {@code null} if not available
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record FingerprintTechnicalError(String errorMessage, Throwable cause) implements FingerprintResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the error message.
|
||||
*
|
||||
* @throws NullPointerException if {@code errorMessage} is null
|
||||
* @throws IllegalArgumentException if {@code errorMessage} is blank
|
||||
*/
|
||||
public FingerprintTechnicalError {
|
||||
Objects.requireNonNull(errorMessage, "errorMessage must not be null");
|
||||
if (errorMessage.isBlank()) {
|
||||
throw new IllegalArgumentException("errorMessage must not be blank");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,36 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Lookup result indicating that the master record lookup itself failed due to a
|
||||
* technical infrastructure problem.
|
||||
* <p>
|
||||
* The persistence layer (SQLite) could not be reached or returned an unexpected error.
|
||||
* The document state is unknown; the use case must treat this candidate as a
|
||||
* transient technical failure for this run and must not attempt to write any attempt
|
||||
* record (since the underlying persistence is unavailable).
|
||||
* <p>
|
||||
* This variant is distinct from a business-level "document not found" outcome
|
||||
* ({@link DocumentUnknown}): here, the lookup operation itself failed.
|
||||
*
|
||||
* @param errorMessage human-readable description of the persistence failure; never null or blank
|
||||
* @param cause the underlying throwable, or {@code null} if not available
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record PersistenceLookupTechnicalFailure(String errorMessage, Throwable cause)
|
||||
implements DocumentRecordLookupResult {
|
||||
|
||||
/**
|
||||
* Compact constructor validating the error message.
|
||||
*
|
||||
* @throws NullPointerException if {@code errorMessage} is null
|
||||
* @throws IllegalArgumentException if {@code errorMessage} is blank
|
||||
*/
|
||||
public PersistenceLookupTechnicalFailure {
|
||||
Objects.requireNonNull(errorMessage, "errorMessage must not be null");
|
||||
if (errorMessage.isBlank()) {
|
||||
throw new IllegalArgumentException("errorMessage must not be blank");
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,40 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
/**
|
||||
* Outbound port for initialising the SQLite persistence schema at program startup.
|
||||
* <p>
|
||||
* This port is invoked exactly once per program run, <em>before</em> the batch
|
||||
* document processing loop begins. The initialisation must ensure that all tables,
|
||||
* indices, and constraints required for M4 persistence are present in the SQLite file.
|
||||
* <p>
|
||||
* <strong>Timing:</strong> The adapter implementation must perform the schema
|
||||
* initialisation eagerly and synchronously. Lazy or deferred initialisation during
|
||||
* the document processing loop is not the intent of this port.
|
||||
* <p>
|
||||
* <strong>Failure handling:</strong> If the schema cannot be initialised, the
|
||||
* implementation must throw {@link DocumentPersistenceException}. The bootstrap
|
||||
* layer must catch this exception and abort the run with exit code 1.
|
||||
* <p>
|
||||
* <strong>Idempotency:</strong> Calling {@link #initializeSchema()} on a database
|
||||
* that already has the correct schema must succeed without error (e.g. via
|
||||
* {@code CREATE TABLE IF NOT EXISTS} semantics).
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> No JDBC, SQLite, or filesystem types appear
|
||||
* in this interface. All schema DDL and connection management are confined to the
|
||||
* {@code adapter-out} implementation.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public interface PersistenceSchemaInitializationPort {
|
||||
|
||||
/**
|
||||
* Creates or verifies the M4 persistence schema.
|
||||
* <p>
|
||||
* Must be called once at program start, before any document processing begins.
|
||||
* The method must be idempotent: calling it on an already-initialised database
|
||||
* must not fail or alter existing data.
|
||||
*
|
||||
* @throws DocumentPersistenceException if the schema cannot be created or verified
|
||||
*/
|
||||
void initializeSchema();
|
||||
}
|
||||
@@ -0,0 +1,88 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.ProcessingStatus;
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
|
||||
|
||||
import java.time.Instant;
|
||||
import java.util.Objects;
|
||||
|
||||
/**
|
||||
* Application-facing representation of exactly one historised processing attempt
|
||||
* (Versuchshistorie-Eintrag) for an identified document.
|
||||
* <p>
|
||||
* <strong>Historisation boundary (M4):</strong> Only attempts for documents whose
|
||||
* {@link DocumentFingerprint} was successfully computed are historised. Failures that
|
||||
* occur <em>before</em> the fingerprint is available (e.g. the source file is
|
||||
* unreadable before hashing) are <em>not</em> represented by a {@code ProcessingAttempt}
|
||||
* and are <em>not</em> written to SQLite.
|
||||
* <p>
|
||||
* <strong>Attempt number semantics:</strong> The attempt number starts at 1 for the
|
||||
* first historised attempt per fingerprint and increases monotonically by 1 for every
|
||||
* subsequent attempt, including skip attempts
|
||||
* ({@link ProcessingStatus#SKIPPED_ALREADY_PROCESSED},
|
||||
* {@link ProcessingStatus#SKIPPED_FINAL_FAILURE}).
|
||||
* <p>
|
||||
* <strong>Field semantics:</strong>
|
||||
* <ul>
|
||||
* <li>{@link #fingerprint()} — foreign key to the document master record.</li>
|
||||
* <li>{@link #runId()} — identifies the batch run during which this attempt occurred.</li>
|
||||
* <li>{@link #attemptNumber()} — monotonically increasing per fingerprint; assigned
|
||||
* before the attempt is recorded.</li>
|
||||
* <li>{@link #startedAt()} — wall-clock timestamp when processing of this candidate
|
||||
* began in this run.</li>
|
||||
* <li>{@link #endedAt()} — wall-clock timestamp when processing completed (success,
|
||||
* failure, or skip).</li>
|
||||
* <li>{@link #status()} — outcome status of this specific attempt.</li>
|
||||
* <li>{@link #failureClass()} — short classification of the failure (e.g. enum constant
|
||||
* name or exception class name); {@code null} for successful or skip attempts.</li>
|
||||
* <li>{@link #failureMessage()} — human-readable failure description; {@code null} for
|
||||
* successful or skip attempts.</li>
|
||||
* <li>{@link #retryable()} — {@code true} if the failure is considered retryable in a
|
||||
* later run; {@code false} for final failures, successes, and skip attempts.</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Not included in M4:</strong> model name, prompt identifier, AI raw response,
|
||||
* AI reasoning, resolved date, date source, final title, final target file name.
|
||||
* These fields are added in later milestones (M5+).
|
||||
*
|
||||
* @param fingerprint content-based document identity; never null
|
||||
* @param runId identifier of the batch run; never null
|
||||
* @param attemptNumber monotonic sequence number per fingerprint; must be >= 1
|
||||
* @param startedAt start of this processing attempt; never null
|
||||
* @param endedAt end of this processing attempt; never null
|
||||
* @param status outcome status of this attempt; never null
|
||||
* @param failureClass failure classification, or {@code null} for non-failure statuses
|
||||
* @param failureMessage failure description, or {@code null} for non-failure statuses
|
||||
* @param retryable whether this failure should be retried in a later run
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public record ProcessingAttempt(
|
||||
DocumentFingerprint fingerprint,
|
||||
RunId runId,
|
||||
int attemptNumber,
|
||||
Instant startedAt,
|
||||
Instant endedAt,
|
||||
ProcessingStatus status,
|
||||
String failureClass,
|
||||
String failureMessage,
|
||||
boolean retryable) {
|
||||
|
||||
/**
|
||||
* Compact constructor validating mandatory non-null fields and numeric constraints.
|
||||
*
|
||||
* @throws NullPointerException if any mandatory field is null
|
||||
* @throws IllegalArgumentException if {@code attemptNumber} is less than 1
|
||||
*/
|
||||
public ProcessingAttempt {
|
||||
Objects.requireNonNull(fingerprint, "fingerprint must not be null");
|
||||
Objects.requireNonNull(runId, "runId must not be null");
|
||||
if (attemptNumber < 1) {
|
||||
throw new IllegalArgumentException(
|
||||
"attemptNumber must be >= 1, but was: " + attemptNumber);
|
||||
}
|
||||
Objects.requireNonNull(startedAt, "startedAt must not be null");
|
||||
Objects.requireNonNull(endedAt, "endedAt must not be null");
|
||||
Objects.requireNonNull(status, "status must not be null");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,70 @@
|
||||
package de.gecheckt.pdf.umbenenner.application.port.out;
|
||||
|
||||
import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* Outbound port for writing and reading the processing attempt history
|
||||
* (Versuchshistorie).
|
||||
* <p>
|
||||
* Every historisable processing attempt for an <em>identified</em> document results
|
||||
* in exactly one {@link ProcessingAttempt} record written via {@link #save(ProcessingAttempt)}.
|
||||
* <p>
|
||||
* <strong>Historisation boundary:</strong> Only attempts with a successfully computed
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint} are historised.
|
||||
* Failures that occur before the fingerprint is available are <em>not</em> recorded
|
||||
* through this port.
|
||||
* <p>
|
||||
* <strong>Attempt number semantics:</strong>
|
||||
* Attempt numbers start at 1 per fingerprint and increase monotonically by 1
|
||||
* for every saved attempt, including skip attempts. The use case calls
|
||||
* {@link #loadNextAttemptNumber(DocumentFingerprint)} to obtain the correct sequence
|
||||
* number before constructing a {@link ProcessingAttempt}.
|
||||
* <p>
|
||||
* <strong>Architecture boundary:</strong> No JDBC, SQLite, or filesystem types appear
|
||||
* in this interface. Mapping to and from the persistence schema is the exclusive
|
||||
* responsibility of the adapter implementation.
|
||||
*
|
||||
* @since M4-AP-001
|
||||
*/
|
||||
public interface ProcessingAttemptRepository {
|
||||
|
||||
/**
|
||||
* Returns the attempt number to assign to the <em>next</em> attempt for the given
|
||||
* fingerprint.
|
||||
* <p>
|
||||
* If no prior attempts exist for the fingerprint, returns 1.
|
||||
* Otherwise returns the current maximum attempt number plus 1.
|
||||
*
|
||||
* @param fingerprint the document identity; must not be null
|
||||
* @return the next monotonic attempt number; always >= 1
|
||||
* @throws DocumentPersistenceException if the query fails due to a technical error
|
||||
*/
|
||||
int loadNextAttemptNumber(DocumentFingerprint fingerprint);
|
||||
|
||||
/**
|
||||
* Persists exactly one processing attempt record.
|
||||
* <p>
|
||||
* The {@link ProcessingAttempt#attemptNumber()} must have been obtained from
|
||||
* {@link #loadNextAttemptNumber(DocumentFingerprint)} in the same run to guarantee
|
||||
* monotonic ordering.
|
||||
*
|
||||
* @param attempt the attempt to persist; must not be null
|
||||
* @throws DocumentPersistenceException if the insert fails due to a technical error
|
||||
*/
|
||||
void save(ProcessingAttempt attempt);
|
||||
|
||||
/**
|
||||
* Returns all historised attempts for the given fingerprint, ordered by
|
||||
* {@link ProcessingAttempt#attemptNumber()} ascending.
|
||||
* <p>
|
||||
* Returns an empty list if no attempts have been recorded yet.
|
||||
* Intended for use in tests and diagnostics; not required on the primary batch path.
|
||||
*
|
||||
* @param fingerprint the document identity; must not be null
|
||||
* @return immutable list of attempts, ordered by attempt number; never null
|
||||
* @throws DocumentPersistenceException if the query fails due to a technical error
|
||||
*/
|
||||
List<ProcessingAttempt> findAllByFingerprint(DocumentFingerprint fingerprint);
|
||||
}
|
||||
@@ -22,12 +22,40 @@
|
||||
* — Extract text content and page count from a single PDF</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* M4-AP-001 ports:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.FingerprintPort}
|
||||
* — Compute the content-based SHA-256 fingerprint of a processing candidate</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordRepository}
|
||||
* — Read and write the document master record (Dokument-Stammsatz)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttemptRepository}
|
||||
* — Write and read the per-document attempt history (Versuchshistorie)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitializationPort}
|
||||
* — Initialise the SQLite schema at program startup</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* M4-AP-001 value types and result types:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.FailureCounters}
|
||||
* — Immutable snapshot of content-error and transient-error counters per document</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecord}
|
||||
* — Application-facing representation of the document master record</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.ProcessingAttempt}
|
||||
* — Application-facing representation of one historised processing attempt</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.FingerprintResult}
|
||||
* — Sealed result of a fingerprint computation (success or technical error)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentRecordLookupResult}
|
||||
* — Sealed result of a master record lookup (unknown / processable / terminal / failure)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Exception types:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.RunLockUnavailableException}
|
||||
* — Thrown when run lock cannot be acquired (another instance running) (M2)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.SourceDocumentAccessException}
|
||||
* — Thrown when source folder cannot be read or accessed (M3)</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.application.port.out.DocumentPersistenceException}
|
||||
* — Thrown when a persistence write operation or schema init fails (M4)</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Architecture Rule: Outbound ports are implementation-agnostic and contain no business logic.
|
||||
|
||||
Reference in New Issue
Block a user