Nachbearbeitung: Meilensteinbezüge aus Produktiv-JavaDoc und
package-info entfernt
This commit is contained in:
@@ -34,8 +34,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
* <strong>Technical failure handling:</strong> Any I/O errors, path resolution issues,
|
||||
* or cryptographic problems are converted to {@link FingerprintTechnicalError} results
|
||||
* without throwing exceptions. Pre-fingerprint failures are not historized in SQLite.
|
||||
*
|
||||
* @since M4-AP-002
|
||||
*/
|
||||
public class Sha256FingerprintAdapter implements FingerprintPort {
|
||||
|
||||
|
||||
@@ -7,6 +7,5 @@
|
||||
* <p>All file I/O and cryptographic operations are strictly confined to this adapter layer,
|
||||
* maintaining the hexagonal architecture boundary.
|
||||
*
|
||||
* @since M4-AP-002
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.fingerprint;
|
||||
@@ -4,10 +4,10 @@
|
||||
* Components:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.lock.FilesystemRunLockPortAdapter}
|
||||
* — File-based run lock that prevents concurrent instances (AP-006)</li>
|
||||
* — File-based run lock that prevents concurrent instances</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* AP-006: Uses atomic file creation ({@code CREATE_NEW}) to establish an exclusive lock.
|
||||
* Implementation details: Uses atomic file creation ({@code CREATE_NEW}) to establish an exclusive lock.
|
||||
* Stores the acquiring process PID in the lock file for diagnostics.
|
||||
* Release is best-effort and logs a warning on failure without throwing.
|
||||
*/
|
||||
|
||||
@@ -19,7 +19,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
/**
|
||||
* PDFBox-based implementation of {@link PdfTextExtractionPort}.
|
||||
* <p>
|
||||
* AP-003 Implementation: Extracts text content and page count from a single PDF document
|
||||
* Extracts text content and page count from a single PDF document
|
||||
* using Apache PDFBox. All technical problems during extraction are reported as
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}.
|
||||
* <p>
|
||||
@@ -29,7 +29,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
* <li>Extracts complete text from all pages (may be empty)</li>
|
||||
* <li>Counts total page count</li>
|
||||
* <li>Returns results as typed {@link PdfExtractionResult} (no exceptions thrown)</li>
|
||||
* <li>All extraction failures are treated as technical errors (AP-003 scope)</li>
|
||||
* <li>All extraction failures are treated as technical errors</li>
|
||||
* <li>PDFBox is encapsulated and never exposed beyond this adapter</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
@@ -41,7 +41,7 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
* <li>All three values are combined into {@link PdfExtractionSuccess}</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Technical error cases (AP-003):
|
||||
* Technical error cases:
|
||||
* <ul>
|
||||
* <li>File not found or unreadable</li>
|
||||
* <li>PDF cannot be loaded by PDFBox (any load error)</li>
|
||||
@@ -49,14 +49,12 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentCandidate;
|
||||
* <li>Text extraction fails or throws exception</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* Non-goals (handled in later APs):
|
||||
* Out of scope (handled elsewhere):
|
||||
* <ul>
|
||||
* <li>Fachliche Bewertung des extrahierten Texts (AP-004)</li>
|
||||
* <li>Page limit checking (AP-004)</li>
|
||||
* <li>Fachliche Bewertung des extrahierten Texts</li>
|
||||
* <li>Page limit checking</li>
|
||||
* <li>Text normalization or preprocessing</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
|
||||
@@ -68,8 +66,8 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
* <p>
|
||||
* The locator is expected to contain an absolute file path as a String (adapter-internal convention).
|
||||
* <p>
|
||||
* In M3-AP-003, all technical problems are reported as {@link PdfExtractionTechnicalError}.
|
||||
* Fachliche Bewertungen like "text is not usable" are deferred to AP-004.
|
||||
* All technical problems are reported as {@link PdfExtractionTechnicalError}.
|
||||
* Fachliche Bewertungen like "text is not usable" are handled elsewhere.
|
||||
*
|
||||
* @param candidate the document to extract; must be non-null
|
||||
* @return a {@link PdfExtractionResult} encoding the outcome:
|
||||
@@ -113,12 +111,12 @@ public class PdfTextExtractionPortAdapter implements PdfTextExtractionPort {
|
||||
}
|
||||
|
||||
// Extract text from all pages
|
||||
// Note: extractedText may be empty string, which is valid in M3 (no fachliche validation here)
|
||||
// Note: extractedText may be empty string, which is valid (no fachliche validation here)
|
||||
PDFTextStripper textStripper = new PDFTextStripper();
|
||||
String extractedText = textStripper.getText(document);
|
||||
|
||||
// Success: return extracted text and page count
|
||||
// (Empty text is not an error in AP-003; fachliche validation is AP-004)
|
||||
// (Empty text is not an error; fachliche validation is handled elsewhere)
|
||||
PdfPageCount pageCountTyped = new PdfPageCount(pageCount);
|
||||
return new PdfExtractionSuccess(extractedText, pageCountTyped);
|
||||
} finally {
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
/**
|
||||
* PDFBox-based adapter for PDF text extraction.
|
||||
* <p>
|
||||
* <strong>M3-AP-003:</strong> This package contains the sole implementation
|
||||
* This package contains the sole implementation
|
||||
* of {@link de.gecheckt.pdf.umbenenner.application.port.out.PdfTextExtractionPort},
|
||||
* using Apache PDFBox to extract text and page count from PDF documents.
|
||||
* <p>
|
||||
* <strong>Scope (AP-003):</strong>
|
||||
* <strong>Scope:</strong>
|
||||
* <ul>
|
||||
* <li>Pure technical extraction: read PDF, extract text, count pages</li>
|
||||
* <li>All extraction problems (file not found, PDF unreadable, PDFBox errors) → {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}</li>
|
||||
@@ -14,21 +14,18 @@
|
||||
* <li>Results always typed as {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionResult}, never exceptions</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Restriction:</strong>
|
||||
* <strong>Result types used:</strong>
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionContentError} is reserved for later APs</li>
|
||||
* <li>AP-003 adapter uses only {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess} and
|
||||
* {@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError}</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionSuccess} for successful text extraction</li>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.domain.model.PdfExtractionTechnicalError} for technical problems</li>
|
||||
* </ul>
|
||||
* <p>
|
||||
* <strong>Out of scope (handled in later APs):</strong>
|
||||
* <strong>Out of scope:</strong>
|
||||
* <ul>
|
||||
* <li>Text validation or quality assessment (AP-004)</li>
|
||||
* <li>Page limit checking (AP-004)</li>
|
||||
* <li>Text validation or quality assessment</li>
|
||||
* <li>Page limit checking</li>
|
||||
* <li>Text normalization or preprocessing</li>
|
||||
* <li>Fachliche Bewertung of extracted content</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-003
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.pdfextraction;
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
/**
|
||||
* Source document adapters for discovering and accessing PDF candidates.
|
||||
* <p>
|
||||
* M3-AP-002 implementations:
|
||||
* Implementations:
|
||||
* <ul>
|
||||
* <li>{@link de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument.SourceDocumentCandidatesPortAdapter}
|
||||
* — File-system based discovery of PDF candidates from the source folder</li>
|
||||
* </ul>
|
||||
*
|
||||
* @since M3-AP-002
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sourcedocument;
|
||||
|
||||
@@ -34,8 +34,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.SourceDocumentLocator;
|
||||
* <strong>Architecture boundary:</strong> All JDBC and SQLite details are strictly
|
||||
* confined to this class. No JDBC types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*
|
||||
* @since M4-AP-004
|
||||
*/
|
||||
public class SqliteDocumentRecordRepositoryAdapter implements DocumentRecordRepository {
|
||||
|
||||
|
||||
@@ -28,8 +28,6 @@ import de.gecheckt.pdf.umbenenner.domain.model.DocumentFingerprint;
|
||||
* <strong>Architecture boundary:</strong> All JDBC and SQLite details are strictly
|
||||
* confined to this class. No JDBC types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*
|
||||
* @since M4-AP-005
|
||||
*/
|
||||
public class SqliteProcessingAttemptRepositoryAdapter implements ProcessingAttemptRepository {
|
||||
|
||||
|
||||
@@ -15,7 +15,7 @@ import de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitiali
|
||||
/**
|
||||
* SQLite implementation of {@link PersistenceSchemaInitializationPort}.
|
||||
* <p>
|
||||
* Creates or verifies the M4 two-level persistence schema in the configured SQLite
|
||||
* Creates or verifies the two-level persistence schema in the configured SQLite
|
||||
* database file. All DDL uses {@code IF NOT EXISTS} semantics, making the operation
|
||||
* fully idempotent: calling {@link #initializeSchema()} on an already-initialised
|
||||
* database succeeds without error and without modifying existing data.
|
||||
@@ -39,8 +39,6 @@ import de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitiali
|
||||
* <p>All JDBC connections, SQL DDL, and SQLite-specific behaviour are strictly confined
|
||||
* to this class. No JDBC or SQLite types appear in the port interface or in any
|
||||
* application/domain type.
|
||||
*
|
||||
* @since M4-AP-003
|
||||
*/
|
||||
public class SqliteSchemaInitializationAdapter implements PersistenceSchemaInitializationPort {
|
||||
|
||||
@@ -49,7 +47,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
/**
|
||||
* DDL for the document master record table.
|
||||
* <p>
|
||||
* <strong>Columns (M4 mandatory fields):</strong>
|
||||
* <strong>Columns (mandatory fields):</strong>
|
||||
* <ul>
|
||||
* <li>{@code id} — internal surrogate primary key (auto-increment).</li>
|
||||
* <li>{@code fingerprint} — SHA-256 hex string; unique natural key; never null.</li>
|
||||
@@ -95,7 +93,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
/**
|
||||
* DDL for the processing attempt history table.
|
||||
* <p>
|
||||
* <strong>Columns (M4 mandatory fields):</strong>
|
||||
* <strong>Columns (mandatory fields):</strong>
|
||||
* <ul>
|
||||
* <li>{@code id} — internal surrogate primary key (auto-increment).</li>
|
||||
* <li>{@code fingerprint} — foreign key reference to
|
||||
@@ -179,7 +177,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates or verifies the M4 persistence schema in the SQLite database.
|
||||
* Creates or verifies the persistence schema in the SQLite database.
|
||||
* <p>
|
||||
* Executes the following DDL statements in order:
|
||||
* <ol>
|
||||
@@ -202,7 +200,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
*/
|
||||
@Override
|
||||
public void initializeSchema() {
|
||||
logger.info("Initialising M4 SQLite schema at: {}", jdbcUrl);
|
||||
logger.info("Initialising SQLite persistence schema at: {}", jdbcUrl);
|
||||
try (Connection connection = DriverManager.getConnection(jdbcUrl);
|
||||
Statement statement = connection.createStatement()) {
|
||||
|
||||
@@ -226,7 +224,7 @@ public class SqliteSchemaInitializationAdapter implements PersistenceSchemaIniti
|
||||
logger.info("M4 SQLite schema initialisation completed successfully.");
|
||||
|
||||
} catch (SQLException e) {
|
||||
String message = "Failed to initialise M4 SQLite schema at '" + jdbcUrl + "': " + e.getMessage();
|
||||
String message = "Failed to initialise SQLite persistence schema at '" + jdbcUrl + "': " + e.getMessage();
|
||||
logger.error(message, e);
|
||||
throw new DocumentPersistenceException(message, e);
|
||||
}
|
||||
|
||||
@@ -19,8 +19,6 @@ import de.gecheckt.pdf.umbenenner.application.port.out.UnitOfWorkPort;
|
||||
* <p>
|
||||
* Provides transactional semantics for coordinated writes to both the document record
|
||||
* and processing attempt repositories.
|
||||
*
|
||||
* @since M4-AP-006-fix
|
||||
*/
|
||||
public class SqliteUnitOfWorkAdapter implements UnitOfWorkPort {
|
||||
|
||||
|
||||
@@ -1,14 +1,14 @@
|
||||
/**
|
||||
* SQLite persistence adapter for the M4 two-level persistence model.
|
||||
* SQLite persistence adapter for the two-level persistence model.
|
||||
*
|
||||
* <h2>Purpose</h2>
|
||||
* <p>This package contains the technical SQLite infrastructure for the M4 persistence
|
||||
* <p>This package contains the technical SQLite infrastructure for the persistence
|
||||
* layer. It is the only place in the entire application where JDBC connections, SQL DDL,
|
||||
* and SQLite-specific types are used. No JDBC or SQLite types leak into the
|
||||
* {@code application} or {@code domain} modules.
|
||||
*
|
||||
* <h2>Two-level persistence model</h2>
|
||||
* <p>M4 persistence is structured in exactly two levels:
|
||||
* <p>Persistence is structured in exactly two levels:
|
||||
* <ol>
|
||||
* <li><strong>Document master record</strong> ({@code document_record} table) —
|
||||
* one row per unique SHA-256 fingerprint; carries the current overall status,
|
||||
@@ -31,7 +31,5 @@
|
||||
* confined to this package. The application layer interacts exclusively through the
|
||||
* port interfaces defined in
|
||||
* {@code de.gecheckt.pdf.umbenenner.application.port.out}.
|
||||
*
|
||||
* @since M4-AP-003
|
||||
*/
|
||||
package de.gecheckt.pdf.umbenenner.adapter.out.sqlite;
|
||||
|
||||
Reference in New Issue
Block a user