1
0

Optimierung: Bootstrap- und Konfigurationsdokumentation punktuell

geschärft
This commit is contained in:
2026-04-06 08:26:14 +02:00
parent b5db3fb361
commit 7bac60c66c
6 changed files with 118 additions and 59 deletions

View File

@@ -212,7 +212,9 @@ public class StartConfigurationValidator {
// === Helper methods for common validation patterns ===
/**
* Validates that a required path is not null, exists, and is a directory.
* Validates that a required directory path is not null, exists, and is a directory.
* <p>
* Used for paths like source and target folders that must already exist before processing can begin.
*/
private void validateRequiredExistingDirectory(Path path, String fieldName, List<String> errors) {
if (path == null) {
@@ -228,6 +230,10 @@ public class StartConfigurationValidator {
/**
* Validates that a required file path is not null and its parent directory exists and is a directory.
* <p>
* The file itself may not exist yet (e.g., SQLite will create it on first use), but the parent
* directory must be present and writable. Used for files like sqlite.file where the application
* will create the file if needed.
*/
private void validateRequiredFileParentDirectory(Path filePath, String fieldName, List<String> errors) {
if (filePath == null) {

View File

@@ -1,8 +1,21 @@
/**
* Bootstrap-phase technical configuration validation.
* <p>
* Handles startup configuration validation before the batch application begins.
* Validates mandatory fields, numeric ranges, URI schemes, and path existence.
* Technical responsibility that does not belong to the application layer.
* Handles startup configuration validation as a separate step after configuration loading.
* Validates mandatory fields, numeric ranges, URI schemes, and path existence before
* the batch application begins. If validation fails, the application exits with code 1.
* <p>
* Validation concerns include:
* <ul>
* <li>Mandatory field presence and non-nullness</li>
* <li>Numeric constraints (timeout, retry limits, page counts, character limits)</li>
* <li>URI validity (API base URL must be absolute with http or https scheme)</li>
* <li>Path existence and type (source/target folders exist and are readable, etc.)</li>
* <li>Path relationships (source and target folders are not the same)</li>
* </ul>
* <p>
* This validation is a technical responsibility that does not belong to the application layer
* and is distinct from configuration loading. The validator is created and invoked by the
* bootstrap phase after configuration is loaded.
*/
package de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation;

View File

@@ -20,8 +20,10 @@ import de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort;
/**
* Properties-based implementation of {@link ConfigurationPort}.
* <p>
* Loads configuration from config/application.properties with environment variable
* precedence for sensitive values like the API key.
* Loads configuration from config/application.properties as the primary source.
* For sensitive values, environment variables take precedence: if the environment variable
* {@code PDF_UMBENENNER_API_KEY} is set, it overrides the {@code api.key} property from the file.
* This allows credentials to be managed securely without storing them in the configuration file.
*/
public class PropertiesConfigurationPortAdapter implements ConfigurationPort {

View File

@@ -1,12 +1,21 @@
/**
* Configuration loading adapters.
* Configuration loading adapters for the bootstrap phase.
* <p>
* Contains implementations of the {@link de.gecheckt.pdf.umbenenner.application.port.out.ConfigurationPort}
* that load the {@link de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration}
* that load the complete {@link de.gecheckt.pdf.umbenenner.application.config.startup.StartConfiguration}
* from external sources (e.g., properties files, environment variables).
* <p>
* Responsibilities:
* <ul>
* <li>Load configuration from the properties file (default: config/application.properties)</li>
* <li>Apply environment variable precedence for sensitive values (e.g., API key)</li>
* <li>Construct the typed StartConfiguration object with all technical infrastructure parameters</li>
* </ul>
* <p>
* These adapters bridge the outbound port contract with concrete infrastructure
* (property file parsing, environment variable lookup) without leaking infrastructure
* details into the application or bootstrap layers.
* (property file parsing, environment variable lookup) without leaking infrastructure details
* into the application or bootstrap layers. Validation of the loaded configuration is performed
* separately by the {@link de.gecheckt.pdf.umbenenner.adapter.out.bootstrap.validation.StartConfigurationValidator}
* in the bootstrap phase.
*/
package de.gecheckt.pdf.umbenenner.adapter.out.configuration;

View File

@@ -41,17 +41,19 @@ import de.gecheckt.pdf.umbenenner.domain.model.BatchRunContext;
import de.gecheckt.pdf.umbenenner.domain.model.RunId;
/**
* Manual bootstrap runner that constructs the object graph and drives the startup flow.
* Orchestrator for the complete startup sequence and object graph construction.
* <p>
* Responsibilities:
* Separates startup concerns into two distinct phases:
* <ol>
* <li>Load and validate the startup configuration.</li>
* <li>Initialise the SQLite persistence schema.</li>
* <li>Resolve the run-lock file path (with default fallback).</li>
* <li>Create and wire all ports and adapters via configured factories.</li>
* <li>Start the CLI adapter and execute the batch use case.</li>
* <li>Map the batch outcome to a process exit code.</li>
* <li><strong>Bootstrap Phase:</strong> Load and validate configuration, initialize persistence schema,
* establish run-lock, and prepare all adapters and ports.</li>
* <li><strong>Execution Phase:</strong> Wire and execute the batch processing use case, then map outcome to exit code.</li>
* </ol>
* <p>
* The startup configuration encompasses all technical infrastructure and runtime parameters
* needed for bootstrap and execution. Once validated and the schema is initialized,
* configuration is handed to the use case factory which extracts the minimal runtime
* configuration for the application layer.
*
* <h2>Exit code semantics</h2>
* <ul>
@@ -65,18 +67,20 @@ import de.gecheckt.pdf.umbenenner.domain.model.RunId;
*
* <h2>Adapter wiring</h2>
* <p>
* The production constructor wires the following adapters:
* The production constructor wires the following key adapters:
* <ul>
* <li>{@link SqliteSchemaInitializationAdapter} — SQLite schema DDL at startup.</li>
* <li>{@link Sha256FingerprintAdapter} — SHA-256 content fingerprinting.</li>
* <li>{@link SqliteDocumentRecordRepositoryAdapter} — document master record CRUD.</li>
* <li>{@link SqliteProcessingAttemptRepositoryAdapter} — attempt history CRUD.</li>
* <li>{@link SqliteUnitOfWorkAdapter} — atomic persistence operations.</li>
* <li>{@link PropertiesConfigurationPortAdapter} — loads configuration from properties and environment.</li>
* <li>{@link FilesystemRunLockPortAdapter} — ensures exclusive execution via a lock file.</li>
* <li>{@link SqliteSchemaInitializationAdapter} — initializes SQLite schema at startup.</li>
* <li>{@link Sha256FingerprintAdapter} — provides content-based document identification.</li>
* <li>{@link SqliteDocumentRecordRepositoryAdapter} — manages document master records.</li>
* <li>{@link SqliteProcessingAttemptRepositoryAdapter} — maintains attempt history.</li>
* <li>{@link SqliteUnitOfWorkAdapter} — coordinates atomic persistence operations.</li>
* </ul>
* <p>
* Schema initialisation is performed once in {@link #run()} before the batch loop starts.
* A {@link DocumentPersistenceException} during schema initialisation is treated
* as a hard startup failure and results in exit code 1.
* Schema initialization is performed exactly once in {@link #run()} before the batch processing loop
* begins. A {@link DocumentPersistenceException} during schema initialization is treated as a hard
* startup failure and results in exit code 1.
*/
public class BootstrapRunner {
@@ -122,12 +126,18 @@ public class BootstrapRunner {
}
/**
* Functional interface for creating a BatchRunProcessingUseCase.
* Factory for creating a properly wired BatchRunProcessingUseCase.
* <p>
* Receives the full startup configuration (for infrastructure adapter wiring) and run lock port.
* The factory extracts the runtime configuration and wires all outbound ports
* required by the use case (e.g., source document port, PDF extraction port,
* persistence and fingerprint ports).
* The factory receives the complete startup configuration (for infrastructure adapter wiring)
* and the run lock port. Its responsibility is to:
* <ol>
* <li>Extract the minimal runtime configuration needed by the application layer.</li>
* <li>Construct all outbound adapter ports (document candidates, PDF extraction, fingerprint, persistence).</li>
* <li>Wire the use case with all required ports and dependencies.</li>
* </ol>
* <p>
* This factory is the primary responsibility boundary between startup configuration
* (complete technical infrastructure setup) and runtime configuration (minimal application needs).
*/
@FunctionalInterface
public interface UseCaseFactory {
@@ -219,28 +229,31 @@ public class BootstrapRunner {
}
/**
* Runs the application startup sequence.
* Runs the complete application startup sequence.
* <p>
* Startup flow:
* Startup flow consists of two phases:
* <ol>
* <li>Load and validate the configuration.</li>
* <li>Initialise the SQLite persistence schema. A {@link DocumentPersistenceException}
* here is a hard startup failure and causes exit code 1.</li>
* <li>Execute the batch processing pipeline with dependencies wired.</li>
* <li><strong>Bootstrap Phase (hard failures only):</strong> Load and validate configuration,
* then initialize the SQLite persistence schema.</li>
* <li><strong>Execution Phase (document failures tolerated):</strong> Execute the batch processing pipeline
* with all adapters and ports wired.</li>
* </ol>
* <p>
* Document-level failures during the batch loop are not startup failures and
* do not change the exit code as long as the run itself completes without a hard
* A {@link DocumentPersistenceException} during schema initialization is treated as a hard startup
* failure and causes exit code 1. Document-level failures during the batch loop are not startup
* failures and do not change the exit code as long as the run itself completes without a hard
* infrastructure error.
*
* @return exit code: 0 for a technically completed run, 1 for any hard startup or
* bootstrap failure (configuration invalid, schema init failed, etc.)
* @return exit code: 0 for a technically completed run, 1 for any hard bootstrap,
* configuration, or persistence failure
*/
public int run() {
LOG.info("Bootstrap flow started.");
try {
// Bootstrap Phase: prepare configuration and persistence
StartConfiguration config = loadAndValidateConfiguration();
initializeSchema(config);
// Execution Phase: run batch processing
return executeWithStartConfiguration(config);
} catch (ConfigurationLoadingException e) {
LOG.error("Configuration loading failed: {}", e.getMessage());
@@ -280,13 +293,17 @@ public class BootstrapRunner {
/**
* Executes the batch processing pipeline with the prepared startup configuration.
* <p>
* Wires all runtime dependencies, constructs adapters and the batch use case,
* invokes the CLI command, and maps the outcome to an exit code.
* Wires all runtime dependencies, constructs adapters and the batch use case via
* the use case factory, invokes the CLI command, and maps the outcome to an exit code.
* <p>
* The use case factory is responsible for extracting the minimal runtime configuration
* from the complete startup configuration. This separation ensures the application layer
* depends only on the configuration it actually needs, following hexagonal architecture principles.
* <p>
* This represents the execution phase after startup configuration is validated
* and persistence schema is initialized.
*
* @param config the validated startup configuration
* @param config the validated startup configuration (complete technical configuration)
* @return exit code: 0 for batch completion, 1 for critical runtime failures
*/
private int executeWithStartConfiguration(StartConfiguration config) {

View File

@@ -1,34 +1,46 @@
/**
* Bootstrap module for application startup and technical object graph construction.
* <p>
* Responsibility: Orchestrate the startup flow, load configuration, validate it,
* create and wire all application components, and invoke the CLI adapter entry point.
* Responsibility: Orchestrate the complete startup sequence in two phases: (1) bootstrap phase
* for configuration loading, validation, and schema initialization, and (2) execution phase
* for wiring all adapters and running the batch processing pipeline.
* <p>
* Components:
* <ul>
* <li>{@link de.gecheckt.pdf.umbenenner.bootstrap.BootstrapRunner}
* — Orchestrator of startup sequence and object graph construction</li>
* — Orchestrator of startup sequence, schema initialization, and object graph construction</li>
* <li>{@link de.gecheckt.pdf.umbenenner.bootstrap.PdfUmbenennerApplication}
* — Main entry point that invokes BootstrapRunner</li>
* — Application entry point that invokes BootstrapRunner</li>
* <li>{@link de.gecheckt.pdf.umbenenner.bootstrap.adapter.Log4jProcessingLogger}
* — Logging adapter for application-layer coordination and use case processing</li>
* </ul>
* <p>
* Implementation approach:
* <ul>
* <li>Uses factory pattern with pluggable interfaces for all ports and use cases</li>
* <li>Manually constructs object graph without framework dependencies</li>
* <li>Ensures strict inward dependency direction toward application and domain</li>
* <li>Provides a minimal, controlled startup path without dependency injection frameworks</li>
* <li>Uses factory pattern with pluggable interfaces for configuration, run lock, schema initialization, use case, and command creation</li>
* <li>Manually constructs the object graph without framework dependencies</li>
* <li>Ensures strict inward dependency direction: all adapters depend on ports, never the other way around</li>
* <li>Separates startup configuration (complete technical parameters for bootstrap and adapter wiring) from
* runtime configuration (minimal parameters the application layer actually depends on)</li>
* <li>Schema initialization happens exactly once at startup, before document processing begins</li>
* </ul>
* <p>
* Startup sequence:
* <ul>
* <li>Load and validate configuration from properties file and environment variables</li>
* <li>Wire run lock adapter to prevent concurrent instances</li>
* <li>Initialize SQLite persistence schema at startup via
* {@link de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitializationPort},
* ensuring the database is ready before document processing begins.</li>
* <li>Load and validate complete startup configuration from properties file and environment variables</li>
* <li>Initialize SQLite persistence schema via {@link de.gecheckt.pdf.umbenenner.application.port.out.PersistenceSchemaInitializationPort},
* ensuring the database is ready before any batch processing</li>
* <li>Schema initialization failure is treated as a hard bootstrap error and causes exit code 1</li>
* <li>Invoke the batch processing CLI adapter</li>
* <li>Create run lock adapter and acquire exclusive lock</li>
* <li>Wire all outbound adapters (document candidates, PDF extraction, fingerprint, persistence, logging)</li>
* <li>Wire and invoke the batch processing CLI adapter</li>
* <li>Map batch outcome to process exit code</li>
* </ul>
* <p>
* Exit codes:
* <ul>
* <li>0 = batch run completed technically successfully (even if individual documents failed)</li>
* <li>1 = hard bootstrap, configuration, schema initialization, or critical infrastructure failure</li>
* </ul>
*/
package de.gecheckt.pdf.umbenenner.bootstrap;