This service worker is responsible for consuming messages from the Kafka topic created by System A, downloading the uploaded CSV files from MinIO (a local simulation of Amazon S3), validating their content, and updating the status of the processed files in a MySQL database.
System B processes CSV files uploaded by System A. The goal of this service worker is to:
- Consume Kafka messages to identify which files need to be processed.
- Download CSV files from MinIO (S3 simulation) based on the information received from Kafka.
- Validate the CSV contents using custom validation logic to ensure the files conform to the expected structure.
- Update the status of each file in the database to either Processed or Fail, depending on the validation outcome.
This service worker allows efficient background processing of large CSV files (up to 7MB with ~100,000 rows), leveraging asynchronous operations to handle validation and processing in the background.
- .NET β The primary backend framework for building the service worker.
- Swagger β API documentation for interacting with the service endpoints.
- MySQL β Relational database for storing metadata and tracking file processing status.
- MinIO β S3-compatible object storage for storing and retrieving CSV files.
- Apache Kafka β Message queue system for communication between System A and System B.
- Kafka Consumer: Listens for file processing requests via Kafka topics published by System A.
- File Download: Downloads files from MinIO based on the message consumed from Kafka.
- CSV Validation: Ensures the CSV file has the correct structure, headers, and content.
- Database Update: Updates the status of each file in the database (Processed or Fail).
- Asynchronous Processing: Utilizes C# async capabilities to process large files efficiently in the background.
-
Clone the repository:
git clone https://github.com/user/repository.git cd repository -
Install dependencies:
Ensure you have the following installed:
- .NET SDK (version 8 or higher)
- MySQL
- MinIO (or equivalent S3 service)
- Apache Kafka
-
Configure application settings:
Update the
appsettings.jsonwith the correct database connection strings, MinIO credentials, and Kafka configurations. -
Run the service worker:
Start the service worker:
dotnet run --project FileUploaderPartB.Worker/FileUploaderPartB.Worker.csproj
- Kafka message consumption: When System A uploads a CSV file, a message is sent to a Kafka topic. The service worker listens to the topic and retrieves the file details.
- File download: The service worker downloads the CSV file from MinIO using the file details from Kafka.
- CSV validation: The service worker validates the CSV content using predefined validation rules.
- Database update: Based on the validation result, the file status is updated in the MySQL database. If the file passes validation, it is marked as Processed; otherwise, it is marked as Fail.
- The project structure diagram was created using GitDiagram by @ahmedkhaleel2004.
