This application illustrates the creation of a streaming event pipeline around Apache Kafka and its ecosystem tools like REST Proxy and Kafka Connect, created as part of the course Data Ingestion with Kafka & Kafka Streaming.
Using public data from the Chicago Transit Authority an event pipeline around Kafka has been created in order to simulate and display the status of train lines in real time.
The architecture looks like so:
The following are required to run the application:
- Docker
- Python 3.7
To run the simulation, you must first start up the Kafka ecosystem using Docker Compose.
%> docker-compose up
Once docker-compose is ready, the following services will be available:
| Service | Host URL | Docker URL | Username | Password |
|---|---|---|---|---|
| Public Transit Status | http://localhost:8888 | n/a | ||
| Landoop Kafka Connect UI | http://localhost:8084 | http://connect-ui:8084 | ||
| Landoop Kafka Topics UI | http://localhost:8085 | http://topics-ui:8085 | ||
| Landoop Schema Registry UI | http://localhost:8086 | http://schema-registry-ui:8086 | ||
| Kafka | PLAINTEXT://localhost:9092 | PLAINTEXT://kafka0:9092 | ||
| REST Proxy | http://localhost:8082 | http://rest-proxy:8082/ | ||
| Schema Registry | http://localhost:8081 | http://schema-registry:8081/ | ||
| Kafka Connect | http://localhost:8083 | http://kafka-connect:8083 | ||
| KSQL | http://localhost:8088 | http://ksql:8088 | ||
| PostgreSQL | jdbc:postgresql://localhost:5432/cta |
jdbc:postgresql://postgres:5432/cta |
cta_admin |
chicago |
Note that to access these services from your own machine, you will always use the Host URL column.
When configuring services that run within Docker Compose, like Kafka Connect you must use the Docker URL. When you configure the JDBC Source Kafka Connector, for example, you will want to use the value from the Docker URL column.
To run the application it is important that you open a terminal window for each piece and run them at the same time. If you do not run both the producer and consumer at the same time you will not be able to successfully run the application.
cd producersvirtualenv venv. venv/bin/activatepip install -r requirements.txtpython simulation.py
cd consumersvirtualenv venv. venv/bin/activatepip install -r requirements.txtfaust -A faust_stream worker -l info
cd consumersvirtualenv venv. venv/bin/activatepip install -r requirements.txtpython ksql.py
** NOTE **: Do not run the consumer until you have reached Step 6!
cd consumersvirtualenv venv. venv/bin/activatepip install -r requirements.txtpython server.py
To view the Transit Status Page, you need to open a web browser to http://localhost:8888

