Complete DevOps pipeline: GitLab CE deployed via Ansible on a Vagrant VM, CI/CD pipelines for 3 microservices + 1 infrastructure repository, deployed on AWS ECS (staging & production) using Terraform.
- Architecture Overview
- Prerequisites
- Project Structure
- Setup Guide
- Pipelines Description
- Security & Cybersecurity
- Technical Decisions
- Audit Commands
- Troubleshooting
┌─────────────────────────────────────────────────────────────────┐
│ LOCAL MACHINE │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Vagrant VM (Ubuntu 22.04) │ │
│ │ 192.168.56.11 | 4GB RAM | 2 CPUs │ │
│ │ │ │
│ │ ┌──────────────┐ ┌───────────────────────────────┐ │ │
│ │ │ GitLab CE │ │ GitLab Runners │ │ │
│ │ │ :80 │ │ shell+terraform | docker+py │ │ │
│ │ └──────────────┘ └───────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Ansible Playbooks: │
│ gitlab.yml → runners.yml → gitlab_users.yml │
│ → gitlab_projects.yml → push_repos.yml → protect_branches.yml│
└─────────────────────────────────────────────────────────────────┘
│
GitLab CI/CD Pipelines
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ AWS CLOUD │
│ │
│ ┌──────────────────────────┐ ┌──────────────────────────┐ │
│ │ STAGING ENVIRONMENT │ │ PRODUCTION ENVIRONMENT │ │
│ │ │ │ │ │
│ │ ┌────────────────────┐ │ │ ┌────────────────────┐ │ │
│ │ │ ECS Cluster │ │ │ │ ECS Cluster │ │ │
│ │ │ ┌──────────────┐ │ │ │ │ ┌──────────────┐ │ │ │
│ │ │ │ api-gateway │ │ │ │ │ │ api-gateway │ │ │ │
│ │ │ │ inventory-app│ │ │ │ │ │ inventory-app│ │ │ │
│ │ │ │ billing-app │ │ │ │ │ │ billing-app │ │ │ │
│ │ │ └──────────────┘ │ │ │ │ └──────────────┘ │ │ │
│ │ └────────────────────┘ │ │ └────────────────────┘ │ │
│ │ ┌────────────────────┐ │ │ ┌────────────────────┐ │ │
│ │ │ ALB + VPC │ │ │ │ ALB + VPC │ │ │
│ │ │ Security Groups │ │ │ │ Security Groups │ │ │
│ │ └────────────────────┘ │ │ └────────────────────┘ │ │
│ └──────────────────────────┘ └──────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ ECR: api-gateway-app | inventory-app | billing-app │ │
│ │ inventory-database | billing-database | rabbitmq │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ S3: Terraform state backend (staging + production) │ │
│ │ DynamoDB: Terraform state lock │ │
│ │ Secrets Manager: DB passwords, RabbitMQ credentials │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Code Push to main (protected branch only)
│
▼
┌───────────────┐
│ BUILD │ pip install / compile
└───────┬───────┘
│
▼
┌───────────────┐
│ TEST │ pytest (unit + integration)
└───────┬───────┘
│
▼
┌───────────────┐
│ SCAN │ Bandit (SAST) + Trivy (image CVEs)
└───────┬───────┘
│
▼
┌───────────────┐
│ CONTAINERIZE │ docker build + push to ECR
└───────┬───────┘
│
▼
┌───────────────┐
│ APPROVE ✋ │ Manual gate — validate before staging
└───────┬───────┘
│
▼
┌───────────────┐
│DEPLOY STAGING │ aws ecs update-service (rolling update)
└───────┬───────┘
│
▼
┌───────────────┐
│ APPROVE ✋ │ Manual gate — stakeholder review
└───────┬───────┘
│
▼
┌───────────────┐
│ DEPLOY PROD │ aws ecs update-service + ecs wait
└───────────────┘ (zero downtime rolling update)
| Tool | Version | Purpose |
|---|---|---|
| Vagrant | >= 2.3 | VM provisioning |
| VirtualBox | >= 6.1 | Hypervisor |
| Ansible | >= 2.14 | Configuration management |
| AWS CLI | >= 2.0 | AWS interaction |
| Git | >= 2.30 | Source control |
# Install required Ansible collection
ansible-galaxy collection install community.general- AWS account with IAM user (
gitlab-pipeline-user) — created automatically byscripts/create-iam-pipeline-user.sh - ECR repositories created via Terraform
- S3 buckets for Terraform state backend:
staging-cloud-design-ssm-2026production-cloud-design-ssm-2026
- DynamoDB table for state locking:
terraform-state-lock - Secrets in AWS Secrets Manager (prefix:
cloud-design/)
- Minimum 8GB RAM on host machine (4GB allocated to VM)
- 20GB free disk space
code-keeper/
├── Vagrantfile # VM definition (Ubuntu 22.04, 4GB RAM)
├── inventory/
│ └── hosts.yml # Ansible inventory
├── playbooks/
│ ├── gitlab.yml # Install & configure GitLab CE
│ ├── runners.yml # Install runners + AWS CLI v2 + Terraform
│ ├── gitlab_users.yml # Create dev1, dev2 users
│ ├── gitlab_projects.yml # Create 4 repos + CI/CD variables
│ ├── protect_branches.yml # Protect main branches
│ └── push_repos.yml # Push source code to GitLab
├── scripts/
│ ├── setup.sh # Guided interactive setup script
│ └── create-iam-pipeline-user.sh # Create AWS IAM user + policy
└── srcs/
├── crud-master/
│ ├── api-gateway/ # API Gateway service
│ │ ├── app/
│ │ ├── tests/
│ │ └── .gitlab-ci.yml
│ ├── billing-app/ # Billing service
│ │ ├── app/
│ │ ├── billing-database/
│ │ ├── rabbitmq/
│ │ ├── tests/
│ │ └── .gitlab-ci.yml
│ └── inventory-app/ # Inventory service
│ ├── app/
│ ├── inventory-database/
│ ├── tests/
│ └── .gitlab-ci.yml
└── cloud-design-infra/
└── cloud-design/
├── environments/
│ ├── staging/ # Staging Terraform config + scripts
│ │ └── scripts/
│ │ └── update-dns.sh
│ └── production/ # Production Terraform config + scripts
│ └── scripts/
│ └── update-dns.sh
├── modules/ # Reusable Terraform modules
│ ├── alb/
│ ├── ecs-ec2/
│ ├── network/
│ └── monitoring/
└── .gitlab-ci.yml
Before starting, create the AWS IAM user that GitLab pipelines will use:
# Configure AWS CLI with admin credentials first
aws configure
# Run the IAM setup script
chmod +x scripts/create-iam-pipeline-user.sh
./scripts/create-iam-pipeline-user.sh
# Save the generated credentials — you will need them in Step 2cd code-keeper/
vagrant upThis automatically runs playbooks/gitlab.yml which:
- Installs GitLab CE
- Disables public signup, enables admin approval
- Sets CI/CD defaults (artifact expiry, max size)
- Configures private visibility and main as default branch
Wait ~10 minutes for GitLab to fully initialize. Verify at: http://192.168.56.11
chmod +x scripts/setup.sh
./scripts/setup.shThe script guides you through all steps interactively:
| Step | Action |
|---|---|
| 1 | Auto-retrieves root password from VM |
| 2 | Prompts for GitLab Admin Token (create at GitLab UI → User Settings → Access Tokens) |
| 1b | Change root password |
| 3 | Prompts for Runner Tokens (GitLab UI → Admin → Runners → New instance runner) |
| 4 | Prompts for AWS credentials (from Step 0 output) |
| 5 | Prompts for dev1/dev2 passwords (min 12 chars) |
| 6 | Installs & registers GitLab Runners (shell+terraform + docker+python) |
| 7 | Creates GitLab users (dev1, dev2) |
| 8 | Creates 4 projects + injects CI/CD variables |
| 9 | Pushes source code to all 4 repos |
| 10 | Protects main branches on all 4 repos |
Trigger the cloud-design-infra pipeline from GitLab UI or push a change to main. The pipeline will:
- Initialize and validate Terraform
- Show the plan for review
- Wait for manual approval before applying to staging
- Wait for manual approval before applying to production
- Run
update-dns.shafter each apply to update DuckDNS with the new ALB DNS
For each app repo (inventory-app, billing-app, api-gateway), push a change to main or trigger the pipeline manually. After containerization, manually approve staging and then production deployment.
init_staging ──► validate_staging ──► plan_staging ──► approve_staging ✋
──► apply_staging + update-dns.sh
│
└──► init_production ──► validate_production ──► plan_production
──► approve_production ✋ ──► apply_production + update-dns.sh
| Stage | Job | Description |
|---|---|---|
init |
init_staging / init_production |
terraform init -reconfigure — providers + S3 backend |
validate |
validate_staging / validate_production |
terraform validate + fmt -check |
plan |
plan_staging / plan_production |
terraform plan — preview changes (no artifact stored) |
apply_staging |
approve_staging ✋ |
Manual gate before staging |
apply_staging |
apply_staging |
Fresh plan + apply + update-dns.sh |
approve_production |
approve_production ✋ |
Manual gate — requires staging success |
apply_production |
apply_production |
Fresh plan + apply + update-dns.sh |
destroy |
destroy_staging / destroy_production |
Manual emergency destroy |
Runner: shell (Terraform 1.10.0 + AWS CLI v2 pre-installed)
Note: Each
applyjob generates a fresh plan atomically just before applying — this prevents "stale plan" errors when the Terraform state has been modified by a previous partial run.
build ──► test ──► scan ──► containerize
| Stage | Tool | Description |
|---|---|---|
build |
pip | Install dependencies, validate build |
test |
pytest | Unit + integration tests |
scan |
Bandit | SAST — Python security analysis (SQL injection, hardcoded secrets, insecure patterns) |
containerize |
Docker + Trivy + ECR | Build image, scan for CVEs (HIGH/CRITICAL), push to AWS ECR with retry |
Runner: docker (docker-in-docker, image: docker:24.0.5)
containerize ──► approve_staging ✋ ──► deploy_staging
──► approve_production ✋ ──► deploy_production
| Stage | Description |
|---|---|
approve_staging |
Manual gate — validate image before staging deployment |
deploy_staging |
aws ecs update-service --force-new-deployment on staging cluster |
approve_production |
Manual gate — validate staging before production |
deploy_production |
aws ecs update-service + aws ecs wait services-stable (zero downtime) |
Runner: docker (image: public.ecr.aws/docker/library/python:3.11-slim)
Note: ECR Public is used instead of Docker Hub to avoid pull rate limiting on the Vagrant VM.
All pipelines trigger only on protected branch main:
rules:
- if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'Application containerize and deploy jobs additionally require file changes in the relevant directories (app/**/*, billing-database/**/*, etc.) to avoid unnecessary rebuilds.
All pipeline jobs use:
rules:
- if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'Only Maintainers can push to main. Force push is disabled. Developers can only merge via MR.
- No credentials in code, Terraform files, or Ansible playbooks
- AWS credentials injected as masked + protected GitLab CI/CD variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_ACCOUNT_ID,AWS_DEFAULT_REGION) - DB passwords and RabbitMQ credentials stored in AWS Secrets Manager (prefix:
cloud-design/) - ECS tasks retrieve secrets at runtime via IAM task execution role
- Runner tokens passed via
-eflag at playbook execution time (never stored in files)
dev1,dev2have Developer role — cannot push directly to main, cannot manage runners or variablesgitlab-pipeline-userIAM user has scoped permissions (ECR repositories, specific S3 buckets, ECS, specific Secrets Manager paths)- ECS task roles have only the permissions required by each service
gitlab-runnersystem user runs with minimal OS permissions
- Bandit runs on every pipeline — flags Python security issues (B-level severity)
- Trivy scans every Docker image for CVEs (HIGH and CRITICAL) before push to ECR
- Docker base images use pinned versions (
docker:24.0.5,python:3.11-slim) - AWS provider pinned in Terraform (
~> 5.0) - Pipeline uses
public.ecr.awsmirror for Python images — avoids Docker Hub throttling and provides faster pulls from us-east-1
GitLab CE provides an all-in-one self-hosted solution: source control + CI/CD + runner management + protected branches in a single instance. No external dependencies, full control over the pipeline environment, and closer to a real enterprise setup.
Ansible provides idempotent, declarative configuration. The same playbooks can be re-run without side effects — making the setup reproducible and auditable. It also serves as living documentation of every configuration decision made on the VM.
For a local development environment, Vagrant provides fully reproducible VMs without cloud costs. Any team member can reproduce the exact environment with vagrant up + ./setup.sh.
ECS integrates natively with ALB, ECR, Secrets Manager, CloudWatch, and Service Discovery with minimal operational overhead. EC2 launch type gives visibility into the underlying infrastructure while keeping costs lower than Fargate for always-on workloads.
Terraform is cloud-agnostic, has a rich module ecosystem, and HCL is more readable than JSON/YAML CloudFormation. State management via S3 + DynamoDB locking enables safe team collaboration. Separate staging/production environments with independent backends prevent state cross-contamination.
- Bandit is purpose-built for Python SAST — catches SQL injection patterns, hardcoded secrets, insecure deserialization with zero configuration
- Trivy covers runtime CVEs in OS packages and Python dependencies — what Bandit cannot see
- Together they cover code and image security without SonarQube's 2GB+ RAM requirement (critical given the 4GB VM constraint)
Terraform requires direct filesystem access for provider binaries and state operations. The shell executor runs directly on the VM where Terraform and AWS CLI v2 are pre-installed, avoiding Docker-in-Docker complexity for infrastructure jobs.
Application pipelines need isolated, reproducible environments. Each job gets a clean container with the exact Python/Docker version specified — preventing environment drift between runs and ensuring test reproducibility.
Instead of passing a tfplan artifact between jobs, each apply job generates a fresh plan immediately before applying. This prevents "Saved plan is stale" errors when a previous pipeline run partially created resources and modified the Terraform state.
# List all tasks in GitLab playbook
ansible-playbook --list-tasks playbooks/gitlab.yml
# List all tasks in runners playbook
ansible-playbook --list-tasks playbooks/runners.yml
# Check GitLab service status
vagrant ssh -c "sudo gitlab-ctl status"
# Check GitLab Runner service status
vagrant ssh -c "sudo systemctl status gitlab-runner"
# List registered runners
vagrant ssh -c "sudo gitlab-runner list"
# Verify GitLab is accessible
curl -s -o /dev/null -w "%{http_code}" http://192.168.56.11# Verify protected branches via API
curl -s --header "PRIVATE-TOKEN: <your-token>" \
"http://192.168.56.11/api/v4/projects/root%2Finventory-app/protected_branches" \
| python3 -m json.tool# Verify CI/CD variables (values are masked in output)
curl -s --header "PRIVATE-TOKEN: <your-token>" \
"http://192.168.56.11/api/v4/projects/root%2Finventory-app/variables" \
| python3 -m json.tool# Verify Terraform on runner
vagrant ssh -c "terraform version"
# Verify AWS CLI on runner
vagrant ssh -c "aws --version"
# Check Terraform state
cd srcs/cloud-design-infra/cloud-design/environments/staging
terraform state list# Verify AWS credentials
aws sts get-caller-identity
# Check ECS services — staging
aws ecs list-services --cluster cloud-design-cluster-staging
# Check ECS services — production
aws ecs list-services --cluster cloud-design-cluster-prod
# Check ECR repositories
aws ecr describe-repositories
# Check Secrets Manager
aws secretsmanager list-secrets --query 'SecretList[?starts_with(Name, `cloud-design/`)].Name'GitLab takes too long to start
vagrant ssh -c "sudo gitlab-ctl tail"
# Wait until: "gitlab Reconfigured!"Runner not picking up jobs
vagrant ssh -c "sudo gitlab-runner verify"
vagrant ssh -c "sudo systemctl restart gitlab-runner"
# Check config
vagrant ssh -c "sudo cat /etc/gitlab-runner/config.toml"Terraform state lock
terraform force-unlock <LOCK_ID>ECR login fails in pipeline
Verify AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ACCOUNT_ID, AWS_DEFAULT_REGION are set as protected + masked variables in GitLab project Settings → CI/CD → Variables.
VM clock drift (AWS signature errors)
vagrant ssh -c "sudo timedatectl set-ntp true && sudo systemctl restart systemd-timesyncd"
vagrant ssh -c "timedatectl status"ECS deployment fails
aws ecs describe-services \
--cluster cloud-design-cluster-staging \
--services inventory-app-service-staging \
--query 'services[0].events[:5]'| Repository | Description | Runner | Pipeline |
|---|---|---|---|
inventory-app |
Inventory service + PostgreSQL | docker | CI + CD |
billing-app |
Billing service + PostgreSQL + RabbitMQ | docker | CI + CD |
api-gateway |
API Gateway routing to microservices | docker | CI + CD |
cloud-design-infra |
Terraform infrastructure (staging + prod) | shell | Infra pipeline |
All repositories accessible at: http://192.168.56.11
- Ahmadou Bamba Diéne — [sdiene]
- Mouhamed Ngom — [mhnom]
- Seynabou Niang — [sniang]
Zone01 — DevOps Project