🚀 Code Keeper — CI/CD Pipeline for Microservices on AWS

Complete DevOps pipeline: GitLab CE deployed via Ansible on a Vagrant VM, CI/CD pipelines for 3 microservices + 1 infrastructure repository, deployed on AWS ECS (staging & production) using Terraform.

📋 Table of Contents

Architecture Overview
Prerequisites
Project Structure
Setup Guide
Pipelines Description
Security & Cybersecurity
Technical Decisions
Audit Commands
Troubleshooting

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        LOCAL MACHINE                            │
│                                                                 │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │              Vagrant VM (Ubuntu 22.04)                  │   │
│   │              192.168.56.11 | 4GB RAM | 2 CPUs           │   │
│   │                                                         │   │
│   │   ┌──────────────┐   ┌───────────────────────────────┐  │   │
│   │   │  GitLab CE   │   │       GitLab Runners          │  │   │
│   │   │  :80         │   │  shell+terraform | docker+py  │  │   │
│   │   └──────────────┘   └───────────────────────────────┘  │   │
│   └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│   Ansible Playbooks:                                            │
│   gitlab.yml → runners.yml → gitlab_users.yml                  │
│   → gitlab_projects.yml → push_repos.yml → protect_branches.yml│
└─────────────────────────────────────────────────────────────────┘
                              │
                    GitLab CI/CD Pipelines
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         AWS CLOUD                               │
│                                                                 │
│   ┌──────────────────────────┐  ┌──────────────────────────┐   │
│   │    STAGING ENVIRONMENT   │  │  PRODUCTION ENVIRONMENT  │   │
│   │                          │  │                          │   │
│   │  ┌────────────────────┐  │  │  ┌────────────────────┐  │   │
│   │  │   ECS Cluster      │  │  │  │   ECS Cluster      │  │   │
│   │  │  ┌──────────────┐  │  │  │  │  ┌──────────────┐  │  │   │
│   │  │  │ api-gateway  │  │  │  │  │  │ api-gateway  │  │  │   │
│   │  │  │ inventory-app│  │  │  │  │  │ inventory-app│  │  │   │
│   │  │  │ billing-app  │  │  │  │  │  │ billing-app  │  │  │   │
│   │  │  └──────────────┘  │  │  │  │  └──────────────┘  │  │   │
│   │  └────────────────────┘  │  │  └────────────────────┘  │   │
│   │  ┌────────────────────┐  │  │  ┌────────────────────┐  │   │
│   │  │   ALB + VPC        │  │  │  │   ALB + VPC        │  │   │
│   │  │   Security Groups  │  │  │  │   Security Groups  │  │   │
│   │  └────────────────────┘  │  │  └────────────────────┘  │   │
│   └──────────────────────────┘  └──────────────────────────┘   │
│                                                                 │
│   ┌──────────────────────────────────────────────────────────┐  │
│   │  ECR: api-gateway-app | inventory-app | billing-app      │  │
│   │       inventory-database | billing-database | rabbitmq   │  │
│   └──────────────────────────────────────────────────────────┘  │
│   ┌──────────────────────────────────────────────────────────┐  │
│   │  S3: Terraform state backend (staging + production)      │  │
│   │  DynamoDB: Terraform state lock                          │  │
│   │  Secrets Manager: DB passwords, RabbitMQ credentials     │  │
│   └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

CI/CD Flow

Code Push to main (protected branch only)
            │
            ▼
    ┌───────────────┐
    │     BUILD     │  pip install / compile
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │     TEST      │  pytest (unit + integration)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │     SCAN      │  Bandit (SAST) + Trivy (image CVEs)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │ CONTAINERIZE  │  docker build + push to ECR
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │  APPROVE   ✋ │  Manual gate — validate before staging
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │DEPLOY STAGING │  aws ecs update-service (rolling update)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │   APPROVE  ✋ │  Manual gate — stakeholder review
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │ DEPLOY PROD   │  aws ecs update-service + ecs wait
    └───────────────┘  (zero downtime rolling update)

Prerequisites

Local Machine

Tool	Version	Purpose
Vagrant	>= 2.3	VM provisioning
VirtualBox	>= 6.1	Hypervisor
Ansible	>= 2.14	Configuration management
AWS CLI	>= 2.0	AWS interaction
Git	>= 2.30	Source control

# Install required Ansible collection
ansible-galaxy collection install community.general

AWS Requirements

AWS account with IAM user (gitlab-pipeline-user) — created automatically by scripts/create-iam-pipeline-user.sh
ECR repositories created via Terraform
S3 buckets for Terraform state backend:
- staging-cloud-design-ssm-2026
- production-cloud-design-ssm-2026
DynamoDB table for state locking: terraform-state-lock
Secrets in AWS Secrets Manager (prefix: cloud-design/)

Hardware

Minimum 8GB RAM on host machine (4GB allocated to VM)
20GB free disk space

Project Structure

code-keeper/
├── Vagrantfile                      # VM definition (Ubuntu 22.04, 4GB RAM)
├── inventory/
│   └── hosts.yml                    # Ansible inventory
├── playbooks/
│   ├── gitlab.yml                   # Install & configure GitLab CE
│   ├── runners.yml                  # Install runners + AWS CLI v2 + Terraform
│   ├── gitlab_users.yml             # Create dev1, dev2 users
│   ├── gitlab_projects.yml          # Create 4 repos + CI/CD variables
│   ├── protect_branches.yml         # Protect main branches
│   └── push_repos.yml               # Push source code to GitLab
├── scripts/
│   ├── setup.sh                     # Guided interactive setup script
│   └── create-iam-pipeline-user.sh  # Create AWS IAM user + policy
└── srcs/
    ├── crud-master/
    │   ├── api-gateway/             # API Gateway service
    │   │   ├── app/
    │   │   ├── tests/
    │   │   └── .gitlab-ci.yml
    │   ├── billing-app/             # Billing service
    │   │   ├── app/
    │   │   ├── billing-database/
    │   │   ├── rabbitmq/
    │   │   ├── tests/
    │   │   └── .gitlab-ci.yml
    │   └── inventory-app/           # Inventory service
    │       ├── app/
    │       ├── inventory-database/
    │       ├── tests/
    │       └── .gitlab-ci.yml
    └── cloud-design-infra/
        └── cloud-design/
            ├── environments/
            │   ├── staging/         # Staging Terraform config + scripts
            │   │   └── scripts/
            │   │       └── update-dns.sh
            │   └── production/      # Production Terraform config + scripts
            │       └── scripts/
            │           └── update-dns.sh
            ├── modules/             # Reusable Terraform modules
            │   ├── alb/
            │   ├── ecs-ec2/
            │   ├── network/
            │   └── monitoring/
            └── .gitlab-ci.yml

Setup Guide

Step 0 — Create AWS IAM Pipeline User

Before starting, create the AWS IAM user that GitLab pipelines will use:

# Configure AWS CLI with admin credentials first
aws configure

# Run the IAM setup script
chmod +x scripts/create-iam-pipeline-user.sh
./scripts/create-iam-pipeline-user.sh

# Save the generated credentials — you will need them in Step 2

Step 1 — Start the GitLab VM

cd code-keeper/
vagrant up

This automatically runs playbooks/gitlab.yml which:

Installs GitLab CE
Disables public signup, enables admin approval
Sets CI/CD defaults (artifact expiry, max size)
Configures private visibility and main as default branch

Wait ~10 minutes for GitLab to fully initialize. Verify at: http://192.168.56.11

Step 2 — Run the Setup Script

chmod +x scripts/setup.sh
./scripts/setup.sh

The script guides you through all steps interactively:

Step	Action
1	Auto-retrieves root password from VM
2	Prompts for GitLab Admin Token (create at GitLab UI → User Settings → Access Tokens)
1b	Change root password
3	Prompts for Runner Tokens (GitLab UI → Admin → Runners → New instance runner)
4	Prompts for AWS credentials (from Step 0 output)
5	Prompts for dev1/dev2 passwords (min 12 chars)
6	Installs & registers GitLab Runners (shell+terraform + docker+python)
7	Creates GitLab users (dev1, dev2)
8	Creates 4 projects + injects CI/CD variables
9	Pushes source code to all 4 repos
10	Protects main branches on all 4 repos

Step 3 — Deploy Infrastructure

Trigger the cloud-design-infra pipeline from GitLab UI or push a change to main. The pipeline will:

Initialize and validate Terraform
Show the plan for review
Wait for manual approval before applying to staging
Wait for manual approval before applying to production
Run update-dns.sh after each apply to update DuckDNS with the new ALB DNS

Step 4 — Deploy Applications

For each app repo (inventory-app, billing-app, api-gateway), push a change to main or trigger the pipeline manually. After containerization, manually approve staging and then production deployment.

Pipelines Description

Infrastructure Pipeline (`cloud-design-infra`)

init_staging ──► validate_staging ──► plan_staging ──► approve_staging ✋
    ──► apply_staging + update-dns.sh
         │
         └──► init_production ──► validate_production ──► plan_production
                  ──► approve_production ✋ ──► apply_production + update-dns.sh

Stage	Job	Description
`init`	`init_staging` / `init_production`	`terraform init -reconfigure` — providers + S3 backend
`validate`	`validate_staging` / `validate_production`	`terraform validate` + `fmt -check`
`plan`	`plan_staging` / `plan_production`	`terraform plan` — preview changes (no artifact stored)
`apply_staging`	`approve_staging` ✋	Manual gate before staging
`apply_staging`	`apply_staging`	Fresh plan + apply + `update-dns.sh`
`approve_production`	`approve_production` ✋	Manual gate — requires staging success
`apply_production`	`apply_production`	Fresh plan + apply + `update-dns.sh`
`destroy`	`destroy_staging` / `destroy_production`	Manual emergency destroy

Runner: shell (Terraform 1.10.0 + AWS CLI v2 pre-installed)

Note: Each apply job generates a fresh plan atomically just before applying — this prevents "stale plan" errors when the Terraform state has been modified by a previous partial run.

Application CI Pipeline (all 3 app repos)

build ──► test ──► scan ──► containerize

Stage	Tool	Description
`build`	pip	Install dependencies, validate build
`test`	pytest	Unit + integration tests
`scan`	Bandit	SAST — Python security analysis (SQL injection, hardcoded secrets, insecure patterns)
`containerize`	Docker + Trivy + ECR	Build image, scan for CVEs (HIGH/CRITICAL), push to AWS ECR with retry

Runner: docker (docker-in-docker, image: docker:24.0.5)

Application CD Pipeline (all 3 app repos)

containerize ──► approve_staging ✋ ──► deploy_staging
    ──► approve_production ✋ ──► deploy_production

Stage	Description
`approve_staging`	Manual gate — validate image before staging deployment
`deploy_staging`	`aws ecs update-service --force-new-deployment` on staging cluster
`approve_production`	Manual gate — validate staging before production
`deploy_production`	`aws ecs update-service` + `aws ecs wait services-stable` (zero downtime)

Runner: docker (image: public.ecr.aws/docker/library/python:3.11-slim)

Note: ECR Public is used instead of Docker Hub to avoid pull rate limiting on the Vagrant VM.

Pipeline Triggers

All pipelines trigger only on protected branch main:

rules:
  - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'

Application containerize and deploy jobs additionally require file changes in the relevant directories (app/**/*, billing-database/**/*, etc.) to avoid unnecessary rebuilds.

Security & Cybersecurity

1. Triggers Restricted to Protected Branches

All pipeline jobs use:

rules:
  - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'

Only Maintainers can push to main. Force push is disabled. Developers can only merge via MR.

2. Credentials Separated from Code

No credentials in code, Terraform files, or Ansible playbooks
AWS credentials injected as masked + protected GitLab CI/CD variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ACCOUNT_ID, AWS_DEFAULT_REGION)
DB passwords and RabbitMQ credentials stored in AWS Secrets Manager (prefix: cloud-design/)
ECS tasks retrieve secrets at runtime via IAM task execution role
Runner tokens passed via -e flag at playbook execution time (never stored in files)

3. Least Privilege Principle

dev1, dev2 have Developer role — cannot push directly to main, cannot manage runners or variables
gitlab-pipeline-user IAM user has scoped permissions (ECR repositories, specific S3 buckets, ECS, specific Secrets Manager paths)
ECS task roles have only the permissions required by each service
gitlab-runner system user runs with minimal OS permissions

4. Dependency Updates & Security Scanning

Bandit runs on every pipeline — flags Python security issues (B-level severity)
Trivy scans every Docker image for CVEs (HIGH and CRITICAL) before push to ECR
Docker base images use pinned versions (docker:24.0.5, python:3.11-slim)
AWS provider pinned in Terraform (~> 5.0)
Pipeline uses public.ecr.aws mirror for Python images — avoids Docker Hub throttling and provides faster pulls from us-east-1

Technical Decisions

Why GitLab CE (not GitHub Actions or Jenkins)?

GitLab CE provides an all-in-one self-hosted solution: source control + CI/CD + runner management + protected branches in a single instance. No external dependencies, full control over the pipeline environment, and closer to a real enterprise setup.

Why Ansible for GitLab deployment?

Ansible provides idempotent, declarative configuration. The same playbooks can be re-run without side effects — making the setup reproducible and auditable. It also serves as living documentation of every configuration decision made on the VM.

Why Vagrant + VirtualBox (not AWS EC2)?

For a local development environment, Vagrant provides fully reproducible VMs without cloud costs. Any team member can reproduce the exact environment with vagrant up + ./setup.sh.

Why AWS ECS EC2 (not Kubernetes or Fargate)?

ECS integrates natively with ALB, ECR, Secrets Manager, CloudWatch, and Service Discovery with minimal operational overhead. EC2 launch type gives visibility into the underlying infrastructure while keeping costs lower than Fargate for always-on workloads.

Why Terraform (not CloudFormation)?

Terraform is cloud-agnostic, has a rich module ecosystem, and HCL is more readable than JSON/YAML CloudFormation. State management via S3 + DynamoDB locking enables safe team collaboration. Separate staging/production environments with independent backends prevent state cross-contamination.

Why Bandit + Trivy (not SonarQube)?

Bandit is purpose-built for Python SAST — catches SQL injection patterns, hardcoded secrets, insecure deserialization with zero configuration
Trivy covers runtime CVEs in OS packages and Python dependencies — what Bandit cannot see
Together they cover code and image security without SonarQube's 2GB+ RAM requirement (critical given the 4GB VM constraint)

Why shell runner for Terraform?

Terraform requires direct filesystem access for provider binaries and state operations. The shell executor runs directly on the VM where Terraform and AWS CLI v2 are pre-installed, avoiding Docker-in-Docker complexity for infrastructure jobs.

Why docker runner for applications?

Application pipelines need isolated, reproducible environments. Each job gets a clean container with the exact Python/Docker version specified — preventing environment drift between runs and ensuring test reproducibility.

Why a fresh plan in each apply job?

Instead of passing a tfplan artifact between jobs, each apply job generates a fresh plan immediately before applying. This prevents "Saved plan is stale" errors when a previous pipeline run partially created resources and modified the Terraform state.

Audit Commands

GitLab & Runners

# List all tasks in GitLab playbook
ansible-playbook --list-tasks playbooks/gitlab.yml

# List all tasks in runners playbook
ansible-playbook --list-tasks playbooks/runners.yml

# Check GitLab service status
vagrant ssh -c "sudo gitlab-ctl status"

# Check GitLab Runner service status
vagrant ssh -c "sudo systemctl status gitlab-runner"

# List registered runners
vagrant ssh -c "sudo gitlab-runner list"

# Verify GitLab is accessible
curl -s -o /dev/null -w "%{http_code}" http://192.168.56.11

Protected Branches

# Verify protected branches via API
curl -s --header "PRIVATE-TOKEN: <your-token>" \
  "http://192.168.56.11/api/v4/projects/root%2Finventory-app/protected_branches" \
  | python3 -m json.tool

CI/CD Variables

# Verify CI/CD variables (values are masked in output)
curl -s --header "PRIVATE-TOKEN: <your-token>" \
  "http://192.168.56.11/api/v4/projects/root%2Finventory-app/variables" \
  | python3 -m json.tool

Terraform

# Verify Terraform on runner
vagrant ssh -c "terraform version"

# Verify AWS CLI on runner
vagrant ssh -c "aws --version"

# Check Terraform state
cd srcs/cloud-design-infra/cloud-design/environments/staging
terraform state list

AWS

# Verify AWS credentials
aws sts get-caller-identity

# Check ECS services — staging
aws ecs list-services --cluster cloud-design-cluster-staging

# Check ECS services — production
aws ecs list-services --cluster cloud-design-cluster-prod

# Check ECR repositories
aws ecr describe-repositories

# Check Secrets Manager
aws secretsmanager list-secrets --query 'SecretList[?starts_with(Name, `cloud-design/`)].Name'

Troubleshooting

GitLab takes too long to start

vagrant ssh -c "sudo gitlab-ctl tail"
# Wait until: "gitlab Reconfigured!"

Runner not picking up jobs

vagrant ssh -c "sudo gitlab-runner verify"
vagrant ssh -c "sudo systemctl restart gitlab-runner"
# Check config
vagrant ssh -c "sudo cat /etc/gitlab-runner/config.toml"

Terraform state lock

terraform force-unlock <LOCK_ID>

ECR login fails in pipeline Verify AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ACCOUNT_ID, AWS_DEFAULT_REGION are set as protected + masked variables in GitLab project Settings → CI/CD → Variables.

VM clock drift (AWS signature errors)

vagrant ssh -c "sudo timedatectl set-ntp true && sudo systemctl restart systemd-timesyncd"
vagrant ssh -c "timedatectl status"

ECS deployment fails

aws ecs describe-services \
  --cluster cloud-design-cluster-staging \
  --services inventory-app-service-staging \
  --query 'services[0].events[:5]'

Repositories

Repository	Description	Runner	Pipeline
`inventory-app`	Inventory service + PostgreSQL	docker	CI + CD
`billing-app`	Billing service + PostgreSQL + RabbitMQ	docker	CI + CD
`api-gateway`	API Gateway routing to microservices	docker	CI + CD
`cloud-design-infra`	Terraform infrastructure (staging + prod)	shell	Infra pipeline

All repositories accessible at: http://192.168.56.11

Authors

Ahmadou Bamba Diéne — [sdiene]
Mouhamed Ngom — [mhnom]
Seynabou Niang — [sniang]

Zone01 — DevOps Project

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
inventory		inventory
playbooks		playbooks
scripts		scripts
srcs		srcs
.gitignore		.gitignore
README.md		README.md
Vagrantfile		Vagrantfile

Folders and files

Latest commit

History

Repository files navigation

🚀 Code Keeper — CI/CD Pipeline for Microservices on AWS

📋 Table of Contents

Architecture Overview

CI/CD Flow

Prerequisites

Local Machine

AWS Requirements

Hardware

Project Structure

Setup Guide

Step 0 — Create AWS IAM Pipeline User

Step 1 — Start the GitLab VM

Step 2 — Run the Setup Script

Step 3 — Deploy Infrastructure

Step 4 — Deploy Applications

Pipelines Description

Infrastructure Pipeline (cloud-design-infra)

Application CI Pipeline (all 3 app repos)

Application CD Pipeline (all 3 app repos)

Pipeline Triggers

Security & Cybersecurity

1. Triggers Restricted to Protected Branches

2. Credentials Separated from Code

3. Least Privilege Principle

4. Dependency Updates & Security Scanning

Technical Decisions

Why GitLab CE (not GitHub Actions or Jenkins)?

Why Ansible for GitLab deployment?

Why Vagrant + VirtualBox (not AWS EC2)?

Why AWS ECS EC2 (not Kubernetes or Fargate)?

Why Terraform (not CloudFormation)?

Why Bandit + Trivy (not SonarQube)?

Why shell runner for Terraform?

Why docker runner for applications?

Why a fresh plan in each apply job?

Audit Commands

GitLab & Runners

Protected Branches

CI/CD Variables

Terraform

AWS

Troubleshooting

Repositories

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Infrastructure Pipeline (`cloud-design-infra`)

Packages