Skip to content

Seynabou96/code-keeper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Code Keeper — CI/CD Pipeline for Microservices on AWS

Complete DevOps pipeline: GitLab CE deployed via Ansible on a Vagrant VM, CI/CD pipelines for 3 microservices + 1 infrastructure repository, deployed on AWS ECS (staging & production) using Terraform.


📋 Table of Contents


Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        LOCAL MACHINE                            │
│                                                                 │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │              Vagrant VM (Ubuntu 22.04)                  │   │
│   │              192.168.56.11 | 4GB RAM | 2 CPUs           │   │
│   │                                                         │   │
│   │   ┌──────────────┐   ┌───────────────────────────────┐  │   │
│   │   │  GitLab CE   │   │       GitLab Runners          │  │   │
│   │   │  :80         │   │  shell+terraform | docker+py  │  │   │
│   │   └──────────────┘   └───────────────────────────────┘  │   │
│   └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│   Ansible Playbooks:                                            │
│   gitlab.yml → runners.yml → gitlab_users.yml                  │
│   → gitlab_projects.yml → push_repos.yml → protect_branches.yml│
└─────────────────────────────────────────────────────────────────┘
                              │
                    GitLab CI/CD Pipelines
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         AWS CLOUD                               │
│                                                                 │
│   ┌──────────────────────────┐  ┌──────────────────────────┐   │
│   │    STAGING ENVIRONMENT   │  │  PRODUCTION ENVIRONMENT  │   │
│   │                          │  │                          │   │
│   │  ┌────────────────────┐  │  │  ┌────────────────────┐  │   │
│   │  │   ECS Cluster      │  │  │  │   ECS Cluster      │  │   │
│   │  │  ┌──────────────┐  │  │  │  │  ┌──────────────┐  │  │   │
│   │  │  │ api-gateway  │  │  │  │  │  │ api-gateway  │  │  │   │
│   │  │  │ inventory-app│  │  │  │  │  │ inventory-app│  │  │   │
│   │  │  │ billing-app  │  │  │  │  │  │ billing-app  │  │  │   │
│   │  │  └──────────────┘  │  │  │  │  └──────────────┘  │  │   │
│   │  └────────────────────┘  │  │  └────────────────────┘  │   │
│   │  ┌────────────────────┐  │  │  ┌────────────────────┐  │   │
│   │  │   ALB + VPC        │  │  │  │   ALB + VPC        │  │   │
│   │  │   Security Groups  │  │  │  │   Security Groups  │  │   │
│   │  └────────────────────┘  │  │  └────────────────────┘  │   │
│   └──────────────────────────┘  └──────────────────────────┘   │
│                                                                 │
│   ┌──────────────────────────────────────────────────────────┐  │
│   │  ECR: api-gateway-app | inventory-app | billing-app      │  │
│   │       inventory-database | billing-database | rabbitmq   │  │
│   └──────────────────────────────────────────────────────────┘  │
│   ┌──────────────────────────────────────────────────────────┐  │
│   │  S3: Terraform state backend (staging + production)      │  │
│   │  DynamoDB: Terraform state lock                          │  │
│   │  Secrets Manager: DB passwords, RabbitMQ credentials     │  │
│   └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

CI/CD Flow

Code Push to main (protected branch only)
            │
            ▼
    ┌───────────────┐
    │     BUILD     │  pip install / compile
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │     TEST      │  pytest (unit + integration)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │     SCAN      │  Bandit (SAST) + Trivy (image CVEs)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │ CONTAINERIZE  │  docker build + push to ECR
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │  APPROVE   ✋ │  Manual gate — validate before staging
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │DEPLOY STAGING │  aws ecs update-service (rolling update)
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │   APPROVE  ✋ │  Manual gate — stakeholder review
    └───────┬───────┘
            │
            ▼
    ┌───────────────┐
    │ DEPLOY PROD   │  aws ecs update-service + ecs wait
    └───────────────┘  (zero downtime rolling update)

Prerequisites

Local Machine

Tool Version Purpose
Vagrant >= 2.3 VM provisioning
VirtualBox >= 6.1 Hypervisor
Ansible >= 2.14 Configuration management
AWS CLI >= 2.0 AWS interaction
Git >= 2.30 Source control
# Install required Ansible collection
ansible-galaxy collection install community.general

AWS Requirements

  • AWS account with IAM user (gitlab-pipeline-user) — created automatically by scripts/create-iam-pipeline-user.sh
  • ECR repositories created via Terraform
  • S3 buckets for Terraform state backend:
    • staging-cloud-design-ssm-2026
    • production-cloud-design-ssm-2026
  • DynamoDB table for state locking: terraform-state-lock
  • Secrets in AWS Secrets Manager (prefix: cloud-design/)

Hardware

  • Minimum 8GB RAM on host machine (4GB allocated to VM)
  • 20GB free disk space

Project Structure

code-keeper/
├── Vagrantfile                      # VM definition (Ubuntu 22.04, 4GB RAM)
├── inventory/
│   └── hosts.yml                    # Ansible inventory
├── playbooks/
│   ├── gitlab.yml                   # Install & configure GitLab CE
│   ├── runners.yml                  # Install runners + AWS CLI v2 + Terraform
│   ├── gitlab_users.yml             # Create dev1, dev2 users
│   ├── gitlab_projects.yml          # Create 4 repos + CI/CD variables
│   ├── protect_branches.yml         # Protect main branches
│   └── push_repos.yml               # Push source code to GitLab
├── scripts/
│   ├── setup.sh                     # Guided interactive setup script
│   └── create-iam-pipeline-user.sh  # Create AWS IAM user + policy
└── srcs/
    ├── crud-master/
    │   ├── api-gateway/             # API Gateway service
    │   │   ├── app/
    │   │   ├── tests/
    │   │   └── .gitlab-ci.yml
    │   ├── billing-app/             # Billing service
    │   │   ├── app/
    │   │   ├── billing-database/
    │   │   ├── rabbitmq/
    │   │   ├── tests/
    │   │   └── .gitlab-ci.yml
    │   └── inventory-app/           # Inventory service
    │       ├── app/
    │       ├── inventory-database/
    │       ├── tests/
    │       └── .gitlab-ci.yml
    └── cloud-design-infra/
        └── cloud-design/
            ├── environments/
            │   ├── staging/         # Staging Terraform config + scripts
            │   │   └── scripts/
            │   │       └── update-dns.sh
            │   └── production/      # Production Terraform config + scripts
            │       └── scripts/
            │           └── update-dns.sh
            ├── modules/             # Reusable Terraform modules
            │   ├── alb/
            │   ├── ecs-ec2/
            │   ├── network/
            │   └── monitoring/
            └── .gitlab-ci.yml

Setup Guide

Step 0 — Create AWS IAM Pipeline User

Before starting, create the AWS IAM user that GitLab pipelines will use:

# Configure AWS CLI with admin credentials first
aws configure

# Run the IAM setup script
chmod +x scripts/create-iam-pipeline-user.sh
./scripts/create-iam-pipeline-user.sh

# Save the generated credentials — you will need them in Step 2

Step 1 — Start the GitLab VM

cd code-keeper/
vagrant up

This automatically runs playbooks/gitlab.yml which:

  • Installs GitLab CE
  • Disables public signup, enables admin approval
  • Sets CI/CD defaults (artifact expiry, max size)
  • Configures private visibility and main as default branch

Wait ~10 minutes for GitLab to fully initialize. Verify at: http://192.168.56.11

Step 2 — Run the Setup Script

chmod +x scripts/setup.sh
./scripts/setup.sh

The script guides you through all steps interactively:

Step Action
1 Auto-retrieves root password from VM
2 Prompts for GitLab Admin Token (create at GitLab UI → User Settings → Access Tokens)
1b Change root password
3 Prompts for Runner Tokens (GitLab UI → Admin → Runners → New instance runner)
4 Prompts for AWS credentials (from Step 0 output)
5 Prompts for dev1/dev2 passwords (min 12 chars)
6 Installs & registers GitLab Runners (shell+terraform + docker+python)
7 Creates GitLab users (dev1, dev2)
8 Creates 4 projects + injects CI/CD variables
9 Pushes source code to all 4 repos
10 Protects main branches on all 4 repos

Step 3 — Deploy Infrastructure

Trigger the cloud-design-infra pipeline from GitLab UI or push a change to main. The pipeline will:

  1. Initialize and validate Terraform
  2. Show the plan for review
  3. Wait for manual approval before applying to staging
  4. Wait for manual approval before applying to production
  5. Run update-dns.sh after each apply to update DuckDNS with the new ALB DNS

Step 4 — Deploy Applications

For each app repo (inventory-app, billing-app, api-gateway), push a change to main or trigger the pipeline manually. After containerization, manually approve staging and then production deployment.


Pipelines Description

Infrastructure Pipeline (cloud-design-infra)

init_staging ──► validate_staging ──► plan_staging ──► approve_staging ✋
    ──► apply_staging + update-dns.sh
         │
         └──► init_production ──► validate_production ──► plan_production
                  ──► approve_production ✋ ──► apply_production + update-dns.sh
Stage Job Description
init init_staging / init_production terraform init -reconfigure — providers + S3 backend
validate validate_staging / validate_production terraform validate + fmt -check
plan plan_staging / plan_production terraform plan — preview changes (no artifact stored)
apply_staging approve_staging Manual gate before staging
apply_staging apply_staging Fresh plan + apply + update-dns.sh
approve_production approve_production Manual gate — requires staging success
apply_production apply_production Fresh plan + apply + update-dns.sh
destroy destroy_staging / destroy_production Manual emergency destroy

Runner: shell (Terraform 1.10.0 + AWS CLI v2 pre-installed)

Note: Each apply job generates a fresh plan atomically just before applying — this prevents "stale plan" errors when the Terraform state has been modified by a previous partial run.

Application CI Pipeline (all 3 app repos)

build ──► test ──► scan ──► containerize
Stage Tool Description
build pip Install dependencies, validate build
test pytest Unit + integration tests
scan Bandit SAST — Python security analysis (SQL injection, hardcoded secrets, insecure patterns)
containerize Docker + Trivy + ECR Build image, scan for CVEs (HIGH/CRITICAL), push to AWS ECR with retry

Runner: docker (docker-in-docker, image: docker:24.0.5)

Application CD Pipeline (all 3 app repos)

containerize ──► approve_staging ✋ ──► deploy_staging
    ──► approve_production ✋ ──► deploy_production
Stage Description
approve_staging Manual gate — validate image before staging deployment
deploy_staging aws ecs update-service --force-new-deployment on staging cluster
approve_production Manual gate — validate staging before production
deploy_production aws ecs update-service + aws ecs wait services-stable (zero downtime)

Runner: docker (image: public.ecr.aws/docker/library/python:3.11-slim)

Note: ECR Public is used instead of Docker Hub to avoid pull rate limiting on the Vagrant VM.

Pipeline Triggers

All pipelines trigger only on protected branch main:

rules:
  - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'

Application containerize and deploy jobs additionally require file changes in the relevant directories (app/**/*, billing-database/**/*, etc.) to avoid unnecessary rebuilds.


Security & Cybersecurity

1. Triggers Restricted to Protected Branches

All pipeline jobs use:

rules:
  - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_REF_PROTECTED == "true"'

Only Maintainers can push to main. Force push is disabled. Developers can only merge via MR.

2. Credentials Separated from Code

  • No credentials in code, Terraform files, or Ansible playbooks
  • AWS credentials injected as masked + protected GitLab CI/CD variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ACCOUNT_ID, AWS_DEFAULT_REGION)
  • DB passwords and RabbitMQ credentials stored in AWS Secrets Manager (prefix: cloud-design/)
  • ECS tasks retrieve secrets at runtime via IAM task execution role
  • Runner tokens passed via -e flag at playbook execution time (never stored in files)

3. Least Privilege Principle

  • dev1, dev2 have Developer role — cannot push directly to main, cannot manage runners or variables
  • gitlab-pipeline-user IAM user has scoped permissions (ECR repositories, specific S3 buckets, ECS, specific Secrets Manager paths)
  • ECS task roles have only the permissions required by each service
  • gitlab-runner system user runs with minimal OS permissions

4. Dependency Updates & Security Scanning

  • Bandit runs on every pipeline — flags Python security issues (B-level severity)
  • Trivy scans every Docker image for CVEs (HIGH and CRITICAL) before push to ECR
  • Docker base images use pinned versions (docker:24.0.5, python:3.11-slim)
  • AWS provider pinned in Terraform (~> 5.0)
  • Pipeline uses public.ecr.aws mirror for Python images — avoids Docker Hub throttling and provides faster pulls from us-east-1

Technical Decisions

Why GitLab CE (not GitHub Actions or Jenkins)?

GitLab CE provides an all-in-one self-hosted solution: source control + CI/CD + runner management + protected branches in a single instance. No external dependencies, full control over the pipeline environment, and closer to a real enterprise setup.

Why Ansible for GitLab deployment?

Ansible provides idempotent, declarative configuration. The same playbooks can be re-run without side effects — making the setup reproducible and auditable. It also serves as living documentation of every configuration decision made on the VM.

Why Vagrant + VirtualBox (not AWS EC2)?

For a local development environment, Vagrant provides fully reproducible VMs without cloud costs. Any team member can reproduce the exact environment with vagrant up + ./setup.sh.

Why AWS ECS EC2 (not Kubernetes or Fargate)?

ECS integrates natively with ALB, ECR, Secrets Manager, CloudWatch, and Service Discovery with minimal operational overhead. EC2 launch type gives visibility into the underlying infrastructure while keeping costs lower than Fargate for always-on workloads.

Why Terraform (not CloudFormation)?

Terraform is cloud-agnostic, has a rich module ecosystem, and HCL is more readable than JSON/YAML CloudFormation. State management via S3 + DynamoDB locking enables safe team collaboration. Separate staging/production environments with independent backends prevent state cross-contamination.

Why Bandit + Trivy (not SonarQube)?

  • Bandit is purpose-built for Python SAST — catches SQL injection patterns, hardcoded secrets, insecure deserialization with zero configuration
  • Trivy covers runtime CVEs in OS packages and Python dependencies — what Bandit cannot see
  • Together they cover code and image security without SonarQube's 2GB+ RAM requirement (critical given the 4GB VM constraint)

Why shell runner for Terraform?

Terraform requires direct filesystem access for provider binaries and state operations. The shell executor runs directly on the VM where Terraform and AWS CLI v2 are pre-installed, avoiding Docker-in-Docker complexity for infrastructure jobs.

Why docker runner for applications?

Application pipelines need isolated, reproducible environments. Each job gets a clean container with the exact Python/Docker version specified — preventing environment drift between runs and ensuring test reproducibility.

Why a fresh plan in each apply job?

Instead of passing a tfplan artifact between jobs, each apply job generates a fresh plan immediately before applying. This prevents "Saved plan is stale" errors when a previous pipeline run partially created resources and modified the Terraform state.


Audit Commands

GitLab & Runners

# List all tasks in GitLab playbook
ansible-playbook --list-tasks playbooks/gitlab.yml

# List all tasks in runners playbook
ansible-playbook --list-tasks playbooks/runners.yml

# Check GitLab service status
vagrant ssh -c "sudo gitlab-ctl status"

# Check GitLab Runner service status
vagrant ssh -c "sudo systemctl status gitlab-runner"

# List registered runners
vagrant ssh -c "sudo gitlab-runner list"

# Verify GitLab is accessible
curl -s -o /dev/null -w "%{http_code}" http://192.168.56.11

Protected Branches

# Verify protected branches via API
curl -s --header "PRIVATE-TOKEN: <your-token>" \
  "http://192.168.56.11/api/v4/projects/root%2Finventory-app/protected_branches" \
  | python3 -m json.tool

CI/CD Variables

# Verify CI/CD variables (values are masked in output)
curl -s --header "PRIVATE-TOKEN: <your-token>" \
  "http://192.168.56.11/api/v4/projects/root%2Finventory-app/variables" \
  | python3 -m json.tool

Terraform

# Verify Terraform on runner
vagrant ssh -c "terraform version"

# Verify AWS CLI on runner
vagrant ssh -c "aws --version"

# Check Terraform state
cd srcs/cloud-design-infra/cloud-design/environments/staging
terraform state list

AWS

# Verify AWS credentials
aws sts get-caller-identity

# Check ECS services — staging
aws ecs list-services --cluster cloud-design-cluster-staging

# Check ECS services — production
aws ecs list-services --cluster cloud-design-cluster-prod

# Check ECR repositories
aws ecr describe-repositories

# Check Secrets Manager
aws secretsmanager list-secrets --query 'SecretList[?starts_with(Name, `cloud-design/`)].Name'

Troubleshooting

GitLab takes too long to start

vagrant ssh -c "sudo gitlab-ctl tail"
# Wait until: "gitlab Reconfigured!"

Runner not picking up jobs

vagrant ssh -c "sudo gitlab-runner verify"
vagrant ssh -c "sudo systemctl restart gitlab-runner"
# Check config
vagrant ssh -c "sudo cat /etc/gitlab-runner/config.toml"

Terraform state lock

terraform force-unlock <LOCK_ID>

ECR login fails in pipeline Verify AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ACCOUNT_ID, AWS_DEFAULT_REGION are set as protected + masked variables in GitLab project Settings → CI/CD → Variables.

VM clock drift (AWS signature errors)

vagrant ssh -c "sudo timedatectl set-ntp true && sudo systemctl restart systemd-timesyncd"
vagrant ssh -c "timedatectl status"

ECS deployment fails

aws ecs describe-services \
  --cluster cloud-design-cluster-staging \
  --services inventory-app-service-staging \
  --query 'services[0].events[:5]'

Repositories

Repository Description Runner Pipeline
inventory-app Inventory service + PostgreSQL docker CI + CD
billing-app Billing service + PostgreSQL + RabbitMQ docker CI + CD
api-gateway API Gateway routing to microservices docker CI + CD
cloud-design-infra Terraform infrastructure (staging + prod) shell Infra pipeline

All repositories accessible at: http://192.168.56.11


Authors

  • Ahmadou Bamba Diéne — [sdiene]
  • Mouhamed Ngom — [mhnom]
  • Seynabou Niang — [sniang]

Zone01 — DevOps Project

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors