GitHub - aaa2ppp/pg-bulk-flow: Bulk Data Insertion Benchmark for PostgreSQL

Bulk Data Insertion Benchmark for PostgreSQL

Overview

A high-performance benchmarking tool designed to evaluate and compare different bulk insertion methods in PostgreSQL. The tool provides empirical data to help determine the optimal insertion strategy for large-scale data loading scenarios.

Key Features

Insertion Methods

Method	Description
`copyfrom`	Direct PostgreSQL COPY protocol
`pgxbatch`	Batched prepared statements using pgx library
`unnestbatch`	Array-based bulk operations using UNNEST

Benchmarking Capabilities

Stream processing
Configurable batch sizes (for pgxbatch and unnestbatch)
Memory and CPU profiling integration
Pipeline mode for concurrent processing
Clean environment management (--truncate)

Performance Metrics

The tool outputs detailed statistics including:

Total execution time
Records inserted
CPU usage
Memory allocation

Installation & Setup

# Clone and build
git clone https://github.com/aaa2ppp/pg-bulk-flow.git
cd pg-bulk-flow
make build

# Configure environment
cp env.example .env
nano .env  # Set your DB parameters and others

# Start test environment
make db-up # Launches PostgreSQL in Docker
make migrate-up

# or (if an external database is used)
make migrate-up USE_EXTERNAL_DB=yes # will be used DB_ADDR from .env

Usage Examples

Basic Benchmark

./bin/fillnames -method pgxbatch -batch 5000 -truncate

Comparative Analysis

mkdir -p ./tmp

# Test all methods with 10k batches
for method in copyfrom pgxbatch unnestbatch; do
  ./bin/fillnames -method $method -batch 10000 -truncate | tee ./tmp/results_${method}.json
done

Advanced Profiling

mkdir -p ./tmp

# CPU profiling
./bin/fillnames -method pgxbatch -cpuprofile=./tmp/pgx_cpu.pprof

# Memory analysis
./bin/fillnames -method unnestbatch -memprofile=./tmp/unnest_mem.pprof

Visualization

For results analysis, consider:

# Generate comparative charts
find ./tmp -name '*.pprof' | while read -r file; do
  go tool pprof -png -output="${file%.pprof}.png" ./bin/fillnames "$file"
done

# Process JSON results
jq '{method: .config.method, records_sec: (.stats.inserted/(.stats.elapsed/1000))}' ./tmp/*.json

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
cmd/fillnames		cmd/fillnames
data/names		data/names
internal		internal
migrations		migrations
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
env.example		env.example
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bulk Data Insertion Benchmark for PostgreSQL

Overview

Key Features

Insertion Methods

Benchmarking Capabilities

Performance Metrics

Installation & Setup

Usage Examples

Basic Benchmark

Comparative Analysis

Advanced Profiling

Visualization

Technical Considerations

Test Environment

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bulk Data Insertion Benchmark for PostgreSQL

Overview

Key Features

Insertion Methods

Benchmarking Capabilities

Performance Metrics

Installation & Setup

Usage Examples

Basic Benchmark

Comparative Analysis

Advanced Profiling

Visualization

Technical Considerations

Test Environment

Limitations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages