Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 4 additions & 7 deletions COPYING
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright 2021 Oden Technologies Inc. except as noted below.
Copyright 2021-2024 Oden Technologies Inc. except as noted below.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use
this file except in compliance with the License. You may obtain a copy of the
Expand All @@ -23,7 +23,7 @@ https://opensource.org/licenses/GPL-2.0
https://opensource.org/licenses/BSD-2-Clause
https://opensource.org/licenses/MIT

PGPool-II is Copyright 2003-2021 the PgPool Global Development Group and is
PGPool-II is Copyright 2003-2024 the PgPool Global Development Group and is
released under the PGPool license:
https://github.com/pgpool/pgpool2/blob/master/COPYING

Expand All @@ -34,15 +34,12 @@ pgpool2_exporter is Copyright (c) 2021 PgPool Global Development Group and is
released under the MIT license:
https://github.com/pgpool/pgpool2_exporter/blob/master/LICENSE

Telegraf is Copyright (c) 2015-2020 InfluxData Inc. and is released under the
MIT license: https://github.com/influxdata/telegraf/blob/master/LICENSE

PostgreSQL and its libraries are released under the PostgreSQL License:
https://www.postgresql.org/about/licence/
Portions Copyright (c) 1996-2021, The PostgreSQL Global Development Group
Portions Copyright (c) 1996-2024, The PostgreSQL Global Development Group
Portions Copyright (c) 1994, The Regents of the University of California

The Google Cloud SDK and the "gcloud" CLI tool are Copyright (c) 2021 by Google
The Google Cloud SDK and the "gcloud" CLI tool are Copyright (c) 2024 by Google
Inc, and are released under the Apache License v2.0:
https://www.apache.org/licenses/LICENSE-2.0

Expand Down
4 changes: 0 additions & 4 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,6 @@ FROM --platform=${PLATFORM} alpine:${ALPINE_VERSION}
RUN apk update
RUN apk add --no-cache curl python3

ARG TELEGRAF_VERSION=1.26.2
RUN curl -sfL https://dl.influxdata.com/telegraf/releases/telegraf-${TELEGRAF_VERSION}_linux_amd64.tar.gz |\
tar zxf - --strip-components=2 -C /

# we build this in the deploy container because there's no guarantee
# that golang:XXX-alpine and alpine:YYY will have the same python versions
RUN mkdir -p /usr/local/gcloud \
Expand Down
90 changes: 32 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,6 @@ The primary moving parts are:
exposes a Prometheus-compatible `/metrics` HTTP endpoint with statistics gleaned
from PGPool's [internal stats commands](https://www.pgpool.net/docs/42/en/html/sql-commands.html).

- The [telegraf](https://github.com/influxdata/telegraf) monitoring agent,
configured to forward the metrics exposed by the exporter to Google Cloud
Monitoring, formerly known as Stackdriver. (This portion is optional; if you
have an existing apparatus for scraping prometheus metrics, you can point
it directly at the exporter.)

In general you should expect that if you add a new replica, it should start
taking 1/Nth of the select query load (where N is the number of replicas)
within 5 to 15 minutes (beware of stackdriver reporting lag!).
Expand All @@ -50,7 +44,8 @@ instance no matter what. This is configureable at deploy time as

Old Version | New Version | Upgrade Guide
--- | --- | ---
v1.3.2 | v1.3.3 | [link](UPGRADE.md#v132--v13r)
v1.3.3 | v1.4.0 | [link](UPGRADE.md#v133--v140)
v1.3.2 | v1.3.3 | [link](UPGRADE.md#v132--v133)
v1.3.1 | v1.3.2 | [link](UPGRADE.md#v131--v132)
v1.3.0 | v1.3.1 | [link](UPGRADE.md#v130--v131)
v1.2.0 | v1.3.0 | [link](UPGRADE.md#v120--v130)
Expand Down Expand Up @@ -81,7 +76,7 @@ helm repo update
```sh
export RELEASE_NAME=my-pgpool-service # a name (you will need 1 installed chart for each primary DB)
export NAMESPACE=my-k8s-namespace # a kubernetes namespace
export CHART_VERSION=1.3.3 # a chart version: https://github.com/odenio/pgpool-cloudsql/releases
export CHART_VERSION=1.4.0 # a chart version: https://github.com/odenio/pgpool-cloudsql/releases
export VALUES_FILE=./my_values.yaml # your values file

helm install \
Expand Down Expand Up @@ -133,7 +128,6 @@ Parameter | Description | Default
`deploy.resources.pgpool` | Kubernetes [resource block](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the pgpool container | `{}`
`deploy.resources.discovery` | Kubernetes [resource block](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the discovery container. | `{}`
`deploy.resources.exporter` | Kubernetes [resource block](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the pgpool2_exporter container. | `{}`
`deploy.resources.telegraf` | Kubernetes [resource block](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the telegraf container. | `{}`
`deploy.startupProbe.pgpool.enabled` | whether to create a [startup probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes) for the pgpool container | `true`
`deploy.startupProbe.pgpool.initialDelaySeconds` | | `5`
`deploy.startupProbe.pgpool.periodSeconds` | | `5`
Expand Down Expand Up @@ -213,20 +207,6 @@ Parameter | Description | Default
<hr>
</details>

## telegraf options

<details>
<summary>Show More</summary>
<hr>

Parameter | Description | Default
--- | --- | ---
`telegraf.enabled` | If true, deploy and configure the telegraf container | `true`
`telegraf.exitOnError` | Exit the container if the telegraf process exits | `false`

<hr>
</details>

## pgpool options

<details>
Expand Down Expand Up @@ -266,11 +246,36 @@ pgpool.maxSpareChildren | When using [dynamic process management](https://www.pg
<hr>
</details>

# Monitoring with Google Cloud Monitoring (AKA Stackdriver)
# Monitoring

## Prometheus configuration

The `pgpool2_exporter` container exposes a prometheus-style `/metrics` endpoint
on port 9090/tcp (and named as the `metrics` port in the pod port
configuration), and if you have an existing local Prometheus infrastructure or
if you are using the [Google Managed Service for
Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus), you
can scrape and ingest those metrics directly.

The details of this will be specific to your local Prometheus setup, but
traditionally prometheus-on-kubernetes collection agents (whether the native
prometheus one or the Opentelemetry Collector) use Kubernetes annotations to
configure their scrape targets. To add annotations to your pods, use the
`.deploy.annotations` value in your Helm values.yaml, for example:

```yaml
deploy:
annotations:
prometheus.io/scrape: enabled
prometheus.io/path: /metrics
prometheus.io/port: metrics
```

## Metrics list

If the telegraf container is enabled, pgpool-cloudsql exports the following Google Cloud Monitoring
Pgpool-cloudsql exports the following Google Cloud Monitoring
[metricDescriptors](https://cloud.google.com/monitoring/custom-metrics/creating-metrics)
with the [gke_container](https://cloud.google.com/monitoring/api/resources#tag_gke_container)
with the [prometheus_target](https://cloud.google.com/monitoring/api/resources#tag_prometheus_target)
resource type and all resource labels automatically filled in.

An example Stackdriver dashboard definition can be found in
Expand All @@ -283,7 +288,7 @@ The full list of metricDescriptor definitions is in
<summary>Full metric descriptor list</summary>
<hr>

All metricDescriptors are created under the `custom.googleapis.com/telegraf/` prefix.
All metricDescriptors are created under the `prometheus.googleapis.com/` prefix.

Metric Descriptor | List of Metric Labels
--- | ---
Expand Down Expand Up @@ -334,30 +339,6 @@ Metric Descriptor | List of Metric Labels
<hr>
</details>

# Monitoring with Prometheus directly

Using telegraf to forward prometheus metrics to Google Cloud
Monitoring/Stackdriver is optional: the `pgpool2_exporter` container exposes a
prometheus-style `/metrics` endpoint on port 9090/tcp (and named as the
`metrics` port in the pod port configuration), and if you have an existing
local Prometheus infrastructure or if you are using the [Google Managed Service
for Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus),
you can scrape and ingest those metrics directly.

The details of this will be specific to your local Prometheus setup, but
traditionally prometheus-on-kubernetes collection agents (whether the native
prometheus one or the Opentelemetry Collector) use Kubernetes annotations to
configure their scrape targets. To add annotations to your pods, use the
`.deploy.annotations` value in your Helm values.yaml, for example:

```yaml
deploy:
annotations:
prometheus.io/scrape: enabled
prometheus.io/path: /metrics
prometheus.io/port: metrics
```

# background info: maybe the real friends were all the yaks we shaved along the way

A billion fussy details stood in between us and the simple-seeming state of
Expand Down Expand Up @@ -423,10 +404,3 @@ But there was some good news: the
binary that scrapes and parses the data returned by the "sql-like" commands and
exports it as a prometheus-compatible `/metrics` endpoint.

## Oh god, Telegraf

Lastly, it's worth calling out the [telegraf config
file](Helm/templates/configmap.yaml); in order to correctly fill in the
required attributes of a stackdriver `gke_container` resource, we run telegraf
under a [wrapper script](bin/telegraf.sh) that queries the GCP instance
metadata API in order to fill out the relevant environment variables.
10 changes: 10 additions & 0 deletions UPGRADE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Upgrading Steps

## `v1.3.3` → `v1.4.0`

The 1.4.0 release removes support for using
[telegraf](https://github.com/influxdata/telegraf) to publish metrics to Google
Cloud Monitoring: it is recommended that you configure [Google Cloud Managed
Service for Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus) if
you are running pgpool-cloudsql in a Google Kubernetes Engine cluster, or use
the [Prometheus Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md)
of the [Opentelemetry Collector](https://opentelemetry.io/docs/collector/) otherwise.

## `v1.3.2` → `v1.3.3`

This is a maintenance release:
Expand Down
36 changes: 0 additions & 36 deletions bin/telegraf.sh

This file was deleted.

2 changes: 1 addition & 1 deletion charts/pgpool-cloudsql/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ apiVersion: v2
description: the pgpool-ii connection pooling postgres proxy with automatic discovery of GCP CloudSQL backends
name: pgpool-cloudsql
type: application
version: 1.3.3
version: 1.4.0
keywords:
- postgresql
- pgpool
Expand Down
44 changes: 0 additions & 44 deletions charts/pgpool-cloudsql/templates/configmap.yaml

This file was deleted.

21 changes: 0 additions & 21 deletions charts/pgpool-cloudsql/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -225,27 +225,6 @@ spec:
- name: EXIT_ON_ERROR
value: {{ .Values.exporter.exitOnError | default "false" | quote }}

{{- if eq .Values.telegraf.enabled "true" }}
- name: telegraf
image: "{{ .image }}"
imagePullPolicy: Always
{{- with .Values.deploy.resources.telegraf }}
resources:
{{- toYaml . | nindent 10 }}
{{- end }}
command:
- /usr/bin/telegraf.sh
env:
- name: EXIT_ON_ERROR
value: {{ .Values.telegraf.exitOnError | default "false" | quote }}
volumeMounts:
- name: telegraf-config
mountPath: /etc/telegraf
{{- end }}

volumes:
- name: etcdir
emptyDir: {}
- name: telegraf-config
configMap:
name: "{{ .Release.Name }}-telegraf"
Loading
Loading