README.md

[[_TOC_]]

# Helm Chart for GCP Cloud SQL Archive

## Overview

This chart archives specified GCP Cloud SQL databases into a GCP Cloud
Storage bucket.

GCP Cloud SQL has facilities for backups but these serve principally for
disaster recovery. This chart is intended to be used for archiving. The
idea is that you use this chart to backup up your databases every so often
to a Cloud Storage bucket with versioning enabled. For example, you might
back up once a month with your bucket configured to delete a version only
if the version is at least a year old and there are at least 12 other
versions. This way you have monthly archives going back a year. See also
[Cloud Storage "Object Lifecycle Management"][7].

This Helm chart is a thin wrapper around the [`gcloud` SQL export
command][1] `gcloud sql export sql`. The wrapper has the advantage that it
can dump multiple databases into individual files whereas calling `gcloud
sql export sql` directly puts all the databases into a single file.

The chart consists of a single Kubernetes CronJob. The assumption is that
the CronJob will run in the context of [Workload Identity][2] which gives
the CronJob the access it needs to both read the database and write to the
Cloud Storage bucket. It is up to the user of this chart to ensure that
the Kubernetes Service Account (specified by the
`kubernetesServiceAccountName` configuration directive) has been set up
with Workload Identity and links to a GCP service account with the
necessary permissions; see [section GCP service account
permissions](#gcp-service-account-permissions) below for more detail on
what permissions are required.

The archiving wrapper script is copied via a `git clone` into the Pod
during initialization, so this repository must remain *publicly*
viewable.

Note: The Cloud SQL database and the destination bucket **must** be in the same
GCP project.

## Example of use

To dump the database `addresses` from the Cloud SQL instance `mysql-1` in
the GCP project `gcp-proj-id` to the bucket `sql-dump1`, set these values
in your `values.yaml` file:

```
backup:
  project_id: gcp-proj-id
  instance:   mysql-1
  bucket:     gs://sql-dump1
  databases:
    - addresses
```

When the CronJob is done running the archive file will be in
`gs://<bucket>/<instance>--<database>` which, in this case, is
`gs://sql-dump1/mysql-1--addresses.gz`.

## Design

The chart is a single CronJob. The CronJob has an InitContainer that
copies the Python 3 script
[`gcp-cloudsql-archive.py`](gcp-cloudsql-archive.py) from this Git
repository to the `/root` directory. The main container is Google's
[`cloud-sdk` image][3] which contains the [`gcloud` executable][4]. The main container runs
`gcp-cloudsql-archive.py` with parameters and options from
[`values.yaml`](common/helm/values.yaml) (see "Configuration" below).
The program `gcp-cloudsql-archive.py` in turn calls `gcloud`.

See also the [man page for
`gcp-cloudsql-archive.py`](gcp-cloudsql-archive.md).

## Configuration

All configuration is made in
[`values.yaml`](common/helm/values.yaml). Values are required unless
marked otherwise.

* `namespace`: The Kubernetes namespace

* `kubernetesServiceAccountName`: This is the name of the Kubernetes
service account to run the cronjob under. It needs to be bound to a GCP
service account using Workload Identity. The GCP service account must
permission to dump the SQL data and write to the bucket; see [GCP service
account permissions](#gcp-service-account-permissions) below for more
details on the permissions needed.

* `backup:project_id`: the GCP Project id where the Cloud SQL and bucket reside.

* `backup:instance`: the name of the Cloud SQL instance.

* `backup:bucket`: The destination bucker. Should be of the form `gs://mybucket`.

* `backup:databases`: A list of databases to back up. Each
will be backed up to a separate object in the bucket. Example:

        backup:
          databases:
            - abc
            - def
            - xyz

* `backup:verbosity`: (optional) How verbose we want the `gcloud sql export sql`
command to be. See the [`gcloud sql export` documentation][1] for the allowed values.
Default: `info`.

* `backup:verbose`: (optional) Have the wrapper script be more verbose. Note that this
is different from `backup:verbosity`.
Default: `false`.

* `backup:dry_run`: (optional) Run in "dry-run" mode, that is, show what
will happen but do not actually run the backups. Default: `false`.

* `crontab`: a standard five-position [crontab time string][5].
Default: "30 6 1 * *" (run on the first of the month at 06:30).

* `sleepmode`: (optional) Run the CronJob but run `sleep infinity` rather than the gcloud backup.
This is useful for debugging.
Default: `false`.

## Important Notes

* Trying to export a non-existent database will cause the application to
hang. So, be careful when configuring.

* The Cloud SQL database and the destination bucket **must** be in the same
GCP project.

* This archive must be _publicly_ readable so that the InitContainer can copy the
script [`gcp-cloudsql-archive.py`](gcp-cloudsql-archive.py) to the main Pod.

## GCP service account permissions

The assumption is that users of this Helm chart will link the CronJob's
Kubernetes service account to a GCP service account using [Workload
Identity][2]. The GCP service account must have the permissions to be able read
the data from the Cloud SQL instance and write to the Cloud Storage
bucket. The permissions needed are described on the ["Exporting and
importing using SQL dump files" page][1].

You will also need to grant the service account associated with the Cloud
SQL instance the right to create storage objects. For more details on this
as well as Terraform code to enable these permissions see [this
StackOverflow answer][6].


[1]: https://cloud.google.com/sdk/gcloud/reference/sql/export/sql

[2]: https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity

[3]: https://gcr.io/google.com/cloudsdktool/cloud-sdk

[4]: https://cloud.google.com/sdk/gcloud

[5]: https://en.wikipedia.org/wiki/Cron

[6]: https://stackoverflow.com/a/69561246/268847

[7]: https://cloud.google.com/storage/docs/lifecycle