Cloud SQL 是 Google Cloud 上的一个数据库托管服务，支持 MySQL, PostgreSQL 和 SQL Server。 Cloud SQL 提供了高可用性 (HA)，Read replica，自动备份等功能，降低数据库运维工作量。

OneTracker (https://onetracker.app) 的后端数据库使用 MySQL，在项目刚开始时使用了 Cloud SQL 上最小的 instance，节省了开发时间。由于流量逐渐增加，需要使用更大的 instance，Cloud SQL 的额外收费也越来越多。本文介绍我把数据库从 Cloud SQL 迁移到 GCE (Google Compute Engine) 上 MySQL 的过程。

价格

Cloud SQL 的收费大概是相同 GCE VM 的两倍，并且不支持稍高性价比的 E2 型 VM，如 e2-medium (1 vCPU, 4GB RAM) $24.46/mo。HA 配置价格 x2。

Instance	GCE	Cloud SQL	Ratio
g1-small (0.5 vCPU, 1.7GB RAM)	$11.76	$25.55	217%
n1-standard-1 (1 vCPU, 3.75GB RAM)	$24.27	$49.31	203%
n1-standard-2 (2 vCPU, 7.5GB RAM)	$48.54	$98.62	203%

功能

先对比一下比较有可能用到的 Cloud SQL 的功能在 GCE 上的对应。

Feature	Cloud SQL	GCE VM
VM	f1, g1, n1	All types
HA	Regional PD + Active/Standby + VIP	Regional disk + secondary launch
Auto backup	Automatic	Snapshot schedule
Monitoring	Automatic	Stackdriver agent
Storage increase	Automatic	Manual PD resizing + growpart
Point-in-time recovery	Half-manual	Backups + binlog

HA

Cloud SQL 的 HA 实现是使用 Regional PD（Regional Persistent Disk，一个同步备份到另一个 Availability Zone 的块存储设备），在 Primary instance 无响应时把数据库 VIP 接收到的流量转发到 Standby instance。这个过程从检测到完成切换需要数分钟时间。

在 GCE 里可以使用任意一种 HA 方案，我选择了使用 Regional PD 并在 Primary 所在的 Zone 失效时启动 Secondary，挂载同一个 PD。这个过程可以手动触发，也可以自动化。使用脚本启动 GCE VM 并且 MySQL 就绪需要 1-2 分钟时间。由于这个方法不需要一直维持两个 VM 运行，相对 Cloud SQL 可以省更多的钱。

自动备份

GCE 支持 Snapshot schedule，可以定期自动备份 Persistent Disk，snapshot 支持 regional 和 multi-regional 存储。

监控

Stackdriver Agent 可以把 VM 上采集的监控数据上传到 GCP，支持采集 MySQL 和 InnoDB 相关数据。

存储扩容

GCE PD 支持在挂载状态下调整大小，结合 growpart 命令调整分区大小，可以实现在不中断服务的情况下扩展存储空间。

Point-in-time recovery

Cloud SQL 使用备份 + binary log 实现 Point-in-time recovery，GCE 上的部署可以用类似的方法。

性能

目前 OneTracker 数据库的 QPS 峰值大概在 500 左右，我做了一些 benchmark 来测试 GCE 部署是否能满足需求。

测试参数

Client:       n2-standard-8
MySQL Server: e2-small
Disk:         32GB Regional PD-SSD, 960 IOPS, 15.36/7.68 MB/s r/w throughput
Table size:   10,000,000
Threads:      8

# create table
sysbench oltp_read_write --table_size=10000000 --db-driver=mysql --mysql-host=x.x.x.x --mysql-db=test --mysql-user=benchmark --mysql-password=*** prepare

# r/w
sysbench oltp_read_write --table_size=10000000 --db-driver=mysql --mysql-host=x.x.x.x --mysql-db=test --mysql-user=benchmark --mysql-password=*** --threads=8 --time=120 --report-interval=10 run

# read only
sysbench oltp_read_only --table_size=10000000 --db-driver=mysql --mysql-host=x.x.x.x --mysql-db=test --mysql-user=benchmark --mysql-password=*** --threads=8 --time=120 --report-interval=10 run

结果

[ 10s ] thds: 8 tps: 244.16 qps: 4891.33 (r/w/o: 3425.39/976.83/489.11) lat (ms,95%): 51.94
[ 20s ] thds: 8 tps: 298.21 qps: 5963.54 (r/w/o: 4174.40/1192.73/596.41) lat (ms,95%): 43.39
[ 30s ] thds: 8 tps: 335.10 qps: 6701.90 (r/w/o: 4691.10/1340.60/670.20) lat (ms,95%): 36.24
[ 40s ] thds: 8 tps: 326.70 qps: 6536.50 (r/w/o: 4576.20/1306.90/653.40) lat (ms,95%): 34.33
[ 50s ] thds: 8 tps: 353.10 qps: 7061.10 (r/w/o: 4942.50/1412.40/706.20) lat (ms,95%): 33.72
[ 60s ] thds: 8 tps: 316.70 qps: 6336.30 (r/w/o: 4435.30/1267.60/633.40) lat (ms,95%): 40.37
[ 70s ] thds: 8 tps: 249.30 qps: 4985.63 (r/w/o: 3489.42/997.61/498.60) lat (ms,95%): 48.34
[ 80s ] thds: 8 tps: 82.50 qps: 1646.00 (r/w/o: 1152.60/328.40/165.00) lat (ms,95%): 227.40
[ 90s ] thds: 8 tps: 88.60 qps: 1774.11 (r/w/o: 1242.10/354.80/177.20) lat (ms,95%): 219.36
[ 100s ] thds: 8 tps: 85.60 qps: 1715.90 (r/w/o: 1201.20/343.50/171.20) lat (ms,95%): 223.34
[ 110s ] thds: 8 tps: 89.80 qps: 1792.20 (r/w/o: 1253.80/358.90/179.50) lat (ms,95%): 219.36
[ 120s ] thds: 8 tps: 88.50 qps: 1770.50 (r/w/o: 1239.80/353.60/177.10) lat (ms,95%): 223.34
SQL statistics:
    queries performed:
        read:                            358274
        write:                           102364
        other:                           51182
        total:                           511820
    transactions:                        25591  (212.89 per sec.)
    queries:                             511820 (4257.87 per sec.)
    ignored errors:                      0      (0.00 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          120.2044s
    total number of events:              25591

Latency (ms):
         min:                                    8.80
         avg:                                   37.57
         max:                                  736.28
         95th percentile:                      207.82
         sum:                               961491.94

Threads fairness:
    events (avg/stddev):           3198.8750/263.25
    execution time (avg/stddev):   120.1865/0.00

由于 e2-small 是 burstable instance，只有 0.5 vCPU sustainable，但可以在短时间使用 2 个 vCPU。从上面的测试结果可以看出在测试开始 60 秒左右 QPS 开始下降。

性能下降时，CPU 被 throttle (75.1 steal)。剩下 0.25 * 2 = 0.5 vCPU。但即使是这种情况 e2-small 也能满足现在的需求。

部署

使用 Packer + Terraform 实现 Infrastructure as code。

Packer 是自动化镜像构建工具。用指定的 base image（通常是云服务平台上提供的标准 image）创建一个临时 VM，在上面执行一系列操作（安装+更新软件, 放置配置文件等等），并把操作后的 VM 打包成 image。下面的例子配置文件定义了构建 image 的临时 VM 的参数和要在上面执行的操作（provisioners）。其中 provision.sh 在 VM 上执行具体的安装配置工作。

{
    "builders": [
        {
            "type": "googlecompute",
            "project_id": "<project_id>",
            "network": "<network>",
            "zone": "us-central1-b",
            "machine_type": "e2-standard-2",
            "disk_type": "pd-ssd",
            "disk_size": "16",
            "source_image_family": "ubuntu-2004-lts",
            "ssh_username": "packer",
            "tags": "packer",
            "image_name": "mysql-",
            "image_storage_locations": ["us-central1"]
        }
    ],
    "provisioners": [
        {
            "type": "file",
            "source": "etc",
            "destination": "/tmp/etc"
        },
        {
            "type": "shell",
            "script": "./provision.sh"
        }
    ]
}

Terraform 可以用配置文件指定 VM 的参数（如镜像，CPU/RAM，挂载硬盘等）并创建/更新 VM。实现“一键”管理资源，提高效率，避免手动操作 GUI 出错。如果需要修改 VM 的参数，只需修改配置文件并应用即可。以下是一个例子配置文件:

resource "google_compute_instance" "mysql" {
    project      = "<project_id>"
    zone         = "us-central1-b"

    name         = "<instance_name>"
    machine_type = "e2-small"

    boot_disk {
        initialize_params {
            image = "mysql-xxxxx"
            type  = "pd-balanced"
        }
    }

    attached_disk {
        device_name = "mysql-data"
        source      = "<regional_pd_url>"
    }

    network_interface {
        network = "<network>"
    }
}

实际的配置文件会稍复杂一些，以支持多个 VM, Shielded VM, Access scope 等。

新架构

由于自己部署的 GCE 在同一个 VPC 内，可避免使用 Public IP 或者 VPC Peering。VM 也不需要有公网连接，提高了安全性。

迁移计划

有多种迁移方案可选，比如导出+导入，设置 replication，master-master 等等。由于 OneTracker 的特性（快递追踪）我选择在 12 月 25 凌晨进行迁移，这时圣诞礼物快递基本已经到了。主要用户都在北美，基本没有快递会在深夜送达。因此认为短时间的宕机（<30 分钟）不会造成大的影响，选择了最简单的导出+导入方案。在迁移过程中，源数据库设置为只读模式，这样服务不会完全中断，用户还可以看到数据。计划如下：

临时提高 GCE VM 的配置，启动。
启动迁移用 VM (n2-highmem-8 - 8vCPU, 64GB RAM)，在上面启动 Cloud SQL Proxy。
将 Cloud SQL 设置为只读模式 SET GLOBAL read_only = ON;。

在迁移 VM 上导出数据库到内存文件系统（避免 IO 成为瓶颈）:

mysqldump --host=127.0.0.1 --port=3306 --column-statistics=0 --single-transaction --set-gtid-purged=OFF --password --user=onetracker_prd onetracker_prd | pv > /tmp/onetracker_prd.sql

导入数据库到 GCE MySQL:

pv /tmp/onetracker_prd.sql | mysql --host=<instance_name>.c.<project_id>.internal --port=3306 --password --user=onetracker_prd onetracker_prd

停止 GCE VM 上的 MySQL 服务。
GCE VM 关机，更改为正常配置，启动。
验证数据库可连接。
后端 API 部署新配置，指向 GCE MySQL。

UI 修改

迁移过程中由于数据库是只读状态，用户只能查看数据，尝试登录或提交任何更改时会返回不太友好的错误消息：

替换错误消息后:

迁移实施

12 月 24 号，在移动端远程添加了一条公告消息：

25 号 01:20 UTC-8，在执行到 SET GLOBAL read_only = ON; 时出错，发现 Cloud SQL 里的 root 用户没有 SUPER 权限。原来这一步竟然只在自己部署的 MySQL 服务器上测试过。由于已经到预先公布的维护时间，决定直接把后端设置为维护模式，拒绝所有请求。此时用户无法读写数据。

01:39 导入完成，服务恢复。

从 Cloud SQL 迁移到 GCE

价格

功能

HA