Regain Cloud SQL disk space with Database Migration Service
So you made a mistake, you pushed an application change that accidentally caused a massive surge of data to hit your database. And now your instance has made happy use of the feature of automatically expanding your storage when you get close to running out of disk space. Most of the time, this is a lifesaving feature that prevents downtime due to a gradually increasing database running out of space. But this time a mistake causes your storage to spike much higher than it normally would. Thankfully you’ve now fixed the application bug and your database isn’t being overrun anymore, but the damage has been done and now your disk is much bigger than it needs to be. And while the adage “Storage is cheap” might be true, if you’re renting it instead, you want to be sure you’re not paying for too much space you’re not using.
There’s not (yet) a good way to simply shrink the size of your Cloud SQL instance’s disk. Block storage, while giving us the performance we want, unfortunately is a bit more challenging to shrink down to size. If you look at what’s out there as techniques for doing things like this, the advice is often to take an export of your data, create the new instance, and import the data into the instance with an appropriately sized disk. Or be a bit sneaky, and set up an external replica (which can have a smaller disk size than the primary unlike a regular replica), then promote once things are all sync’d up.
There’s nothing wrong with doing that. Those methods really are the right ways to do this. But setting up the infrastructure to do it right on top of keeping production systems running, as well as all of them involving at least some downtime, might be a cost-benefit analysis that results in just leaving the disk alone and calling it good enough.
At the end of the day, this process is akin to a migration, from the existing instance that’s grown unnecessarily to a right-sized instance. To make migrations a bit less daunting, we’ve built Database Migration Service. Google’s managed migration service makes this nice and simple. You just need to define a connection profile for the Cloud SQL source instance you want to reduce the disk for, and DMS guides you through creating the destination instance where you can specify the smaller disk.
There are some things to think about which can make this a bit easier. First, you need to have your instance prepared for migration, which can, depending on your set up, involve a restart of the database. If you haven’t ever migrated a database, we’ve got a couple good blog posts for MySQL and PostgreSQL.
Second, the IP address your application uses is going to change because you’ll need to point at the new instance once it’s promoted. Now, one thing is that with DMS, when you set up the migration, you can set it up as a streaming migration. So no worries about the export/import catch up that you might normally need to do. This will persist right up until you promote the new instance to be the primary. This means you can take the time you need to prepare the application and other tooling, as well as testing the new instance before cutting over.
One way to account for that is that you can leave the target instance as a replica while you cut over all applications. That way you have coverage by leaving both instances up while your applications catch up with the new IP address. This results in the least downtime, but there may be complications depending on how the data in the database is being used by the applications.
A good way to future proof this if you’re not doing it already, is have your application use the Cloud SQL Auth Proxy.
Of course if you have an established instance this would result in downtime as well as the logic would have to change in the application to point at where you have the proxy running (preferably on the same machine as the application) and you’d have to cut over. The benefit of having it this way is that repointing the application involves the briefest of downtime while you shut down the proxy pointing at the old instance and restart it pointing at the new one.
Other ways of doing a similar thing would be things like putting a load balancer or HTTP proxy in front of the instance so that your applications all have a consistent IP address to connect to, and that leaves you able to swap around the databases behind it without having to change application code and redeploy. Isolating infrastructure changes from your application code. These can be as complex or as simple as you need them to be. For example, in the following diagram you see how a load balancer is paired with a high availability proxy to spread out read operations.
If you’re interested in diving down the rabbit hole on proxying in various ways to your Cloud SQL instance, there’s a great article here to check out.