Tuesday, July 9, 2019

Re: [google-cloud-sql-discuss] Modify database schema and data as a part of deployment process


We as a startup want to avoid complex deployments where you would want to maintain old and new schema support in the code, deploy migrations manually and all the related pain. I did a lot of that in larger projects so I know how hard it can get.

Migrations are challenging but also common: every single ORM or query builder I find for node/ruby has the same built-in way of running migrations. It is weird that there is no official mainstream way to apply migrations for trivial applications in google cloud as a platform that provides both App engine with deployments and SQL database. There must be a community-defined or officially-defined process to start from for database migrations, but I didn't find any.


I like the idea to invent a docker container whos sole job would be to run migrations.
Can you explain how this container can be defined and plugged into deployment process in a way that this container is executed after the image preparation but before the setting of the traffic split and make sure it exists with  status code before continuing the deployment?
Doc links are welcome.


Step #1: INFO[0116] Taking snapshot of full filesystem...
Step #1: INFO[0146] CMD yarn start
....

Finished Step #1
PUSH
DONE
-------------------------

Updating service [default] (this may take several minutes)...done.
<BASICLY HERE>
Setting traffic split for service [default]...done.


On Tuesday, July 9, 2019 at 5:21:47 PM UTC+3, Jeffrey Eliasen wrote:
Migrations are a challenging topic, and a single email can only scratch the surface.

If you application is deployed in a Docker container, then you can run migrations by triggering a one-time task that starts a container whose sole job is to `python manage.py migrate` and exit, and all the migrations that exist in that container will be run. If you're in kubernetes, the mechanism for one-time tasks is the "job": https://cloud.google.com/kubernetes-engine/docs/how-to/jobs. A job is essentially a one-time operation, so it will run once when triggered rather than running on every instance in a cluster.

Regarding schema changes and compatibility, it *is* possible to define migrations in a way that running code will never break when the migration happens, though it involves multiple migration jobs and intermediate deployments. The basic idea is to run a migration that is 100% backwards compatible (so adding a table or field is fine, but changing existing tables and fields is not), then deploying code that can handle both the before and after schemas (but preferring the "after" schema), then updating all your entities to have the "right" data in the newly added fields, then switching to the "final" code that only knows about the "after" schema, and then finally running a second migration that deletes fields and tables that are no longer being referenced by any running code. Using this sequence, it is possible to update your application without downtime or failed requests, but it's definitely harder than the "usual" way of just deploying and running the single migration right away.

----------

jeffrey k eliasen - technologist, philosopher, agent of change
blog | linkedin | google+ | facebook |


On Jul 8, 2019, at 02:58, Bogdan Gusiev <agr...@gmail.com> wrote:

I got used to the idea that a database migrations (scripts to change the schema or data) are shipped as a part of deployment process.


Gcloud platform doesn't have a recommended way to run migrations.


However, this will cause migrations to be run multiple times for each instance of the server being launched.

That can cause problems if the migration takes more than 10 seconds to run and performs non-trivial data changes.

Is there a way to plug `migrate` command into the deployment process  so that it is being executed only once per deployment?


Also, I would be great to perform migrations as close as possible to service restart because in some cases old deployed version can be incompatible to the new DB schema.


--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/20930d07-25c1-4848-b89c-aae0e5becf56%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/7dd58d30-c908-4fb1-867b-5c9cb4abd9bb%40googlegroups.com.

No comments:

Post a Comment