"They're version dates not version numbers" is hairsplitting. You still have ext...

stephenr · on Dec 10, 2019

> You still have external information that you must rely on to determine what state your production db is in, and that's bad

What external information? Whether each migration has been applied or not is stored in the database itself. The dates are literally used just to ensure correct ordering - that's literally their only purpose.

> not present a series of additional steps for them to jump through.

If someone can't write the change they want to make into a file, write the opposite action into another file, commit and push that change to a VCS repo, I don't think they should be given access to a god damn toaster oven much less your production database.

> You review the script, and when the time comes to apply, recheck the database to make sure it's still the same state you generated the script against.

.. How can that possibly work with automated deployments? And how on earth do you "recheck the database to make sure it's still the same", with any degree of certainty?

Your entire approach smells like a very manual process that doesn't work for teams any larger than 1 person.

djrobstep · on Dec 10, 2019

> Whether each migration has been applied or not is stored in the database itself.

You're dragging this further into pedantic territory here. A chain of scripts and a version table is external to the structure of the database itself.

> If someone can't write the change they want to make into a file, write the opposite action into another file, commit and push that change to a VCS repo...

The recurring theme here is that you have a preference for mandatory busywork instead of a direct approach. People putting out fires ought to be focused on what will directly solve the problem most quickly and safely. In larger environments with dedicated ops people supporting multiple applications/environments/databases, not every ops person is going to be familiar with your code and preferred workflow.

> How can that possibly work with automated deployments? And how on earth do you "recheck the database to make sure it's still the same", with any degree of certainty?

...with a diff tool.

> Your entire approach smells like a very manual process that doesn't work for teams any larger than 1 person.

The whole point is that it is automatic rather than manual. I've used it before in teams "larger than 1 person" and it has worked fine.

evanelias · on Dec 10, 2019

> Your entire approach smells like a very manual process that doesn't work for teams any larger than 1 person.

You may be misunderstanding the concept. Automated declarative schema management (AKA diff-based approach) has been successfully used company-wide by Facebook for nearly a decade, to manage schema changes for one of the largest relational database installations on the planet. It's also a widely used approach for MS SQL Server shops. It's not some newfangled untested crazy thing.

I have a post discussing the benefits here: https://www.skeema.io/blog/2019/01/18/declarative/

stephenr · on Dec 11, 2019

I understand the concept of a tool that changes the schema to match some declared state dynamically.

I wrote the same functionality into a library.

What I cannot comprehend is the poster who claims that such an approach can simultaneously:

- be automatically applied

- be reviewed and even edited after generation to handle e.g. renames

- handle previously unknown changes in the DB schema (aka handling cowboy behaviour from other ops).

All three are simply not possible at once.

djrobstep · on Dec 11, 2019

Here's how you can achieve all 3.

- Develop intended schema (I)

- Inspect production schema (P), save as P0

- Generate migration (M) by comparing to production (P0): I

- P0 = M

- Edit M as necessary, test for correctness, commit to master (meets your second criteria)

- Deploy code, with the migration running as a deploy step (meets your first criteria)

- Migration works as follows:

- Inspect P again, save as P1. If P0 != P1, abort process (this prevents any issues from out-of-band changes as per your third criteria, and means the pending script won't run more than once)

- Apply M.

- Inspect P once more save as P2. Double-check P2 == I as expected.

evanelias · on Dec 11, 2019

I disagree. This is definitely all possible at once with proper tooling.

Ideally this workflow is wrapped in a CI/CD pipeline. To request a schema change, you create a pull request which modifies the CREATE statement in a .sql file in the schema repo. CI then automatically generates and shows you what DDL it wants to translate this change to.

If that DDL matches your intention, merge the PR and the CD system will apply the change automatically. If that DDL doesn't match your intention, you can either modify the PR by adding more commits, or take some manual out-of-band action, or modify the generated DDL (if the system materializes it to a file prior to execution).

In any case, renames are basically the only general situation requiring such manual action. Personally I feel they're a FUD example, for reasons I've mentioned here: https://news.ycombinator.com/item?id=21758143