Still have to decide on a single OS to reduce maintenance problems. Could just have installed all the services (which are all available as packages) and handled the configuration files instead of configuration files+docker file+s3 costs of docker image with nothing but the base os + one package and a configuration file.
That's not a fair assessment and I'm surprised this is the top comment. In this case, the huge advantage to a containerized setup is that everything is now easily portable. If his server goes down, or he just decides to move, OP can now deploy all of his websites onto another server instantly. He also quotes the ability to build (and test) locally before shipping images to production, which is a really neat workflow. Improved security comes as an added bonus.
As for the "s3 costs of docker image", it's a few cents per month.
> Without adding all that docker specific complexity described in the post:
My guess is that you never used containers at all, let alone Docker.
A Dockerfile is just a setup script that installs the needed services on an image, which you can run on any machine. That's it. There is no added complexity.
This is not about a single Docker file vs a setup script. If you read my post you will see that I describe the steps the author took. And they are plenty.
My guess is that you did not read the article at all.
He was "building Docker images for each of the services". So not a single one. 10 of them. And he signed up for a commercial registry to host them. An additional service he depends on now.
Yet even a single Docker file would not be as simple as a setup script. A setup script on the host OS would install some packages that the host OS will keep up to date. Using a Docker image instead puts the burden on you to keep it up to date.
I agree with this statement that Dockerizing creates more dependencies that you need to track. But...
A setup script on the host OS would install some packages that the host OS will keep up to date.
This is simply not as easy as you make it out to be. Installing dozens of services from the OS, is inherently creating a nest of dependencies which is hard to explicitly reproduce on other systems.
Whereas Docker provides explicit isolated environments for each service so it's far easier to reproduce on other systems. This appeals to me for cloud environments but Docker on the desktop might be a bit too far for me...
Yes, isolation is a big win. It means I can update the “os” each service resides on independently of each other, so I don’t have to tackle 10 upgrades at once.
It also removes attack vectors and weirdness that happens when a package sees optional dependencies on the system. I.e,, if I need ldap for one thing, I don't have services in other containers trying to work with ldap.
Now every time a package in Alpine gets an update you have to update all 10 containers. Because you will have no way of knowing if that package impacts the security of the service running in that container.
Yes, most docker enthusiasts don't do this. They run a bunch of containers full of security holes.
I expect this to become a hot topic as soon as we will start witnessing data breaches that have outdated containers as their source.
> Now every time a package in Alpine gets an update you have to update all 10 containers. Because you will have no way of knowing if that package impacts the security of the service running in that container.
That's pretty much the baseline when dealing with any software system, whether it's a bare metal install of a distro, a distro running on a VM, or software running in a container.
> Now every time a package in Alpine gets an update you have to update all 10 containers.
All it takes is inheriting the latest version of an image and running docker build prior to redeploying.
I mean, this stuff is handled automatically by any CI/CD pipeline.
If you don't carr about running reproducible containers you can also sh into a container and upgrade it yourself.
Do you also complain about package managers such as deb or rpm because most debian and redhat users run a bunch of software full of security holes?
Software updates is not a container issue. It is a software deployment issue. I mean, when you complain about keeping packages updated you are in fact complaining about the OS running on the base image.
That's pretty much the baseline when dealing
with any software system
Exactly. And now instead of one system, he has 11.
All it takes is inheriting the latest
version of an image
He is not using "an image". From the article: "After the Alpine Linux 3.9.1 release I noticed the official Docker images had not been updated so I built my own."
I mean, this stuff is handled
automatically by any CI/CD pipeline.
He has not describe any CI/CD pipeline involved in his infrastructure. Yet another aspect he has to build.
you can also sh into a container
and upgrade it yourself
> He was "building Docker images for each of the services". So not a single one. 10 of them.
10 services, 10 installers, 10 installations.
Where exactly do you see any problem or issue?
> even a single Docker file would not be as simple as a setup script. A setup script on the host OS would install some packages that the host OS will keep up to date. Using a Docker image instead puts the burden on you to keep it up to date.
That's simply wrong on many levels. Yes, a single Dockerfile is as simple (if not simpler) than a setup script. A Dockerfile is a setup script.
And yes, you can update individual containers or even build updated images.
Again, you seem to be commenting on stuff you know nothing about.
> Yes, a single Dockerfile is as simple (if not simpler) than a setup script. A Dockerfile is a setup script.
Sure, but:
a) you have 10 setup scripts rather than 1. This would make sense if you actually wanted to have different dependencies/OS setup/whatever for your 10 services. But if you've decided to standardise on a common baseline set of dependencies for the sake of consistency (which is a valid choice) then why repeat them 10 times over?
b) You have the extra intermediate artifacts of the images which just give you one more thing to get out of date, go wrong, or slow down your process. Rather than run script -> get updated things, it's run script -> generate images and then deploy those images. Sure, it's all automatable, but what's it gaining you for this use case?
If you have a single setup script to build, package and deploy all 10 services, and you can't build and/or deploy each service independently, then you have more important things to worry about than figuring how containers are used in the real world.
Actually, it is, because your criticizing proper deployment strategies, which are not specific to containers, with a usecase that has many technical red flags. You can't simply criticize deployment best practices by giving an blatant anti-pattern as an example. And do note that this has nothing to do with containers at all, because this applies equally well to VM and bare metal deployments.
To have a productive discussion you have to actually engage. If there's really a "blatant anti-pattern" then it shouldn't be so hard to explain what's wrong with it. Your replies so far have been no more substantial than "you're wrong".
You're giving the credit of automation to docker, which isn't where the credit belongs. It's pretty easy to get the same portability and testing without containers (this is what was happening long before docker was launched). Not to say that the OP shouldnt have done it, but I'm kind of tired of seeing the whole portability thing still being put up as if it's only viable with containerised solutions.
To be fair, docker itself introduced no radical new technologies, but it did introduce a lot of convenience. Containers had been available for a long time, but the convenience of a Dockerfile + docker hub made at accessible for the non hard code linux/bsd people.
What other solution for the easy portability do you know? Or how would you propose to handle this?
If it is easier then docker build && docker push and docker pull on the other side I'm all ears!
The main benefit docker introduced, was leading developers to at least consider "configuration injection" and "stateless installs" ("12 factor apps").
If upstream supplies a decent docker image, chances are that means the package is more amenable to scripting and running in a chroot/jail/container - and documents it dependencies somewhat.
That said, snaphotting the "state" of your container/jail can be nice. Recently I used the official ruby images, and could build a custom image to use with our self-hosted gitlab (with built-in docker registry) that a) got a standard Debian based ruby image and applied updates, and b) built freetds for connecting to mssql.
Now I can quickly update that base image as needed, while the CI jobs testing our rails projects "only" need a "bundle install" before running tests.
And the scripts are largely reusable across heterogeneous ruby versions (yes I hope we can get all projects up on a recent ruby..).
> What other solution for the easy portability do you know? Or how would you propose to handle this?
Puppet for the server-automation part. Languages that make it easy to produce a "fat binary" for the isolation part.
Docker solves a real problem for languages where deployment is a mess, like Python. It just grates on me when the same people who spent the last 10 years mocking Java (which does essentially the useful parts of docker) are suddenly enthusiastic about the same kind of solution to the same kind of problem now that it has a trendy name.
> portability thing still being put up as if it's only viable with containerised solutions.
You're arguing a point never made. That containers make things portable is not saying that's the ONLY thing that makes things portable.
I find using containers a lot easier to be portable when I have multiple apps that bizarrely require different versions of, say, python, or python libs and the same version of python.
You get portability by using any provisioning system: Ansible, Puppet, Chef.
Although, it's not exactly the same thing, because with Docker you have everything already installed in the image. I've only used Ansible and I was never happy with its dynamic nature.
You don't get portability from chef and others. You get a framework where you can implement your deployment with a case for each system you want to target. Past some toy examples, it's on you to make it "portable".
This is 100% correct. If you go look at the Chef cookbooks for any popular piece of software, say Apache or MySQL, the code is littered with conditional logic and attributes for different Linux distributions (not even considering entirely different operating systems). Every distro has different packages as dependencies, install locations, configuration file locations, service management, etc.
Docker (all container solutions really) aren't a panacea, but they solve a very real problem.
I was referring to the language you write playbooks in (YAML). There are no static checks, other than a dry-run that only tests for syntax errors. Frankly, I haven't heard of any provisioning system written in a compiled language. I wonder why.
NixOS kind of fits the bill (it can generate complete OS images from a recipe which is IIRC statically typed and "compiled")
If it looks waaaay different to puppet, ansible and chef, there's a reason for that :) Doing provisioning "properly" means managing every file on the drive...
I know about nix, but I'm referring to the language you use to describe the final image. Actually, this part is not a problem. The problem comes in the deployment part.
For example, there's no concept of `Maybe this request failed, you should handle it`. So when you run the deployment script, the request fails and the rest of your deployment process.
Defining the possibility of failure with a type system would force you to handle it in your deployment code and provide a backup solution.
Concerns about server going down or changing cloud provider imo is not particularly interesting or even useful advantage to mention for personal infrastructure. Considering that it's likely we might change our personal infrastructure less than one every year and I've never got a case when an unmaintained docker setup can run 6 months later, I'm not sure if the value for portable is that high.
> Concerns about server going down or changing cloud provider imo is not particularly interesting or even useful advantage to mention for personal infrastructure.
Why? Personal projects aren't more stable or bound to a single provider. If anything, personal projects may benefit more from a deployment strategy that makes it quite trivial to move everything around and automatically restart a service in a way that automatically takes dependencies into account.
> Considering that it's likely we might change our personal infrastructure less than one every year
In my experience, personal projects tend to be more susceptible to infrastructure changes as they are used to experiment stuff.
> and I've never got a case when an unmaintained docker setup can run 6 months later,
The relevant point is that the system is easier to maintain when things go wrong and no one is looking or able to react in a moment's notice. It doesn't matter if you decide to shutdown a service 3 or 4 months after you launch it because that's not the usecase.
> I'm not sure if the value for portable is that high.
That assertion is only valid if you compare Docker and docker-compose with an alternative, which you didn't. When compared with manual deployment there is absolutely no question that Docker is by far a better deployment solution, even if we don't take into account the orchestration funtionalities.
> Concerns about server going down or changing cloud provider imo is not particularly interesting or even useful advantage to mention for personal infrastructure.
I look at this from a different perspective: I have plenty of actual things to do, personal infra should be the least of my concerns and I should be able get them up and running in least amount of time.
> I've never got a case when an unmaintained docker setup can run 6 months later
It really depends on the well-being of the host and the containerized application. I have plenty of containers running for more than a year without a single hiccup.
I've been upgrading my OVH dedicated server once in a year. So far it has been possible to get a little bit better server for the same price from their black Friday sale. Thanks to a simple bootstrap shell script, docker and docker compose I'm able to migrate my ten random services and two KVM VMs in two hours (copying /data directory takes most of the time obviously)
> Concerns about server going down or changing cloud provider imo is not particularly interesting or even useful advantage to mention for personal infrastructure.
I’ll let you know my kids personally disagrees with you on this one if Plex on the TV or iPad suddenly doesn’t work.
Being able to easily migrate apps is super nice to when changing hardware/server.
But he had to migrate away from FreeBSD to use Docker, so that doesn't sound like a portability advantage at all, in the usual sense of the word "portability". "Improved security" is also a red herring. You can't make something more secure by adding an additional layer of abstraction. He even mentions that he found it problematic that inside the Docker container, everything runs as root by default. Plus the dicker daemon must needlessly run as root on the host.
> he found it problematic that inside the Docker container, everything runs as root by default
That's technically right, but not what you'd expect. Docker runs root in a few restricted namespaces and behind seccomp by default. The syscall exploits people are worried about are often simply not available. Even then it's easy to take it another step and create new users. You could even have root mapped to a different user on the outside if you prefer that.
Last I checked, Docker containers are not hard to break out of unless you go to extra lengths (SELinux, AppArmor, etc--not entirely sure; I'm not an expert). Most people use Docker as a way to avoid their programs accessing the wrong version of a shared dependency or similar. I believe there may be other container runtimes with stronger guarantees or better defaults, some of which are even suitable for running untrusted code (or so they advertise).
The advantage is not that huge if you compare to the author's previous setup, which was based on Ansible.
Unless you like to run a different OS per virtual machine, moving your websites to a new machine (or add a new machine to the bunch) is as easy as with Docker, and you can test your setup locally too (though you will need a VM running on your computer).
The biggest advantage of Docker in my opinion is that it makes it much easier to make conflicting software coexist on the same machine (for example two websites requiring two different versions of node.js or php). Also it is nice that you can build the image once and then deploy it many times. Ansible's equivalent is of rebuilding the image every time you want to deploy it.
Also I find it a bit easier to accomplish what I want with a Dockerfile than with an Ansible script and if you make some mistakes it is easier to rebuild an image than "cleanup" a VM instance.
So, Docker smooths many edges compared to Ansible, but I wouldn't consider that a _huge_ advantage, expecially in the context of a personal infrastructure.
Another advantage is that you can run more up to date packages than your distro would allow, or different versions of the same one.
The downside is that you should rebuild the containers daily to be sure to have all the security patches. Not as convenient as apt-get. Maybe it's more cost effective to run another VPS or two.
Yeah but you can also use Ansible or a comparable tool. Moving with that is equally easy. Also without always rising storage usage that results from configuration tweaking, which can be especially problematic if you deploy heavy-weight Java servers.
Sure the containers he can just re-setup, but what about all the DATA.
Where is all the data for his mattermost instances being held? You still have to back that up somehow/somewhere and feed it into your "new containers" that are on another server "instantly"
It's not over-engineering; it's quite a lot easier to get Docker up and running and run things in it rather than dealing with init systems, package managers (and dependency conflicts), ansible, etc for every app. You get a sane, consistent, and standard interface to logging, networking, data management (volumes), declarative infrastructure, packaging, etc and Docker swarm makes it relatively simple to scale up to multiple nodes.
I would disagree with this even if I had a really nice cloud operator with great interfaces and utilities for logging, networking, monitoring, image management, volume management, secrets management, declarative infrastructure, etc and you can afford to run that many VMs... I'd still probably be running lots of cruft (SSH servers, process managers, etc) in each VM (or I'd have to go through the trouble of opting out) and I still need to get logs out of the VM and into the logging service, which usually implies faffing with ansible and friends. Nooooo thank you.
Also, `docker commit` is pretty easy, and you can also just back up individual volumes.
I disagree -- with traditional VMs, you have to deal with multiple mutable systems. In the Docker/OCI container world, containers are immutable, so you can manage all your changes atomically, from a single source of truth (your Dockerfile collection).
In my view, LXD/LXC splits the difference pretty nicely between VMs and Docker.
Portability with LXD is even cleaner as all the data is in the lxc container. It's not immutable, and the initial setup is a little more involved as you have to set up services on each container, eg no dockerfiles, and you need to figure out ingress on the host often less declaratively, with normally routing 80/443 via iptables to an nginx or haproxy container to then reverse proxy to the relevant container per domain-based ACLs.... etc.
But, I still prefer it to Docker. I rather don't mind initially setting up and configuring the needed services the first time on each container... And for me that's a good way to get familiar with unfamiliar packages and services/applications that I might want to try/play with, rather than just go find a dockerfile for Application X or Y and not actually learn that much about the service or application I am deploying. Speaking for myself only-- obviously there are the gurus who know common services and applications inside and out already, and can configure them blindfolded, so Dockerfile everything would make sense for them.
Fully agree and pretty much exactly my setup. A haproxy container which directs traffic (not only websites, but also syncthing, caldav/carddav etc.) and renews all Let's Encrypt certificates.
It's fun, easy to backup, easy to migrate, easy to just test something and cleanly throw it away. And in practice the containers are pretty much like VMs (talking about personal projects here, corporate is more complicated of course).
And the upfront work is not that much. Do the quick start guide and one or two things. Maybe you don't even need to configure iptables manually, "lxc config device add haproxy myport80 proxy listen=tcp:0.0.0.0:80 connect=tcp:localhost:80" does a lot for you.
Just like how most programs are available as packages in your favourite distribution's package manager, most programs are available as Docker images on Docker hub, which is as gratis as packages built for distros. And for the ones that don't have an image on Docker hub, you can include the Dockerfile to build it on the spot in the production environment.
The real benefit comes from having the _ability_ to use a different base distribution if required. While alpine is suitable for most applications, sometimes you might be required to run a different distro; perhaps you're packaging a proprietary application compiled for Glibc.
Also, networking isolation is straightforward in Docker. Let's say there's a serious security bug in Postgres that allows you to log in without any password. If someone could perform a RCE in a public-facing website, this would be a catastrophe as they'd be able to exploit the database server as well. With Docker, you can easily put each service in its own network and only give access to the database network to those services that need it. Or you can be even more paranoid and have a separate database and a separate network for each service.
One more feature I'm fan of is the ability to run multiple versions of a software. Maybe some applications require Postgres 9.x and some require Postgres 10.x. No problem, run both in separate containers. You can't do that with regular distributions and package managers (at least in any I know of).
Nix is... tough. I really want to like it, but every time I pick it up I end up having to write some custom package definition for some obscure transitive C dependency with its own snowflake build system. Couple that with poor documentation for existing packages, the terrible search engine experience ("Nix" and "Nix packages" usually turn up things about Unix or "Congress Nixes Aid Package"), and a thousand other papercuts and it just seems to create more problems than it solves. I dearly hope this changes.
Guix is like Nix, but I don't think Nix has an equivalent command to bundle a bunch of packages into a self-contained file for running on arbitrary distros (without Nix/Guix installed).
That's a bit like saying "ORMs are too complex" before proceeding to eventually build your own crappy ORM. I think by accepting docker you gain a lot of support from a developer community which you'd otherwise not see if you assembled your own.
Before proceeding to eventually learn the modern SQL finally and write a couple of stored procedures to stop pumping terabytes of almost-raw data to your "application server" code to be filtered by ORM that is too complex to understand, and in the same time still too stupid to use half of the features your DBMS provides.
You're complaining about a scenario where you might have picked an ORM that didn't met your personal performance requirements. Even if you ignore that you are free to pick any ORM you choose and even if you assume that you have all the time in the world to refactor your project to an experimental setup that was never tested at all, your comment still sounds like a revamp of the old argument on how hand-rolled assembly is always superior to any compiled or interpreted code.
Anyone building out a setup based on cgroups and namespaces will eventually arrive at a poorly specified, bug ridden mini Docker. Might as well get with the program early.
Anyone calling bc from bash script will eventually arrive at a poorly specified, bug ridden mini Mathematica?
Wrt "bug ridden"
> if I were having trouble with something Docker-related I would honestly feel like there was a 50/50 chance between it being my fault or a Docker bug/limitation
Sounds like you don't understand Docker or containerisation in general
> Still have to decide on a single OS to reduce maintenance problems.
No you don't, you can run various distros in docker containers. We're using a mix of Debian (to run legacy services developed to run on old Debian LAMP servers) and Alpine (for our sexy new microservices) at my current job.
> Could just have installed all the services (which are all available as packages) and handled the configuration files
Then you would have a system dependent on the volatile state you configured by hand, meaning the system configuration is not declarative or reproducible.
> No you don't, you can run various distros in docker containers. We're using a mix...
I think you are missing the point. For each distro base you introduce via docker you must track and update the security releases. Standardizing on one base distro absolutely reduces the ongoing maintenance work.
> Then you would have a system dependent on the volatile state you configured by hand, meaning the system configuration is not declarative or reproducible.
If you're using Ansible as the author already was, you essentially have this already. I can do a one command deploy to cloud servers, dedicated servers and colo boxes with just Ansible. Docker gets you a slightly more guaranteed environment and an additional layer of abstraction (which has its own set of pros and cons), but that's about it.
Slightly more guaranteed is something of an understatement; if you specify base images with specific hashes and pin package versions, you can get quite close to reproducible builds of the environment.
In support of your argument; Look for example at the Dockerfile for the official Golang container. They pin exact sha256 hashes for each architecture, and the source release in case you're on an un-binary-released architecture.
Pin specific versions of your packages, coupled with caching and you're sitting pretty.
Yes, but you still can't guarantee anything about the host the docker container has to run on, so you're still impacted by host configuration and therefore still need to provide a good base environment for running your containers. This is fairly simple with most providers, but in such a case, using Ansible or similar to deploy directly to the host has similar results.
The problem that docker solve for me is reinstalling.
Some software, like RDBMS, are heavy to setup. You can do once and is not that hard, but then when the day come to reinstall, upgrade or move it to another version:
- You need to remember how do that
- If you upgrade OS, the step have changed
- If the install (of the OS) go kaput, you can lose the work of setup everything so far -the steps- (ie: I have a few times manage to ruin some ubuntu install. Instead of wasting time fixing it, I spin another server and redo)
- If the install (of the app) go kaput, you are left with a inconsistent state (I manage to broke a PG database in a upgrade, because was looking to the wrong tutorial, Instead of wasting time fixing it, I spin another docker image and redo from backup)
- I try to upgrade everything fast. A new ubuntu version? upgrade. A new major PG version? upgrade. Not always to production but I don't like to wake up 2 years late and face a BIG upgrading cycle. prefer to spread the pain in the year. If something wanna break, I want to see it coming. So I reinstall a lot of times.
And it work across OS. So all the problems of above, was in my dev machine!
P.D: I wanna something alike docker, I don't care for orchestration (so far) only something that allow to package apps/dependencies, work in my small serves, allow to (re)build as fast as possible. What other option can work? (P.D: My main deps are postgress, nginx, redis, python3, .net core, rust (coming). Wish to setup android/ios toolchains too)
> s3 costs of docker image with nothing but the base os
An alpine container is in the middle double digit MB range and will be reused in all other images using it. The space costs are trivial. Furthermore the base image is hosted on Dockerhub and last I checked you can host up to 3 images privately on Dockerhub and unlimited in public.
Clearly you don't understand the advantage of containers. You can swap them without causing problems to other services, in contrast with traditional installation on a single OS.
It is a surprise, since he was already a FreeBSD user, that he did not use its facilities like for example jails + perhaps a program to manage them or bhyve for VMs.
Which features Jails are missing? You have to orchestrate Docker just as you have to orchestrate Jails, there is a reason why Kubernetes is on the rise.
He's taken the first level step of containerization. What he gets is a deterministic way to build his environment.
You could just install the services and handle the configuration files, but then you suffer from replicability and deterministic build problems:
- The configuration files are all over the place, and while you can remember most of them, you can't be 100% sure that you got everything in /var/lib/blahblah/.config/something.conf, /etc/defaults/blahblah, /etc/someotherthingd/blahblah.conf, etc.
- Rebuilding the server becomes a problem because you don't remember a year and a half later everything you did to set the damn thing up. Even worse, you've been tweaking your config over time, installing other packages that you can't remember, putting in custom scripts for various things...
- Recovering from a catastrophic failure is even worse, because now you don't even have the old configuration to puzzle through.
- If you set up another failover system, you can't be sure it's configured 100% the same as the old one. And even if you did get them 100% the same, they WILL drift over time as you forget to migrate changes from one to the other.
You can mitigate this by using ansible or chef or docker or the like.
The other handy thing with container tech is that you can tear down and rebuild without having to wipe your hard drive and spend 30 minutes reinstalling the OS from the iso. Iterating your builds until you have the perfect setup becomes a lot more appealing when your turnaround time is 2-5 minutes.
Damage control becomes a lot easier. A bug in one program doesn't damage something in another container.
After this level, you can move up to the orchestration layer, where each container is just a component, and you tell your orchestration layer to "launch two of these with failover and the following address, link them to instances of postgresql, do an nginx frontend with this cache and config, pointing to this NAS for data", and it just works.
It's a lot like programming languages. You COULD do it all in assembly, but C is nicer. You COULD do it all in C, but for most people, a gc functional language with lambdas and coroutines etc gives you the sweet spot between cognitive load and performance.
With a shared language for doing these powerful things, you now gain the ability to easily share with others (like docker hub), increasing everyone's power exponentially. Yes, you need to vet whose components you are using, but that's far less work than building the world yourself.
Yes, there's a learning curve, but the payoff over time from the resulting power of expression is huge.
Still have to decide on a single OS to reduce maintenance problems. Could just have installed all the services (which are all available as packages) and handled the configuration files instead of configuration files+docker file+s3 costs of docker image with nothing but the base os + one package and a configuration file.