No matter if you are a student or a professional, you'll always have to manage multiple database-powered projects in your computer at the same time as long as you are in the field of web development. This is when things get messy.
Think about installing an instance of MySQL or Postgres system-wide and then using it for all of your projects. You have to create databases and manage permissions for each of your projects. You have to keep track of the tables if you are sharing the default database among multiple hobby projects. You have to perform manual clean ups occasionally. And worse, you find yourself struggling to log in as the root user and to reset its password (if you haven't gone through this so far, just search "mysql root password reset" and see how many complaints and totally diverse suggestions show up).
Is there a solution? Will containers help?
Why not use SQLite?
Sure, there's SQLite, in which a database is just a file, without any concept of server, users and permissions. That's why many use SQLite for local development and MySQL or Postgres in production. But this is practical only if you are using a framework with sufficient abstraction. Every RDBMS is incompatible with everything else in one way or another. The differences can vary from changes in syntax to the way transactions are carried on. So if you are planning for SQLite in production, use it in your dev environment. But if you're planning for Postgres in production, you should use the same in development too.
How containers can help
Okay, you want to experiment with multiple MySQL or Postgres projects but still don't want to have a difficult time managing them. This is where containerization comes to help.
The idea is to create a container whenever you need a new database for a new project. You can stop the container or even delete it when you are done with it. No installations, no administration. Can't get closer to the easiness of SQLite with DB servers.
Need more explanation for the no-installations-no-administration part? For Docker, there are pre-built images for services like Postgres and MySQL. Once you've downloaded such an image, you can spawn as many instances as you want. Since each project gets its own DB server container, the amount of work required to set up the database, roles and permissions is trivial.
NOTE
Please note that the only intention of this article is to serve as an introduction to Docker and Docker Compose while sharing a solution to some problems faced by database application developers. Docker is a great development tool. But there is certain amount of risk when using it in production, especially in terms of security and performance. You should use Docker in production only if there is a real need and you have to be extra careful when you do so.
Installing Docker
Installing Docker used to be a bit difficult, but not any more. all you have to do in Ubuntu 20.04 is running these commands:
sudo apt update
sudo apt install docker.io docker-compose
Don't forget to add yourself to the docker group:
sudo usermod -aG docker YOUR_USERNAME
You'll have to log out and log in again to apply the group change, but that's all it takes. Now you can use this docker setup to run as many containers and projects you want. However, remember to monitor the containers and clean up the junk frequently, or your system resources will be wasted.
Let's start the container!
Now that you've Docker installed on your machine, you can run this command to start a Postgres instance:
docker run --name mypostgres1 -e POSTGRES_PASSWORD=mypgpass -d postgres
It'll pull the docker image from the Internet automatically and start an instance of it. Pulling is a one-time process, which means subsequent runs will be faster.
In the above command, mypostgres1
is a name of your choice, which you can use in commands like docker stop mypostgres1
. The flag -e
sets an environment variable, and here we use it to set POSTGRES_PASSWORD
to mypgpass
, the password of your choice. -d
means detached, which will cause the container to run in background, allowing you to use the shell for other purposes. Finally, postgres
at the end of the command is the image name.
To enter the Postgres shell of this instance, run this:
docker exec -it mypostgres1 psql -U postgres
But this setup is only useful to try out some SQL commands. To run a real project, we have to manage ports, persist data, etc. It's time to get introduced to Docker Compose.
Docker Compose
Docker Compose is an abstraction over Docker that makes life easier when you need more control over your containers like setting environment variables, persisting data, coordinating multiple services (e.g.: database server and web server), etc.
With Compose, you can define all of your services in a human-readable YAML file, and run the command docker-compose up
with minimal arguments to get them started. It's still possible to manage containers started this way using the native Docker commands.
Postgres with Compose
Let's first create a working directory with the following structure:
- myproject/
- initdb.d/
- pgdatadir/
- docker-compose.yml
- pg.env
initdb.d is a folder to keep the SQL files that are to be executed automatically while initializing the database. pgdatadir is an empty folder in which we ask our Postgres instance to store all the data. Both are optional. If there is no inital SQL, the database will be empty. If there is no pgdatadir given, Docker will keep the data in an automatic volume.
pg.env is where we plan to keep all the environment variables. We can put them inside the Compose file itself, but that's a bad idea (for example, someone will ask you to share your Compose file).
Yes, all the filenames are of our own choice. We'll map them using the Compose file and only the targets matter.
Now let's define the contents of docker-compose.yml, the Compose file:
version: "3.3"
services:
db:
image: postgres
container_name: mypostgres2
volumes:
- ./initdb.d:/docker-entrypoint-initdb.d/
- ./pgdatadir:/var/lib/postgresql
env_file:
- pg.env
ports:
- "9000:5432"
It should already make sense to you. We are defining just one service (db, again a name of our choice) here. The lines under volumes:
map the local directories to the container's filesystem. Similarly, there's a mapping going on under ports:
. Here we are mapping the container's internal port 5432 (the default Postgres port) to the host machine's port 9000 (once again, our choice). This mapping is not needed if the database is only accessed from the services defined in the same Compose file. But with this mapping, the service is exposed as part of the host itself, making it available to any application run in the host, LAN, or even the Internet, as long as the routers and firewalls permit. Now you know one security pitfall. I recommend you doing some research on exposing ports and container IPs (also, try the command docker inspect CONTAINER_NAME
).
Okay, let's see the contents of the environment file:
POSTGRES_PASSWORD=mypgpass
The env file has a syntax that expects one variable per line. Anything put after the equals symbol is part of the value, which means no quotes or escaping.
Finally, how do we start the service? Make sure you are inside the directory myproject and run this:
docker-compose up -d
You can still use the basic Docker commands like docker exec
and docker logs
. However, to stop the services, it's better to run:
docker-compose down
Connecting to the database
To interact with the database, you can use docker exec
. But now that the port is open, you can connect from a client that is outside Docker (say psql
installed system-wide using apt install
), or from your application, just like you connect always. The host will be localhost and the port will be 9000, based on the above example. If your application is run by the same Compose file, you can use the container's name (here, mypostgres2) as the host.
MySQL
Things that change with MySQL include the environment variables, the path of the data directory and the internal port.
The useful variables are:
- MYSQL_DATABASE
- MYSQL_USER
- MYSQL_PASSWORD
The data directory is /var/lib/mysql and the default MySQL port is 3306.
Monitoring and clean up
You can use the command docker ps
to list the running containers. docker stop CONTAINER_NAME
is how you stop a container, but it's recommended to stop them using docker-compose down
if you started them using docker-compose up
. Run docker system prune
to clean up unused files and cache. Docker volumes can also consume disk space. Run the command docker volume
to learn how you can manage them.
If you want to stop the whole Docker service for some reason, run systemctl stop docker
. You can use systemctl
to restart Docker also.
Reference
- https://hub.docker.com/_/postgres/
- https://hub.docker.com/_/mysql/
- https://docs.docker.com/compose/