Overpass API > Blog >
Amongst the aims of Overpass API is to make the installation as easy as possible. In fact, it has been simplified to only the three standard commands:
./configure make make install
You also need to have the prerequisites ready (a compiler, wget and expat). But it is highly unlikely to meet a working Linux installation without. Even if they were not installed: it is just one command away:
sudo apt-get install wget g++ make expat libexpat1-dev zlib1g-dev
An installation only makes sense if you execute the installed software afterwards. This is also designed to be easy:
Each of these steps is again a simple command. There is no need to build wrapper scripts or edit configuration files.
Things get complicated once one tries to actively involve systemd. Essentially this is because systemd is not designed to run a DBMS. I will explain details below.
Overpass API shall and can run on a wide range of systems:
The only way to support such a wide range of systems from a small project is to find a common base for all the operating systems. This is to compile the software at the target system and to rely solely on POSIX standard features.
The system is carefully designed to nowhere need root permissions. Unlike for Windows, user permissions have always been a vital layer of security in the POSIX model. For this reason, most software is carefully designed to require no root permissions. An example are web servers like Apache and database systems like PostgreSQL. Each of them has a dedicated user with as few permissions as possible.
As an extra bonus of the no-root-policy you can install an Overpass instance on a system where you do not have special permissions at all. These shared repsonsibilities have for example taken place on the Rambler instance.
There are only few permanently running components, in particular the dispatcher, fetch_osc, and apply_osc_to_db. They are designed to be controlled from the command line. Do not forget: a useful server will be permanently up. The public instance is booted once or twice per year or so, and booting is almost always in the context of serious trouble. A manual check is anyway unavoidable in such a situation.
The second most probable scenario is running the server on a laptop. Testing with suspend-to-RAM or suspend-to-disk did run smoothly. Such a system is next to never doing a reboot.
If you really need to regularly boot on systems that run permanent services then you should use the crontab: Read the help that you get from man crontab in the command line. Then edit the file bin/reboot.sh according to the instructions in the file. Then run crontab -e and add:
@reboot nohup /path/to/overpass/bin/reboot.sh
The crontab is a feature to run a command at a defined point in time. This includes the option to run them each time after the system is back from a reboot. According to my tests, the crontab turned out to be widely available even if it is not part of the POSIX standard.
Beside security we also want safety. This means that the system should come to a controlled stop if something is wrong. This protects the database from becoming corrupted.
This safety is ensured via lock files. Both the update process and the dispatcher write and check for the presence or absence of some files and whether the files link to the process that tries to obtain them. If a process decides to error out instead of deleting or rewriting them then it is because it is highly likely that it has run into a severe error.
The problems reported in the context of systemd invocation all were either caused by or severly worsened by external deletion of lock files. External means triggered by the systemd config file. By the way, the example file for systemd's predecessor, called Upstart, had the same problem.
We just noticed a thing to avoid with systemd. How shall we invoke systemd instead properly? The answer requires in addition to the background information about Overpass API background information about systemd.
systemd has been designed to speed up the boot process of desktop computers. As explained above, booting is of little relevance on any system that runs a service waiting to answer requests. It is an anti-pattern indicating a faulty design.
In particular, systemd makes some default assumptions that are lethal to DBMSs:
A feature of DBMS are the so called sockets. A socket is a point in the file system that you can connect to. One can compare it with browsing the web: Your web browser figures out by the URL a place where it can send data to (the request) and then gets other data back (the response, in this metaphor, the web page). For sockets, insteads of the URL you have a path in the file system where you send data to. There should be a service listening and answering to you. Hence, the socket should exist exactly as long as the process runs that listens on that socket.
systemd is by default configured to instead delete the socket if a user logs out. This is usually an unrelated event. A common workflow is to login, start the daemon and then log out. Of course, you do not want to break down the daemon at that point. I will not judge whether there is a use case in the context of a desktop environment. But for a daemon this is both surprising and lethal.
There is another mean of communication used by a DBMS called shared memory. Again, systemd is messing up with that. And Overpass API relies on both Unix Domain Sockets and shared memory.
Both systemd and the idea to start a DBMS by a systemd config file are at odds with the Unix philosophy:
Do one thing and do that well
There have been numerous complaints that systemd has feature creep:
But I do not even refer to that.
It is a serious task and systemd's task to organise during the boot process the steps that have to be done to get all hardware to proper operation. Starting software payload is a completely different story. This is even more true if the software is meant to run permanently.
I would like to give you a metaphor: In most offices there is a network printer. The network printer is permanently running, but still connected via a plug in a power outlet. systemd's job is to do the electric wiring and painting the walls. And finally making electricity available via power outlets adhering to a norm (in our case: POSIX). It is a stupid idea to carve away the power outlet, cut off the printer's plug and hard-wire the printer to building's wires.
This is what you do when you try to start Overpass API directly from systemd.
Deleting the lock files corresponds in this metaphor to bypassing the fuses both in the device and the building. This jeopardizes the building, the printer, and all users. Do not do it. It is the step from bad idea to gross negligence.
So what to do else?
First the suggestions to you as user: The preferred way to run the service is to log in, issue
nohup /path/to/overpass/bin/dispatcher --osm-base --meta \ --db-dir="/path/to/the/overpass/data/" &
and the subsequent commands from the installation guide. Then log out. If systemd is running in the background or not, the POSIX specification asserts you that everything is going to run smoothly.
As extra bonus, this works on any of the mentioned operating systems and without special permissions.
If you do need to have a server that starts Overpass API in the course of the boot process: Edit the file bin/reboot.sh as explained there and put it into the crontab:
@reboot nohup /path/to/overpass/bin/reboot.sh
This mode is relatively new. Therefore, you are encouraged to ask back if you run into trouble. I'm happy to help you because I want to make this into a well-understood mode of operation.
Please do not try to run anything as root or to programmatically delete lock files.