Overpass API > Blog >
Published: 2017-04-24, updated 2019-10-31
I have just finished to reorganize the examples page in the wiki. The reorganization may or may not help in navigating the page. My motivation is to understand what the goals are that the Overpass API users are after. I will revise the individual items later on.
As it turns out, one large subject is to understand and count tagging combinations. This is a good opportunity to answer some recurring questions about how public transport tagging is actually in the field.
The most loudly announced issue in the Public Transport v2 scheme is the transition from highway=bus_stop to public_transport=platform and public_transport=stop_position. The former is for bus stops that are modeled by their sign beside the road. The latter is for bus stops that are modeled by an estimation of the stop position of the vehicle. We will first take care of the platforms.
Along with public_transport=platform these stops should have been tagged bus=yes to distinguish these platforms from platforms for other means of transport. So we have already three tags to care for: highway=bus_stop, public_transport=platform, and bus=yes.
A clean solution would be to check any of the eight possible combinations. However, this example should tell you to notice when we do self-deception. I do it right now and take the unchecked assumptions that:
This means we have to master the following building blocks:
From these requirements I suggest:
area[name="Antwerpen"]; ( node(area)[highway=bus_stop]; node(area)[public_transport=platform]; ); node._[name]->.with_name; node._[public_transport=platform]->.pt; node.pt[bus=no]->.not_bus; node.pt[bus=yes]->.explicit_bus; make count all=count(nodes), with_name=with_name.count(nodes), pt=pt.count(nodes), not_bus=not_bus.count(nodes), explicit_bus=explicit_bus.count(nodes); out;
Let us reassign the lines to the requirements:
For the friends of advanced presentation, a variant that really writes fractions in the usual percent notation:
area[name="Antwerpen"]; ( node(area)[highway=bus_stop]; node(area)[public_transport=platform]; ); node._[name]->.with_name; node._[public_transport=platform]->.pt; node.pt[bus=no]->.not_bus; node.pt[bus=yes]->.explicit_bus; make count all=count(nodes), with_name=with_name.count(nodes)/count(nodes)*100 + " %", pt=pt.count(nodes)/count(nodes)*100 + " %", not_bus=not_bus.count(nodes)/pt.count(nodes)*100 + " %", explicit_bus=explicit_bus.count(nodes)/pt.count(nodes)*100 + " %"; out;
It looks like at least bus=yes has been carefully applied everywhere - and that the Antwerp tram and light rail is modeled in a somehow different way.
Now to the nodes with public_transport=stop_position - it is time for another unchecked assumption: there are no stops where passengers can neither board nor alight (really? And houses always have entrances?) Hence we search for stop positions that have no nearby platforms.
This can be archived in the following steps: First, being close to is always reciprocal. Thus, it is always worth consideration whether we search for one side or the other first. I opt for searching for platforms here. Second, platforms can be nodes, ways, or relations. Thus we need the usual construction of an union statement and multiple query statements.
We cut out of all the stop positions those stop positions that are close to a platform, i.e. at most 10 meters away from a platform:
area[name="Milano"]->.a; ( node(area.a)[public_transport=platform]; way(area.a)[public_transport=platform]; rel(area.a)[public_transport=platform]; )->.platforms; ( node(area.a)[public_transport=stop_position]; - node._(around.platforms:10)->.matched; )->.orphans; make count all=count(nodes), orphans=orphans.count(nodes); out;
This number may justify the hypothesis that there are still a lot of unmatched stop positions.
It is time to cross-check whether our test cities are representative. For the sake of comfort the two queries can be merged into a single query. To get a concise overview it is best to take a sample fo cities across Europe and to make a table out of the results:
[out:csv("name", "all", "orphans", "with_name", "pt", "not_bus", "explicit_bus")]; ( area[name="Hamburg"]["admin_level"=4]; area[name~"^(München|Köln|Milano|Napoli|Birmingham|Manchester|Barcelona|Antwerpen)$"]["admin_level"=6]; area[name~"^(Lille|Lyon|Marseille)$"]["admin_level"=7]; area[name="Rotterdam"]["admin_level"=8]; )->.areas; foreach.areas->.a( ( node(area.a)[public_transport=platform]; way(area.a)[public_transport=platform]; rel(area.a)[public_transport=platform]; )->.platforms; ( node(area.a)[public_transport=stop_position]; - node._(around.platforms:10); )->.orphans; ( node.platforms; node(area.a)[highway=bus_stop]; )->.stops; node.stops[name]->.with_name; node.stops[public_transport=platform]->.pt; node.pt[bus=no]->.not_bus; node.pt[bus=yes]->.explicit_bus; make count name=a.set(t["name"]), all=stops.count(nodes), orphans=orphans.count(nodes), with_name=with_name.count(nodes), pt=pt.count(nodes), not_bus=not_bus.count(nodes), explicit_bus=explicit_bus.count(nodes); out; );
I got on 24 Apr 2017 the results:
name | all | orphans | with_name | pt | not_bus | explicit_bus |
---|---|---|---|---|---|---|
Manchester | 1 | 0 | 0 | 0 | 0 | 0 |
Birmingham | 3991 | 18 | 3933 | 142 | 0 | 9 |
Barcelona | 3397 | 766 | 2685 | 574 | 0 | 360 |
Marseille | 2330 | 71 | 2297 | 712 | 1 | 7 |
Napoli | 1710 | 344 | 1235 | 70 | 0 | 65 |
Lyon | 3748 | 339 | 3712 | 3570 | 0 | 3522 |
Milano | 4529 | 535 | 4007 | 841 | 0 | 687 |
München | 2215 | 71 | 2206 | 128 | 0 | 122 |
Lille | 2873 | 149 | 2842 | 2711 | 0 | 450 |
Antwerpen | 7554 | 111 | 7519 | 7521 | 0 | 7300 |
Rotterdam | 737 | 481 | 733 | 137 | 0 | 94 |
Köln | 1659 | 307 | 1648 | 528 | 0 | 135 |
Hamburg | 3766 | 410 | 3717 | 1150 | 1 | 1110 |
Hence, we can state that