This will be of particular interest if you want to provision your own server which produces map tiles in the Welsh language (or perhaps a different language of your preference). All code is licensed under GPL, allowing you to run it for any purpose, modify it, and redistribute it.
Please note that a basic knowledge of how to use the Linux command line is assumed.
The main tool is a Lua script called cartonamecy2name.lua which is run when importing data. It determines the name to be stored in the database, for any given entity on the map. OSM data is the main source for Mapio Cymru, particularly the name:cy and name tags, and also Wikidata name information via OSM’s wikidata tag. In editing these sources we also refer to open data from the Welsh Language Commissioner.
Many more details are given in the README file in the repository, including step-by-step instructions for use.
If you just want to use a Welsh-language map which already exists, ignore the above and head to openstreetmap.cymru!
It was a pleasure to take part in the FOSS4G conference today in Cardiff.
Here’s the recording of my talk about provisioning the Mapio Cymru server along with the slides.
This presentation will be of interest to people who want to advance mapping in lesser-resourced languages around the world. Then again, anybody with an interest in the overall growth of OpenStreetMap would do well to pay attention to the multilingual aspects of the project – in my view.
Thanks to the organisers and all participants for an insightful day at FOSS4G!
If you’re interested in the server software and data aspects of Mapio Cymru and OpenStreetMap, come to my talk at FOSS4G.
FOSS4G is the international Free and Open Source Software Conference for Geospatial. This year it is happening on Thursday 17th November simultaneously over several physical sites around Wales, Scotland, and England – including Cardiff where I will be.
As Mapio Cymru we’ve shared our learnings and enthusiasm at several events and organised a few of our own events, variously through the medium of Welsh and English. This time my talk will be through the medium of English. I am going to focus on the more technical side – what happens on the servers, plus some context about the Welsh language situation for those who might need it.
I’ll be discussing how we’ve built a map of Wales in Welsh, what technical and other challenges we’ve overcome, what the next steps will be, and share our overall vision for bringing mapping in Welsh to all.
Hopefully somebody out there will be inspired to contribute to mapping in the Welsh language or OpenStreetMap as a wider project, or start their own initiative – perhaps for another lesser-resourced language.
At the time of writing this I’ve just heard that the allocation of tickets for the Cardiff branch of the event is now running low. If you’re unable to get a ticket you can still watch the recording afterwards.
Look closely and you’ll see that all the place names are subtly different.
Why do this? The main purpose is to spot gaps in the data for names in Welsh. There are a few means by which a name can find its way to the main map. The map takes data from OpenStreetMap and Wikidata, and then processes it. We at the Mapio Cymru project wanted to convey the data source of each name on a map, but separately from the main map.
At the moment there are four potential sources noted in the experimental map’s key:
From the name:cy field (OpenStreetMap)
From the name field (OpenStreetMap) – while not labelled as being in Welsh the name looks as if it could be in Welsh, according to certain criteria. I need to blog about these criteria soon.
No suitable name found (at the moment)
Please note that this key could change in future. Please refer to the map and its own key for details.
You won’t be able to do all the things that you can do with the main map, like search and easy embedding.
What you can do is browse the experimental map to find deficiencies and then edit OSM to enter names, in instances where the data is incomplete.
Your changes will appear on the main map and the experimental map.
Ultimately the place name you enter could then appear in a multitude of apps and projects, thanks to its licensing status as open data. I am very glad to offer this resource as a means of helping anybody who wants to share place names in Welsh. Thanks again to the Welsh Government for supporting this work.
The experimental map server is called the Pwll Tywod, or sand pit. This is denoted by the sandy coloured border. Our use of this term is to convey that we are playing with how it appears. Please excuse any occasional tech glitches you might see on the experimental map – but that’s the point.
Combining Wikidata with OSM allows us to build on the work of Mapio Cymru which has been developing a map of Wales using only Welsh language data held in the OSM database. By aligning and combining this with Wikidata the map can begin to grow further, offering more information to users through the medium of Welsh.
And this is important. Many places in Wales, be they towns, villages, hills or beaches have two names, or sometimes more. The names in Welsh are almost always the original place names, ancient in origin and steeped in history. These names are usually descriptive or refer to long lost saints, chieftains or fortresses. The English versions of place names are sometimes meaningless mutations of the Welsh originals or names imposed by medieval invaders or Victorian ‘modernisers’. Even today historic properties are renamed in English by their new owners and Welsh names are dropped from websites and maps in favour of English alternatives deemed to be ‘more easy to pronounce’.
This project aims to decolonise mapping in Wales, not by erasing English place names from the record but giving users the option to view and explore a modern map of Wales solely through the medium of Welsh – a service that didn’t really exist until the launch of Mapio Cymru.
So the first challenge with this project is actually to encourage communities to contribute their local Welsh place names to OSM or Wikidata so that they can be included in the map, and this is done through a series of discussions, workshops and editing events. […]
We are building an open public map of Wales with all the names in Welsh. Because of recent work, the map will load much faster for you now.
Here’s what we did to improve the map loading speed.
When you load the map, what you’re seeing is a grid of tiles. Each is a square image file, like this:
These tile images are rendered from the underlying map data in OpenStreetMap, which is stored as points, ways, and relations. As well as the data, the OpenStreetMap software stack developed by the project is also freely licensed.
The most time consuming part of showing an up-to-date map to a user is converting the data into images. This tends to be done when the user loads the map: the images are generated and served, and also stored in a cache on the server.
If the map were completely finished and final then that would help. We could make sure the server has the tile images all rendered and stored, and serve them every time. But that’s not an option for the whole map for a couple of reasons.
At the moment the Mapio Cymru map is updated automatically once every night when most people in Wales are asleep, and server capacity tends to be higher. These updates are necessary because geographical map features and names change often, whenever somebody makes an edit to OpenStreetMap. This is often an improvement to the map, e.g. somebody adding a name to a feature. The open data elements are constantly being revised, making it a bit like a Wikipedia of maps. The edit can also be a response to something changing in the physical world, e.g. a café changing its name or perhaps a lovely new railway station.
Therefore we can’t preserve the map in aspic, it’s changing all the time.
It turns out there’s another snag to the idea of pre-rendering and storing the whole map to speed up loading. There is a different set of image tiles at each zoom level. For the furthest zoom levels it is possible to store all the tile images. But for closer zoom levels, the total number of tiles grows exponentially. Pretty soon we need a vast amount of time and storage space, much more than we have.
For example all of Wales at zoom level 17 took a little over seven hours to render overnight. That’s too much.
Can we pre-render and store some selected tiles, and then render any others on demand? It turns out that we can. The challenge is to figure out what to pre-render for maximum speed advantage, given the constraints of time and storage.
What are the map areas of ‘interest’ or ‘relevance’, and how do we codify this more precisely?
Initially we had a hypothesis that for the map sparsely populated areas would be less frequently visited than densely populated areas. One method would be to pre-render areas above a certain population density threshold.
I then realised that there was another solution much more ready to go, and even better. We could refer to aggregated browser requests for tiles to see which parts of our map were visited most. This allows us to look at the historical popularity of areas right down to individual tile level. This was data we already had, lying in the server logs.
Here’s a heat map produced by Ben Proctor.
The popular areas do seem to correspond to population density. There may also be a relationship with the number and/or percentage of Welsh speakers in different areas, which is available from Census data.
I’ve instructed the server to pre-render these tiles and store them. We have chosen these areas:
Zoom levels 3 to 16 are now entirely pre-rendered.
Zoom levels 17 and 18 are now partially pre-rendered.
Now the server automatically pre-renders these areas every night, immediately after importing the up-to-date data.
The difference was very noticeable when I loaded the site before and after the change. Beforehand I’d been a bit embarrassed about the huge blank areas and the apparent freeze-ups of the map, while the server wheezed along. I am not experiencing that anymore – at least for now!
On average, tiles are loading in 40% of the time when pre-rendered. That’s a dramatic improvement, although the degree of speed-up is highly dependent on how many users are accessing the server at once.
Even tiles that are not pre-rendered are loading faster because there is usually more capacity on the server.
Incidentally the smooth running of the server also depends on choosing the right settings and configuration. (We briefly considered nginx as an alternative to Apache but it appears not to have an equivalent of the mod_tile module.)