Categories
Guides Milestones

Improving road names in Cymraeg mapping in Cymru

Plugging the gap

A key objective of the Mapio Cymru project is to help increase the amount of names in Cymraeg (Welsh language) held by OpenStreetMap as open data, as a resource to support mapping apps and services available in Cymraeg in the future.

The biggest single gap in name data in OpenStreetMap in Cymru (Wales) is road names that should be used in Cymraeg. This data is held by all local authorities and passed to GeoPlace who use the data in the National Street Gazetteer. GeoPlace also supply the data to Ordnance Survey for inclusion in relevant products.

OS Open Roads and the National Street Gazetteer

The OS Open Roads dataset contains name data and is coded to provide the appropriate name in English and Cymraeg. When examining this product the Mapio Cymru team has been concerned to see inconsistencies in the data and what seem like data quality issues compared with on-the ground signage. For this reason we resisted using OS Open Roads as a source until we could understand these inconsistencies and judge whether they were significant.

The authoritative dataset for road names should be the National Street Gazetteer (NSG). This is available to public bodies under the PSGA but it is not easily available to third parties to view or to re-use.

Comparing OS Open Roads to the National Street Gazetteer

We signed a sub-contractor agreement with Welsh Government and they obtained a version of the NSG for Wales. They supplied us with a simple CSV consisting of Unique Street Reference Number (USRN), name in Cymraeg and name in English.

We joined this dataset to the OS Open Roads product and this gave us the opportunity to compare name data between the datasets.

For all of Cymru, where a name value appears in OS Open Roads  it is an exact match for the values in the NSG 66% of the time.

On the face of it this is a worrying error rate. To dig into this we filtered the data to include only Powys roads.

In Powys, where a name appears in OS Open Roads it is an exact match for the values in the NSG 73% of the time.

Of the cases where a name appears in OS Open Roads but is not an exact match for the values in the NSG:

  • 32% are simply different names and it is difficult to infer any reason for the difference
  • in 27% of cases the NSG contains a description rather than a name.
    For example: USRN 85320327 appears as “PRIVATE STREET MAES MAESMAENDU TO MAES-Y-FFYNNON LINK ROAD” in Cymraeg in NSG and “HEOL Y FFYNNON” in OS. The OS dataset does seem to contain street names that appear on physical signs but don’t appear in the NSG. Presumably this is down to OS surveyors adding the on-the ground data.
  • in 18% of cases it appears as if there are data quality issues on the OS Open Roads side. We can’t be sure, of course, but this is an educated guess
    For example: USRN 85318006 appears as “CLOS BURGESS” in Cymraeg in NSG and “BURGESS CLOSE” in OS Open Roads
  • in 14% of cases the difference between the NSG and OS Open Roads is down to different application of spaces, hyphens, apostrophes etc
    For example: USRN 85319893 appears as “DAN Y CASTELL” in Cymraeg in NSG and “DAN-Y-CASTELL” in OS Open Roads
  • The remaining 8% of cases were a mix of various other issues.

OS Open Roads is good enough

Having broken down these various differences we concluded that the OS Open Roads dataset is a good enough source for human editing. An editor reasonably familiar with road names in Cymraeg would be able to detect many of the problems and resolve them correctly.

To demonstrate this we used OS Open Roads as a source to add Cymraeg names to roads in Aberhonddu, Powys.

This is what our map looked like before we started:

Screenshot from the Mapio Cymru website. It shows north Brecon with many un-labelled roads.

And this is what it looked like once we had added Cymraeg names:

Screenshot from Mapio Cymru website of a map of the north of Brecon. Almost all roads now have a label on them.

Lots more road names.

One of the potentially jarring things about this process is that it adds a lot of English-language names to the map. Which, on the face of it, seems odd if we’re trying to improve a Cymraeg map.

The truth is that many English-language names are used in Cymraeg. When we generate our Cymraeg map (which we do every day!) we label features with the name by which they are known in Cymraeg.

If we compare our map to the same view on the openstreetmap.org server which, in Cymru, tends to show the names used in English. We see that many road names are the same but not all.

Screenshot of openstreetmap.org. A map of the north of Brecon. Most of the roads have labels and most of the labels are the same as the names in the Welsh-language map.

Notably Lôn Bupren appears on the Cymraeg map but appears as Peppercorn Lane on the English-language map. The other road names remain the same because they have the same name in Cymraeg and English.

This year we will be putting some effort into improving road name labels in OpenStreetMap.

If you are an OpenStreetMap editor please help us.

 

Categories
Projects

The state of Welsh-language mapping

We worked with Transport for Wales to investigate Welsh language mapping, geolocation and route finding. This is what we found.

Mapping in Welsh

Transport for Wales asked us to undertake a piece of research for them. They wanted to know how they could build online mapping applications that treated Welsh and English language equally.

We’ve been thinking about these issues for several years and we maintain a Welsh language map of Wales at openstreetmap.cymru This, however, was a real opportunity to think about these questions from an organisation providing public transport services across Wales. We’ve produced a report for Transport for Wales which has a lot of detail in it and is very focused on their specific circumstances. This post is an opportunity to take a step back and think about some of the key things we have learned.

Digital mapping is really commodified… in English

If you want to spin up a transport application with slick background mapping, geo-location, route finding and lots of points of interest there are many robust options available. For many uses digital mapping applications can be rapidly assembled, at least in part, by stitching together commercial services that provide data on demand (for a fee).

For those of us on the team who remember when they had to drive to a mini-Computer in Cardiff to do some fairly simple GIS tasks this is really impressive. When you want those results in Welsh it suddenly becomes much harder.

Google Maps does not support Welsh… but…

Google Maps does not officially support Welsh.
This means that if you are a developer wanting to create an online mapping application using the Google Maps APIs you can’t ask for the data to be provided to you in Welsh. That should rule it out for most uses by public bodies in Wales. But… Google Maps does use Welsh in interesting ways. It seems to perform on-the fly translation. For example: it knows that Prifysgol is the same as University and so if you search for Prifysgol… it will offer you universities not just in Wales but across the World. We searched for a pub called the Black Horse and were offered the Ceffyl Du. This is quite clever.

If you use Google Maps on your phone with your locality set to Cymraeg you will see Welsh place names on the map, including in England. But without official support there are real limits to where it should be used.

Bing surfaces a lot more Welsh than we had realised

Bing Maps is Microsoft’s mapping platform. I’m sure they wouldn’t like to be de-scribed this way but I think most of us would say they are “Microsoft’s version of Google Maps”.

Unlike Google Maps, Bing Maps does officially support Welsh. As a casual user of the Bing maps website you might not notice this but as a developer you can amend your API calls to request responses in Welsh. Overall a developer can surface a great deal of Welsh in services from Bing maps. At most zoom levels roads will have bilingual names (rather than the Welsh name or the English name) and there are some odd gaps.

If you have very simple requirements for displaying points on a background map of Wales that won’t be dominated by English-language names Bing Maps is certainly worth a look at.

We feel that most public bodies could probably go beyond what Bing offers however.

Ordnance Survey could do better

Public bodies in Wales can use Ordnance Survey data under the Public Sector Geospatial Agreement. Ordnance Survey data is of extremely high quality and they offer a range of data downloads and APIs to support public bodies in their work.

But the way OS handles Welsh could be much improved.

OS publishes tiles: essentially electronic versions of the OS maps we are familiar with from walking trips. These include Welsh and English language names but OS policy means that English language names are more likely to appear than Welsh language names. It is not possible to request tiles that show only Welsh language names or where Welsh language names are more likely to appear than English language names.

OS also provides other datasets. Some of these contain the Welsh and English language names for features but we found that often the way that the data is labelled in the datasets made it difficult to identify what was the Welsh language name and what was the English language name.

Some of these are data quality issues, others are policy issues. Hopefully public bodies in Wales are working with OS to see improvements in these areas.

The “official” name

In many cases there isn’t an obvious, official source of the English language name and Welsh language name for a place, for a stream, a forest or an area.

This makes it hard to measure how good the coverage of a map is. There simply isn’t a “correct” dataset to compare it to. In many ways this is one of the strengths of the Welsh language, it truly is a living language and things are called what people using the language call them.

That said, computers need rules, and as we use maps on computers more and more the need for some rules around Welsh language names grows.

Our Welsh language map is based on the community edited OpenStreetMap and Wikidata databases. Our researches suggest that these datasets are likely to remain part of the mix in terms of naming places and features until, at least, commercial competitors catch-up. We really encourage people to contribute to these datasets.

Overall

At the time of writing:

  • It is very straightforward to build web mapping applications in English.
  • It isn’t at all straightforward to build web mapping applications in Welsh. It is possible to build them in Welsh though.
  • We’d like to encourage public bodies and other organisations serving the people of Wales to look into how they can build bilingual mapping applications. The more organisations working on this problem the more solutions will be developed.

We will carry on working on this area and we would love to hear from others with questions or ideas about welsh language and bilingual mapping.

We’d like to thank Transport for Wales for commissioning this work and for allowing us to share this summary of things we found as a result of this project.