UK Transit Data- TransXchange and ATOC

This page describes how to convert UK public transport data to GTFS format. Note this part has been done for you in the Workshop files, which include the converted GTFS data for London and the South East, as well as National Rail.

UK Public Transport Schedule Formats
While many cities and transport agencies around the world publish GTFS data directly, the UK has its own public transport schedule data formats. These need to be converted to GTFS data to model UK cities in R5. One of the few exceptions is Greater Manchester which does publish GTFS data of its bus and tram network.

To add to the complication, there are different UK formats for the bus and metro data (TransXchange) and the National Rail data (ATOC-CIF). Fortunately there are several programs out there to handle the conversion of this data. We are going to use a very useful tool by ITS Leeds called UK2GTFS. This tool includes functions to convert both the TransXchange bus and metro data, and the National Rail ATOC-CIF data.

TransXchange Data Conversion
In the UK, local public transport schedule data is published using a format developed by the Department for Transport called TransXchange. You can access the latest TransXchange data from the Traveline National Data website (requires registration for the FTP link). The data is published according to regions in Great Britain-

  • EA – East Anglia
  • EM – East Midlands
  • L – London
  • NE – North East
  • NW – North West
  • S – Scotland
  • SE – South East
  • SW – South West
  • W – Wales
  • WM – West Midlands
  • Y – Yorkshire

Your study area may be sufficiently covered by one of these regions or you may need data on more than one region. For example a model of London could be either confined to the Greater London Authority boundary so would only need the London region, or if you wanted commuting trips into London then you might want to include the neighbouring South East and East Anglia regions.

Let’s look at how to convert a TransXchange file for one region. First of all, you will need to instal UK2GTFS in R-

install.packages("remotes") # If you do not already have the remotes package
remotes::install_github("ITSleeds/UK2GTFS")

The library also has some dependencies linked to the Tidyverse package if you haven’t already got that installed.

Next we will load the library and run the transxchange2gtfs function. Set the path to the location of your TransXchange zip file-

library(UK2GTFS)
path_in <- "NE.zip"
gtfs <- transxchange2gtfs(path_in = path_in,
 ncores = 3)

If the TransXchange file is clean, then we can export the output to a GTFS file-

gtfs_write(gtfs, folder = "C:/AccessibilityWorkshop/Traveline_National", name = "NorthEast_GTFS_April2023")

There are however often errors with the file that need to be fixed. If you get a merge error then you need to force the merge as follows-

gtfs <- gtfs_merge(gtfs, force = TRUE)

Furthermore, you often need to clean the GTFS output so that it can be imported into R5. This can be achieved using the following functions-

gtfs <- gtfs_clean(gtfs)
gtfs <- gtfs_force_valid(gtfs)

If you are having problems with invalid GTFS files, then a useful tool is the online GTFS Validator.

National Rail Data- ATOC-CIF
Unfortunately the GB National Rail data is even more of a pain to convert to GTFS than the TransXchange data! The rail data comes in an old format called ATOC-CIF. It can be downloaded from the Rail Delivery Group Website (you need to register, the data is the Current Timetable Feed).

We can then use UK2GTFS to convert this file-

library(UK2GTFS)
path_in <- "ttis627.zip"
gtfs <- atoc2gtfs(path_in = path_in,
 ncores = 3)

Then you can clean and force valid the output file as above. One common issue with the rail data is that stop points are sometimes missing from the GTFS output file. These are generally timing points rather than stops where passengers can access the train, and so accurate location data is not required. The online GTFS Validator can help to identify if this is causing a potential error. If can be fixed by adding missing stop locations with a default location of 0,0.

Workshop Pages