In this post I finally share my answer to the question “How efficient is the placement of London Tube stations in minimising walking distances?”. To briefly recap In part 1 we developed our strategy and R code for measuring the efficiency of the placement of tube stations. In part 2 we modified our code so that we can input empirical data from openstreetmap and Transport for London and then I presented the results of the TfL station network as-is and a randomly generated station network to act as a performance comparator.
Final Results – How efficient is the placement of London Tube stations in minimising walking distances?
- The current placement of London Tube (TfL) stations is not efficient for the 10K radius around London.
- The current placements are 149% less efficient than a random allocation of stations.
We calculated this by finding the shortest path (as the bird flys) from every building to the nearest station and scoring the layout in terms of the sum of square shortest distances. The reason that we used square distances is that it penalizes larger distances more strongly than shorter ones thereby penalizing more strongly distances that become increasingly un-walkable due to their length. We measured efficiency in terms of the percentage performance difference with a random station allocation.
Name 1st Qu. Median Mean 3rd Qu. Max. SSSD Efficiency
Random Stations 465.500 704.500 753.900 988.600 2246.000 7.83e10 0%
TfL Stations 338.600 567.000 950.500 1253.000 5844.000 1.95e11 -149%
*Partially Optimised 283.800 425.300 454.000 585.300 2347.000 2.89e10 63%
*I call my results partially optimised since optim hit the max number of iterations without reaching convergence but we can safely say that it was converging since the order of magnitude of SSSD (Sum of Square Shortest Distances) is an order of magnitude lower than the TfL starting points. This can be seen in the diagram below where we can see stations more evenly spread out within the 10K radius of study and generally close to areas with a high density of buildings.
What can we do better?
- We can improve the coverage of stations by including railway stations that are not operated by TfL this is likely to fill in some coverage gaps in the South East and North East of London.
- We can improve the speed of the optimisation in a few ways:
- selecting cleverer initial conditions we simply used the current locations of TfL stations as starting points but it would be better to use a heuristic to place stations such as placing them at point with the highest building densities;
- improving the execution time of the SSSD cost function (min.distance) that is executed repeatedly by optim. To see how to do get a ~40x speed improvement check out my blog post on using Rcpp.