# How efficient is the placement of London Tube stations in minimising walking distances? Part III

In this post I finally share my answer to the question “How efficient is the placement of London Tube stations in minimising walking distances?”. To briefly recap In part 1 we developed our strategy and R code for measuring the efficiency of the placement of tube stations. In part 2 we modified our code so that we can input empirical data from openstreetmap and Transport for London and then I presented the results of the TfL station network as-is and a randomly generated station network to act as a performance comparator.

Final Results – How efficient is the placement of London Tube stations in minimising walking distances?

• The current placement of London Tube (TfL) stations is not efficient for the 10K radius around London.
• The current placements are 149% less efficient than a random allocation of stations.

We calculated this by finding the shortest path (as the bird flys) from every building to the nearest station and scoring the layout in terms of the sum of square shortest distances. The reason that we used square distances is that it penalizes larger distances more strongly than shorter ones thereby penalizing more strongly distances that become increasingly un-walkable due to their length. We measured efficiency in terms of the percentage performance difference with a random station allocation.

Name                       1st Qu.     Median     Mean    3rd Qu.        Max.     SSSD     Efficiency
Random Stations      465.500  704.500  753.900  988.600  2246.000   7.83e10         0%
TfL Stations              338.600  567.000  950.500 1253.000 5844.000   1.95e11    -149%
*Partially Optimised  283.800  425.300  454.000  585.300   2347.000  2.89e10       63%

*I call my results partially optimised since optim hit the max number of iterations without reaching convergence but we can safely say that it was converging since the order of magnitude of SSSD (Sum of Square Shortest Distances) is an order of magnitude lower than the TfL starting points. This can be seen in the diagram below where we can see stations more evenly spread out within the 10K radius of study and generally close to areas with a high density of buildings.

What can we do better?

• We can improve the coverage of stations by including railway stations that are not operated by TfL this is likely to fill in some coverage gaps in the South East and North East of London.
• We can improve the speed of the optimisation in a few ways:
• selecting cleverer initial conditions we simply used the current locations of TfL stations as starting points but it would be better to use a heuristic to place stations such as placing them at point with the highest building densities;
• improving the execution time of the SSSD cost function (min.distance) that is executed repeatedly by optim. To see how to do get a ~40x speed improvement check out my blog post on using Rcpp.