Not all networks have geographic information (e.g., friendship networks, hyperlink networks, etc.). However, for networks that do have geographic data (such as networks of commuting or migration), geography can be a powerful way to position the nodes in a visualization. Geographic position allows viewers to interpret the network in a geographical context and also, enables viewers that are familiar with the geography to quickly find specific nodes. However, many networks include dense clusters (Figure 1a) that make a simple geographic layout inappropriate for static images. Even for interactive visualizations, users will have to pan and zoom frequently and hence, may struggle to see the big picture properties of the network.
We would like to retain the advantages of a geographic layout, but avoid the clustering issue. We first started by looking at ways to modify standard network force-directed layout algorithms to achieve this, but we have found that multidimensional scaling (MDS)—a technique typically used for data visualization and dimensionality reduction—works well. Given a matrix of pairwise distances, MDS produces a set of points whose pairwise Euclidean distances are as close as possible to those in the original matrix. In our case, we compute the Euclidean distance matrix between nodes, add a number λ to every entry (except those on the diagonal) and then compute new coordinates for the nodes using MDS. Note that by adding λ to every distance, we increases the relative distance between pairs of points that are close together.
There are many variants of MDS. We use the well-known Sammon mapping which focuses on the distances between nearby points. Figure 1 demonstrates the approach for different values of λ on a map of local authorities in the United Kingdom. In this case, a value of λ between 100-200km produces reasonable results—the clusters are spread out, but significant geographic information is retained.