The huge potential of data analytics changes not only the processes of individual companies, but also bears the potential for massive improvements for urban infrastructure and the way we live, work, and commute in our cities. Data analytics can be used to efficiently plan the (public) transportation networks of cities like Boston, and helps to tackle the complexity that lies within these networks: the high connectedness, dynamic commuting patterns of the population, and a huge variety of externalities (weather, special events, etc.). Interestingly, it seems that especially companies introducing new transportation concepts without an existing infrastructure are benefitting from data analytics
The case of Uber’s ridesharing service is probably well known, but also local services like Bostons bike sharing service Hubway are able to use data analytics. Hubway enjoyed a rapidly increasing popularity within the last 3-4 years, as they create great value for both tourists and local commuters: riding the bike is usually cheap, healthy, environment friendly, and often even faster than commuting in a stifling, old subway. And most importantly: Hubway offers the convenience of over 140 bike stations in the Boston area that can be used to pick up or drop off one of their bikes (see map on the right. The value of all these benefits is captured through either a monthly subscription fee or through a (significantly higher) fee for one-time users.
However, this value heavily depends on bike availability: if you want to bike to work in the morning and the rack is empty, this leads not only to lost revenue for Hubway, but also quickly decrease customer satisfaction. Given the rapid growth of bike sharing users it is a major challenge for Hubway to continuously guarantee the availability of bikes and to avoid empty stations like this one on the left. They do this by using trucks that are in operation 16 hours a day to relocate their bikes between different stations. But how are they able to predict when, where, and how many bikes should be relocated to optimize their network? The solution lies in the data.
Hubway collects a variety of features for each trip with one of their bikes. These include a timestamp, start/end station, bike ID, subscription type (registered or “tourist”), and some other user-related data (gender, ZIP-code, etc.). In 2014 they observed 1,192,805 trips in total. Although I have no detailed knowledge about their internal prediction models, it is obvious that this dataset offers an immense potential for data-driven predictions to optimize Hubway’s business. A couple of questions that Hubway can address by applying statistical models to the dataset:
- How do the commuting patterns in the Boston area look like? Which stations are affected by a significant imbalance of inbound/outbound traffic in the morning/evening?
- How does the time and weekday influence demand? What’s the effect of holidays or special events?
- How does the weather influence the network?
- What’s the optimal policy for the bike relocation truck for a given day?
- Where should Hubway extend their existing locations or build new stations?
To give you a brief idea about the dimension and potential of the Hubway data, I created a simple dashboard using their publicly available datasets from 2011-2013. Feel free to play around with it – it’s interactive! 😉
Hubway didn’t just use their internal capabilities to analyze the data, but soon discovered new creative way to leverage their data wealth: They used the crowd to get additional insights. As there are many people that are immediately benefitting from improved local transportation services, the nature of Hubway’s business seems to fit perfectly to a crowdsourcing approach. In 2013 and 2014 Hubway therefore set up a public data analytics challenge to visualize and analyze their dataset, which they made available to the public. The challenge resulted in a huge number of submissions and stunning visualizations.
Is that all there is? I guess not, and there are many different options to further work on a data-driven bike sharing future in Boston. Potential improvements include dynamic pricing models to reduce the imbalance of the demand, the connection with other public transportation services (Internet of Things), or further extensions to their (already very good) mobile services for smartphones and –watches.