Over the last decade, ride hailing companies have changed the way we move in our cities. Founded in 2011, Cabify is one of the largest international players in this industry, operating in more than 40 cities in Latin America and Spain. Cabify services are based on a powerful technological stack that allows smooth, simultaneous operation for millions of journeys worldwide, 24 hours/day, 7 days/week. One of the key components of this stack is the tool that calculates the time it will take for drivers to travel between locations when going to pick up clients or driving them to their destination, a.k.a. Estimated Time of Arrival (ETA). A good ETA calculator is crucial for driver assignment, pricing, etc. There are different approaches to obtain this ETA, depending on the company’s internal capabilities and external vendor policies.
At Cabify we have developed Cabimaps, our own ETA calculator. Cabimaps is a deep neural network that predicts the time it will take for a driver to travel between two locations. The main purpose of this talk is to tell Cabimaps’ story since its inception, explaining the most important features that have been incorporated into it over the years. Cabimaps has been an integral part of our stack since 2019 and we revise, retrain and fine tune it frequently to keep up with market changes.
Cabimaps gives us freedom to control how ETAs are calculated, avoiding the costs and vendor lock of relying on an external provider (e.g., on Google Maps or Here). Cabimaps is trained using exclusively our own data, removing any dependency with external sources. Cabify has been a data-driven company from the start, so we have a lot of useful information that we can use.
Cabimaps is not a route calculator (it does not return the route the driver needs to follow), but a time estimator. This means Cabimaps does not need to rely on complex data representations of the city map. Cabimaps only uses the origin and destination coordinates and the date and time of the trip. Several transformations are applied before feeding this information to the neural network. Since Cabimaps does not rely on a city map, it needs a smart way to incorporate geographical information into our model. Many aspects of the geographical surroundings can impact the driver’s route, like street layout, nearby services, etc. We have to generalize these features to incorporate them into the model. We achieve this by a combination of spatial indexing and feature embedding. Cabimaps uses spatial indexes to convert geographic coordinates into cell ids, leveraging different index levels to capture geographical information at different scales. Then, it uses specially trained neural subnets to calculate index cell embeddings.
We have trained Cabimaps models for more than 40 cities in 8 countries. Our objective is to obtain ETA calculations similar to state-of-the-art commercial providers, so we can rely entirely on our service. Cabimaps has evolved over the years, and currently produces similar or better results than our best vendor in 65% of the cities, including very large markets like Buenos Aires, Madrid or Mexico DF. Where Cabimaps still comes in second place, the difference with the best vendor is 16 seconds on average.
From a purely technical point of view, we faced a difficult problem in terms of the necessary model accuracy, reliability and computational performance. From a business perspective, we managed to develop a piece of technology that helped us to significantly reduce operational costs and our dependence on external providers for one of our core processes, without compromising the quality of the services provided to our users.