Before switching to Avro, our internal interface posed three fundamental problems that made it hard for us to scale our product:
- Serialization: We wanted to make caching possible without either side of the internal interface having to do additional work. In order to make this type of caching possible, we needed to find a way to easily serialize the data that passes through the interface. At the time, we didn’t have a standardized format for that data, which made it hard to serialize.
- Decoupling: Our developer-facing interface of HTTP endpoints had a one-to-one mapping with our internal interface, which made it hard to make changes to one without compromising the functionality of the other.
When trying to solve these three challenges, we decided to tackle the serialization problem first. Soon, we ended up diving head over heels into the world of serialization formats. We spent many a day researching and trying out formats like Google’s Protocol Buffers, Apache Thrift, Apache Avro, MessagePack, and others. Some of them, like MessagePack, function as serialization formats only. Others, like Avro and Protocol Buffers, act as combinations of serialization formats and Remote Procedure Call (RPC) frameworks.
Our guess was that an RPC framework could help us decouple the two sides of the interface and thus solve our second problem as well. This narrowed our choices down to the formats that came with an RPC framework: Avro, Thrift, and Protocol Buffers.
Looking more into those three formats, we found that Avro’s features would solve all three of our challenges and make our product scalable in other ways too:
- Solving the three problems: Avro knocks out three birds with one stone: serialization format, RPC protocol, and schema validation. These three features would solve our three challenges respectively — all in one go.
- Logical types: Avro would give us the freedom to add our own types to Avro’s existing logical types. This would allow us to execute stricter schema validation in the protocol itself, thus solving our third problem.
- Binary encoding: Comparing Avro to other serialization formats, we found that Avro’s binary encoding was incredibly efficient for the specific schemas we were using.
Implementing the Avro protocol
Once we knew that Avro was the format to go with, it was time to start building!
When we started implementing, we found that both creating Avro schema files and following the avsc instructions were relatively easy. We also had to face some challenges:
- We wanted the binary encoding of the data to be as efficient as possible. This was a pretty ambitious goal, so we had to put a good amount of thought into how to structure the protocol and how to build the schemas. We even ended up encoding data manually — literally! — by hand on a whiteboard.
Replacing the internal interface
Once we had successfully replicated our internal interface in Avro, the last step was to replace our old interface with the new Avro interface.
Did I mention that our internal interface is on the critical path of almost every request to Smartcar? We couldn’t just go on a feature freeze and halt all work until the migration was complete. Thankfully, we were able to use some avsc features alongside some carefully designed shims — small chunks of temporary code that wrapped our old methods and made them compatible with our new interface. All this allowed developers to keep using the Smartcar API, while we integrated and deployed the new interface at the same time.
The work is done! Smartcar now has a brand new internal interface that uses Avro to connect the developer-facing side of our API with our OEM integrations. This new interface allows our team to scale the Smartcar platform more easily. It ensures that we can:
- efficiently cache large amounts of data,
- iterate on each side of the interface independently of each other,
- and catch errors and potential bugs much earlier.
Thank you, Avro and avsc! 🚀
Smartcar is the connected car API that allows mobile and web apps to communicate with connected vehicles across brands (think “check odometer” or “unlock doors.”)