September 18, 2018

How we use Apache Avro at Smartcar

Gurpreet Atwal

Director of Platform Engineering

We at Smartcar want to make it easy and secure for developers to integrate applications with cars. Our API allows apps to communicate with connected vehicles across car brands without the need for any additional hardware. Recently, our team overhauled the internal interface that connects our developer-facing API to each of our car manufacturer (OEM) integrations. Rather than calling the respective JavaScript functions directly, we introduced the Avro protocol. Why, you might ask?

Smartcar uses Apache Avro in its internal interface.

Before switching to Avro, our internal interface posed three fundamental problems that made it hard for us to scale our product:

  • Serialization: We wanted to make caching possible without either side of the internal interface having to do additional work. In order to make this type of caching possible, we needed to find a way to easily serialize the data that passes through the interface. At the time, we didn’t have a standardized format for that data, which made it hard to serialize.
  • Decoupling: Our developer-facing interface of HTTP endpoints had a one-to-one mapping with our internal interface, which made it hard to make changes to one without compromising the functionality of the other.
  • Type checks: We used type checks to ensure that the data we returned to developers was pristine. Because we used JavaScript — a weakly typed language — we had to make the type checks on both sides of the internal interface. This doubled our workload.

Choosing Avro

When trying to solve these three challenges, we decided to tackle the serialization problem first. Soon, we ended up diving head over heels into the world of serialization formats. We spent many a day researching and trying out formats like Google’s Protocol Buffers, Apache Thrift, Apache Avro, MessagePack, and others. Some of them, like MessagePack, function as serialization formats only. Others, like Avro and Protocol Buffers, act as combinations of serialization formats and Remote Procedure Call (RPC) frameworks.

Our guess was that an RPC framework could help us decouple the two sides of the interface and thus solve our second problem as well. This narrowed our choices down to the formats that came with an RPC framework: Avro, Thrift, and Protocol Buffers.

Looking more into those three formats, we found that Avro’s features would solve all three of our challenges and make our product scalable in other ways too:

  • Solving the three problems: Avro knocks out three birds with one stone: serialization format, RPC protocol, and schema validation. These three features would solve our three challenges respectively — all in one go.
  • Logical types: Avro would give us the freedom to add our own types to Avro’s existing logical types. This would allow us to execute stricter schema validation in the protocol itself, thus solving our third problem.
  • Binary encoding: Comparing Avro to other serialization formats, we found that Avro’s binary encoding was incredibly efficient for the specific schemas we were using.
  • Language support: Avro has been implemented in many different languages. If we ever wished to switch from JavaScript to another language in the future, Avro would give us the freedom to choose from a wider range of languages.
  • In-memory transport: This feature is specific to avsc — the JavaScript implementation of Avro, which you’ll read more about in just a minute. Avsc supports an in-memory transport in addition to the traditional network transports. If both sides of an interface are on the same network, this would allow us to mitigate a good amount of the latency associated with traditional RPC frameworks.

Implementing the Avro protocol

Once we knew that Avro was the format to go with, it was time to start building!

Time to start building!

By “building” I mean implementing the Avro protocol as a copy of Smartcar’s internal interface. Let’s recall the two sides of the internal interface: the developer-facing API and the OEM integrations. As both sides were written in JavaScript, we needed to use an npm library to implement the Avro protocol. By a fluke, we stumbled upon avsc — an npm library that implements Avro in pure JavaScript.

When we started implementing, we found that both creating Avro schema files and following the avsc instructions were relatively easy. We also had to face some challenges:

  • We wanted the binary encoding of the data to be as efficient as possible. This was a pretty ambitious goal, so we had to put a good amount of thought into how to structure the protocol and how to build the schemas. We even ended up encoding data manually — literally! — by hand on a whiteboard.
  • We also ran into some issues with avsc due to the quirks of applying a typed schema language — Avro — in a weakly typed language — JavaScript. This is where Matthieu, the author of avsc, was able to help us solve these problems. We’re very grateful for his support and for the many hours of debugging he saved us!
# Schema for Smartcar's odometer record in Avro's IDL

record OdometerDistance {
    // 2,000,000km ~ 1,242,742 miles
    @logicalType("range") @min(0) @max(2000000)
    float value;
}

Replacing the internal interface

Once we had successfully replicated our internal interface in Avro, the last step was to replace our old interface with the new Avro interface.

Did I mention that our internal interface is on the critical path of almost every request to Smartcar? We couldn’t just go on a feature freeze and halt all work until the migration was complete. Thankfully, we were able to use some avsc features alongside some carefully designed shims — small chunks of temporary code that wrapped our old methods and made them compatible with our new interface. All this allowed developers to keep using the Smartcar API, while we integrated and deployed the new interface at the same time.

Conclusion

The work is done! Smartcar now has a brand new internal interface that uses Avro to connect the developer-facing side of our API with our OEM integrations. This new interface allows our team to scale the Smartcar platform more easily. It ensures that we can:

  • efficiently cache large amounts of data,
  • iterate on each side of the interface independently of each other,
  • and catch errors and potential bugs much earlier.

Thank you, Avro and avsc! 🚀


Smartcar is the connected car API that allows mobile and web apps to communicate with connected vehicles across brands (think “check odometer” or “unlock doors.”)

Want to take our API for a spin? Check out our docs and get started with our demo app! Have any questions or feedback? Shoot us an email!

Everything you need to know about car APIs. Delivered monthly.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.