Recently at Smartcar, we started to build an integrated billing infrastructure that facilitates, tracks, and automates all payment processing for our API platform. Our first step in this process was to create a reliable billing service that would allow us to store and retrieve all of our users’ API usage data.
But before getting started, we needed to make a couple of decisions: Which type of data store should we use? What types of data should we store? And how should we build the data store? This blog post will outline why we chose to build a relational data model in PostgreSQL, why we chose to build it as a microservice, and why we're using Apache Avro for data serialization.
We hope that you’ll find this post useful when creating a data store for your own product. Happy reading and happy billing, everyone! 💰💳
Our first step in creating a data store was to choose what type of database would best serve our use case. To make this decision easier, we created a list of criteria that we found most important:
With those criteria in mind, we evaluated a few different databases. The table below shows whether each type of database is able to entirely meet each criterion.
When looking at this comparison, we quickly identified Postgres as the winner:
To ensure both durability and queryability, we decided that a relational database management system (RDBMS) like Postgres was the right choice for us. Postgres gives us transactional guarantees that makes our data durable. Data stored in relational databases is also highly queryable. Unlike noSQL alternatives or in-memory data store variants, relational databases support structured data organization and storage using schemas and tables. This allows us to easily construct our queries to calculate billing and flexibly change queries, should our business requirements change. A Postgres database is also extensible, as it can support different types of queries while keeping the same structure. Finally, as Postgres is open source, it offers a whole ecosystem of tools to export and transport data from our billing database to other services.
In an ideal world, we would want to track and store all types of requests made to Smartcar, but this would require a lot of instrumentation work. To save time and effort, we prioritized to only track what we called “billable” requests, with the possibility of expanding that scope in the future. We defined a billable request as a successful API request made to a real vehicle. For each billable request, we store only the data that’s necessary for invoicing and auditing, meaning:
Now that we decided on a type of data store and what data to store, it was time to choose an architecture and get building!
We knew that multiple services on Smartcar’s platform would need to interact with our billing database:
As several services would depend on our billing database, we decided to build it as a separate microservice. In its code base, we wanted to apply the logic of doing one thing really well. We looked for inspiration and decided to follow the principle of decoupled persistent storage. This Medium Engineering post about monolithic and decoupled data persistent data storage helped us make this decision.
Now that we had decided to build our data store as a separate microservice, we had to find a way to easily communicate to and from it.
We considered to either build an internal JSON-based HTTP server interface, or to use a binary serialization format with a Remote Procedure Call (RPC) framework like Apache Avro. We decided to go with the latter, for the following reasons:
With those three factors in mind, using Avro was the natural choice for us.
Let’s recap the steps we took when creating our first billing data store at Smartcar. We decided to:
This is how our billing data store interacts with other services on our platform:
We are now a few months into using this data store as the single source of truth for all our billing data. In this short amount of time, the way we architected the data store has already saved us a colossal amount of engineering work, and has improved user experience for the developers using our API. We have also begun building other services that depend on our billing data store, and we’re looking forward to using, testing, and improving it more as we build out our entire billing infrastructure.
If you've made it all the way to the end of this blog post, and this topic sounds like an interesting challenge to you, we'll tell you a secret: 🔐👉 We’re hiring!