The concept of “production readiness” is highly influenced by the product we make, and by the culture and the people involved in designing, developing and maintaining the code.
Having a common understanding of what production-ready code looks like has been fundamental in building shared libraries that facilitate our work. This also enables internal mobility and allows our engineers to change teams, and helps us to evaluate candidates for our engineering roles using the same approach and measures.
As engineers at SuperAwesome, we all review pull requests and code challenge submissions, and as reviewers we expect our teammates to give their thumbs up only when the code is “production-ready”. Here’s an outline of the process we use.
The User, the Team, and the System
The code powering our products serves three main stakeholders, each with their own needs.
The user’s top priority is that the program they run accomplishes its task in a reasonable amount of time with no negative side effects.
Therefore, when we review a piece of code through the user’s lens we start by carefully reading the requirements, and making sure each of them is fulfilled by the code.
We review the code performance by analysing the runtime complexity of the different functions.
Not everything needs to be optimised for performance; we also measure the success of our products by user adoption. Therefore, most of the time the right trade-off is the one that allows us to deliver a reasonably well-performing feature to our users in the shortest time possible. We can optimise for better performance later, following user feedback.
Technically speaking, an O(N^2) complexity is only borderline-reasonable, and an O(2^N) is definitely beyond any assessment of reasonable performance. In some conditions, such as when we iterate on a very small dataset, algorithms with this runtime complexity can be acceptable, but the majority of times these will get flagged in our code reviews as they can result in sluggish execution and a bad user experience.
Double down on kids’ data privacy
Kids are a large portion of the global user base, and although they share the same sensitivity adults have to visible side effects such as bugs or crashes, they are more vulnerable to side effects that can compromise their privacy.
We are very serious about kids’ online privacy, and each engineer gets trained in COPPA and GDPR-K in the first months of their career at SuperAwesome.
Kidtech is built on the principle of “privacy-by-design”, hence production-ready kidtech code should never collect PII inadvertently (should such information be needed, verified parental consent must be obtained).
When we review a pull request we scrutinise the code verifying that IP addresses are truncated, user agents are stripped of unnecessary information and online identifiers are never stored in cookies to prevent fingerprinting. If third party libraries or services are used, they must already have been vetted by our team and marked as safe for kids.
The team needs to retain the capability to iterate on any system, regardless of the team member who wrote it. Hence, the team’s take on production-readiness is that code must be understandable, maintainable and extensible.
At SuperAwesome we are a team of generalists, and although each individual has their preferred part of the stack, we expect every team member to have a holistic view of the problems at hand, and to design a solution from front-end to back-end to infrastructure.
Writing performing code can require multiple iterations, and while production-ready code might not necessarily be optimised for performance at the first write, it is always optimised for readability and extensibility.
We do this by ensuring that class, function and variable names are self-explanatory, that SOLID principles are followed, that trade-offs are made explicit either in readme files or inline docs, and that dependencies are clear and captured in consumer-driven contracts when possible.
Automated tests are our executable documentation and they follow the same readability rules of every other piece of code.
We encourage every team member to practice TDD, and we expect the large majority of our code to be covered by well-written, comprehensive tests.
A good suite of tests creates a safe environment that enables the whole team to iterate fast and ship small incremental improvements to the users multiple times a day, without any negative side effects.
The System can be a deceiving stakeholder: it will allow any sort of code to run, and eventually it’ll leave the User or the Team dealing with the consequences.
Production-ready code ensures that the User and the Team have the best possible experience.
In practice, we want our code to play nicely with the system resources, consider memory constraints and security concerns, and handle failure gracefully.
This requires a high level of operational awareness, and that’s why at SuperAwesome we embraced the devops culture and trained all our engineers to be familiar with the infrastructure running our code.
During a code review, we look for the common sources of failure in a distributed system such as sluggish I/O or networking, sudden instance termination, scarce memory availability and malicious attacks. Thinking about these aspects as fundamental system requirements helps us to develop a resilient system capable of serving billions of kid-safe transactions every month.
As SuperAwesome grows we’ll iterate further on our definition of production-readiness and this document will evolve. Do you want to have your say on it? We’re always looking for bright minds to add to our great team of kidtech engineers. Have a look at our careers page and come join us!
Piergiorgio Niero is Head of Engineering at SuperAwesome.