Archives November 2023

Going Global: A Deep Dive to Build an Internationalization Framework

Key Takeaways

  • Internationalization (i18n) and localization are critical processes in web development that ensure software can be adapted for different languages and regions and the actual adaptation of the software to meet these specific requirements.
  • Though JavaScript-focused i18n libraries (like i18next, react-intl, and react-i18next) are dominant tools in the field, aiding developers in efficiently handling translations and locale-specific configurations, they are only available for Javascript-based web applications. There is a need for a language-agnostic framework for internationalization.
  • JSON is a widely-accepted format for storing translations and locale-specific configurations, allowing for easy integration and dynamic content replacement in various applications irrespective of the language and framework used.
  • Content Delivery Networks (CDN) can be strategically used to efficiently serve locale-specific configuration files, mitigating potential downsides of loading large configurations.
  • Building and integrating a custom internationalization framework with databases or data storage solutions enables dynamic and context-aware translations, enhancing the user experience for different regions and languages.

Dipping your toes into the vast ocean of web development? You’ll soon realize that the web isn’t just for English speakers — it’s global. Before you’re swamped with complaints from a user in France staring at a confusing English-only error message, let’s talk about internationalization (often abbreviated as i18n) and localization.

What’s the i18n Buzz About?

Imagine a world where your software speaks fluently to everyone, irrespective of their native tongue. That’s what internationalization and localization achieve. While brushing it off is tempting, remember that localizing your app isn’t just about translating text. It’s about offering a tailored experience that resonates with your user’s culture, region, and language preferences.

However, a snag awaits. Dive into the tool chest of i18n libraries, and you’ll notice a dominance of JavaScript-focused solutions, particularly those orbiting React (like i18next, react-intl, and react-i18next).

Venture outside this JavaScript universe, and the choices start thinning out. More so, these readily available tools often wear a one-size-fits-all tag, lacking the finesse to cater to unique use cases.

But fret not! If the shoe doesn’t fit, why not craft one yourself? Stick around, and we’ll guide you on building an internationalization framework from scratch — a solution that’s tailored to your app and versatile across languages and frameworks.

Ready to give your application a global passport? Let’s embark on this journey.

The Basic Approach

One straightforward way to grasp the essence of internationalization is by employing a function that fetches messages based on the user’s locale. Below is an example crafted in Java, which offers a basic yet effective glimpse into the process:

public class InternationalizationExample {

    public static void main(String[] args) {
        System.out.println(getWelcomeMessage(getUserLocale()));
    }

    public static String getWelcomeMessage(String locale) {
        switch (locale) {
            case "en_US":
                return "Hello, World!";
            case "fr_FR":
                return "Bonjour le Monde!";
            case "es_ES":
                return "Hola Mundo!";
            default:
                return "Hello, World!";
        }
    }

    public static String getUserLocale() {
        // This is a placeholder method. In a real-world scenario,
        // you'd fetch the user's locale from their settings or system configuration.
        return "en_US";  // This is just an example.
    }
}

In the example above, the getWelcomeMessage function returns a welcome message in the language specified by the locale. The locale is determined by the getUserLocale method. This approach, though basic, showcases the principle of serving content based on user-specific locales.

However, as we move forward, we’ll dive into more advanced techniques and see why this basic approach might not be scalable or efficient for larger applications.

Pros:

  • Extensive Coverage — Given that all translations are embedded within the code, you can potentially cater to many languages without worrying about external dependencies or missing translations.
  • No Network Calls — Translations are fetched directly from the code, eliminating the need for any network overhead or latency associated with fetching translations from an external source.
  • Easy Code Search — Since all translations are part of the source code, searching for specific translations or troubleshooting related issues becomes straightforward.
  • Readability — Developers can instantly understand the flow and the logic behind choosing a particular translation, simplifying debugging and maintenance.
  • Reduced External Dependencies — There’s no reliance on external translation services or databases, which means one less point of failure in your application.

Cons:

  • Updates Require New Versions — In the context of mobile apps or standalone applications, adding a new language or tweaking existing translations would necessitate users to download and update to the latest version of the app.
  • Redundant Code — As the number of supported languages grows, the switch or conditional statements would grow proportionally, leading to repetitive and bloated code.
  • Merge Conflicts — With multiple developers possibly working on various language additions or modifications, there’s an increased risk of merge conflicts in version control systems.
  • Maintenance Challenges — Over time, as the application scales and supports more locales, managing and updating translations directly in the code becomes cumbersome and error-prone.
  • Limited Flexibility — Adding features like pluralization, context-specific translations, or dynamically fetched translations with such a static approach is hard.
  • Performance Overhead — For high-scale applications, loading large chunks of translation data when only a tiny fraction is used can strain resources, leading to inefficiencies.

Config-Based Internationalization

Building on the previous approach, we aim to retain its advantages and simultaneously address its shortcomings. To accomplish this, we’ll transition from hard-coded string values in the codebase to a config-based setup. We’ll utilize separate configuration files for each locale, encoded in JSON format. This modular approach simplifies the addition or modification of translations without making code changes.

Here’s how a configuration might look for the English and Spanish locales:

Filename: en.json

{
    "welcome_message": "Hello, World"
}
Filename: es.json
{
    "welcome_message": "Hola, Mundo"
}

Implementation in Java:

First, we need a way to read the JSON files. This often involves utilizing a library like Jackson or GSON. For the sake of this example, we’ll use Jackson.

import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Map;

public class Internationalization {

    private static final String CONFIG_PATH = "/path_to_configs/";
    private Map translations;

    public Internationalization(String locale) throws IOException {
        ObjectMapper mapper = new ObjectMapper();
        translations = mapper.readValue(new File(CONFIG_PATH + locale + ".json"), Map.class);
    }

    public String getTranslation(String key) {
        return translations.getOrDefault(key, "Key not found!");
    }
}

public static class Program {

    public static void main(String[] args) throws IOException {
        Internationalization i18n = new Internationalization(getUserLocale());
        System.out.println(i18n.getTranslation("welcome_message"));
    }

    private static String getUserLocale() {
        // This method should be implemented to fetch the user's locale.
        // For now, let's just return "en" for simplicity.
        return "en";
    }
}

The Internationalization class reads the relevant JSON configuration in the above code based on the provided locale when instantiated. The getTranslation method fetches the desired translated string using the identifier.

Pros:

  • Retains all the benefits of the previous approach — It offers extensive coverage, no network calls for translations once loaded, and the code remains easily searchable and readable.
  • Dynamic Loading — Translations can be loaded dynamically based on the user’s locale. Only necessary translations are loaded, leading to potential performance benefits.
  • Scalability — It’s easier to add a new language. Simply add a new configuration file for that locale, and the application can handle it without any code changes.
  • Cleaner Code — The logic is separated from the translations, leading to cleaner, more maintainable code.
  • Centralized Management — All translations are in centralized files, making it easier to manage, review, and update. This approach provides a more scalable and cleaner way to handle internationalization, especially for larger applications.

Cons:

  • Potential for Large Config Files — As the application grows and supports multiple languages, the size of these config files can become quite large. This can introduce a lag in the initial loading of the application, especially if the config is loaded upfront.

Fetching Config from a CDN

One way to mitigate the downside of potentially large config files is to host them on a Content Delivery Network (CDN). By doing so, the application can load only the necessary config file based on the user’s locale. This ensures that the application remains fast and reduces the amount of unnecessary data the user has to download. As the user switches locales or detects a different locale, the relevant config can be fetched from the CDN as required. This provides an optimal balance between speed and flexibility in a high-scale application. For simplicity, let’s consider you’re using a basic HTTP library to fetch the config file. We’ll use the fictional HttpUtil library in this Java example:

import java.util.Map;
import org.json.JSONObject;

public class InternationalizationService {

    private static final String CDN_BASE_URL = "https://cdn.example.com/locales/";

    public String getTranslatedString(String key) {
        String locale = getUserLocale();
        String configContent = fetchConfigFromCDN(locale);
        JSONObject configJson = new JSONObject(configContent);
        return configJson.optString(key, "Translation not found");
    }

    private String fetchConfigFromCDN(String locale) {
        String url = CDN_BASE_URL + locale + ".json";
        return HttpUtil.get(url);  // Assuming this method fetches content from a given URL
    }

    private String getUserLocale() {
        // Implement method to get the user's locale
        // This can be fetched from user preferences, system settings, etc.
        return "en";  // Defaulting to English for this example
    }
}

Note: The above code is a simplified example and may require error handling, caching mechanisms, and other optimizations in a real-world scenario.

The idea here is to fetch the necessary config file based on the user’s locale directly from the CDN. The user’s locale determines the URL of the config file, and once fetched, the config is parsed to get the required translation. If the key isn’t found, a default message is returned. The benefit of this approach is that the application only loads the necessary translations, ensuring optimal performance.

Pros:

  • Inherits all advantages of the previous approach.
  • Easy to organize and add translations for new locales.
  • Efficient loading due to fetching only necessary translations.

Cons:

  • Huge file size of the config might slow the application initially.
  • Strings must be static. Dynamic strings or strings that require runtime computation aren’t supported directly. This can be a limitation if you need to insert dynamic data within your translations.
  • Dependency on external service (CDN). If the CDN fails or has issues, the application’s ability to fetch translations.

However, to address the cons: The first can be mitigated by storing the config file on a CDN and loading it as required. The second can be managed by using placeholders in the static strings and replacing them at runtime based on context. The third would require a robust error-handling mechanism and potentially some fallback strategies.

Dynamic String Handling

A more flexible solution is required for situations where parts of the translation string are dynamic. Let’s take Facebook as a real-life example. In News Feed, you would have seen custom strings to represent the “Likes” for each post. If there is only one like to a post, you may see the string “John likes your post.” If there are two likes, you may see “John and David like your post.”. If there are more than two likes, you may see “John, David and 100 others like your post.” In this use case, there are several customizations required. The verbs “like” and “likes” are used based on the number of people who liked the post. How is this done?

Consider the example: “John, David and 100 other people recently reacted to your post.” Here, “David,” “John,” “100,” “people,” and “reacted” are dynamic elements.

Let’s break this down:

  • “David” and “John” could be user names fetched from some user-related methods or databases.
  • “100” could be the total number of people reacting on a post excluding David and John, fetched from some post-related methods or databases.
  • “people” could be the plural form of the noun person when referring to a collective group.
  • “reacted” could be used when the user reacts with the icon’s heart or care or anger to a post instead of liking it.

One way to accommodate such dynamic content is to use placeholders in our configuration files and replace them at runtime based on context.

Here’s a Java example:

Configuration File (for English locale):

{
      oneUserAction: {0} {1} your post,
      twoUserAction: {0} and {1} {2} your post,
      multiUserAction: {0}, {1} and {2} other {3} recently {4} to your post,
      people: people,
      likeSingular: likes,
      likePlural: like,
}

Configuration File (for French locale):

{
      oneUserAction: {0} {1} votre publication,
      twoUserAction: {0} et {1} {2} votre publication,
      multiUserAction: {0}, {1} et {2} autres {3} ont récemment {4} à votre publication,
      people: personnes,
      likeSingular: aime,
      likePlural: aiment,
}

Java Implementation:

import java.util.Locale;
import java.util.ResourceBundle;

public class InternationalizationExample {

    public static void main(String[] args) {
        // Examples
        System.out.println(createMessage("David", null, 1, new Locale("en", "US"))); // One user
        System.out.println(createMessage("David", "John", 2, new Locale("en", "US"))); // Two users
        System.out.println(createMessage("David", "John", 100, new Locale("en", "US"))); // Multiple users

        // French examples
        System.out.println(createMessage("David", null, 1, new Locale("fr", "FR"))); // One user
        System.out.println(createMessage("David", "John", 2, new Locale("fr", "FR"))); // Two users
        System.out.println(createMessage("David", "John", 100, new Locale("fr", "FR"))); // Multiple users
    }

    private static String createMessage(String user1, String user2, int count, Locale locale) {
        // Load the appropriate resource bundle
        ResourceBundle messages = ResourceBundle.getBundle("MessagesBundle", locale);    

        if (count == 0) {
            return ""; // No likes received
        } else if (count == 1) {
            return String.format(
                  messages.getString("oneUserAction"),
                  user1,
                  messages.getString("likeSingular")
            ); // For one like, returns "David likes your post"
        } else if (count == 2) {
            return String.format(
                  messages.getString("twoUserAction"),
                  user1,
                  user2,
                  messages.getString("likePlural")
            ); // For two likes, returns "David and John like your post"
        } else {
            return String.format(
                  messages.getString("multiUserAction"),
                  user1,
                  user2,
                  count,
                  messages.getString("people"),
                  messages.getString("likePlural")
                  ); // For more than two likes, returns "David, John and 100 other people like your post"
        }
    }
}

Conclusion

Developing an effective internationalization (i18n) and localization (l10n) framework is crucial for software applications, regardless of size. This approach ensures your application resonates with users in their native language and cultural context. While string translation is a critical aspect of i18n and l10n, it represents only one facet of the broader challenge of globalizing software.

Effective localization goes beyond mere translation, addressing other critical aspects such as writing direction, which varies in languages like Arabic (right-to-left) and text length or size, where languages like Tamil may feature longer words than English. By meticulously customizing these strategies to meet specific localization needs, you can deliver your software’s truly global and culturally sensitive user experience.

Adopting Asynchronous Collaboration in Distributed Software Teams

Key Takeaways

  • A meeting-centric way of working on distributed teams can undermine deep work and flow, inclusion, flexible work and in the long run knowledge sharing. It also doesn’t lend itself to scale. Choosing asynchronous ways to collaborate can be an effective alternative to this meeting-centric approach.
  • Being “async-first” is not about being “async-only”. It’s about recognising the benefits of asynchronous and synchronous communication patterns and being thoughtful about when you use either. The default, of course, is to start most collaborations, asynchronously.
  • To adopt an async-first approach, you must find workable asynchronous collaboration alternatives such as replacing status update meetings with up-to-date task boards, or by replacing onboarding presentations with self-paced videos.
  • Conduct a baselining exercise to assess your team’s current maturity with distributed work practices. The exercise will help you identify outcomes that asynchronous collaboration can help improve.
  • To accelerate your async-first shift, document your workflow, simplify your decision-making process, create a team handbook, and conduct a meeting audit.

Do you work on a distributed team? If so, you know that meetings can be a major time-sink. While meetings can be valuable, if we reach for them as a default way of working, we inadvertently create a fragmented team calendar. Such an interrupted schedule can be a major drain on productivity, especially for knowledge workers who need time to focus on deep work.

In this article, we’ll discuss the benefits of asynchronous collaboration and how to implement it on your team.

Just. Too many. Meetings.

“We have too many meetings!” As a leader or a member of a distributed team, you probably hear this complaint all the time. It won’t be surprising if you heard it last when you were in a meeting. While the pandemic drove up the number of meetings we find ourselves in, many teams have always had a meeting-centric collaboration approach.

Between design discussions, status updates, reporting calls, daily standups, planning meetings, development kick-offs, desk checks, demos, reviews and retrospectives, an average software development iteration packs in several such “synchronous” commitments. And those aren’t the only meetings we have!

Ad hoc conversations, troubleshooting sessions, brainstorms, problem-solving, and decision-making – they all become meetings on distributed teams.

Between 2021 and 2022, I conducted an informal survey targeting over 1800 technologists in India. I wanted to understand their distributed work patterns.

Amongst other questions, the survey asked respondents to estimate how many hours of meetings they attended each week. It also asked people to estimate how many times they checked instant messages each day. The responses were astonishing.

  • The average technologist spent 14 hours in meetings each week, which amounts to 80 days each year in meetings. As you may guess, the most experienced technologists face even more of the meeting burden.
  • People reported facing 18 interruptions on average, each day. The culprit? Instant messaging – a tool we employ for “quick responses”.

 
A slice of my survey results

Software development is inherently a creative process, whether you’re writing code, crafting an interaction or designing a screen. Leadership on software teams is a creative exercise too. Many of us spend time building technical roadmaps, articulating architectural decisions, prototyping, designing algorithms and analysing code.  As Paul Graham noted in his 2009 essay, software development is a “maker’s” job (i.e. a job that’s creative), which needs a “maker’s schedule”. “You can’t write or program well in units of an hour. That’s barely enough time to get started.”, Graham said back in the day. On a maker’s schedule, where people need “units of half a day at least”, meetings and interruptions are productivity kryptonite.

Unsurprisingly, this sentiment showed up in my surveys as well. The average technologist desires about 20 hours of deep work each week, but they get just 11. Imagine how much more every team can achieve if we could somehow buy back those lost hours of deep work.

Enter, “async-first” collaboration. It’s a simple concept that prioritises asynchronous collaboration over synchronous collaboration. I describe it through three underlying principles.

  1. Meetings are the last resort, not the first option.
  2. Writing is the primary means of communicating information in a distributed team.
  3. Everyone on the team builds comfort with reasonable lags in communication.

I understand if you’re dismayed by the first principle. In many organisational settings, meetings are indeed the primary tool for leaders to do their work. We can, however, agree on two fundamental observations.

  • For most of your team, a trigger-happy approach to meetings can lead to an interrupted, unsatisfying, and even unproductive day. As leaders, we owe it to our team to create a work environment where they can experience a state of flow.
  • Many meetings are unproductive because of a lack of prep, a lack of subsequent documentation and a poorly identified audience. When we go “async-first”, we force ourselves to prepare well for a synchronous interaction if we need one. It helps us think about who we need for the meeting and how we’ll communicate the outputs of the meeting to people who don’t attend it.

So, instead of reaching for a meeting the next time you wish to collaborate with your colleagues, consider if a slower, asynchronous medium may be more effective. You can use collaborative documents that support inline discussion, wikis, recorded video and even messaging and email.

Allow me to explain why you and your team will benefit from this async-first mindset and how you can adopt it, in your context.

The six benefits of going async-first

Ever since the pandemic-induced remote work revolution, knowledge workers now expect employers to offer location and time flexibility. Return-to-office drives from various employers notwithstanding, it’s fair to assume that some of our colleagues will always be remote in relation to us. This apart, even in a recession, companies are struggling to fill open tech positions. When markets pick up as they inevitably will, at some point, tech talent will feel even more scarce. Companies will have to hire people from wherever they can find them, and they’ll have to integrate them into their teams. Tech teams will be “distributed by default”. This is where an “async-first” collaboration approach can outshine the meeting-centric status quo.

Benefit     How asynchronous work helps
Work-life balance When your team has people from across the country or even across the world, you don’t have to force-fit them into working a specific set of hours. People can choose the work hours that work best for them. Their life doesn’t have to compromise with their work.
Diversity and inclusion

In synchronous interactions; i.e. meetings; we often don’t hear from introverts, non-native English speakers and neurodiverse people. Often the loudest, most fluent and most experienced voices win out. In an async-first environment, the people who aren’t confident speaking can use the safety of tools and writing, to communicate freely.

Since async-first collaboration can help people work during the hours convenient to them, it helps people with various personal situations, to become part of your team.

Improved onboarding and knowledge sharing Writing is the primary mode of communication on an async-first team. Regular writing and curation help build up referenceable team knowledge that you can use for knowledge sharing and onboarding. It also reduces FOMO.
Communication practices that support scale

Everyone can’t be in every meeting. That’s the perfect way to burn out your colleagues. People can read faster than they listen. They don’t remember every piece of information they encounter. This makes referenceability critical, especially on distributed teams.

Writing allows your team to share information at scale. It’s referenceable, fast to consume and easy to modify and update.

Deep work When you don’t have unnecessary meetings on your calendar, you can now free up large chunks of time to get deep, complex work done without interruptions. This has a direct impact on the quality of the work you do as a team.
A bias for action When you work asynchronously, you’ll inevitably get stuck at some point. The usual choice in a meeting-centric culture would be to get a bunch of people on Zoom to make a decision. After all, you don’t want to screw up, do you?

In an async-first culture though, we adopt a bias for action. We make the best decision we can at the moment, document it, and move on. The focus is on getting things done. If something’s wrong, we learn from it, refactor and adapt. The bias for action improves the team’s ability to make and record decisions. Decisions are, after all, the fuel for high-performing teams.

All these benefits aside, an async-first culture also helps you improve your meetings. When you make meetings the last resort, the meetings you have, are the ones you need. You’ll gain back the time and focus to make these few, purposeful meetings useful to everyone who attends.

Async-first is not async-only

Going async-first does not mean that synchronous interactions are not valuable. Team interactions benefit from a fine balance between asynchronous and synchronous collaboration. On distributed teams, this balance should tilt towards the asynchronous, lest you fill everyone’s calendars with 80 days of meetings a year. That said, you can’t ignore the value of synchronous collaboration.

Written communication allows you to be slow, deliberate and thoughtful about your communication. You can write something up independently and share it with your colleagues without having to get on a call. It also allows you to achieve depth that you may not achieve in a fast-paced, real-time conversation. Everything you write is reusable and referenceable in the future as well. These are the reasons that asynchronous communication can be very effective. Let’s look at a few examples.

  1. Imagine an architect documenting a proposal to use a new library by explaining the various dimensions of the proposal -e.g.  integration plan, testing, validation, risks and alternatives. It can help bring the entire team and business stakeholders onboard in a short time.
  2. When team members document the project in the flow of their work, through artefacts such as meeting minutes, architectural decision records, commit messages, pull requests, idea papers or design documents, it builds the collective memory of the project team. This helps you build the archaeology of your project, that helps explain how you got to your current state.
  3. Each of the above examples can benefit from both team members and leaders collaborating. e.g. when a leader proposes to use the new library, team members can comment inline on the wiki page and share concerns, ideas, feedback and suggestions. In a similar way, a junior team member can create a pull request to trigger a code review, while a team lead must provide useful feedback when reviewing the pull request. This is how written communication becomes a collaborative exercise.

On the flip-side, synchronous communication helps you address urgent problems. This is where real-time messaging or video conferencing comes in handy. Every activity isn’t urgent though. Applying the pattern of synchronous work to non-urgent activities, often comes with the “price” of interruptions to flow. This is why we must balance synchrony and asynchrony!

Similarly, when you have to address a wide range of topics in a short time, or when you’re looking for spontaneous, unfiltered reactions and ideas you’ll want to go synchronous. And you’ll agree that after a while, most people want some “human” connection with their colleagues; especially on remote and distributed teams. This too, is where synchronous communication shines.

Asynchronous vs synchronous communication – the balance of values

So remember – “async-first” is not “async-only”. Think about the balance of collaboration patterns on your team as a set of tradeoffs. Use the right pattern for the right purpose.

How not to go async-first

The idea of async-first is deceptively simple. Meet less, write more, embrace lag – it’s an uncomplicated mantra. However, I’ve found that many teams struggle to adopt this approach. There are a few reasons for this.

  • Collaboration patterns work when everyone buys into them. That’s when the “network effect” kicks. If a few people work async-first and the rest work in a meeting-centric way, you’re unlikely to see the benefits you’re aiming for.
  • Teams already practise many synchronous rituals that recur on their calendars – e.g. standups, huddles, reporting calls and planning. It’s hard to replace these rituals with async processes while still retaining their value, without careful thought.
  • Any change in collaboration patterns leads to a period of confusion and chaos. During this time, the team will see a dip in productivity before they see any benefits. Teams that don’t plan for this dip in productivity may give up on the change well before they’ve realised the benefits.

So, to make any new ways of working stick, you need business buy-in and team buy-in first. After that, you must go practice-by-practice and ensure that you don’t lose any value when introducing the new way of doing things. The same goes for async-first collaboration.

What it takes for a successful change to ways of working

Prepare to go “async-first” with your team

Assuming that the business sees the problem with too many meetings and so does the team, you should have some buy-in from these stakeholders already. To prioritise your areas of focus, I suggest polling your team to learn which of the six benefits they prize more than the others.

Reflecting as a team on the current state of your collaboration practices is a great way to prepare to be async-first. I recommend asking four sets of questions.

  • How diligent are you with the artefacts you create in the flow of your work? E.g. decision records, commit messages, READMEs and pull requests. Such artefacts lay the foundation of working out loud, in an async-first culture.
  • How disciplined are you with the meetings you run? Think about things like pre-defined agendas, nominated facilitators, time boxes and minutes. Do people have the safety to decline meetings where they won’t add or derive value? Are meeting sizes usually small; i.e. 8 people or fewer? Effective meetings are a side-effect of an effective async-first way of working.
  • How much time do you already get for deep work? How much time would you want instead, each day? What’s the deficit between the time the team wants, versus what it already gets?
  • How easy is it to onboard someone to your team? How much time does it take for someone to make their first commit, after they’ve got access to your systems? Onboarding as a process is a test for how well-documented your ways of working are.

The above questions will help you establish a baseline for your team’s maturity with distributed work practices. It’ll also help you identify areas for improvement that you can target, with your async-first shift. I suggest using a survey tool to assess this baseline, because it’s simple, and it’s … well … asynchronous!

Publish the results of this survey to the team and highlight the gaps, so they’re clear to everyone. That way it’s always clear why you’re going async-first, and what benefits you hope to get.

The fundamental shifts

In my experience every team’s shift to async-first ways of working is different. Indeed, teams that go through the baselining exercise I’ve recommended, often pick different areas to improve through asynchronous collaboration. There are a few fundamental shifts that I recommend every team makes.

Define your workflow

We all value “self-organising teams” and “autonomy” at work. However, as Cal Newport says in his book, “A World Without Email”, knowledge work is a combination of “work execution”, and “workflow”. While every individual should have autonomy over their work execution, teams must explicitly define their workflow and not leave it to guesswork. For example, you’ll notice that the screenshot below describes my current team’s two-track development approach. It spells out everyone’s responsibilities and how work flows through our team’s system. We’ve also set our team’s task board to implement this workflow.

Defining your workflow avoids confusion in async-first teams

When you document your development process this way, it reduces meetings that arise from the confusion of not knowing who does what, or what the next step is. It also becomes a ready reference for anyone new to the team. You don’t need a meeting to explain your development process to new colleagues. Write once and run many times!

Push decisions to the lowest level

Too often teams get into huddles and meetings for every single decision they make. Not only does this coordination have a cost, it also leads to several interruptions to work. It also leads to risk-averse teams. As a first step, work with your team to define the categories of decisions that are irreversible, in your project context. Don’t be surprised if there are very few such categories. Continuous delivery practices have matured to a point that almost all decisions in software are reversible, except the ones that have financial, compliance or regulatory impacts. Only these decisions need consensus-style decision-making. Other, reversible decisions can take a more lightweight approach.

For example, each time someone wants to decide something and wishes to seek other people’s inputs, they can write up the decision record and invite comments on it. If the decision is reversible, and there are no show-stopping concerns, they go ahead, while acting on workable suggestions. You can even organise large teams into small, short-lived pods of two or three people, along feature or capability lines, as you see in the diagram below. Each pod can have a directly responsible individual (DRI), who is accountable for decisions the pod makes. The DRI shouldn’t be an overhead. They should instead be a “first-amongst-equals” (FaE). Organising this way has three distinct features.

  1. Individuals can still adopt asynchronous decision-making techniques, but when there’s a difference of opinions, the DRI can play the tie-breaker
  2. You reduce the blast radius of communication, especially when you need a meeting. The pod owns its decisions for the capability it’s building. They decide and then share the record in a place where it’s visible to all other pods.
  3. You can still rotate people across pods to facilitate expertise sharing and resilience. That way you improve decision making autonomy while still helping team members build familiarity with the entire codebase.

Organising large teams into pods, to reduce the blast radius of decisions

Create a team handbook

High-quality documentation is a catalyst for asynchronous collaboration. Every software team should have one place to store team knowledge and documents – from decisions to reports to explainers to design documents and proposals. I call this the “team handbook”. There are ubiquitous tools that your team already uses, that you can employ for this purpose – Confluence, GitLab, SharePoint, Notion, Almanac, Mediawiki – are some that come to mind.

Spend some time planning the initial structure of your handbook with your team and organise lightweight barn-raising activities to set up the first version. From that point on, the team should own the handbook just as they collectively own their codebase. Improve it and restructure it as your team’s needs change.

An example of a team handbook

Reduce the mass of meetings

Teams that’ve been together for a while, often accumulate many rituals and commitments. On distributed teams, these commitments show up as meetings on the team calendar. It’s not uncommon for teams to be committed to 10 hours of meetings per sprint, simply through recurring meetings. To go async-first, it helps to trim down your team’s meeting list to the necessary ones. I use a framework that I call the ConveRel quadrants, to help teams identify which meetings they need and which meetings they can replace with asynchronous communication.

The framework is a standard 2×2 matrix. The x-axis represents the nature of communication. On the right side, you have “conveyance”. An example would be one-way information transfer. The left side of the axis stands for “convergence”. An example would be a workshop to make a high-stakes decision.

The y-axis denotes the strength of the relationship between the people communicating. The top half denotes a strong relationship and the bottom, a weak relationship.

You can plot your team’s meetings into the quadrants of this matrix to decide how they change when you go async-first.

  • If you’re only trying to convey information to people you have strong relationships with, don’t do meetings. Instead, make the information available online by sharing a document or wiki page or by sending an email or a message. This is as easy as it gets. Most status updates and reporting calls fit in this quadrant.
  • If the relationship is weak though, you may need some interactions to build up the relationship. Conveying information can be a contrivance for building that camaraderie. As the relationship gets stronger, replace these interactions with asynchronous ones.
  • If your relationship is weak and you’re trying to converge on a decision, I suggest defaulting to a meeting.
  • And finally, if your relationship is strong, and you’re trying to converge on a decision, you must collect all inputs asynchronously, do everything you can outside a meeting and then get together only for the last step.

The ConveRel quadrants

A simple meeting audit using the ConveRel quadrants can not only help you identify meetings you can replace with asynchronous processes, you’ll also be able to improve the meetings you keep.

With these fundamental shifts in place, you’ll have made some of the biggest improvements to help you work asynchronously. You can now work with your team to assess each practice that you execute over ad hoc video-calls and find asynchronous alternatives instead. This will take time, so address one or two improvements every development cycle and see how you go from there.

In summary

Async-first collaboration can help your distributed team become more efficient, inclusive, thoughtful, and fun to work in. It isn’t an overnight shift though. Many of us default to a synchronous way of working, because it mimics the office-culture which we were part of, before the pandemic. This, as we’ve learned, generates team calendars full of meetings!

If you wish to help your team be more asynchronous in their ways of working, you must enlist not just their support, but that of the business, so you have the space to absorb change. As part of this change process examine your team’s collaboration processes and be sure to find workable asynchronous alternatives. The idea is to not just reduce interruptions but also preserve the value that the team got from the synchronous variants.

Remember the value of reflection when directing this shift. A baselining exercise where you assess your team’s distributed work maturity, will help you mark the starting point. From that point, each change you influence should be in service of nudging your team towards the benefits it seeks. Start by working with your team to document your workflow, simplify your decision-making process, create a team handbook, and to conduct a meeting audit. That start will set the foundation for other, smaller, iterative improvements that you’ll make to each of your collaboration processes.

InfoQ Java Trends Report – November 2023

Key Takeaways

  • Java Virtual Threads was finalized in the recently released JDK 21, and we believe that adoption of this feature will continue to grow as the latest edition of application frameworks, such as Helidon Níma and Vert.x, have already taken advantage of this.
  • Oracle has made a commitment to evolve the Java language so that students and beginners can more easily learn to write their first “Hello, world!” applications without the need to understand more complex features of the language.
  • Project Galahad, launched late last year, continues to aim to contribute Java-related GraalVM technologies to the OpenJDK Community and prepare them for possible incubation in a JDK mainline release.
  • There is increasing community interest in learning about modern microservice frameworks, such as Spring Boot, Quarkus, and Jakarta EE. The Spring Modulith project is now an official Spring project and allows the creation of better monoliths instead of microservices.
  • Since the release of Java 17, we’ve noticed faster adoption of newer Java versions than we did when Java 11 was released.
  • We see increasing development and application of Generative AI in the Java space, especially for code generation. We also see more development of SDKs or frameworks for AI and ML with Java, like Semantic Kernel, Deeplearning4J, djl, and Tribuo.

This report provides a summary of how the InfoQ Java editorial team currently sees the adoption of technology and emerging trends within the Java space. We focus on Java the language, as well as related languages like Kotlin and Scala, the Java Virtual Machine (JVM), and Java-based frameworks and utilities. We discuss trends in core Java, such as the adoption of new versions of Java, and also the evolution of frameworks such as Spring Framework, Jakarta EE, Quarkus, Micronaut, Helidon, MicroProfile and MicroStream.

You can also listen to the additional podcast discussion around Java Trends for 2023.

This report has two main goals:

  • To assist technical leaders in making mid- to long-term technology investment decisions.
  • To help individual developers in choosing where to invest their valuable time and resources for learning and skill development.

This is our fifth published Java trends report. However, this topic has received ample news coverage as we have been internally tracking Java and JVM trends since 2006.

To help navigate current and future trends at InfoQ and QCon, we make use of the “crossing the chasm” mental model for technology success pioneered by Geoffrey Moore in his book of the same name. We try to identify ideas that fit what Moore referred to as the early market, where “the customer base is made up of technology enthusiasts and visionaries who are looking to get ahead of either an opportunity or a looming problem.”

As we have done for the 2022, 2021, 2020 and 2019 Java trend reports, we present the internal topic graph for 2023:

For context, this was our internal topic graph for 2022:

Aside from several new technologies having been identified in the Innovator category, notable changes are described as follows.

Java 17+ has been recategorized as simply Java 17 and remains in the Early Adopter phase as more frameworks have committed to Java 17 as a baseline. Java 21 has been introduced in the Innovator category.

We have created a new label, Fast JVM Startup, with further refinements, Fast JVM Startup (CRaC), placed in the Innovators category, and Fast JVM Startup (GraalVM), placed in the Early Adopters phase. The rationale was to acknowledge the relatively new technologies that have been recently introduced to the Java community.

What follows is a lightly edited summary of the corresponding discussion on various topics among several InfoQ Java Queue editors and Java Champions:

  • Michael Redlich, Director at Garden State Java User Group and Java Queue Lead Editor at InfoQ. Retired Senior Research Technician at ExxonMobil Technology & Engineering Company
  • Johan Janssen, Software Architect at ASML and Java Queue Editor at InfoQ
  • Ixchel Ruiz, CDF Foundation Ambassador at The Linux Foundation
  • Alina Yurenko, Developer Advocate for GraalVM at Oracle Labs
  • Rustam Mehmandarov, Chief Engineer at Computas AS

We also acknowledge the Java Queue editors who provided input on updating our “crossing the chasm” model for 2023:

  • Ben Evans, Senior Principal Software Engineer at Red Hat, Java Queue Editor at InfoQ and Java Champion
  • Erik Costlow, Senior Director of Product Management and Java Queue Editor at InfoQ
  • Karsten Silz, Senior Full-Stack Java Developer and Java Queue Editor at InfoQ
  • Olimpiu Pop, Chief Technology Officer at mindit.io
  • Bazlur Rahman, Software Engineer and Java Champion
  • Shaaf Syed, Senior Principal Technical Marketing Manager at Red Hat

We believe this summary provides more context for our recommended positioning of some of the technologies on the internal topic graph.

GraalVM/Cooridanted Restore at Checkpoint (CRaC)

Janssen: All the improvements in GraalVM and CRaC (Coordinated Restore at Checkpoint) to reduce the startup time of Java applications are impressive. It’s great to see the continuous improvements to GraalVM, and the integration with many frameworks makes it easy to use native images within your application. Apart from native image support, GraalVM also offers a Java runtime engine which may be used instead of the JVM from your vendor and may result in better performance for your application by just changing the runtime.

Redlich: Apart from the releases of JDK 20 and JDK 21, I believe the most significant change comes from Oracle Labs and GraalVM. Over the past year, we’ve seen: applicable parts of GraalVM technology being contributed to OpenJDK; the creation of Project Galahad, a project that will initially focus on the continued development and integration of the Graal just-in-time (JIT) compiler as an alternative to the existing HotSpot JIT compiler for possible inclusion in a future OpenJDK release; releases of GraalVM aligned with the releases of OpenJDK; and the elimination of GraalVM Enterprise in favor of a new license.

Java 17 and Beyond

Ruiz: The release cadence is not only bringing in new features in a more digestible way, but it is also allowing different users to pick them up and try them out. So overall, I have seen a good attitude towards early testing and richer feedback.

In a way, it has also simplified the roadmap for updating the Java version in production. The predictability allows for better synchronisation of development teams.

Yurenko: I see the speed of adoption of the latest Java versions increasing. This is something I often see being discussed at conferences, reflected in questions I receive, and also observed in the GraalVM Community Survey the GraalVM team ran last year — 63% of our users were already on Java 17 or higher.

Mehmandarov: We have Java 20 and Java 21 this year. Some of the most noticeable features are Record Patterns and Pattern Matching for switch that are finally out of preview. Those can be exciting features working with large amounts of data and can simplify the code.

Native Java (GraalVM/Spring Native/Project Leyden)

Yurenko: I might be biased, but I see a lot of projects and libraries adopting GraalVM, particularly Native Image. Spring Boot now supports Native Image out of the box along with other popular Java frameworks, and I see many libraries that have added support as well.

Java for Beginners

JEP 445, Unnamed Classes and Instance Main Methods (Preview), delivered in JDK 21, was inspired by the September 2022 blog post, Paving the on-ramp, by Brian Goetz, the Java language architect at Oracle. This feature will “evolve the Java language so that students can write their first programs without needing to understand language features designed for large programs.”

JEP 463, Implicitly Declared Classes and Instance Main Methods (Second Preview), was recently promoted from its JEP Draft 8315398 to Candidate status, and we anticipate that it will be targeted for JDK 22. Formerly known as Unnamed Classes and Instance Main Methods (Preview), Flexible Main Methods and Anonymous Main Classes (Preview) and Implicit Classes and Enhanced Main Methods (Preview), this JEP incorporates enhancements in response to feedback from JEP 445. Gavin Bierman, a consulting member of the technical staff at Oracle, has published the first draft of the specification document for review by the Java community.

Yurenko: Another great trend related to this one which I love, is how Java becomes more accessible for beginners. I think it’s very important for a community to stay open and welcoming for beginners, whether that’s students or newcomers to the industry, and features such as Records, Pattern Matching, Unnamed Classes, and Instance Main Methods allow beginners to learn faster, develop their first app more easily, and become increasingly productive.

What is the Java Community Saying?

Ruiz: A mixed bag, as many people were focused on the 21st release of Java. Project Loom has been tantalising many developers since the first previews. Virtual Threads and Structured Concurrency.

Others have noted the progress of JVM advances that are not strictly tied to syntax changes in the Java language.

Plans for migration from Java 8 to a newer version of Java. A “now or never” attitude!

Yurenko: I recently saw a new interesting way to analyze the trends in our community – Marcus Hellberg, VP of Developer Relations at Vaadin, analyzed talks presented at four major Java conferences in 2023. You can check out the article yourself, and here are a few of my conclusions:

  • Microservices and Kubernetes are still the hottest topics
  • AI and ML in the third spot confirm my observation about how hot this topic is now
  • I always see a lot of interest in framework talks, in this report, it’s Spring Boot and Quarkus
  • What’s rather surprising for me is that the security topic is in the fifth spot

Another trend that I see is conversations about reducing the startup time of JVM applications and the evergreen topic of performance; for me, that’s a sign that we are doing the right thing 🙂

Mehmandarov: Lately, there have been numerous talks and general excitement around Virtual Threads, which is finally out of preview with Java 21. We also see more development and application of Generative AI, especially for code generation. It still needs more maturity, but it is an exciting start. We also see more development of SDKs or frameworks for AI and ML with Java, like Semantic Kernel, Deeplearning4J, djl, and Tribuo.

Janssen: The Java ecosystem is still booming with many new developments and improvements. Apart from the things already mentioned, there are interesting developments, mainly around AI, which is evolving quickly, and it’s great to see projects such as Spring AI instead of all the Python-based solutions. Next to the big hype topics, there are many other interesting projects like Spring Modulith, which is now an official Spring project and allows the creation of better monoliths instead of microservices.

Redlich: Most of the buzz that I have been hearing and seeing throughout 2023 is Project Loom. In particular, Virtual Threads was a final feature with the release of JDK 21. Leading up to this anticipated release in September 2023, there were numerous presentations and YouTube videos on virtual threads. At a special Java Community Process (JCP) 25th anniversary event held in New York City in September 2023, a panel of JCP Executive Committee members revealed their favorite feature of JDK 21, in which they unanimously said virtual threads.

What is New and Exciting That We Didn’t Expect?

Ruiz: Given all this attention to LLM, ML and AI, I would not be surprised to see new projects, libraries and APIs in Java to support use cases, workflows and products.

Mehmandarov: If I have to pick one thing, it is Generative AI, specifically focusing more on code generation. While it struggles with logical errors in the generated code, hallucinations, and other issues, it still fits nicely into the “unexpected, new, and exciting” category.

Janssen: It’s awesome to see many new features in Java 21, such as virtual threads from Project Loom. I was a bit afraid that some of those features would be postponed to later Java releases. As Java 21 is the new Long Term Support (LTS) version, it’s great that they are included, as many companies only use LTS versions. At first glance, it looks like nothing big has been removed, and hopefully this results in easy upgrades for our projects. Those upgrades are nowadays even easier with OpenRewrite, which allows automated upgrades not only for the Java language itself, but also for libraries such as JUnit.

Redlich: The new MicroProfile JWT Bridge specification, currently being developed, is a collaboration between Jakarta EE and MicroProfile Working Groups. This new specification will enable Jakarta Security applications to build on the MicroProfile JWT Authentication specification that will provide seamless integrations and eliminate duplication of effort and circular dependencies. The goal is to move the optional section of MicroProfile JWT Authentication to the new bridge specification together with TCKs and to have this specification ready for MicroProfile 7.0.

What’s Getting You, Personally, Really Excited in the Java Space?

Ruiz: The resurgence of CLI tools in the JVM space. We have seen examples of mature projects that solve or reduce the friction for developers to try, test, release and publish tools, projects and products. JBang and JReleaser are part of this set of resurgent tools.

Yurenko: I like seeing many new projects occurring every day. For example, artificial intelligence and machine learning are probably the hottest trends now, and there are many opportunities in this field for Java developers. One of my favorites is Tribuo, an open source ML library developed by my colleagues at Oracle Labs. OpenJDK projects Valhalla and Panama will also greatly benefit Java developers working with AI.

Mehmandarov: There are quite a few things to get excited about. Some of them are new and upcoming, like String Templates (still in preview) and various libraries supporting Machine Learning and Java, and some of them are more mature but still see improvements and new development, like the developments in the Cloud-Native stack for Java (like Jakarta EE and MicroProfile).

When it comes to working with large datasets, I am also excited to see more concepts like Data-Oriented Programming in Java as well as the improvements to more memory usage efficiency, like projects Lilliput and value objects from project Valhalla.

Also, the importance and the excitement I get from interacting with the Java community worldwide. It is a genuinely vibrant and supporting group eager to learn and share their knowledge.

Redlich: I have been putting together a presentation entitled “Jakarta EE 11: Going Beyond the Era of Java EE” that I will present numerous times starting in November 2023. It’s amazing how Jakarta EE has evolved since 2018, and it’s been awesome studying the Jakarta EE specifications.

The Java Community

Janssen: Of course, every year, we get two new Java releases packed with features and lots of improvements in all the tools, libraries and frameworks. Next to that, it’s good to see that the Java or JVM conferences are again happening and attracting more attendees, so make sure to visit them to learn more about Java and have some great discussions with fellow developers.

Conclusion

Please note that the viewpoints of our contributors only tell part of the story. Different groups, segments, and locales of the Java ecosystem may have different experiences. Our report for 2023 should be considered as a starting point for debate rather than a definitive statement, and an invitation to an open discussion about the direction the industry is taking.

Efficiently Arranging Test Data: Streamlining Setup With Instancio

Key Takeaways

  • Automated test data generation supports rigorous quality assurance. Using randomized test scenarios may uncover potential defects not evident when only using static data.
  • Using randomized test generation tools can complement standard testing methodologies, such as Arrange-Act-Assert (AAA) and Given-When-Then (GWT), enhancing a developer’s ability to identify complex bugs.
  • The decoupling of test code from data generation simplifies maintenance, as changes to the data model do not require corresponding updates to test cases.
  • Automated generation of randomized inputs broadens test coverage, providing a wide array of data permutations that help catch edge cases, thus reducing the dependency on handcrafted, parameterized tests.
  • Automation in test data creation eliminates the tedium of manual setup, providing ready-to-use, complex data structures that save time and minimize human error.

The school of thought about developing software products more effectively evolved from Waterfall to Agile, but one aspect of development was never in doubt: the need to ensure quality.

Current approaches, like Continuous Deployment and Continuous Delivery, indicate that reliable test suites are directly connected to the speed of development and quicker customer feedback loop.

Approaches like Test-Driven Development (TDD) even help shape the structure of software products, and a healthy test suite is often an indicator of a well-maintained code base.

Regardless of the overall philosophy, one thing is certain: writing tests is beneficial.

Having the necessary tools to write tests will make the task less tedious and more enjoyable for developers, resulting in more comprehensive test coverage.

Patterns like Arrange Act Assert (AAA) or Given-When-Then (GWT) focus on providing a more predictable structure.

This article follows the AAA path, and it focuses on maybe the most challenging aspect of developing unit tests: test data generation, which is the Arrange step.

This article will compare manual test fixtures with automated data setup using Instancio, an open-source Java library for generating randomised data for unit and integration tests. Using randomised data in tests is not a novel concept. Property-based testing (PBT) frameworks, such as QuickCheck for Haskell, have used this approach for years. QuickCheck has inspired similar frameworks in other languages, including jqwik for Java. Although Instancio is not a PBT framework, it does have some similarities due to the randomised nature of the data.

In the following sections, we will cover the following in more detail:

  • Illustrate the usage of Instancio with a simple test refactoring exercise.
  • Highlight the library’s features using a few common testing use cases.

Testing Problem

Arguably, an important part of a test is represented by the quality of the data used. The arranging step of a unit test focuses exactly on that aspect, some of the key challenges of this step being:

  • Data setup may incur significant development effort and maintenance overhead, depending on the size and complexity of the data model.
  • Setup code is often brittle in the face of refactoring.
  • The setup code must be updated to reflect the changes when properties are added or removed.
  • Different test cases may require similar or overlapping data. This often leads to duplicated setup code with minor variations or helper factory methods with complex signatures

Existing approaches to data setup include helper methods within a Test class, helper classes providing static factory methods (the Object Mother pattern), and test object builders.

Helper methods are the simplest and most common approach. They work well when we don’t need a lot of variation in the data. Since the setup code is within the test class, the tests are typically easier to read and understand.

The Object Mother pattern can be useful when the setup code needs to be reused across multiple tests, or the setup logic is more complicated (for example, we may need objects created in different states).

One of the downsides of the Object Mother pattern is its lack of flexibility when objects need to be created in different states. This is where test object builders come in, although it does make set setup code more complicated.

Instancio attempts to solve some of the problems mentioned above. First, we will briefly cover what the library does. Next, we’ll review a refactoring exercise to show how manual data setup can be eliminated. We will also discuss the Instancio extension for JUnit 5 and, finally, cover some common data setup use cases and how they can be implemented using the API.

Instancio Basics

Instancio’s API accepts a class and returns an instance of the class populated with sensible defaults:

  • Non-null values
  • Positive numbers
  • Non-empty collections with a few elements

This allows typical data setup code like the below:

Person person = new Person();
person.setFirstName("first-name");
person.setLastName("last-name");
person.setDateOfBirth(LocalDate.of(1980, 12, 31));
// etc

to be replaced with a more concise version:

Person person = Instancio.create(Person.class);

By automating the Arrange stage, we can Act and Assert within a shorter time frame and with fewer lines of code.

How it Works

Instancio populates objects with reproducible, random data, which can be customized as needed. First, the class structure is converted to a node hierarchy, where each node represents a class or class property, for example:

This output can be obtained by running Instancio in verbose mode:

Person person = Instancio.of(Person.class)
    .verbose()
    .create();

The digit before the class name refers to the depth of the node, where the root node (the Person class) is at depth zero.

Once the node hierarchy has been constructed, Instancio traverses the nodes and populates values using reflection. By default, this is done by assigning values directly via fields; however, the behavior can be overridden to assign values via setters.

As mentioned earlier, the values are generated randomly. At first glance, the idea of tests based on random data may seem bad, but upon closer inspection, it is not without merit and actually offers some advantages. The fear of randomness stems from its inherent unpredictability. The concern is that it will make tests themselves unpredictable. However, it is important to note that many tests require a value to be present but do not actually care what the value is. For such tests, whether a person’s name is “John,” “foo,” or a random value like “MVEFZ,” makes no difference. The test is happy as long as it has some non-null value. For cases where tests require specific values, the library provides an API for customizing the generated data.

We will look at customizing generated values (and reproducing tests) later in the article. Let’s go through a refactoring exercise to illustrate the library’s usage with a concrete example.

A Simple Refactoring Exercise

Instead of making up a test case, we will modify an existing test from an open-source project. The test we’ll use is from the MapStruct samples repository. MapStruct is a great library for automating mapping between POJOs, and its samples are a perfect candidate for our purposes.

The following is the original version of the test that involves mapping a DTO to an entity (note: AssertJ is used for assertions). Although this is just a simple example, we may find similar tests in many real-world projects, often with dozens of fields. Some projects may have utility methods or classes for creating test objects, but the data setup is still done manually.

class CustomerMapperTest {
 
   @Test 
   void testMapDtoToEntity() { 
      CustomerDto customerDto = new CustomerDto(); 
      customerDto.id = 10L; 
      customerDto.customerName = "test-name"; 
      
      OrderItemDto order1 = new OrderItemDto(); 
      order1.name = "Table"; 
      order1.quantity = 2L; 
      customerDto.orders = new ArrayList<>(Collections.singleton(order1)); 

      Customer customer = CustomerMapper.MAPPER.toCustomer(customerDto); 

      assertThat(customer.getId()).isEqualTo(10); 
      assertThat(customer.getName()).isEqualTo("test-name"); 
      assertThat(customer.getOrderItems()) 
            .extracting("name", "quantity") 
            .containsExactly(tuple("Table", 2L)); 
  }
}

The refactored version of the test looks as follows:

@ExtendWith(InstancioExtension.class) 
class CustomerMapperTest { 

   @WithSettings 
   private final Settings settings = Settings.create() 
         .set(Keys.COLLECTION_MIN_SIZE, 0) 
         .set(Keys.COLLECTION_NULLABLE, true); 

   @RepeatedTest(10) 
   void testMapDtoToEntity() { 
      // Given 
      CustomerDto customerDto = Instancio.create(CustomerDto.class); 

      // When 
      Customer customer = CustomerMapper.MAPPER.toCustomer(customerDto); 

      // Then 
      assertThat(customer.getId()).isEqualTo(customerDto.id); 
      assertThat(customer.getName()).isEqualTo(customerDto.customerName); 
      assertThat(customer.getOrderItems()) 
            .usingRecursiveComparison() 
            .isEqualTo(customerDto.orders); 
   } 
} 

Manual setup, as shown in the first example, usually manifests itself in the following properties:

  • If fields are added or removed, the setup code needs to be updated to reflect the changes
  • Use of hard-coded values adds noise to the test
  • Verifying optional data requires additional setup, where the values are not present
  • Testing for null, empty, and non-empty collections also adds complexity to the setup code, so typically, only a collection with one or two elements is verified

The refactored test offers some improvements. First, the data setup code has been reduced and no longer contains hard-coded values, making the test’s intention clearer: the mapper should map source values to the target object as-is, without any transformations. In addition, automating object creation means that no changes to the setup code are necessary if, for example, new properties are added or removed from CustomerDto or any class it references. We only need to ensure that the mapper handles the new property and that an assertion covers it.

Another advantage is that the same test method verifies all possible collection states: null, empty, and with sizes >= 1. This is specified by custom Settings (since, by default, Instancio generates non-empty collections). For this reason, @Test was replaced with @RepeatedTest to test all permutations. However, unlike using @ParameterizedTest, as is common in this situation, using @RepeatedTest does not introduce additional complexity to the test setup.

As a side note, AssertJ’s usingRecursiveComparison is very well suited for this type of test. Using Instancio to populate the object and usingRecursiveComparison to verify the results, we may not even need to update the test when fields are added to or removed from classes that can be verified with usingRecursiveComparison:

  • Instancio automatically populates an object
  • MapStruct automatically maps the properties (assuming the names match)
  • AssertJ auto-verifies the result via recursive comparison (again, assuming the property names match)

In summary, when everything aligns – which does happen occasionally – creation, mapping, and assertion are all handled by the respective libraries.
Bottom line: as the size and complexity of a data model increases, manual data setup becomes less practical and more costly in terms of development and maintenance.

Instancio Extension for JUnit 5

Instancio can be used as a standalone library with any testing framework, such as JUnit or TestNG. When used standalone (for example, when with JUnit 4 or TestNG), the seed used for generating the data can be obtained via the API:

Result result = Instancio.of(Person.class).asResult();
Person person = result.get();
long seed = result.getSeed();

This makes reproducing the data a little more cumbersome.

With JUnit 5, Instancio provides the InstancioExtension. The extension manages the seed value, ensuring that all objects created within a single test method can be reproduced in case of test failure. When a test fails, the extension reports the seed value as follows:

Test method ’testMapDtoToEntity’ failed with seed: 12345

Using the reported seed value, we can reproduce the failure by annotating the test method with @Seed annotation:

@Seed(12345) // to reproduce data
@Test 
void testMapDtoToEntity() { 
    // .. remaining code unchanged 
}

The seed value specified by the annotation will be used instead of a random seed. Once the test failure is resolved, the annotation can be removed.

Parameter Injection

In addition, the extension adds support for injecting arguments into a @ParameterizedTest. For example, we could also re-write the refactored test shown earlier as follows:

@InstancioSource 
@ParameterizedTest 
void testMapDtoToEntity(CustomerDto customerDto) { 
   // When 
   Customer customer = CustomerMapper.MAPPER.toCustomer(customerDto); 

   // Then 
   // same assertions as before... 
} 

This allows any number of arguments to be provided to a test method.

A Collection of Common Use Cases

When testing specific behaviors, there are instances where we require the creation of an object in a particular state. For this purpose, the library offers a fluent API that allows for the customization of objects. The subsequent examples outline several typical scenarios involving data setup commonly arising during testing. We’ll provide a use case and a sample code illustrating how to achieve it.

Customizing an Object’s Values

In this example, we create an object populated with random data but with specific values for some of the fields. We will assume our test case requires a customer from Great Britain with a past registration date. We can achieve this as follows:

Customer customer = Instancio.of(Customer.class) 
    .set(field(Address::getCountry), “GB”)
    .set(field(Phone::getCountryCode), “+44”)
    .generate(field(Customer::getRegistrationDate), gen -> gen.temporal().localDate().past())
    .create();

The gen parameter (of type Generators) provides access to built-in generators for customizing values. Built-in generators are available for most common JDK classes, such as strings, numbers, dates, arrays, collections, etc.

The set() method works as a setter, but unlike a regular setter, it will be applied to all generated instances. For example, if the customer has more than one phone number, all of them will have country code +44.

Creating a Collection

Let’s assume we need a list of 10 orders that

  • have a null id
  • have any status except CANCELLED or COMPLETED

Such a list can be generated as follows:

List orders = Instancio.ofList(Order.class) 
    .size(10) 
    .ignore(field(Order::getId)) 
    .generate(field(Order::getStatus), gen -> gen.enumOf(OrderStatus.class)
             .excluding(OrderStatus.CANCELLED, OrderStatus.COMPLETED))
    .create();

While the Order class may have dozens of other fields, only those of interest are set explicitly. This has the benefit of highlighting which properties the test actually cares about. Other Order fields, for example, the shipping address, may also be required by the method under test to pass. However, they may not be pertinent to this particular test case and can therefore be filled with random data.

Customizing Collections within a Class

In addition to creating a collection of objects, the API also supports customizing collections declared somewhere within a class. For instance, let’s assume that we need to create a Customer that

  • Has a specific id
  • Has 7 orders with expected statuses

We can generate such a Customer as follows:

Long expectedId = Gen.longs().get(); 

Customer customer = Instancio.of(Customer.class) 
   .set(field(Customer::getId), expectedId) 
   .generate(field(Customer::getOrders), gen -> gen.collection().size(7)) 
   .generate(field(Order::getStatus), gen -> gen.emit() 
         .items(OrderStatus.RECEIVED, OrderStatus.SHIPPED) 
         .item(OrderStatus.COMPLETED, 3) 
         .item(OrderStatus.CANCELLED, 2)) 
   .create(); 

The Gen class above also provides access to built-in generators via static methods. It’s a shorthand API for generating simple values. The emit() method allows generating a certain number of deterministic values. It can be useful for generating a collection containing objects with certain properties.

Generating Optional Data

In this example, we need to verify that the method under test does not fail if optional values are absent. Let’s say we need to create a Person that has:

  • An optional date of birth
  • An optional Spouse
  • The Spouse itself contains only optional fields, all of type String.

This can be done as shown below:

Person person = Instancio.of(Person.class) 
   .withNullable(all( 
      field(Person::getDateOfBirth), 
      field(Person::getSpouse))) 
   .generate(allStrings().within(scope(Spouse.class)), gen -> gen.string().nullable().allowEmpty()) 
   .create(); 

withNullable() will generate an occasional null value. The method all() groups selectors to avoid repeating withNullable() twice. Finally, all strings declared within Spouse will be null, empty, or non-null.

This setup will result in different permutations of null and non-null values on each test run. Therefore, a single test method can verify different states without the use of parameterized tests or custom data setup logic.

Generating Conditional Data

Often, a class has interdependent fields, where the value of one field is determined by the value of another. Instancio provides an assignment API to handle such cases. The following example illustrates how the country code field in the Phone class can be set based on the country field of the Address class:

Assignment assignment = Assign.given(field(Address::getCountry), field(Phone::getCountryCode))
    .set(When.isIn("Canada", "USA"), "+1")
    .set(When.is("Italy"), "+39")
    .set(When.is("Poland"), "+48")
    .set(When.is("Germany"), "+49");

Person person = Instancio.of(Person.class)
    .generate(field(Address::getCountry), gen -> gen.oneOf("Canada", "USA", "Italy", "Poland", "Germany"))
    .assign(assignment)
    .create();

Bean Validation Constraints

Another common use case is creating a valid object based on Bean Validation constraints. Perhaps the method under test performs validation, or we must persist some entities in an integration test.

Assuming we have this data model:

class Person { 
   @Length(min = 2, max = 64) 
   String name; 

   @Range(min = 18, max = 65) 
   int age; 

   @Email 
   String email; 
} 

we can generate a valid object as follows:

Person person = Instancio.of(Person.class) 
       .withSettings(Settings.create()
            .set(Keys.BEAN_VALIDATION_ENABLED, true))
       .create(); 

// Sample output: Person(name=XGFK, age=23, [email protected]) 

It should be noted that this is an experimental feature. It can be enabled via Settings, as shown above, or a configuration file. Most constraints from the following packages are supported, depending on what’s available on the classpath:

• jakarta.validation.constraints 
• javax.validation.constraints 
• org.hibernate.validator.constraints 

Object Reuse via Models

Sometimes, we may have several test methods requiring the same object but in slightly different states. Let’s assume we are testing a method that accepts a loan Applicant. The loan application is approved if both of these conditions are met:

  • The applicant has an income of at least $25,000
  • The applicant has not declared bankruptcy within the past five years

To cover all the test cases, we will need three applicant states: one valid for the happy path and two invalid, one for each of the conditions. We start by defining a Model of a valid applicant with the required income and a bankruptcy date that is either null or over five years ago.

LocalDate maxDate = LocalDate.now().minusYears(5).minusDays(1);

Model validApplicantModel = Instancio.of(Applicant.class)
    .generate(field(Applicant::getIncome), gen -> gen.ints().min(25000))
    .generate(field(Applicant::getBankruptcyDate), gen -> gen.temporal()
            .localDate()
            .nullable()
            .range(LocalDate.MIN, maxDate))
    .toModel();

For the happy path test, we can then generate a valid applicant from the model:

Applicant applicant = Instancio.create(validApplicantModel);

The created Applicant will inherit all the properties specified by the model. Therefore, no other customizations are needed.

To create an invalid applicant, we can simply customize the object created from the model. For example, to generate an applicant with a bankruptcy date within the last five years, we can customize the object as follows:

Applicant applicant = Instancio.of(validApplicantModel) 
   .generate(field(Applicant::getBankruptcyDate(), gen -> gen.temporal()
         .localDate()
         .range(LocalDate.now().minusYears(5), LocalDate.now())) 
   .create(); 

Applicant’s income can be modified in a similar manner.

Populating Objects via Setters

By default, Instancio populates POJOs by assigning values directly to fields. However, there may be cases where assigning values via setters is preferred. One such example is when setters contain logic that is relevant to the functionality under test, as shown below:

class Product { 
   // ... snip 
   String productCode; 

   void setProductCode(String productCode) { 
      Objects.requireNonNull(productCode); 
      this.productCode = productCode.trim().toUpperCase(); 
   } 
} 

To populate this class using setters, we can modify the assignment type via Settings:

Product product = Instancio.of(Product.class) 
   .withSettings(Settings.create()
     .set(Keys.ASSIGNMENT_TYPE, AssignmentType.METHOD))
   .create(); 

Instancio will then attempt to resolve setter method names from field names. By default, it assumes that mutators follow the JavaBeans convention and use the set prefix. If setters follow a different naming convention, for example, using with as the prefix, the behavior can be customized by modifying the SetterStyle option:

Settings.create() 
 .set(Keys.ASSIGNMENT_TYPE, AssignmentType.METHOD) 
 .set(Keys.SETTER_STYLE, SetterStyle.WITH); 

Generating Custom Types

Some applications have data classes that are used extensively throughout the data model. For instance, a GIS (Geographic Information System) application may define a Location class that is referenced by PointOfInterest and several other classes:

public class Location { 
   private final double lat; 
   private final double lon; 
   // snip ... 
} 

public class PointOfInterest { 
   private final Location location; 
   // snip... 
} 

Although we can generate valid locations as shown below:

PointOfInterest poi = Instancio.of(PointOfInterest.class) 
   .generate(field(Location::lat), gen -> gen.doubles().range(-90d, 90d))
   .generate(field(Location::lon), gen -> gen.doubles().range(-180d, 180d)) 
   .create(); 

It can get tedious if it needs to be done repeatedly across many tests. Defining a custom Generator can solve this problem:

import org.instancio.Random; 

class LocationGenerator implements Generator { 
   @Override 
   public Location generate(Random random) { 
      double lat = random.doubleRange(-90, 90); 
      double lon = random.doubleRange(-180, 180); 
      return new Location(lat, lon); 
   } 
} 

Then the previous example can be modified as:

PointOfInterest poi = Instancio.of(PointOfInterest.class) 
    .supply(all(Location.class), new LocationGenerator()) 
    .create(); 

Although this is an improvement, we must manually bind the custom generator to the Location field. To take it further, we can also register the new generator via Instancio’s Service Provider Interface. Once registered, the following statement will automatically produce valid locations using the custom generator:

PointOfInterest poi = Instancio.create(PointOfInterest.class);

We will omit an example for brevity. For details, please refer to the Instancio Service Provider documentation.

Final Thoughts: Embracing Randomness

Some developers are apprehensive about using random data in unit tests. Their main concern is that random data will cause tests to become flaky and that failures will be impossible or difficult to reproduce. However, as we mentioned earlier, randomised data in tests has been used for many years, most notably in property-based testing frameworks.

Repeatability is a key feature of most testing libraries that generate random data. Therefore, the fear of randomness is misplaced. In addition, as we showed earlier, switching from hard-coded inputs to generated data can increase test coverage and reduce the need for more complicated data setup logic and parameterized tests.

If a test fails, it may have uncovered a potential bug, or perhaps it was an incorrectly set expectation. Regardless of the root cause, each time the test runs, it probes the subject under test from a different angle.

From Dependency to Autonomy: Building an In-House E-signing Service

Key Takeaways

  • The iCreditWorks team built a custom e-signing service in an attempt to sidestep the limitations of third-party service customizations, feature availability, and cost.
  • This case study demonstrates how technologies such as Java, Spring, cloud blob storage, and MySQL can be used to build the service.
  • Securing artifacts (i.e., the target PDF document) so that this can’t be tampered with after the digital signing is crucial. The user’s signature is stored securely in cloud storage along with signed documents.
  • The iCreditWorks solution was required to be cloud agnostic and support both Azure Blob and AWS S3 cloud storage. The design also supports future work to extend to the Google Cloud Platform as well.
  • By sharing the open source code with the broader community, the iCreditWorks team aims to empower other startups and fintechs to benefit from our development efforts and to foster a collaborative environment for continuous improvement.

In today’s digital age, the ability to securely sign documents online is not just a convenience; it’s a necessity. For fintech startups and other businesses, e-signing services are an integral part of operations. While many companies rely on third-party services, there’s a growing realization that an in-house solution can offer more control, flexibility, and cost savings.

In this article, we’ll delve into how to build an e-signing microservice. We’ll cover open-source tools and technologies used in building this service offering. In addition, we will cover a use case of the iCreditWorks startup, which transitioned to its own in-house e-signing service.

Compliance with the E-sign Act and Regulatory Standards

E-sign Act overview

Before diving into the technical aspects of building an in-house e-sign service, it’s crucial to understand the legal landscape governing electronic signatures. The Electronic Signatures in Global and National Commerce (E-sign) Act, enacted in 2000, plays a pivotal role in this domain.

The act provides a general rule of validity for electronic records and signatures for transactions in or affecting interstate or foreign commerce. The Act ensures that electronic signatures hold the same weight and legal effect as traditional paper documents and handwritten signatures.

Compliance measures

Our in-house e-signing service is meticulously designed to adhere to the stipulations of the E-sign Act. We ensure that:

  • All electronic signatures are uniquely linked to the signatory.
  • Once a document is signed, it becomes immutable, ensuring that no alterations can be made post-signature. This strict adherence guarantees the integrity and authenticity of the signed document.
  • The signatory is provided with a clear and concise way to do business electronically.
  • The signatory has the intent to sign the document.

The case for building an in-house e-signing service

Building an in-house e-signing service is not merely a technical decision but a strategic one. It involves evaluating the cost-effectiveness, operational efficiency, and customization possibilities that such a service brings compared to third-party e-signing services. In this section, we will delve into the compelling reasons that make a case for building an in-house e-signing service.

  1. Cost Efficiency: Relying on external third-party services often incurs a per-signature fee. When dealing with a large volume of documents, these costs can accumulate rapidly. For iCreditWorks, this was a pivotal consideration, as it significantly reduced the loan application expenses.
  2. Reduced External Dependencies: If a product offering hinges on numerous external APIs – with e-signing being one of them – having an in-house solution eliminates one external dependency, streamlining operations.
  3. Enhanced UI Customizations: The user interface (UI) possibilities for mobile apps or web browsers are confined to the capabilities offered by third-party services. iCreditWorks, for instance, wanted granular UI branding, a feature that the external partner did not easily support.

After understanding the compelling reasons and strategic advantages of having an in-house e-signing service, let’s transition into the architectural and technological aspects. In the following section, we will unveil the foundational elements and the tech stack of the e-signing microservice by looking at the case study of iCreditWorks.

Case study – iCreditWorks

Background: iCreditWorks aims to modernize and streamline the lending process at the point where sales occur, making it more efficient and customer-friendly. Currently, iCreditWorks caters to dental financing and expanding to other industry verticals.

Challenge: The third-party e-signing service they relied on charged a fee of $0.70 for each digital signature. Considering their estimated 100,000 funded loans annually, this translated to a substantial expense of $70,000. Additionally, the external vendor’s platform limited their customization options, preventing iCreditWorks from achieving a consistent brand look and feel across their digital platforms.

Solution: Recognizing the dual challenges of cost and branding, iCreditWorks took the step to develop an in-house e-signing microservice. By transitioning to an in-house solution, iCreditWorks achieved cost savings, enhanced branding, and operational efficiency.

Documents processed: iCreditWorks has successfully processed a large number of agreement documents. This volume encompasses a diverse range of documents, from customer agreements pertaining to loan applications to intricate network provider agreements. This showcases the reliability of the in-house e-signing service, and it also highlights the capability to adeptly manage a variety of document types.

Transitioning to in-house solutions isn’t just about cost-saving. It’s about gaining control and flexibility, and ensuring that the user experience aligns with the brand’s vision.

Building an in-house e-signing microservice

Choosing a microservice architecture for the e-signing service was driven by several key considerations. This architectural style offers modularity, making the application easier to develop, test, deploy, and scale. It allows for the independent deployment of services, enhancing agility and resilience. Furthermore, microservices enable technology diversity, allowing us to choose the best technology stack for each service’s unique requirements, thereby optimizing performance and maintainability. Another key factor was to build this as a horizontal capability service to be leveraged across different business units.

Tech stack

When building any microservice, especially with critical business functionalities, it’s essential to choose a tech stack that’s robust, scalable, and has wider community support. The e-signing microservice is built on the following technology stack:

  1. Spring Boot with Java
  2. Generating and Securing PDFs
  3. Cloud Storage (AWS S3 or Azure Blob)
  4. Database (MySQL). If the database is hosted on Azure, you could use Azure Database for MySQL. However,  if it’s on AWS, you could use Amazon RDS for MySQL.

Figure 1: E-sign Tech Stack

Spring Boot

This is a standard REST API interface developed in Java using the Spring Boot framework. Spring Boot provides rapid development capabilities and wider community support. There are two main APIs: one to fetch the initial document for signing and another to post the signature, which then returns the signed document.

Generating and securing PDFs

To generate PDF documents, we use headless Chrome. In Java, there’s a Maven package that serves as a wrapper for the Chrome dev tools, named chrome-devtools-java-client. To secure the PDFs, we employ an external third-party certificate. For iCreditWorks, the certificate is sourced from Entrust.

Cloud storage

It’s essential to store generated documents, user signatures, and published templates for documents that need signatures. All of these are stored in the cloud. If AWS is the cloud provider, AWS S3 is used. Alternatively, for Azure, Azure Blob storage is the choice. Additionally, Google Cloud Platform (GCP) also offers robust storage solutions like Cloud Storage, which could be integrated as per the organization’s needs.

Database

The database is mainly for state storage between the APIs. It also stores document-related artifacts and the payload sent with the initial request. The current GitHub implementation uses MySQL as the database, but it can be replaced with any other relational database like MS SQL Server, Postgres, or Oracle. We can use NoSQL databases like MongoDB as well.

E-signing flow and API endpoints    

There are primarily two endpoints, which ensure a straightforward process for the user, from document generation to signing.

  1. Generate documents for signing
  2. Post-signature and get a signed document

We will delve into the above two steps in detail below, and we will start first with generating documents. Generating documents is the initial step in the e-signing process, where business-specific legal documents are created using predefined templates and dynamic data. This section will detail the process, from fetching and populating templates to converting them into ready-to-sign PDF documents.

Generate documents for signing

The e-sign service uses templates to create business-specific legal documents for signing. At iCreditWorks, there’s a specific process to publish a template, but that process is beyond the scope of this article. The assumption here is that the template has already been published and is stored on cloud storage. These templates are HTML files with static content and templated variables (e.g., {{CustomerAddressCity}}, {{CustomerAddressState}}, {{CustomerAddressZip}}). These tokenized variables are replaced with actual values at runtime.

Implementing tokenized variables within templates facilitates dynamic content creation, catering to diverse business needs. Additionally, it provides versatility by supporting various document types through distinct templates.

API Endpoint

POST {{baseUrl}}/api/e-sign/v1/document/{contentType}

Request Body

{
    "docCode": "GEN-MPN-01",
    "contextId": "LN-1000109-01",
    "localePreference": "en-US",
    "fields": {
        "CurrentDate": "09/03/2023",
        "UserAcctId": "LN-1000109-01",
        "CustomerFullName": "John Doe",
        "CustomerAddressLine": "1100 Fox Run Dr",
        "CustomerAddressCity": "Iselin",
        "CustomerAddressState": "NJ",
        "CustomerAddressZip": "08050"
    }
}

The “fields” are dynamic variables that must be filled in the HTML template during document generation.

The following steps occur under the hood of the API endpoint:

Figure 2: Generate document

  1. Fetch the template from cloud storage based on the provided template code.
  2. Populate the HTML template by replacing variable tokens with the dynamic values provided.

    Code snippet for replacing tokenized variables.
     

    private String pattern = "\{\{([\w]+)\}\}";
    
    Pattern pattern = Pattern.compile(this.pattern, Pattern.CASE_INSENSITIVE);
    Matcher matcher = pattern.matcher(messageTemplate);
    while (matcher.find()) {
        String messageToken = matcher.group();
        String fieldName = messageToken.substring(2, messageToken.length() - 2);
        String fieldValue = normalizedFieldMap.get(fieldName);
        if (fieldValue != null) {
            if (fieldValue.contains("$")) {
                fieldValue = fieldValue.replace("$", "\$");
            }
            matcher.appendReplacement(messageBuffer, fieldValue);
        } else {
            matcher.appendReplacement(messageBuffer, "");
        }
    }
  3. Use headless Chrome to convert the HTML file into a PDF document.
  4. Store the generated HTML and PDF document in cloud storage.
  5. Save the document metadata and other data in the database to maintain the state for that document request.
  6. The API returns an HTML response, but it can also return a PDF, depending on the requirement. The API also provides a document a universal unique identifier (UUID) in the response header, which can be used for subsequent API calls.

Post-signature and get signed document

API Endpoint

POST {{baseUrl}}/api/e-sign/v1/document/signature/{{eSignDocumentUUID}}/html

Request Body (form- Data):
Key: “signature”
Value: “User’s signed image”
The following steps occur under the hood of the API endpoint:

Figure 3: Get Signed Document

This is the core of the e-signing process, where the main actions occur. Here’s what happens step by step:

  1. The API request contains the document UUID as a path variable, this documentUUID is used to fetch the state from the previous request.
  2. The user’s signature image – part of the request body – is uploaded to cloud storage, and its path is saved in the database.
  3. Based on the state data, a new empty template is fetched, and the templated tokenized variables are then replaced with dynamic data, which is fetched from the state data.
  4. For the Uploaded user signature image URL in cloud storage, a pre-signed public URL is created. This type of URL provides time-limited access. The URL is then inserted into the template variable for the user’s signature.

    Below is a sample of one of the templates where the user’s signature will be inserted.

    HTML code for the template would look something like this:
     

    
        
            
    {{CustomerFullName}} ({{SignedDateTime}})
    {{SignedDateTime}}

    {{CustomerSignature}} – This variable is replaced with the Pre-Signed URL.

    Below is what the run time HTML looks like.

    
        
            
    John Doe (10/01/202320:25:47 EDT)
    10/01/2023 20:25:47 EDT

    At iCreditWorks, we use the concept of a one-time token instead of a Pre-signed URL; a one-time token is appended to the URL, along with the UUID, which is associated with the signature image that has been stored in the cloud storage (AWS S3). The token in the URL can be used only once and will be short-lived. Below is a sample URL for reference:

    {{baseUrl}}/v1/document/157f55tt-e8d1-4f6e-90bb-dc5341bc4fa4?token=78902bed-60c2-11ee-acd2-0aa24276fac5

    Pre-signed URL code snippet for AWS S3

    Calendar calendar = Calendar.getInstance();
    calendar.setTime(new Date());
    calendar.add(Calendar.MINUTE,5); // Generated URL will be valid for 24 hours
    return this.awsS3Client.generatePresignedUrl(s3BucketName, documentPath, calendar.getTime(), HttpMethod.GET).toString();
    
  5. After digitally signing the PDF document, both the HTML and PDF versions of the signed document are saved in cloud storage. The paths from the cloud storage are then saved in the database.
  6. Once all the steps are completed, the signed HTML document response is sent to the consuming application, allowing the user to view the signed document.

Having explored the detailed steps of the e-signing process, let’s shift our focus toward the security measures that ensure the integrity and confidentiality of the signed documents in the e-signing service.

Security measures in e-signing

E-signing Certificate: The e-signing service uses a digital certificate to authenticate the identity of the signer and to provide the signer with a unique digital ID. This certificate is issued by a trusted third-party Certificate Authority (CA) and ensures the integrity and authenticity of the signed document.

Storage Security: All signed documents and associated data are encrypted in a secure cloud storage.

Handshake Protocol: The e-sign service uses a secure handshake protocol to establish a secure connection between the client and the server. This ensures that the transmitted data is encrypted and secure from any potential eavesdropping or tampering.

Channel-agnostic shared service

The e-signing microservice is designed and built as a shared service, meaning it can be seamlessly integrated into various applications or systems within the organization. This modular approach ensures consistency in e-signing across different platforms and reduces redundancy.

The e-signing service is designed to work across various channels, be it mobile, web, or any other digital platform. This flexibility ensures a consistent user experience across all touchpoints and allows for easy scalability and adaptability to emerging channels.

Conclusion

In today’s digital-centric world, businesses must continuously seek ways to enhance operational efficiency, reduce costs, and provide a seamless user experience. The transition to an in-house e-signing solution, as illustrated in our case study, not only led to significant cost savings but also provided the flexibility to tailor services to unique branding and operational needs. While third-party solutions offer convenience, there’s undeniable value in building proprietary systems that align closely with a company’s vision. As industries continue to digitize, the ability to adapt, innovate, and customize becomes paramount. Investing in in-house solutions can be a strategic move toward achieving that agility and laying the groundwork for sustained growth in the digital era.

Open-source availability

The e-signing microservice’s codebase is available on GitHub. By sharing the solution with the broader community, we aim to empower other startups and fintechs to benefit from our development efforts and to foster a collaborative environment for continuous improvement. The potential of in-house e-signing solutions isn’t limited to fintech. Various other industries can leverage the open-source solution.

Beyond the Numbers: Decoding Metrics for Assessing Client-Side Engineer Impact

Key Takeaways

  • Over the years, managers have fine-tuned their understanding of metrics such as “queries per second” or “request latency” to gauge the impact and expertise of backend engineers.
  • Establishing universally accepted impact metrics for client-side engineers remains a challenge we have yet to overcome.
  • The notion that client-side engineering might be less relevant as a discipline is losing weight, but an unconscious bias persists.
  • A comprehensive grasp of impact metrics for client-side engineers will eliminate personal biases from promotions and performance assessments.
  • It’s crucial to discern what these metrics convey and what they omit. 

When people managers assess the performance of software engineers, they often rely on a set of established metrics, believing they offer a meaningful representation of an engineer’s impact. However, these metrics sometimes fail to provide a complete and nuanced view of an engineer’s daily responsibilities and their actual contribution to a project.

Consider this scenario: an engineer makes a change to a critical component of a product used by millions. On paper, it appears they’ve impacted a substantial user base, but the reality may be completely different.

Indeed, while most performance assessment guides try to enforce metrics that can be directly tied to an individual, there is often a lack of clarity and understanding of what these metrics truly represent in the broader context of the engineer’s role and skills.

This deficiency is particularly pronounced in evaluating the impact of client-side engineers. The metrics used for their assessment are not as well understood as those used for their server-side peers, creating thus potential gaps in evaluation.

This article will delve deep into metrics that can be used for assessing the impact of client-side engineers, offering insights into what they mean and what they don’t.

Our aim is to provide a more comprehensive perspective that can be useful when developing performance assessment guides for organizations building full-stack software, ensuring a more balanced and fair evaluation of engineers’ contributions and impact.

What This Document Is, and What It’s Not

Most performance assessment guides available today pivot on a few foundational elements to bring structure to the assessment of engineers. These elements, while expressed differently across various organizations, hold a consistent essence.

  • Firstly, engineers are generally assessed on the basis of their impact, or another term synonymous with impact. The evaluation begins with measuring the ripple effect of their work and contributions.
  • Secondly, as practitioners of computer science, engineers are anticipated to untangle complex computer science issues to endow the business with a durable advantage. It’s a quiet understanding that problem-solving prowess is at the heart of their role.
  • Thirdly, the silhouette of an engineer’s responsibilities morphs with varying levels of seniority. As they ascend the corporate ladder, their influence and leadership seamlessly integrate into the evaluation framework, becoming significant markers of their growth at senior echelons.

While most rubrics also include evaluations based on teamwork and other similar attributes, these are typically less contentious and more straightforward to calibrate across engineers working on diverse aspects of the stack. Hence, this document will not delve into those aspects, keeping the focus firmly on the aforementioned elements.

The following sections focus on a number of metrics we believe could be used to assess the performance of client side engineers. With each metric, we highlight the associated engineering impact, discuss the inherent technical intricacy, and offer examples to demonstrate how contributions can be effectively contextualized using these parameters.

Adoption / Scale

Let’s address the elephant in the room first. A prevalent impact metric for gauging the body of work accomplished by client-side engineers often orbits around the adoption, engagement, or retention of the feature they developed

Now, pause and ponder. Boasting product metrics like installs or DAU, might not always reflect the engineers’ brilliance (or perhaps, sometimes they do?). Its crucial here to fine-tune the calibration of assessment metrics across different teams. It’s essential to evaluate it in tandem with other impact metrics used for backend engineers, which may, again, not always echo their expertise but predominantly highlight the product’s growth.

But don’t be led astray. Yes, there exist substantial engineering challenges intertwined with the scale that these metrics showcase. Yet, it’s paramount to remember it’s the overcoming of these challenges that should be the yardstick for their performance assessment, not merely the growth or the flashy numbers themselves.

 

Product Metric Why is it important? What it’s not Examples

#Installs

#[Daily|Monthly] Active Users

#Day [7|30] retention

A whopping number of app installs or usage typically means a few things:

– It suggests a meticulously designed, universal implementation, particularly on the Web and Android platforms. Navigating the intricate maze of diverse lower API and browser versions on these platforms is indeed a commendable achievement in itself.

– It signals an ability to function effectively across a spectrum of geographic locations, each with its unique internet connectivity, privacy/legal mandates, and phone manufacturers.

– It better matches the fragmented landscape of Android hardware, as well as the multiple form factors existing on Apple’s platform (MacOS, tvOS, watchOS etc).

– It underscores the ability to iron out nuanced bugs on obscure devices and browser versions.

– It emphasizes the critical role in safeguarding the ecosystem’s health, for apps that are truly ubiquitous (i.e., billion installs), and potentially capable of causing system wide catastrophe.

– Contrary to popular belief, a billion installs doesn’t inherently measure a client-side engineer’s prowess in making burgeoning products.

– In a more light-hearted vein, it’s akin to building a backend API that serves (say) 500K queries per second. While it’s impressive, it’s not the lone marker of an engineer’s capability or the definitive gauge of the product’s overall vitality and growth trajectory.

– Without the trusty sidekicks we call quality metrics (outlined below), the #installs metric is a bit like a superhero without their cape. Sure, it’s flashy and might get you some street cred, but it’s hardly enough to truly save the day. Alone, it mostly just flaunts product growth and lacks the depth to genuinely showcase impact. So, let’s not send it into battle without its full armor, shall we?

– Mitra enhanced our text editor to function on niche Android OEMs, expanding our user base by 1% for our 100M DAU product.

– Akira’s commitment to web standards streamlined our transition from a 2K-user private preview to a 1M-user public preview across browsers.

– Mireya’s design of our core mobile functionality in C++ allowed us to launch the iOS app just two months after the Android, resulting in an additional 1M DAU.

– Ila’s deep knowledge of Apple platform APIs enabled us to roll out an app version for Apple Silicon to 100K users within two weeks of its WWDC announcement.

– We boosted our CSAT score in India by 3% thanks to Laya addressing specific bugs impacting thousands of Android users on devices without emulation capabilities.

– Amal’s optimization of local storage was key to having power users be highly engaged in the product for 30 consecutive days without running out of disk space.

App Health and Stability

Among all metrics that highlight the challenges of building client-side applications compared to backends or APIs, app health and stability stand out the most. Rolling back client-side applications is inherently difficult, and they often require a slower release cadence, especially outside of web environments. This sets a high standard for their quality, stability, and overall health. Additionally, the performance of client apps can subtly influence API backends. Factors like caching, retrying, and recovery in client apps can directly correlate with essential metrics in application backends.

 

Metric What it is What its not Examples
#Crashes

A reduction of crashes (or greater than 99% crash-free users) represents:

– The demonstration of foresight and rigorous engineering to deliver top-notch client-side software, which is inherently more difficult to roll back than backend development.

– An adherence to the evolving best practices on the Web, as well as on Apple and Android platforms.

– The capability to construct software that operates seamlessly in diverse computing environments, especially for Android. This highlights the team’s technical versatility and in-depth understanding of various computing landscapes, ensuring optimal functionality and user experience across all platforms.

An application may be crash-free also because it is intrinsically simple. These metrics need to be calibrated alongside the number of flows and features that the product supports.

– Maya has reduced the crash rate in sign-in flow supporting 4 different identity providers from 10K crashes a day to under 1K.

Startup Latency
Application Size
Memory Footprint

Build duration

– Most production-grade applications have a complex start-up path that involves the initialization of not just the required business logic, but also a complex set of dependencies. Having a snappy start-up experience is critical for retaining users. Any substantial improvement to the P95 of cold start demonstrates meticulous profiling, building hypotheses, and rigorous experimentation.

– An application that has a reasonable memory footprint, indicates the adoption of engineering best practices such as loading entities efficiently and reusing objects when appropriate. This is especially critical on Android platforms that have a lot of lower-end devices.

– A developer pain point in native mobile development compared to web engineering is the relatively long compilation times. Engineering effort that can make a significant dent in bringing the compilation times down, will have a multiplicative effect in terms of developer productivity. Among other strategies, this can be done by caching build artifacts, auditing dependencies to eliminate irrelevant ones, and using dynamic linking of libraries.

 

– Alice reduced the P95 of iOS cold start by 25% in the past 6 months. Alice did this by carefully profiling the start-up path and deferring the initialization of libraries that were not needed for the initial screen to load.

– Yu rewrote the view model cache. This resulted in better memory utilization and reduced OOM crashes in the past 6 months by 15%.

#Hot Fixes

A modest number of hotfixes typically signals:

– Stable, high-caliber releases that stand robust against the tides of user demands and technical challenges, showcasing a commitment to delivering exceptional and reliable software solutions.

– Thoroughly conceived experiment flags, highlighting a strategic and considerate approach to feature testing and implementation, further strengthening the software’s resilience and user-centric design.

 

– Kriti designed an experimentation framework on the client that allowed us to remotely configure client parameters on the backend without re-releasing apps, while not violating App Store and Play Store policies. We’ve gone from doing 6 hotfixes per quarter on average to under 2.

#Errors on backend (4XX) (and other backend metrics)

Where relevant, a decrease in backend error rates could suggest:

– Corrections in client implementations, reducing faulty RPCs. This indicates improved system communication and coordination.

– Improved client configurability, allowing for parameter adjustments post-launch, ensuring better performance and responsiveness to project needs and issues.

 

– Indra went through a painstaking process to understand what scenarios caused clients to send incorrect parameters and spike 400s on our backends. This has reduced the false alarms on our on-call rotations by over 30%.

Product excellence

Client applications are the main touchpoint for users across a majority of online apps. While this section covering product excellence may appear as a “catch-all” , the aim is to emphasize the direct connection between quick releases, accessibility, prompt bug resolutions, and overall customer satisfaction.

 

Customer Focus / Product
Metric What it is What it’s not
#ExternalIssuesFixed

While there’s a potential for this to be seen as a vanity metric, it typically carries substantial significance for open-source client apps confronted with issues reported by users. This metrics underscores the importance of addressing and resolving these issues to maintain and enhance the project’s reliability and reputation in the open-source community.

– For teams used to declaring bug bankruptcy, this metric loses its efficacy. It falls short as a reliable measure of performance and improvement when bugs are regularly written off en masse without resolution.

– Additionally, the metric operates on an assumption of good faith.

#releases

Frequent releases can imply

– Consistent bug fixes and improvements, showing a commitment to refining the product and ensuring its reliability and effectiveness.

– Enhanced releasability, overcoming historical challenges and demonstrating an improved and more efficient release process.

– Diligent efforts to stay in sync with the rest of the ecosystem, including dependencies and platform updates, ensuring the product remains up-to-date, secure, and compatible with other elements of the ecosystem.

– A significantly high number of releases can also stem from instability, indicating that frequent updates are needed to address ongoing issues and ensure the product works as expected.

– This perspective underscores the importance of balancing release frequency with product stability to avoid overloading the team and the end-users.

NPS/CSAT

Customer satisfaction surveys, albeit generic, are often profoundly influenced by client applications. Apps stand as the first, most tangible, and recurrent interaction customers have with a service.

– A subpar experience can indelibly etch a negative impression, proving hard to erase.

– On the flip side, a stellar app experience can compensate for deficiencies in service features, performance, and pricing, leaving a positive and lasting impact on the customers, fostering loyalty and satisfaction.

 
Accessibility / Usability metrics

– Efforts to build an accessible, inclusive product that can be used by people with disabilities in hearing, vision, mobility or speech.

– Usability is often reflected as a reduction in support costs for the product

 

Conclusion

For a while, the perception of being a client-side engineer was that you were not as hardcore as some of your backend counterparts. Backend engineers often avoided client-side opportunities, as client-side engineering was seen as a minor, easier form of software engineering that prioritized vanity over correctness and software quality. While this view has significantly shifted over the last five years with the rise of serverless applications and SaaS backends, remnants are still there.

Even as these perspectives are on the path to correction, it’s crucial for us as people managers to ensure that our personal biases do not impact our decision-making, especially when our decisions profoundly affect the careers and well-being of client-side engineers. The metrics discussed here aim to offer a foundational point for ensuring an organization that is more equitable to client-side engineers.

About the Authors

Optimizing Resource Utilization: The Benefits and Challenges of Bin Packing in Kubernetes

Key Takeaways

  • Challenges in bin packing include balancing density versus workload isolation and distribution, as well as the risks of overpacking a node, which can lead to resource contention and performance degradation.
  • Kubernetes provides scheduling strategies such as resource requests and limits, pod affinity and anti-affinity rules, and pod topology spread constraints.
  • Examples of effective bin packing in Kubernetes include stateless applications, database instances, batch processing, and machine learning workloads, where resource utilization and performance can be optimized through strategic placement of containers.
  • Best practices for bin packing include careful planning and testing, right-sizing nodes and containers, and continuous monitoring and adjustment.
  • Implementing bin packing in Kubernetes can also have a positive environmental impact by reducing energy consumption and lowering greenhouse gas emissions.

Given Kubernetes’ status as the de facto standard for container orchestration, organizations are continually seeking ways to optimize resource utilization in their clusters. One such technique is bin packing: the efficient allocation of resources within a cluster to minimize the number of nodes required for running a workload. Bin packing lets organizations save costs by reducing the number of nodes necessary to support their applications.

The concept of bin packing in Kubernetes involves strategically placing containers, or “bins,” within nodes to maximize resource utilization while minimizing wasted resources. When done effectively, bin packing can lead to more efficient use of hardware resources and lower infrastructure costs. This is particularly important in cloud environments where infra spend makes up a significant portion of IT expenses.

In this article, we will explore the complications of bin packing in Kubernetes, discuss the challenges and trade-offs associated with this approach, and provide examples and best practices for implementing bin packing in your organization.

Challenges of Bin Packing in Kubernetes

While bin packing in Kubernetes offers significant benefits in terms of resource utilization and cost savings, it also presents some challenges that need to be addressed.

Density vs. Workload Isolation and Distribution

One of the main issues when implementing bin packing is finding a balance between maximizing resource density and maintaining workload isolation while ensuring the distribution of workloads across systems and availability zones (AZs) for resilience against hardware failures. Packing containers tightly onto nodes can lead to better resource utilization, but it can also increase the risk of contention for shared resources, such as CPU and memory.

This can result in performance degradation and potentially affect the stability of the entire cluster. Moreover, excessive bin packing can contradict the concept of distribution, presenting dangers to the system’s ability to sustain hardware failures. Therefore, it is essential to apply bin packing strategies judiciously and only when the use case makes sense, taking into account both resource optimization and system resilience.

To further understand the implications of this trade-off, it’s worth considering the impact of increasing density on the fault tolerance of your cluster. When containers are packed tightly onto a smaller number of nodes, the failure of a single node can have a more significant impact on the overall health and availability of your applications. This raises the question: how can you strike a balance between cost savings and ensuring your workloads are resilient to potential failures?

Risks of Over Centralizing Applications in the Node

The risk of excessively bin-packing applications in a node, is the opposite of maintaining the “best-practice” of a distributed deployment. It’s the classic risk management mistake of having all your eggs in one basket. It’s an operational risk so that if your node dies, it means a bigger chunk of your deployment will be down with it. Therefore, on the one hand, you want to be distributed as possible for the sake of resiliency. On the other hand, you want to keep your costs under control and bin packing is a good solution for this. The magic is in finding the sweet spot in this balance of considerations.

These issues become more pronounced when multiple containers vie for the limited resources, like memory or CPU, available on a single node, resulting in resource starvation and suboptimal application performance. Additionally, scaling the system in a non-gradual manner or in bursts can also cause unwanted failures, further exacerbating these challenges. To manage these inconsistencies it helps to set policy limits, where you can ensure the reliable supply of resources to applications.

Another aspect to consider when overpacking a node is the potential effect on maintenance and updates. With more containers running on a single node, the impact of maintenance tasks or software updates can be magnified, possibly leading to more extended periods of downtime or reduced performance for your applications. How can you manage updates and maintenance without negatively affecting the performance of your workloads when using bin packing is a critical question to consider.

Scheduling Strategies to Address the Challenges

Kubernetes provides several scheduling strategies to help remediate issues related to bin packing:

  • Resource requests and limits let you configure the Kubernetes scheduler to consider the available resources on each node when making scheduling decisions. This enables you to place containers on nodes with the appropriate amount of resources.
  • Pod affinity and anti-affinity rules allow you to specify which nodes a pod should or should not be placed on based on the presence of other pods. This can help ensure that workloads are spread evenly across the cluster or grouped together on certain nodes based on specific requirements. For example, data-critical systems, such as those handling essential customer data for production functionality, need to be distributed as much as possible to enhance reliability and performance. This approach can reduce the risk of single points of failure and promote better overall system resilience.
  • Pod topology spread constraints enable you to control how pods are distributed across nodes, considering factors such as zone or region. By using these, you can ensure that workloads are evenly distributed, minimizing the risk of overloading a single node and improving overall cluster resilience.

By carefully considering and implementing these scheduling strategies, you can effectively address the challenges of bin packing in Kubernetes while maintaining optimal resource utilization and performance.

Examples of Bin Packing in Kubernetes

There are various examples of how Kubernetes can effectively implement bin packing for different types of workloads, from stateless web applications to database instances and beyond. We’ll explore some of them below.

Stateless Applications

Kubernetes can pack multiple instances of stateless applications into a single node while ensuring that each instance has sufficient resources. By using resource requests and limits, you can guide the Kubernetes scheduler to allocate the appropriate amount of CPU and memory for each instance. As long as the instances have enough resources, they will be up and running and ensure high availability for stateless applications such as web or client-facing apps

Database Instances

When dealing with databases, Kubernetes can effectively pack individual instances of different stateful applications into nodes to maximize throughput and minimize latency. By leveraging pod affinity rules, you can ensure that database instances are placed on nodes with the necessary volumes and proximity to other components, such as cache servers or application servers. This can help optimize resource usage while maintaining high performance and low latency for database operations.

Batch Processing and Machine Learning Workloads

Bin packing can also be beneficial for batch processing and machine learning workloads. Kubernetes can use pod topology spread constraints to ensure these workloads are evenly distributed across nodes, preventing resource contention and maintaining optimal performance.

Large Clusters with Many Nodes

In cases where a service needs to be distributed to a large number of nodes (e.g., 2,000 nodes), resource optimization remains a priority. While spreading these services out is essential for tolerance, bin packing should still be considered for the remaining services to increase the utilization of all nodes.

Kubernetes can manage this through topology spread configurations such as PodTopologySpreadArgs if specific resources from nodes are used. Cluster admins and cloud providers should ensure nodes are provisioned accordingly to balance the spread-out services and the bin-packed services.

By understanding and applying these examples in your Kubernetes environment, you can leverage bin packing to optimize resource utilization and improve the overall efficiency of your cluster.

Cost Efficiency Benefits of Bin Packing in Kubernetes

By efficiently allocating resources within a cluster and minimizing the number of nodes necessary to support workloads, bin packing can help reduce your infrastructure costs. This is achieved by consolidating multiple containers onto fewer nodes, which reduces the need for additional hardware or cloud-based resources. As a result, organizations can save on hardware, energy, and maintenance.

In cloud environments, where infrastructure costs are a significant portion of IT expenses, the cost savings from bin packing can be particularly impactful. Cloud providers typically charge customers based on the number and size of nodes used, so optimizing resource utilization through bin packing can directly translate to reduced cloud infrastructure bills.

Best Practices for Bin Packing in Kubernetes

To fully harness the benefits of bin packing in Kubernetes, it’s essential to follow best practices to ensure optimal resource utilization while preventing performance problems. We highlight three below.

Careful Planning and Testing

Before implementing bin packing in your Kubernetes environment, it’s crucial to carefully plan and test the placement of containers within nodes. This may involve analyzing the resource requirements of your workloads, determining the appropriate level of density, and testing the performance and stability of your cluster under various scenarios. Additionally, setting hard limits for memory is essential, as memory is a non-compressible resource and should be allocated carefully to avoid affecting surrounding applications. It is also important to account for potential memory leaks, ensuring that one leak does not cause chaos within the entire system.

By taking the time to plan and test, you can avoid potential pitfalls associated with bin packing, such as resource contention and performance degradation.

Right Sizing Nodes and Containers

Properly sizing nodes and containers is a key aspect of optimizing resource utilization in your Kubernetes environment. To achieve this, first assess the resource requirements of your applications, taking into account CPU, memory, and storage demands. This information helps in determining the most suitable node sizes and container resource limits to minimize waste and maximize efficiency. It is crucial to size nodes and containers appropriately for the workload because if your containers are too large and take up a significant proportion of the node, then you won’t be able to fit additional containers onto the node. If you’re running a very large container that takes up 75% of every node, for example, it will essentially force 25% waste regardless of how many bin packing rules were set. The resources allocated to a container and the resources a machine offers are critical factors to consider when optimizing your Kubernetes environment.

Monitoring and Adjusting Over Time

Continuous monitoring and adjustment are essential for maintaining optimal resource utilization in your Kubernetes clusters. As workloads and requirements evolve, you may need to reassess your bin packing strategy to ensure it remains effective.

Regular monitoring can help you identify issues early on, such as resource contention or underutilized nodes, allowing you to make adjustments before a problem escalates.

Utilizing Kubernetes Features for Bin Packing

  • Resource quotas allow you to limit the amount of resources a namespace can consume, ensuring that no single workload monopolizes the available resources in your cluster.
  • Resource requests and limits for your pods, already noted above, let you guide the Kubernetes scheduler to place containers on nodes with the appropriate amount of resources. This helps ensure workloads are allocated efficiently and resource contention is minimized.

One more aspect to consider is the environmental impact of your infrastructure. By optimizing resource utilization through bin packing, you can potentially reduce your organization’s carbon footprint. Running fewer nodes means consuming less energy and generating less heat, which can contribute to lower greenhouse gas emissions and a smaller environmental impact. This raises an important question: How can businesses balance their goals for cost efficiency and performance with their social responsibility to reduce their environmental footprint?

Conclusion

Bin packing in Kubernetes plays a crucial role in optimizing resource utilization and reducing infrastructure costs. But it’s also important to achieve the right balance between efficiency and performance when optimizing Kubernetes resources.

By strategically allocating resources within a cluster, organizations can minimize the number of nodes required to run workloads, ultimately resulting in lower spend and more efficient infrastructure management.

However, as discussed, there are some performance-related challenges and trade-offs associated with bin packing, as well as best practices for effectively employing bin packing in your Kubernetes environment. By understanding and leveraging these techniques, you can maximize resource utilization in your cluster, save on infrastructure costs, and improve overall efficiency.

Distributed Transactions at Scale in Amazon DynamoDB

Key Takeaways

  • NoSQL cloud database services are popular for their key-value operations, high availability, high scalability, and predictable performance. These characteristics are generally considered to be at odds with support for transactions. DynamoDB supports transactions, and it does it without compromising on performance, availability or scale.
  • DynamoDB added transactions using a timestamp ordering protocol while exploiting the semantics of a key-value store to achieve low latency for both transactional and non-transactional operations.
  • Amazon DynamoDB introduced two novel single-request operations: TransactGetItems and TransactWriteItems. These operations allow the execution of a set of operations atomically and serializably for any items in any table. Transactional operations on Amazon DynamoDB provide atomicity, consistency, isolation, and durability (ACID) guarantees within the region.
  • The results of experiments against a production implementation demonstrate that distributed transactions with full ACID properties can be supported without compromising on performance, availability, or scale.

Can we support transactions at scale with predictable performance? In this article, I explore why transactions are considered at odds with scalability for NoSQL databases and walk you through the journey of how we added transactions to Amazon DynamoDB.

NoSQL Databases

NoSQL databases, like DynamoDB, have gained adoption because of their flexible data model, simple interface, scale, and performance. Core features of relational databases, including SQL queries and transactions, were sacrificed to provide automatic partitioning for unlimited scalability, replication for fault-tolerance, and low latency access for predictable performance.

Amazon DynamoDB (not to be confused with Dynamo) powers applications for hundreds of thousands of customers and multiple high-traffic Amazon systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers.

In 2023, over the course of Prime Day, Amazon systems made trillions of calls to the DynamoDB API, and DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 126 million requests per second.

When customers of DynamoDB requested ACID transactions, the challenge was how to integrate transactional operations without sacrificing the defining characteristics of this critical infrastructure service: high scalability, high availability, and predictable performance at scale.

To understand why transactions are important, let’s walk through an example of building an application without support for transactions in a NoSQL database, using only basic Put and Get operations.

Transactions

A transaction is a set of read and write operations that are executed together as a single logical unit. Transactions are associated with ACID properties:

  1. Atomicity ensures that either all or none of the operations in the transaction are executed, providing an all-or-nothing semantic.
  2. Consistency ensures that the operation results in a consistent and correct state for the database.
  3. Isolation allows multiple developers to read or write data concurrently, ensuring that concurrent operations are serialized.
  4. Durability guarantees that any data written during the transaction remains permanent.

Why do we need transactions in a NoSQL database? The value of transactions lies in their ability to help construct correct and reliable applications that need to maintain multi-item invariants. Invariants of this nature are commonly encountered in a wide range of applications. Imagine an online e-commerce application where a user, Mary, can purchase a book and a pen together as a single order. Some invariants in this context could be – a book cannot be sold if it is out of stock, a pen cannot be sold if it is out of stock, and Mary must be a valid customer in order to purchase both a book and a pen.

Figure 1: Simple e-commerce scenario

However, maintaining these invariants can be challenging, particularly when multiple instances of an application run in parallel and access the data concurrently. Furthermore, the task of preserving multi-item invariants becomes challenging in the event of failures such as node failures. Transactions provide a solution for applications to address both challenges of concurrent access and partial failures, alleviating the need for developers to write excessive amounts of additional code to deal with these two challenges.

Imagine that you are developing a client-side e-commerce application relying on a database without transactional support to create Mary’s order. Your application has three tables – the inventory, the customer, and the orders tables. When you want to execute a purchase, what do you need to consider?

Figure 2: Three separate NoSQL tables for inventory, customers, and orders

First, you need to ensure that Mary is a verified customer. Next, you need to check to ensure the book is in stock and in a sellable condition. You also need to do the same checks for the pen, then you need to create a new order and update the status and count of the book and pen in the inventory. One way to achieve this is by writing all the necessary logic on the client side.

The crucial aspect is that all the operations must execute atomically to ensure that the final state has the correct values and no other readers see inconsistent state of the database while the purchase order is getting created. Without transactions, if multiple users access the same data simultaneously, there is a possibility of encountering inconsistent data. For instance, a book might be marked as sold to Mary, but the order creation could fail. Transactions provide a means to execute these operations as a logical unit, ensuring that they either all succeed or all fail, while preventing customers from observing inconsistent states.

Figure 3: What if we experience a crash without transactions?

Building an application without transactions involves navigating through various other potential pitfalls, such as network failures and application crashes. To mitigate these challenges, it becomes necessary to implement additional client-side logic for robust error handling and resilience. The developer needs to implement rollback logic, deleting unfinished transactions. Multi-user scenarios introduce another layer of complexity, needing to ensure that the data stored in the tables is consistent across all users.

Transactions and NoSQL Concerns

There are often concerns about the tradeoffs that come with implementing a transaction system in a database. NoSQL databases are expected to provide low latency performance and scalability, offering often only Get and Put operations that have consistent latency.

Figure 4: Can transactions provide predictable performance?

Many NoSQL databases do not provide transactions, with common concerns being breaking non-transactional workloads, the complexity of the APIs, and system issues such as deadlocks, contentions, and interference between non-transactional and transactional workloads. Some databases attempted to address these issues by providing them with restricted features like isolation levels or limiting the scope of transactions, allowing them to be executed in a single partition. Others enforce constraints on the primary or hash key or require upfront identification of all partitions expected to be part of a transaction.

These restrictions are designed to make the system more predictable and reduce complexity, but they come at the expense of scalability. As the database grows and splits into multiple partitions, data restriction to a single partition could lead to availability issues.

DynamoDB Transaction Goals

When we set out to add transaction support to DynamoDB, the team aimed for providing customers with the capability to perform atomic and serializable execution of operations on items across tables within a specific region with predictable performance, and no impact on non-transactional workloads.

Customer Experience

Let’s focus on the customer experience and explore the options for offering transaction support in DynamoDB. Traditionally, transactions are initiated with a “begin transaction” statement and concluded with a “commit transaction”. In between, customers can write all the Get and Put operations. In this approach, existing operations on single item can be simply treated as implicit transactions, consisting of a single operation. To ensure isolation, two phase locking can be used, while achieving atomicity through two phase commits.

However, DynamoDB is a multi-tenant system, and allowing long-running transactions could tie up system resources indefinitely. Enforcing the full transactional commit protocol for singleton Get and Put operations would have an adverse impact on performance for current customers who do not intend to utilize transactions. In addition, locking introduces the risk of deadlocks, which can significantly impact system availability.

Instead, we introduced two novel single-request operations – TransactGetItems and TransactWriteItems. These operations are executed atomically and in a serializable order with respect to other DynamoDB operations. TransactGetItems is designed for read-only transactions which retrieve multiple items from a consistent snapshot. This means that the read-only transaction is serialized with respect to other write transactions. TransactWriteItems is a synchronous and idempotent write operation that allows multiple items to be created, deleted, or updated atomically in one or more tables.

Such transaction may optionally include one or more preconditions on the current values of the items. Preconditions enable checking for specific conditions on item attributes, such as existence, specific values, or numerical ranges. DynamoDB rejects the TransactWriteItems request if any of the preconditions are not met. Preconditions can be added not only for items that are modified but also for items that are not modified in the transaction.

These operations don’t restrict concurrency, don’t require versioning, don’t impact the performance of singleton operations, and allow for optimistic concurrency control on individual items. All transactions and singleton operations are serialized to ensure consistency. With TransactGetItems and TransactWriteItem, DynamoDB provides a scalable and cost-effective solution that meets ACID compliance.

Consider another example that shows the utilization of transactions in the context of a bank money transfer. Let’s assume Mary wants to transfer funds to Bob. Traditional transactions involve reading Mary and Bob’s account balances, checking funds availability, and executing a transaction within a TxBegin and TxCommit block. In DynamoDB, you can accomplish the same transactional behavior with a single request using the TransactWriteItems operation: checking balances, performing a transfer with TransactWriteItems, eliminating the need for TxBegin and TxCommit.

Transactions High Level Architecture

To better understand how transactions were implemented, let’s delve into workflow of a DynamoDB request. When an application requests a Put/Get operation, the request is routed to a request router randomly selected by a front-end load balancer. The request router leverages a metadata service to map the table name and primary key to the set of storage nodes that store the item being accessed.

Figure 5: The router

Data in DynamoDB is replicated across multiple availability zones, with one replica serving as the leader. In the case of a Put operation, the request is routed to the leader storage node, which then propagates the data to other storage nodes in different availability zones. Once a majority of replicas have successfully written the item, a completion response is sent back to the application. Delete and update operations follow a similar process. Gets are similar except they are processed by a single storage node. In the case of consistent reads, leader replica serves the read request. However, for eventually consistent reads, any of the three replicas can serve the request.

To implement transactions, a dedicated fleet of transaction coordinators was introduced. Any transaction coordinator in the fleet can take responsibility for any transaction. When a transactional request is received, the request routers perform the needed authentication and authorization of the request and forward it to one of the transaction coordinators. These coordinators handle the routing of requests to the appropriate storage nodes responsible for the items involved in the transaction. After receiving the responses from the storage nodes, the coordinators generate a transactional response for the client, indicating the success or failure of the transaction.

Figure 6: The transaction coordinator

The transaction protocol is a two-phase process to ensure atomicity. In the first phase, the transaction coordinator sends a Prepare message to the leader storage nodes for the items being written. Upon receiving the Prepare message, each storage node verifies if the preconditions for the items are satisfied. If all the storage nodes accept the Prepare, transaction proceeds to the second phase.

In this phase, the transaction coordinator commits the transaction and instructs the storage nodes to perform their writes. Once a transaction enters the second phase, it is guaranteed to be executed in its entirety exactly once. The coordinator retries each write operation until all writes succeed. Since the writes are idempotent, the coordinator can safely resend a write request in case of scenarios like encountering a timeout.

Figure 7: When a transaction fails

If the Prepare message is not accepted by any of the participating storage nodes, then the transaction coordinator will cancel the transaction. To cancel, the transaction coordinator sends a Release message to all the participating storage nodes and sends a response to the client, indicating that the transaction has been canceled. Since no writes occur during the first phase, there is no need for a rollback process.

Transactions Recovery

To ensure atomicity of transactions and ensure completion of transactions in the event of failures, coordinators maintain a persistent record of each transaction and its outcome in a ledger. Periodically, a recovery manager scans the ledger to identify transactions that have not yet been completed. Such transactions are assigned to a new transaction coordinator which resumes execution of the transaction protocol. It is acceptable for multiple coordinators to work on the same transaction simultaneously, as the commit and release operations are idempotent.

Figure 8: The transaction coordinator and failures

Once the transaction has been successfully processed, it is marked as completed in the ledger, indicating that no further actions are necessary. The information about the transaction is purged from the ledger 10 minutes after completion of the transaction to support idempotent TransactWriteItems requests. If a client reissues the same request within this 10-minute timeframe, the information will be looked up from the ledger to ensure the request is idempotent.

Figure 9: The transaction coordinator and the ledger

Ensuring Serializability

Timestamp ordering is used to define the logical execution order of transactions. Upon receiving a transaction request, the transaction coordinator assigns a timestamp to the transaction using the value of its current clock. Once a timestamp has been assigned, the nodes participating in the transaction can perform their operations without coordination. Each storage node is responsible for ensuring that the requests involved in the items are executed in the proper order and for rejecting conflicting transactions that may come out of order. If each transaction executes at the assigned timestamp, serializability is achieved.

Figure 10: Using a timestamp-based ordering protocol

To handle the load from transactions, a large number of transaction coordinators operate in parallel. To prevent unnecessary transaction aborts due to unsynchronized clocks, the system uses a time sync service provided by AWS to keep the clocks in coordinator fleets closely in sync. However, even with perfectly synchronized clocks, transactions can arrive at storage nodes out of order due to delays, network failures, and other issues. Storage nodes deal with transactions that arrive in any order using stored timestamps.

TransactGetItems

The TransactGetItems API works similarly to the TransactWriteItems API but does not use the ledger to avoid latency and cost. TransactGetItems implements a two-phase write-less protocol for executing read transactions. In the first phase, the transaction coordinator reads all the items in the transaction’s read-set. If any of these items are being written by another transaction, then the read transaction is rejected; otherwise, the read transaction moves to the second phase.

In its response to the transaction coordinator, the storage node not only returns the item’s value but also its current committed log sequence number (LSN), representing the last acknowledged write by the storage node. In the second phase, the items are re-read. If there have been no changes to the items between the two phases, the read transaction returns successfully with fetched item values. However, if any item has been updated between the two phases, the read transaction is rejected.

Transactional vs Non-transactional Workloads

To ensure no performance degradation for applications not using transactions, non-transactional operations bypass the transaction coordinator and the two-phase protocol. These operations are directly routed from request routers to storage nodes, resulting in no performance impact.

Transaction goals revisited

What about the scalability concerns we raised at the beginning? Let’s see what we achieved by adding transactions to DynamoDB:

  • Traditional Get/Put operations have not been affected and have the same performance as for not transactional workloads.
  • The TransactGetItems API works similarly to the TransactWriteItems API but does not use the ledger to avoid latency and cost.
  • All operations maintain the same latency as the system scales.
  • Utilizing single-request transactions and timestamp ordering we have both transactions and scalability.

Figure 11: Predictable latency for transactions

Best Practices

What are the best practices for using transactions on DynamoDB?

  1. Idempotent write transactions: When making a TransactWriteItems call, you have the option to include a client token to ensure the request is idempotent. Incorporating idempotence into your transactions helps prevent potential application errors in case the same operation is inadvertently submitted multiple times. This feature is available by default when utilizing the AWS SDKs.
  2. Auto-scaling or on-demand: It is recommended to enable auto-scaling or utilize on-demand tables. This ensures that the necessary capacity is available to handle the transaction workload effectively.
  3. Avoid transactions for bulk loading: For bulk loading purposes, it is more cost-effective and efficient to utilize the DynamoDB bulk Import feature instead of relying on transactions.

DynamoDB transactions have been greatly influenced by the invaluable feedback of our customers, who inspire us to innovate on their behalf. I am grateful to have such an outstanding team by my side throughout this journey. Special thanks to Elizabeth Solomon, Prithvi Ramanathan and Somu Perianayagam for reviewing the article and sharing their feedback to refine this article. You can learn more about DynamoDB in the paper published at USENIX ATC 2022 and about DynamoDB transactions in the paper published at USENIX ATC 2023.

The False Dichotomy of Monolith vs. Microservices

Key Takeaways

  • Microservices are the cure to, rather than the cause of, complexity. All applications will become complex; beyond a certain point, microservices will help us manage that complexity.
  • Microservices come with costs and benefits. If the benefits don’t outweigh the costs, you won’t have a good time with microservices.
  • There is no such thing as monolith vs. microservices. There is actually a spectrum of possibilities between them. If you have pegged yourself at either extreme of the spectrum, you are missing out on the wide variety of architectures in the middle.
  • We can stop our journey to microservices somewhere in the middle of the spectrum, what I like to call the hybrid model. At this point, we might have some big services mixed up with some smaller services. We can have the best of both worlds: the simplicity and convenience of the monolith combined with the flexibility and scalability of microservices.
  • We should stop talking about monolith vs. microservice and instead have a more nuanced debate about right-sized services.

The ongoing war: monolith vs. microservices

“In many ways, microservices is a zombie architecture. Another strain of an intellectual contagion that just refuses to die. It’s been eating brains since the dark days of J2EE (remote server beans, anyone??) through the WS-Deathstar nonsense, and now in the form of microservices and serverless.”
— David Heinemeier Hansson (source)

With the recent blog post from AWS saying they have ditched microservices and returned to the monolith, the old war of monolith vs. microservices has reignited.

What’s your position on this? Are you team microservices or team monolith? What if I told you the distinction was something of a fantasy and that people are fighting over a fiction: microservices vs. monolith is just one part of the bigger story.

The article from AWS has been taken as evidence that the company (as a longtime proponent of microservices) has backflipped on microservices and gone back to the monolith.

Despite the title of their blog post being calculated to get attention, the article seems to be about their conversion from functions as a service to what is now arguably a microservices architecture, if not a distributed application with services that are larger than micro (however you define micro).

But the point I’d like to make is that it doesn’t really matter. This is just one team at AWS acknowledging that their first attempt at an architecture didn’t work out (over time), so they tried a different architecture, and it worked better. But so what? From what I have seen in my career, this is just the normal way that good software development should work.

We all want to focus on what’s most important; doing the right thing for our customers. Taking sides in the debate of microservices v. monolith gets in the way of that. Sometimes, we need microservices. Sometimes, we need a monolith. (I’m not yet convinced I’ll ever need FaaS — but I’m keeping an open mind). Most of the time we are better off somewhere between these extremes.

Why do we fear microservices?

Sure, microservices are more difficult to work with than a monolith — I’ll give you that. But that argument doesn’t pan out once you’ve seen a microservices architecture with good automation. Some of the most seamless and easy-to-work-with systems I have ever used were microservices with good automation. On the other hand, one of the most difficult projects I have worked on was a large old monolith with little to no automation. We can’t assume we will have a good time just because we choose monolith over microservices.

Is the fear of microservices a backlash to the hype? Yes, microservices have been overhyped. No, microservices are not a silver bullet. Like all potential solutions, they can’t be applied to every situation. When you apply any architecture to the wrong problem (or worse, were forced to apply the wrong architecture by management), then I can understand why you might passionately hate that architecture.

Is some of the fear from earlier days when microservices were genuinely much more difficult? Ten years ago, microservices did make development significantly more difficult. But the tools and platforms have come a long way since then. It’s easier than ever before to create good automation that makes working with microservices a much more seamless and enjoyable experience.

Maybe some of the fear comes from the perceived complexity. I do think this is a big part of it. People naturally fear (or at least avoid) complexity. I say perceived complexity because it’s not just microservices that get complex. Every monolith will become complex as well — you just have to give it some time. Whereas with microservices, the complexity is just out there for all to see, and we have to deal with it early. In my book, Bootstrapping Microservices, I call this bringing the pain forward in the development process so that it’s easier and cheaper to deal with.

Unfortunately, it’s not possible to hide from complexity in modern software development. Our applications are growing larger and more complex — even the humble monolith is destined to become more complex than any single person can handle.

We can’t avoid complexity in large-scale modern development. We need tools to help us manage the complexity so that it doesn’t slow our development process or overwhelm us.

Why do microservices seem so difficult?

“You must be this tall to use microservices.”
— Martin Fowler (source)

Building distributed applications, not just microservices, requires higher technical proficiency. Managing a fleet of many services instead of just one means we must have tools to automate system management. There’s also a lot to keep track of, just trying to understand what our services are doing. The communication between services becomes exponentially more difficult to understand the more of them we have.

Suppose you are a small team or small project. In that case, if you are applying microservices to a situation where they aren’t warranted, or if you aren’t willing to pay down the investment in skills and technology required to build and run the distributed system, you can’t expect to have a good experience with it.

Another possible pain point is not aligning your services appropriately with the domain. I have seen microservices applications aligned with technological rather than business needs — leading to too many services and an avoidably overwhelming system to manage. There is such a thing as making your services too small, unnecessarily increasing the complexity and difficulty of managing the system.

If you can’t align your architecture correctly with your domain, you will have massive problems irrespective of whether you are using a monolith or microservices — but those problems will be massively amplified the more services you have. Microservices aren’t just good for scaling performance; they will also scale up whatever problems you already have.

Is it just a scaling problem?

“If you can’t build a monolith, what makes you think microservices are the answer?”
— Simon Brown (source)

Is the real problem with microservices just that they scale up our existing problems?

A bad microservices implementation will be at least X times worse than a bad monolith, where X is the number of services in your distributed application. It’s even worse than that, given the exponentially increasing communication pathways in a distributed application.

If you don’t have the tools, techniques, automation, process, and organization that work for your monolith, what makes you think you can scale up to microservices? You need to get your house in order before you can scale up.

Microservices don’t just scale for performance and the dev team; they also scale in difficulty. If you struggle to build and maintain a monolith, scaling to microservices isn’t going to help you.

A microservices application is just a monolith, but with the number of services dialed up and the sizes of the services dialed down. If you are struggling with a monolith and think microservices are the answer, please think again.

I think that microservices are not just scalable for performance and development; they are also scalable in difficulty. Microservices come with benefits, but they aren’t without their costs.

The cost of microservices

“Microservices are not a free lunch.”
— Sam Newman (from Building Microservices)

What are microservices really about? Why would we divide our application into separate services?

There are a bunch of well-known benefits:

  • Scalability
  • Fault tolerance
  • Independent (and less risky) deployment for rapid development cycles
  • Developer empowerment
  • Designing for disposability
  • Managing complexity

But the benefits aren’t the whole story. There are also costs that must be paid:

  • A higher level of technical skill
  • Better automation, management, and observability systems
  • Dealing with the scaleable difficulty

For any tool, technology, architecture, or whatever we want to use, we must ask ourselves the question: Do the benefits outweigh the costs? When the benefits outweigh the costs, you will have a good experience using that technology. When they don’t, you will have a bad time.

Managing complexity

“Microservices enable the continuous deployment of large, complex applications.”
— Chris Richardson (source)

Microservices have a ton of benefits, but the real reason we should use them is because they can help us manage the growing complexity of our application.

That’s right, you heard it here: microservices are not the cause of, but the cure to, complexity.

All applications will become complex; we can’t avoid that even if we are building a monolith. But microservices give us the tools to break up that complexity into smaller, simpler, and more manageable chunks.

Microservices help us manage complexity by breaking it into simple yet isolated pieces. Yes, we can do this with the monolith, but you need a disciplined and proactive team to keep the design intact and not degenerate into a big ball of mud.

We can use microservices to create abstractions and componentize our software. Of course, we can do this kind of thing with a monolith. Still, microservices also give us hard and difficult-to-breach boundaries between components, not to mention other important benefits like independent deployments and fault isolation.

The spectrum of possibilities

“There is not one architectural pattern to rule them all.”
— Dr. Werner Vogels (source)

I asked you a question at the start of this article. Are you team microservices or team monolith?

Returning to this article’s title, it’s not a one-or-the-other choice. There’s a sliding scale from one big service (the monolith) to many tiny services (microservices) with many other viable choices in between.

It’s not just monolith vs. microservices; there’s a whole spectrum of different possibilities. If you fix yourself to either team monolith or team microservices, you are missing out on the rich variety of architectures in between.

You don’t have to artificially align yourself at either end of this spectrum. You don’t even have to peg yourself to any particular position within it. Despite what some people want you to think, there is no right position. The location you choose must be appropriate for your team, business, project, or customers. Only you can decide where you should be on the spectrum of possible architectures.

A diminishing return on investment

The benefits from microservices will come as you move to the right on the spectrum of possibilities. But moving to the right also has costs and difficulties. We need to be sure that the cost of moving toward microservices is one that we are willing to pay.

If you aren’t trying to manage complexity, don’t need the other benefits of microservices, or are struggling to manage the automation and technology for a single service, you should be sticking as close as possible to the monolith on the left side of the spectrum. To the extent that you need microservices, you should be moving closer to microservices on the right side of the spectrum.

It might not be worth pushing all the way to the developer’s utopia of microservices due to a diminishing return on investment, but going part of the way there can yield a high return on investment.

It’s important to realize at this point that we don’t need to reach (what I like to call) the developer’s utopia of microservices to start getting the benefits of them. Any amount of movement we make toward the right-hand side of the spectrum will bring tangible benefits even if we don’t reach all the way to the other side!

There are good reasons why we don’t want to push all the way to perfect microservices. (For a start, who gets to decide what perfect means?) As we start pushing toward the right, we’ll start to see big payoffs. But as we continue to push further, there will be a diminishing return on investment. The more we push toward smaller services, the more the cost will outweigh the benefits. In the real world (it’s messy and complicated out there), it’s difficult, not to mention unnecessary, to achieve anyone’s notion of perfect microservices. But that doesn’t mean moving in that general direction doesn’t help.

The hybrid model

If we don’t need to push all the way to microservices, then where do we stop? The answer is somewhere in the middle where there is a set of trade-offs that improve our development speed and capability, and where the cost of development does not exceed the benefits.

I like to think of somewhere in the middle as the best of both worlds. Yes, we can have a monolith (or multiple monoliths) surrounded by a constellation of microservices. Am I some kind of heathen that I take this pragmatic position? The practical benefit is that I can mix and match the monolith’s benefits with the microservices’ benefits. The convenience and simplicity of the monolith for much of the codebase, and the flexibility, scalability, and other benefits of microservices that I can leverage when I need them make for an ideal environment. I can also incrementally excavate individual microservices from the monolith whenever it becomes apparent that certain features or tasks can benefit from doing so.

The hybrid model isn’t a new idea. It is what the real world often looks like (somewhere in the middle), despite the arguments that continue to rage online.

David Heinemeier Hansson (very much in team monolith) even seems to like the idea, which he calls The Citadel Architecture.

Does size really matter?

“Perhaps ‘micro’ is a misleading prefix here. These are not necessarily ‘small’ as in ‘little.’ Size doesn’t actually matter.”
— Ben Morris (source)

The smaller our services, the more micro they are, the less useful they will be, and the more of them we’ll need. The level of difficulty goes up as we reduce the size of our services and increase the number of them.

Maybe we should stop focusing on the “micro” part of microservices. I think it’s causing people to make their services way too small — and that’s a guarantee to have a bad time with microservices.

I’m not sure how we even got so fixated on making them as small as possible. The intention is to be able to split up our software into pieces, separating the responsibilities, where each of the parts is simpler than the whole, thus making it easier to manage the overall complexity of the system. But when we make our services too small, we risk being swamped by the complexity instead of managing it.

Even though everyone seems to have their own idea of how big or small a microservice should be, the reality is that there is no fixed size that a microservice should be. The “microservice police” aren’t out patrolling for offenders.

So let’s stop arguing about the size of our services and instead start talking about “right-sized” services, that is to say, whatever the right size is for our situation — monolith-sized or somewhere over on the smaller end of the spectrum. Our services, no matter how big or small, should be organized around our business and appropriate to our domain. The size is almost an afterthought; it’s the overall organization that is important.

It’s not about making our services as small as they can be. Beyond a certain point, making your services smaller is counterproductive. The smaller they are, the more they must communicate with the rest of the system to get work done. The more they communicate, the more we’ll pay the network transfer cost. Not to mention that it becomes much more difficult to understand who is talking to whom. We need a good balance between service size and how chatty our services are (thanks to Damian Maclennan for bringing me the term chatty).

Choose a size for your services that’s meaningful to you. It doesn’t matter if some services are bigger than others. Please don’t let your OCD decide on service size–that can get in the way of what could have been great architecture. There’s nothing inherently right or wrong about making them bigger or smaller, so long as you find something that works for you.

Don’t be afraid to change your mind

“Just to be honest — and I’ve done this before, gone from microservices to monoliths and back again. Both directions.”
— Kelsey Hightower

Sometimes, we have to try new things to understand whether they are a good fit for our project. So don’t be afraid to try new technologies. Don’t be scared to try microservices or the hybrid model to see if it works.

But later, don’t be afraid to change your mind and roll back whatever previous decisions you have made. It’s not bad to admit that something hasn’t worked out. That’s exactly what we need to do to find success. Try different things, do various experiments, and move on from the ones that didn’t work out. Because microservices didn’t work out for you on a particular project doesn’t mean they are a bad choice for other teams or projects.

Or better yet, just keep an open mind. That’s the best way to not shut yourself off from new ideas and new thinking that could be what you need to truly shine in your next project.

Further reading