April 13, 2022

Addressing enterprise testing needs with Testcontainers & test data

Addressing enterprise testing needs with Testcontainers & test data

Nicolai Baldin & Denis Borovikov are joined by Sergei Egorov, the CEO of AtomicJar. Together they examine AtomicJar’s work, the importance of Testcontainers, the current trends in software testing, and the challenges that large companies face shipping their products faster. 

The full recording of their discussion is available here.

Can you introduce yourself, and your work with AtomicJar and Testcontainers?

Sergei: I’m Sergei, originally from Russia, I've been living in Europe for almost 10 years and then I moved to the United States. I'm broadcasting from New York City, Brooklyn, and I’m the CEO of AtomicJar. But despite having this management title, I have been a developer my whole life.

I went from engineering to being a Co-founder of a startup focused on Testcontainers, which is the foundation for AtomicJar. And it's been there for seven years as an open-source project making integration tests easy by using Docker and by wrapping Docker with API. It can be used directly from your code and AtomicJar is the next step for Testcontainers, the community, and the set of products we have. We are now building Testcontainers in the cloud, which is a hosted version that allows you to run the same tasks but not on your machine.

Denis: I'm Denis, the CTO, and co-founder at Synthesized. My background is in software engineering. I've been doing it for the last 14 or 15 years. I would say that I was primarily a backend engineer focused on Java microservices stack, Java, and Scala. Synthesized is the agile test data management for the cloud era. Synthesized for test data is what GitLab or GitHub is for DevOps, the next generation of data management tools.

Nicolai: I was particularly excited about this podcast because Testcontainers and Test Data Management are two complementary fields. Whenever we run tests or create Testcontainers within those tests, we need to make sure we have production data for some of the tests, such as exploratory testing or other types of testing. Ultimately, we are all focused on making sure we can automate the software testing process, or help QA engineers.

What core trends do you see in the market of software testing? 

Sergei: What has happened is that external factors such as migration to the cloud, microservices, and an exploding number of various databases triggered a demand for better testing. Also in old school companies, the distribution of a workforce wasn't focused solely on the developers as it is right now. More or less everyone's a developer. Everyone tries to be a developer and there is a lot of shift-left movement. We now want to make more things that are done by developers. The old school style had a major QA department, a development department, and a sysadmin.

Now there is DevOps, with an infinity symbol. But what I now see is that in any stage testing is involved. And since there was a major shift in the workforce towards developers, they started to dictate how they want to do testing. They don’t want to start browsers or do exploratory testing. They are used to this process but it's not the most efficient for them because they want to write code. And integration testing is somewhat a sweet spot because it gives much more confidence than unit testing, but it’s not as heavyweight as E2E testing which needs to be done in some environments.

Denis: I agree. It’s more about the global trends of agile transformation and cloud adoption in DevOps. There are even higher-level trends, such as constantly learning how to ship software quicker. That shift has been happening for the last 20 to 30 years. If you remember how software was developed 20 years ago, the release cycle would typically last around a year or even longer. Now we’re learning how to ship software more frequently, but not all companies keep up with that. Some companies are lagging for many reasons. They still have an old school structure and they’re trying to jump into DevOps and CI/CD. Something else worth mentioning is that while we want to ship faster, the question is how do we do that? We eliminate undone stages in our kind of process such as QA.

Sergei: I noticed the growth of ASCII engineers becoming brilliant engineers in general. Not just QA engineers, but also software engineers. And part of what you have described, such as the elimination of manual QA, was almost triggered by QA folks themselves because they started automating more processes. But at some point, someone noticed that now that we have everything automated, why do we have a dedicated QA department? Why can’t developers do the same thing? 

Nicolai: We all agree that companies want to build better products, and ship them faster, and more frequently. 

What are the challenges for different companies to build and ship products faster?

Sergei: One of the challenges is having confidence, because if you’re confident then you are only limited by your tooling. If you’re confident that your changes will make it to production and not break everyone's systems, or every user's experience, then you'd be shipping every single time. So you would avoid continuous delivery, and you would do continuous deployment. Anything that goes into the main branch should eventually be automatically deployed to the production environment.

But then we face reality. Sometimes changing a single line, especially in configuration, can have a tremendous effect. We have heard famous stories about outages, even at the biggest tech companies in the world. Amazon Web Services broke down some time ago by making a single-line change.

If I was the person who made the change I would now be afraid to make a change to any line anywhere. It would also trigger a lot of activities around gaining the confidence to make changes without anything wrong because I know that I have tests, I have monitoring, and I have canary releases so that I can test it on a subset of users. And all of that comes with a lot of automation since you cannot be confident in manual processes because manual processes are prone to human errors. And the automated processes never lie as long as they are configured correctly.

Denis: That degree of confidence is also very much connected to other factors, such as trust. But confidence is feeling good about yourself. How confident are you about pushing code somewhere to production? That’s a problem in bigger institutions such as banks. 

Nicolai: We see many companies that understand that they need to get there. They need to adopt the cloud and enable agile test data management, but there are different ways to get there. For some younger companies, scale-ups, and startups, it’s a little bit easier. But for more mature organizations with arranged and established processes, it may take a little bit longer to get to shift-left testing. There is an intent to get there but it will take some time.

Organizations want to jump into DevOps, and agile transformation, but they can’t necessarily do it right now. And it's a long journey, but interestingly even conservative companies should be using mainframes. Instead, they're trying to adopt clouds, Docker, and containers. 

Sergei: Last year I saw a report that cloud spending by large enterprises has increased dramatically. People spend much more money on the cloud nowadays, which shows the growth in cloud usage by these major enterprises. 

And for us as software engineering boosters, we should focus on the weakest chains. We cannot duplicate some features in our code unless we know that nobody's using them. But legacy systems, slow-moving companies, and those who are not ready to change are using the code. It's fun to be able to work with large enterprises because it takes a lot of time to just understand your buyer and how they think about their issues. 

Nicolai: A very important problem that prevents some of the large companies from testing software in the cloud is data because we know that many organizations still use on-premise data sources, databases, and data warehouses. You want to make sure you adopt the cloud and agile and digital transformation, but it's hard. And it's important to work collaboratively on enabling companies to make it easier. One way to enable companies to use software that's in the cloud is to use test data.

Do you see demand for quality high-quality test data, which can be used to create test cases to eventually run software testing? 

Sergei: We see a lot of demand for test data providers because we can shift everything to the left. But we cannot shift the QA mindset, and QA thinking. 

The truth is that someone who is good at creating a task should be a creative thinker. When planning, I would start with edge cases because those are the most interesting ones. This is where test data could also make a big difference because one element is how you write and prepare your tasks and then another element is about how you prepare for your test data. And integration tasks usually come with test data next to the test. And this is interesting because it is a big advantage to be able to see your whole test scenario in a single place. If test data comes as a snapshot from production, for example, then you have two sources of truth for your task, and it's hard to follow both of them. 

The test data generated that follows the same patterns as seen in production data could be a major advantage that improves the test. The same set of tasks could be running with test data created by engineers to test their scenarios, but that is a form of mutation testing because having test data allows us to have property-based testing which defines how data comes in, and then it should pass with any type of data.

Denis: It’s also related to the topic of explaining exploratory testing. There’s a mentality of basically hacking your software, and being creative because testing always has two components, system and creativity. For the system, you have automated tests, which are very structured. But quite often people overlook the creative part. It’s still a form of the testing pyramid, and at the very top, there is exploratory, then creative testing and manual thinking.

When you have real data, you put all of this in real conditions. You can see how the application would work in practice, not in synthetic scenarios or your test, but how it would behave in practice. And if I was a user how could I break it? I think that helps with this process. 

Sergei: I agree. Having test data that is very close to real data is priceless because you can insert a couple of lines into the database or a couple of entries, and then assert what your endpoint returns, which is easy. We want to start the application, code some endpoints and get a feeling of how it will look once it’s deployed to production. We don't need to deploy it anywhere because we can do it locally with Testcontainers or Docker.

But the question remains how do we get to a state that is close to or representative of our production environment? This is so I'm not coding endpoints that do not return any data, but rather so I can get results that look very similar to production. And once you start doing manual testing, you notice that an endpoint could always return up to 10 records, for example, and then you’d need to add a test that will cover a scenario that you weren’t aware of before. 

Denis: It’s important to not just do manual testing, but developers should give more input for actual tests. Explore most scenarios and when you find them you code them.

Sergei: I'm confident about that because that’s one of the questions we received when we were starting our company. We were openly talking about how the majority of testing at large enterprises is still manual testing. Developers start services and code some endpoints with Insomnia or other tools. QA engineers are also doing the same, and at best they have some automation. They were probing us because we knew the answer.

Developers will start the application once or twice, and the next time they would have to start again and go through the same flow of opening the login page, going to the next page, and they’ll wonder how much time they’re spending on it. Why spend five minutes doing something manually if you can automate it in five days? On the other hand, there’s a recurring value because the next time you do it you don't need to perform the same actions, and when you introduce a regression it will catch it. And manual testing, unless you perform it recurrently, will not catch it. 

Denis: You also have some large enterprises that do not have automated testing. It might be too late because the amount of technical debt is enormous. They are trying to automate, but if they are starting after decades of manual testing then it's a big challenge. This is why we support the shift-left movement and believe in the developer being involved in testing. But large enterprises quite often still do a lot of manual testing, in which you need production data. 

Nicolai: Also, not just for production data, but sometimes for performance and load testing, you need to have much more volume. If you have a new service or application running you might not have enough volume to test it against. If it's a payment system that you have to launch soon, you need to test some test cases against certain scenarios which may not even be available in the product. You need to think about how to create business examples and make them available to software testers and developers if you decide to move completely to the left in this framework.

Sergei: And there is no single type of testing that will cover everything, as long as people use the right tools for the right types of testing. Sometimes I see activities being done with the wrong tools or not necessarily the wrong tools, but tools that I never thought would be used a certain way. For example, the folks behind the Apache Pulsar StreamNative, created a test suite for Apache BookKeeper, which is a major underlying component of Testcontainers, because they control networks and containers.

If one of the nodes goes down, they can inspect everything. That was not the usage that Richard, the original creator of Testcontainers, created seven years ago with our CTO.  And once he told us we were shocked. That’s a positive example, but there are a lot of negative examples when people do things that do not belong to the testing tool, including testing databases with mocking frameworks. Around 90% of modern applications are code blocks with queries that just take data from one place, transform it, and return it to another.

But then what would you unit test there? It would be an assumption that the string is correct. How will your database react to the string? It’s painful when people say that nobody needs integration testing, and they prefer to mock everything out.

Denis: That is interesting because you have a proportion of unit tests versus integration tests. I also lean towards integration tests if they're good, and unit tests, which I find most useful for complex logic. And for things such as CRUD applications and other applications that shift SQL back, it’s not good. So for them, integration testing covers everything.

Sergei: At least it covers 80% to 90% of cases. This is good because, in terms of making your first investment, you always pick those that cover the most cases. We think of this as building a product.

How would you build your product? 

Sergei: You should build your product for the majority of your users, and then you’ll want to start deriving value as soon as possible. Integration tests give you that. You add a database, write a couple of tests, and then your coverage is already at 80%. With unit testing, the scope is much bigger.

Denis: Also another factor is maintainability. Unit tests test on design. Whenever you change the design of components or units, you need to change tests. Whenever you do refactoring, you need to change all of your mocks, how components interact, etc. With the integration test, as long as you don’t change the specification, the test stays stable. Another interesting factor related to testing is that there is a new trend in the usage of static-type languages and all sorts of linters and analyzers become a form of verification.

Sergei: I think it correlates very well with the common trend of a faster feedback cycle. 

And slightly off-topic, but as a Java Champion, I cannot forget to mention that the recent developments in the Java language have made it so much better because now we have a lot of support for synthetic constructs that help us to build better systems. There are pattern matching or case classes where I no longer need to have this default fallback. If I know that the set of classes are the ones I will ever receive, I want to see my code, and I don't want to check for it at run time.

We always had to have a balance between being memory efficient and being strongly typed, and sometimes we had to shift the balance towards memory efficiency otherwise you’re generating a lot of garbage in Java language. That applies more to Java, and less to languages such as Golang. Nevertheless, it's great to know that Java is catching up because by removing weaknesses you move the whole system. And if you remove a weakness then you can align the baseline with all the great progress that has been made in Java. By removing weaknesses you move the whole thing. And if you remove these weaknesses, then it will align the baseline with all the great progress that was made in the Java world.

Denis: That's probably the vision for the future, a strongly-typed language, a lot of integration testing is minimized, no manual testing, and a lot of component testing. Also, more CI/ CD with data. 

Nicolai: And Sergei, in terms of AtomicJar and the work you are doing, we are big fans and we use one of your products in our team.

And we know that you're working with many other customers in the U.S, U.K, and Europe. What are your key learnings from working with enterprise clients? 

Sergei: We are in private beta right now, and I would not say that we have many other customers. We have tens of customers right now that we are working closely with and making sure that the product, once we release it later this year, will be solid. That's part of the private beta process but we did take some interesting learnings from the private beta users and some customers verified our assumptions about our product and our go-to-market strategy and the overall product-market fit.

We started with ideas like running Testcontainers without having Docker locally available can be a great advantage because it not only helps with security and speed of the execution but also gives you back resources you previously had to give to Docker. And I know that a lot of folks weren’t happy about having to give 8GB of RAM to a Docker. And then between Slack, Chrome, and Docker, you have to pick two. 

With Testcontainers, the cloud is possible, but we were always under the assumption that the speed of delivery is one of the most important things. But then we wondered if it truly mattered. Would it matter if it was a second or two? It probably would not matter unless it's multiplied by 10,000. But then does it matter if it's 5 minutes or 15 minutes for total time? It does because these are different brackets.

But one thing that is underappreciated is responsiveness. I noticed that responsiveness was a priority before with IntelliJ. It is the best IDE, but you also need indexing, which is a necessary step in your development process. Once you index it, you can do crazy things with your code. This is what differentiates IntelliJ from VS code, despite them having some Java support. It's nowhere near IntelliJ, but as a developer, if I need to index it once per week then it’s fine. Waiting for 10 minutes once per week is fine. 

But when I start to notice indexing as a regular occurrence when switching branches, it would be fine as long as my system remains responsible. There could be some background index that says I need to wait a bit longer, or I could use a redacted functionality in IDE unless it’s indexed, but I would continue using my computer. Also, IntelliJ isn't the only thing that runs on my computer. I’d need to switch to some browsers or Slack, and if one of the applications takes 100% of the resources then I’ll focus on that application. And when it comes to running tasks, I realized that integration tasks are also a single source of 100% consumption. And since our product allows the movement of workloads to the cloud, you no longer need to allocate these resources, as the CPU uses RAM while running the test.

I realized that I could do more tasks such as researching topics whilst the tests are running, or even run them in parallel because previously I couldn't, since my resources were already dedicated to containers.

Do you work on a database to optimize containers? 

Sergei: We do a lot of research. The Testcontainers team knows the most about many databases when it comes to development experience. We know how to optimize certain databases by using internal APIs, and we’ve worked closely with vendors. Now that we have an entity behind the project, we are establishing more partnerships with the database vendors. They're eager to collaborate and they know our worth. They know what can be optimized in a database, and they keep whatever is needed because they know that in the Testcontainers community there's a lot of demand. They see how the Testcontainers community is using their databases.

And sometimes we dig into databases internally for their optimization. For example, the Kafka module is one of the most advanced modules and task containers. It has a lot of optimizations. We also know a lot about how the Kafka protocol works, so Testcontainers are the only solution out there for running Docker or Kafka Docker images on a random port.

You can not do it with Docker compose, there aren't many other solutions for that, but our model of how we define modules allows us to do that. And there is an advantage to running things on random ports, because you no longer get conflicts when you run multiple things in parallel, especially on the same CI machine.

But back to the point, we aren't contributing fixes, or improvements to the database, at least not yet. We prefer to collaborate with database vendors who know how to better make these changes. But what we do give them is Intel about what's slow. For example, we noticed that it takes three seconds to initialize or preallocate a file system. But for tasks, you don't need a file system, you can do it in-memory. So they asked us if we could introduce some environment variables to make it in-memory and we say yes because it's a no-brainer, since it's the only interface to implement. And then they release it and the containers start in three seconds. Last but not least, we also realized that some technologies sometimes are hard to optimize. 

It’s not easy to change a database, introduce in-memory storage, and everything else. You need to initialize it, and there are a lot of other things involved. DB-2 is the same, it requires some privileges to reconfigure the system. It runs inside. It's not just a process, but rather it’s part of your Linux operating system. And for that, we don't need to start every time as we could capture the state, and either we use an existing instance or we could be a bit more creative with some advanced containerization technologies because our foundation Docker containers give us the obstruction and then with that obstruction we can, for example, freeze and unfreeze any container. There is a lot of potential and we work in both ways, cooperation with vendors and skipping the overhead for startups altogether. 

Finally, Sergei, what is the next big development for you and the team that you're excited about? 

Sergei: The focus right now is on public beta, but once we release it and have happy customers who come in through self-service, it will be a major milestone for us. We want to explore options now that we control the environments where Testcontainers are running. We can do a lot of things.

Now that we control this environment we can do interesting things because we are optimizing for specific use cases, up to being able to mine and apply patches to Docker, and removing parts that would make sense for long-running workloads, for example. But for testing, they would only add overhead. This makes me very excited because I know that there is so much more we can do. I wish we could kind of have our own Docker so that we can do things with it. Why not?

Nicolai: Thank you for this incredibly insightful conversation. It’s been great discussing the challenges and trends in software testing, how we can enable companies to ship products and services faster, and how we can enable shift-left gently. The need for Testcontainers is clear. 

Sergei: You also have an amazing product. There is a great potential for partnership and bringing our users together. And I'm sure that our users will appreciate it a lot, and you are building.