Why you should use the testing pyramid in test automation

The Testing Pyramid is a widely popular method used in software development. It gained prominence through Mike Cohn's book, "Succeeding with Agile," in 2009. It is a way of creating strong and effective test automation strategies. According to the article, "The Practical Test Pyramid ", although the concept has existed for a while now, implementation still proves difficult for most teams.

Pyramids are buildings with broad bases that narrow towards the top. The Testing Pyramid mirrors this form in its approach to software testing. As Mike Cohn originally outlined, the pyramid consists of Unit Tests, Service Tests, and User Interface Tests.

Fig.1: The Testing Pyramid for test automation

The Foundational Layer: Unit Tests

Unit tests are at the bottom. These unit tests concentrate on the most atomic testable pieces of code, such as individual functions or methods, to guarantee their correct functionality independently.

Unit tests possess some very important advantages: they are quick, isolated, and highly automatable. Their primary goal is to discover bugs early in the development process. Having a large number of unit tests provides a solid safety net when the codebase is being modified. Secondly, even the process of writing unit tests before code itself (Test-Driven Development or TDD) is also referred to as a method for enhanced code quality. According to "The Practical Test Pyramid," a unit can range from a single method to an entire class in object-oriented programming languages.

There are numerous tools available that can aid in automating unit tests, and the choice generally depends on the programming language used. For example, in the Java ecosystem, JUnit is a standard test runner, and Mockito is a widely employed library for mocking dependencies for the sake of obtaining isolation. Similarly, within JavaScript, Jest is a standard. Unit tests need to verify non-trivial paths through the code, like happy paths and edge cases, but not with a close coupling to implementation details. They should generally verify the public interface of the code. The default pattern for unit tests is the "Arrange, Act, Assert" pattern.

Fig.2: Useful tools for running unit level tests.

In contemporary software development, unit testing is often assumed to be an integral component of any new project. There is no need to convince anybody of the importance of such tests. The uncertainty lies in determining the extent of these tests and the desired level of source code coverage. The answer, as is often the case, is that it depends on the project. However, I believe that the test pyramid ratio remains relevant, indicating that there should be more unit-level tests than any other type. Another controversy is who should be responsible for writing them, but that has already been solved. It is the developers' responsibility.

The Intermediate Level: Service (or Integration) Tests

Moving up the pyramid, the middle section of the pyramid includes service tests, sometimes called integration tests, or component (module) tests. These tests verify the communication and interaction between different modules or services of the application, such as API testing to verify the correct functioning of application programming interfaces.

At this level, in microservices architecture, contract tests play a vital role to ensure services are abiding by their contracts, making it possible to have smooth data exchange.

Some tools that may be used for API test automation are REST-assured and Playwright . Pact is another famous contract testing tool whereby consumers are able to describe what they expect of a service in automated tests, which the providers can run to ensure compliance. Wiremock (Java) or Requests-mock (Python) can be used to stub out external services during integration tests in order to test service boundaries in a more controlled and faster way.

Fig.3: Useful tools for running service level tests.

These tests confirm points of connection between components or services without traversing the entire user interface. They require a little more time than unit tests but much less than higher-order UI tests. Integration tests should focus on verifying data serialization/deserialization and interaction with outside systems like databases and other APIs.

In my opinion, this level is often overlooked, and it is unfortunate because it can bring many benefits to a project. In the case of microservices architecture, contract testing is an innovation that redefined how we can take care of quality with little effort using modern tooling. Another caveat I have for many projects these days is that this layer is unclear and often mixed with unit tests. This is also a big mistake as it blocks the possibility of optimally managing test execution, especially in fully automated CI/CD processes, which are the goal for most projects I worked on.

The Apex: End-to-End (E2E) Tests (or User Interface (UI) Tests)

UI tests, or end-to-end tests, sit at the topmost level (they can be used interchangeably only in the context of some projects - I explain why at the end of this section). They simulate the way an end-user uses the whole application through its user interface, verifying the overall system flow from start to finish.

While required to confirm the user experience and critical paths, UI tests tend to be slower, more difficult to automate, and more prone to flakiness since they are founded on numerous interdependent elements. Thus, the Testing Pyramid recommends fewer UI tests, with them being used on the most critical user journeys, as the underlying testing has already been addressed at the lower levels.

The most widely used UI test automation tools are Selenium , Playwright , and Cypress . They allow the tester to automate a browser interaction, for instance, clicking and filling in inputs. Cloud test platforms such as BrowserStack and Sauce Labs offer support for cross-browser and cross-device testing without setting up big test environments and management.

Fig.4: Useful tools for running end-to-end level tests.

Some of the frameworks also offer no-code or low-code tools such as IBM's Rational toolset , Katalon , Eggplant , and Test Complete . For one-page applications built with frameworks like React, Angular, or Vue.js, a testing library specific to the framework can also be used to write unit tests for UI components. Visual regression testing libraries can also be added to test visual aspects of the UI. Libraries like Playwright can give feedback about visual regression at a certain point with specified precision.

This layer has many challenges, and they usually revolve around the term “test automation”. One common mistake is to always identify E2E as UI tests and use the term interchangeably. In some cases, it can be untrue due to the chosen test strategy based on the context of the project, e.g., application architecture.

It is essential to distinguish between E2E and UI testing when we just check the logic of the frontend code (understood as a UI layer in web development). The first belongs to this layer, and the second is more associated with the intermediate level tests. This is pretty important as it should be reflected in the ratio of those tests. The same goes for E2E tests that are not UI tests, e.g., API E2E tests; they should be accounted for at the apex of the test pyramid.

Keeping Focus: Avoiding Test Overlap

One of the fundamental principles of the Testing Pyramid is that tests on different levels should complement each other and have a specific purpose without replication. Overlapping tests, where the same behavior is tested on different levels of tests, can create many undesirable side effects:

Slow test suite runs
Unnecessary testing of the same logic can make the run time of the test suite longer in general.
Increased cost of maintenance
Adaptations in the application may call for adaptations in the same test logic in many test layers.
Increased likelihood of flaky tests
Flakier tests are tests higher up in the pyramid. Duplicate UI tests introduce unwanted flakiness.
Inefficient use of resources
Redundant test efforts waste effort and time that can be spent elsewhere.

Tests need to be pushed down the test pyramid as far as possible to prevent test duplication. This means performing as much logical verification as is feasible at the unit test level. For example, complex validation rules are best suited for unit tests. Service tests would focus on verifying the relationships and data exchange between different system components, e.g., API contracts. UI tests should primarily verify end-to-end user flows and the final user-perceived outcome, avoiding duplicating detailed validations performed at lower levels. Collaboration among developers and QA teams is important to determine the best possible level of testing in specific scenarios.

The Risks of Ignoring the Pyramid: The Anti-Pattern of the Ice-Cream Cone

Skewing away from Testing Pyramid principles, generally resulting in too many slow and fragile UI tests and too few fast and robust unit tests, which creates a less-than-ideal "ice-cream cone" (or inverted pyramid) shape. This anti-pattern creates the following problems:

Slow feedback loops
Excessive use of slow UI tests delays feedback for code changes.
Unreliable test suites
Thousands of flaky UI tests kill confidence in the test suite.
High maintenance overhead
UI tests are costly to write and to maintain, and UI updates tend to break tests.
Hard debugging
UI test failures are difficult to determine due to the numerous underlying factors.
Lower confidence in software quality
An unreliable test suite reduces confidence in release stability.
Long test release cycles
Prolonged test suites within CI/CD pipelines may impede the release process.
Inefficient use of resources
Overemphasis on UI testing diminishes investment in more effective lower-level tests.

In short, it is straightforward to develop a robust, speedy, and manageable test automation strategy while remaining within the guidelines of the Testing Pyramid. By strategically applying automation tools at all levels of testing – e.g., JUnit and Mockito at the unit level, REST-assured and Pact at the contract and service level, and Selenium, Playwright, and Cypress at the UI level – the teams can achieve quick feedback and contribute significantly to shipping quality software. The goal is to have a stable base of unit tests, an intense coating of service tests, and fewer UI tests, compared to the uneven and unhealthy ice-cream cone.

Therefore, the Testing Pyramid was developed as a fundamental model for building a solid test automation strategy, with more low-level, high-speed tests and fewer high-level, slow tests. To actually benefit from this model and build high-quality software cost-effectively, some related concepts are essential, including the shift-left approach, exploratory testing, writing clean test code, and understanding test coverage in this model.

Adopting the Shift-Left Approach

The Testing Pyramid tends to naturally support the shift-left methodology, which promotes shifting testing activities further left in the software development lifecycle. Instead of waiting until the application is completely built before testing begins (typically slow and late UI tests), the Testing Pyramid prefers teams to begin at the unit level as soon as parts are built.

By focusing on unit tests, developers get fast feedback on their code changes, catching and fixing bugs early before they infect higher system levels. As "The Practical Test Pyramid" points out, this greatly diminished feedback loop, with power provided by automated testing, is well adapted to agile development techniques.

Shifting testing left reduces the expense of bug fixing because issues caught earlier tend to be simpler and less costly to repair. If a bug is caught at the unit test level, it typically resides in a very small, localized piece of code, so it is easier to find and resolve. But a bug caught by a UI test might be caused by any number of different underlying components and interactions, and therefore represents a lot more work to diagnose.

The Test-Driven Development (TDD) process of writing unit tests prior to production code is a prime example of shifting left. It compels developers to think about requirements and testability early, leading to more stable and better-designed code.

The Complementary Role of Exploratory Testing

Although the Testing Pyramid places great importance on automated testing, it is important to appreciate the worthwhile contribution of exploratory testing. Exploratory testing is a human technique focusing on the tester's autonomy and imagination to discover quality defects in an executing system.

Even automated test efforts under the most careful attention are not flawless, and exploratory testing will serve to identify edge cases and usability defects that are likely to be overlooked by automated testing.

Bug hunts and checklists are manual test activities performed to supplement automation.

Exploratory testing allows the testers to use their understanding of the system to examine tricky areas and look for what was not anticipated. It can be employed even to find usability and design problems, which are typically difficult to automate.

The findings of exploratory testing can then be utilized to drive the creation of new automated tests to prevent regressions of the found issues. This offers a feedback loop that continuously improves the test automation suite.

Writing Clean and Maintainable Test Code

Because production code has to be high-quality and maintainable, the same holds for test code. "The Practical Test Pyramid" simply states that test code is as significant as production code and should receive the same consideration and care.

Quote: Test code is as significant as production code and should recieve the same consideration and care - Stanisław Madaliński-Piętka, QA Tech Lead, CodiLime

Readability is important in test code. Tests should be readable and rational. Using unambiguous naming conventions and following the "Arrange, Act, Assert" (or "Given, When, Then") approach makes tests more expressive and easier to maintain.

While the DRY (Don't Repeat Yourself) principle takes precedence in coding, "The Practical Test Pyramid" suggests test code duplication is acceptable if it makes the code more readable. The goal is to find a balance between conciseness and readability.

In an ideal scenario, each test should be for one condition, so that they are short and targeted, and hence easier to identify the cause of a failure.

Investing in good test code decreases the cost of maintaining the test suite. Poorly written tests can be fragile and quickly become broken with even minor changes in the production code, expending effort to fix tests rather than to deliver value.

Understanding Test Coverage in the Context of the Pyramid

Test coverage is a vital element of any test plan, yet how one interprets it in the Testing Pyramid has to be done tactfully. Having high code coverage, particularly at the unit test level, might be ideal, but it is not the only factor for success.

The prime intention of the Testing Pyramid is to ensure the most critical portions of the application are adequately tested at the correct levels.

It prioritizes testing public code interfaces and non-trivial code paths, including happy paths, edge cases, and negative scenarios. But avoids testing silly code like mundane getters and setters.

Higher levels of the pyramid observe testing shift from code-level details to user-centric use cases and interactions. The aim is to test that the entire system meets the user's requirements.

It is hard to quantify end-to-end test coverage, and it often depends on team experience with the application and the features being tested. While metrics can be quantified, knowing what is actually being covered by the tests is essential. To name one example, a model-based approach can be utilized to track automation progress and give a helicopter view of the E2E coverage.

The secret is to have sufficient coverage at each layer to instill confidence in the software's quality, without creating unnecessary overlap or excessively slow and costly high-level tests. Developers and QA must collaborate to appropriately balance and focus testing efforts.

Summary

The Testing Pyramid is a well-known and simple concept that can be unraveled to more complexity. By embracing the shift-left approach, recognizing the role of exploratory testing, cherishing clean test code, and meticulously planning test coverage within the framework of the Testing Pyramid, development teams are able to craft a solid and successful test automation strategy that makes a huge difference in delivering quality software at pace and with certainty.

Services

Knowledge

Why you should use the Testing Pyramid in test automation

Table of contents:

The Foundational Layer: Unit Tests

The Intermediate Level: Service (or Integration) Tests

The Apex: End-to-End (E2E) Tests (or User Interface (UI) Tests)

Keeping Focus: Avoiding Test Overlap

The Risks of Ignoring the Pyramid: The Anti-Pattern of the Ice-Cream Cone

Adopting the Shift-Left Approach

The Complementary Role of Exploratory Testing

Writing Clean and Maintainable Test Code

Understanding Test Coverage in the Context of the Pyramid

Summary

Read also

Beyond buzzwords: How AI is actually making testing smarter

Testing network configurations with free traffic generators

Get your project estimate

Trusted by leaders: