We never have enough time for testing, so let’s just write the test first.
—Kent Beck

Test-First Abstract

Agile testing differs from the big-bang, test-at-the-end approach of traditional development. Instead, code is developed and tested in small increments, often with the development of the test itself preceding the development of the code. In this way, tests serve to elaborate and better define the intended system behavior before the system is coded. Quality is built in from the beginning. This just-in-time approach to elaboration of the intended system behavior also mitigates the need for lengthy, detailed requirements specifications and sign-offs, as are often used in traditional software development to control quality. Even better, these tests, unlike traditional requirements, are automated wherever possible. Even when they’re not, they provide a definitive statement of what the system actually does, rather than a statement of early thoughts about what it was supposed to do.

This article describes a comprehensive approach to Agile testing, and testing first, based on Brian Marick’s four-quadrant Agile Testing Matrix. Quadrants One and Two define the tests that support the development of the system and its intended behavior; they are described in this article. Quadrants Three and Four are more fully described in the Release and Nonfunctional Requirements articles, respectively.

Note: Of course, not all these tests can be written or executed first, but the authors believe that Test-First captures the proper sentiment!


Agile testing is a continuous process, not a one-time or end-game event. It is integral to Lean and Built-in-Quality. Simply, Agile Teams and Agile Release Trains (ARTS) can’t go fast without endemic quality, and they can’t achieve endemic quality without continuous testing and, wherever possible, “testing first.”

The Agile Testing Matrix

XP proponent and Agile Manifesto signer Brian Marick helped pioneer Agile testing by describing a matrix that guides reasoning about such tests. This matrix was further developed in Agile Testing [1] and extended for the scaled Agile paradigm in Agile Software Requirements [2].

Figure 1 describes and extends the original matrix with guidance on what to test and when.

Figure 1. Agile Testing Matrix
Figure 1. Agile Testing Matrix
  • The horizontal axis of the matrix contains business- or technology-facing tests. Business-facing tests are understandable by the user and are described using business language. Technology-facing tests are written in the language of the developer and are used to evaluate whether the code delivers the behaviors the developer intended.
  • The vertical axis contains tests supporting development (evaluating internal code) or critiquing the solution (evaluating the system against the user’s requirements).

Classification into these four quadrants (Q1 – Q4) enables a comprehensive testing strategy that helps ensure quality:

  • Q1 contains unit and component tests. These automated tests are written by developers and run before and after code changes, to confirm that the system works as intended. Automating their execution reduces the overhead of executing a large number of tests in different development or test environments.
  • Q2 contains functional tests (user acceptance tests) for Stories, Features, and Capabilities, to validate that they work the way the Product Owner (or Customer/user) intended. Feature- and capability-level acceptance tests validate the aggregate behavior of many user stories. Teams automate these tests whenever possible and only use manual tests when absolutely necessary.
  • Q3 contains system-level acceptance tests to validate that the aggregate behavior of the system meets usability and functionality requirements, including various scenarios that may be encountered in actual use. These can include exploratory testing, user acceptance testing, scenario-based testing, final usability testing, and more. These tests are often done manually because they involve users and testers using the system in actual or simulated deployment and usage scenarios. These tests are used for final system validation and are required before delivery to the end user.
  • Q4 contains system qualities tests to verify that the system meets its Nonfunctional Requirements (NFRs), as exhibited in part by Enablers tests. They are typically supported by a suite of automated testing tools, such as load and performance, designed specifically for this purpose. Since any system changes can violate conformance with NFRs, they must be run continuously, or at least whenever is reasonably practical.

Quadrants 1 and 2 test the functionality of the system. When these tests are developed before the code is committed, that is described as Test-First. Test-first methods include both test-driven development (TDD) and acceptance test-driven development (ATDD). Both methods use test automation to support continuous integration, team velocity, and development effectiveness. Quadrants 1 and 2 are described below; Quadrants 3 and 4 are described in the companion articles Release and Nonfunctional Requirements, respectively.

Test-Driven (Test-First) Development

Beck [3] and others have defined a set of XP practices described under the umbrella label of test-driven development, or TDD. The focus is on writing the unit test before writing the code, as described below:

  1. Write the test first. Writing the test first ensures that the developer understands the required behavior of the new code.
  2. Run the test, and watch it fail. Because there is as yet no code to be tested, this may seem silly initially, but this accomplishes two useful objectives: It tests the test itself and any test harnesses that hold the test in place, and it illustrates how the system will fail if the code is incorrect.
  3. Write the minimum amount of code that is necessary to pass the test. If the test fails, rework the code or the test as necessary until a module is created that routinely passes the test.

In XP, this practice was primarily designed to operate in the context of unit tests, which are developer-written tests (also code) that test the classes and methods that are used. These are a form of “white-box testing” because they test the internals of the system and the various code paths that may be executed. Pair work is used extensively as well; when two sets of eyes have seen the code and the tests, it’s probable that the module is of high quality. Even when not pairing, the test is “the first other set of eyes” that see the code, and developers note that they often Refactor the code in order to pass the test as simply and elegantly as possible. This is quality at the source—one of the main reasons that SAFe relies on TDD.

Unit Tests

Most TDD is done in the context of unit testing, which prevents QA and test personnel from spending most of their time finding and reporting on code-level bugs. This allows additional focus on more system-level testing challenges where more complex behaviors are found, based on the interactions of the unit code modules. In support of this, the open source community has built unit testing frameworks to cover most languages, including Java, C, C#, C++, XML, HTTP, and Python. Now there are unit-testing frameworks for most languages and coding constructs a developer is likely to encounter. These frameworks provide a harness for the development and maintenance of unit tests and for automatically executing unit tests against the system under development.

Because the unit tests are written before or concurrently with the code, and because the unit testing frameworks include test execution automation, unit testing can be accomplished within the Iteration. Moreover, the unit test frameworks hold and manage the accumulated unit tests, so regression testing automation for unit tests is largely free for the team. Unit testing is a cornerstone practice of software agility, and any investments a team makes toward more comprehensive unit testing will be well rewarded in quality and productivity.

Component Tests

In a like manner, component testing is used to test larger-scale components of the system. Many of these are present in various architectural layers, where they provide services needed by features or other components. Testing tools and practices for implementing component tests vary according to the nature of the component. For example, unit testing frameworks can hold arbitrarily complex tests written in the framework language (Java, C, C#, and so on), so many teams use their unit testing frameworks to build component tests. They may not even think of them differently, as it’s simply part of their testing strategy. In other cases, developers may use other testing tools or write fully customized tests in any language or environment that is most productive for them to test these larger system behaviors. These test are automated as well, where they serve as a primary defense against unanticipated consequences of refactoring and new code.

Acceptance Test-Driven Development

Quadrant 2 of the Agile Testing Matrix shows that test-first applies as well to testing stories and features and capabilities as it does to unit testing. After all, the goal is to have the system work as intended, not to simply have the code do as intended. This is called acceptance test-driven development (ATDD), and whether it is adopted formally or informally, many teams simply find it more efficient to write the acceptance test first, before developing the code. Pugh [4] notes that the emphasis here can be viewed more as expressing requirements in unambiguous terms than as a focus on the test per se. He further notes that there are three alternative labels to this requirement detailing process: ATDD, Specification by Example, and Behavior-Driven Design. There are some slight differences to these three versions, but they all emphasize understanding requirements prior to implementation. In particular, specification by example suggests that the Product Owner should be sure to provide examples, as they often do not write the acceptance tests themselves.

Whether its viewed as a form of requirements expression or as a test, the understanding that results is the same. The acceptance tests serve to record the decisions made in the conversation (see the “3Cs” in the “Writing Stories” section of Story, referring to card, conversation, and confirmation) between the team and the product owner, so that the team understands the specifics of the intended behavior the story represents.

Functional Tests

Story acceptance tests are functional tests intended to ensure that the implementation of each new user story delivers the intended behavior. The testing is performed during the course of an iteration. If all the new stories work as intended, then it’s likely that each new increment of software will ultimately satisfy the needs of the users.

Feature and capability acceptance testing is performed during the course of a Program Increment. The tools used are generally the same, but these tests operate at the next level of abstraction, typically testing how some number of stories work together to deliver a larger value to the user. Of course, there can easily be multiple feature acceptance tests associated with a more complex feature, and the same goes for stories. In this manner, there is strong verification that the system works as intended at the feature, capability, and story levels.

The following are characteristics of functional tests. They are:

  • Written in the language of the business
  • Developed in a conversation between the developers, testers, and the Product Owner
  • Black-box tests that verify only the outputs of the system and meet conditions of satisfaction, without concern for how the result is achieved
  • Implemented during the course of the iteration in which the story is implemented

Although everyone can write tests, the Product Owner, as business owner/Customer proxy, is generally responsible for the efficacy of the tests. If a story does not pass its test, the teams get no credit for that story and it is carried over into the next iteration, when the code and/or the test are reworked until the story passes the test.

Features, capabilities, and stories cannot be considered done until they pass one or more acceptance tests. Stories realize the intended features and capabilities. There can be more than one test associated with a particular feature, capability, or story.

Automating Acceptance Testing

Because acceptance tests run at a level above the code, there are a variety of approaches to executing them, including handling them as manual tests. However, manual tests pile up very quickly (“the faster you go, the faster they grow, the slower you go”), and eventually, the number of manual tests required to run a regression slows down the team and introduces major delays in value delivery.

To avoid this, teams know that they have to automate most of their acceptance tests. They use a variety of tools to do so, including the target programming language (PERL, Groovy, Java) or natural language as supported by specific testing frameworks, such as RobotFramework or Cucumber; or perhaps they use table formats as supported by the Framework for Integrated Testing (FIT). The preferred approach is to take a high level of abstraction that works directly against the business logic of the application, thereby preventing encumbrance by the presentation layer or other implementation details.

Acceptance Test Template/Checklist

An ATDD checklist can help the team consider a simple list of things to do, review, and discuss each time a new story appears. ASR [2] provides an example of such a story acceptance-testing checklist.

Learn More

[1] Crispin, Lisa, and Janet Gregory. Agile Testing: A Practical Guide for Testers and Agile Teams. Addison-Wesley, 2009.

[2] Leffingwell, Dean. Agile Software Requirements: Lean Requirements Practices for Teams, Programs, and the Enterprise. Addison-Wesley, 2011.

[3] Beck, Kent. Test-Driven Development. Addison-Wesley, 2003.

[4] Pugh, Ken. Lean-Agile Acceptance Test-Driven Development: Better Software Through Collaboration. Addison-Wesley, 2011.


Last update: 29 September 2015