Refactoring towards testability

When working on a legacy codebase, you often have to make changes in code you don’t fully understand. When this code is not tested, you’re never sure that it will still work after adapting it. And will it have unexpected effects on other parts of the codebase? As a software engineer at madewithlove, I often help other companies dealing with their legacy applications. I’ve learned that to efficiently change this code, you need to have a very short feedback loop. This verifies that your application will work as intended so that you don’t have to manually test all possible paths that could be affected by your change. It is often quite easy to write end-to-end tests for your application or manually test the user interface or the input and output it gives. But these tests typically take a long time to run, so they are run less. This in turn makes it a lot harder to pinpoint what exactly is wrong when the tests fails and when exactly the issue was introduced.

Writing unit tests is the easiest way to create a short feedback loop between developing an app and knowing that something is wrong with it. Badly written code can make it really hard to write such tests and trickier to move towards a more confident development process. Luckily, there are some easily recognisable anti-patterns of untestable code, creating clear paths to get rid of them. Better still, testable code will almost always be code with a better design and readability!

Extract code handling I/O

When code does I/O – such as database or API calls, sending emails, pushing notifications, or putting messages in a queue – it can be really hard to test that code. After all, you don’t want to send an email to a customer every time your testsuite is run
If for example code is talking to a database, you should try to encapsulate this logic into a specific class that only has this as a responsibility. The repository pattern is a common way of building such objects. After extracting this logic to its own unit, you will be able to put it behind an interface and inject that piece of code everywhere you need to communicate with your database. This has two major advantages for testability:

You only have to unit test database (or other) interactions in one place.
You can unit test other places, without having the slow input/output calls, by replacing your repository or other abstraction by a mock or a different implementation (such as a repository that holds an in-memory array of data) that follows the same interface/contract.

A simplified example could be this RegistrationController that does a static database call and sends an email:

 $user->getUsername(),
                'password' => $user->getHashedPassword(),
            ]
        );

        mail(
            $user->getEmail(),
            'Welcome to our platform',
            'Hi ' . $user->getUsername() . ', activate using this link: ' . ActivationLink::forUser($user),
            'From: [email protected]'
        );

        return $this->render('registered-template', compact($user));
    }
}

If you wrap the code that sends the email and that does the database calls into its own class, you can easily test them in isolation. An example of an integration tested repository would be:

repository = new DatabaseUserRepository(
            TestHelper::getTestDatabaseConnection()
        );
    }

    /** @test */
    public function itCanAddUsers()
    {
        $user = new User('username', 'hashed-password');

        $this->repository->add($user);

        $usersFromDatabase = TestHelper::getTestDatabaseConnection()
            ->select('*')
            ->from('users')
            ->where('id', '=', $user->getId())
            ->fetchAll();

        $this->assertCount(1, $usersFromDatabase);
        $this->assertEquals($user->getUsername(), $usersFromDatabase[0]['username']);
        $this->assertEquals($user->getHashedPassword(), $usersFromDatabase[0]['password']);
    }
}

connection = $connection;
    }

    public function add(User $user): void
    {
        $this->connection->insert('users', [
            'username' => $user->getUsername(),
            'password' => $user->getHashedPassword(),
        ]);

        $user->setId($this->databaseConnection->lastInsertId());
    }
}

If you use a similar approach for the mail component, you could inject the mailer and the repository in the controller which makes it testable:

createMock(UserRepository::class);
        $controller = new RegisterController(
            $userRepository,
            $this->createMock(UserMailer::class)
        );
        $user = new User('username', 'hashed-password');

        $userRepository->expects($this->once())
            ->method('add')
            ->with($user);

        $controller->register($user);
    }

    /** @test */
    public function itSendsAnActivationMailToTheRegisteredUser()
    {
        $userMailer = $this->createMock(UserMailer::class);
        $controller = new RegisterController(
            $this->createMock(UserRepository::class),
            $userMailer
        );
        $user = new User('username', 'hashed-password');

        $userMailer->expects($this->once())
            ->method('sendActivation')
            ->with($user);

        $controller->register($user);
    }
}

userRepository = $userRepository;
        $this->userMailer = $userMailer;
    }

    public function register(User $user)
    {
        $this->userRepository->add($user);
        $this->userMailer->sendActivation($user);

        return $this->render('registered-template', compact($user));
    }
}

Split up complex code into understandable pieces

If a method, class or file is too big, this one unit of code should usually be split up into many smaller pieces. One method can evolve into multiple methods or even multiple classes. The two big advantages of doing this are that smaller pieces of code are much easier to understand and much easier to test.

If this unit of code does many different things, chances are high that it has one main goal and that the other code executed is a side effect of this first action. So the best solution is to cleanly refactor the unit of code towards an event-driven architecture.

An example of this is the following piece of code, which sends out a webhook every time a blog post is added.

 $title,
        'content' => $content,
        'author' => $currentUser->getId(),
    ]
);

$webHookData = [
    'id' => urlencode($postId),
    'title' => urlencode($title),
];

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'http://webhook.example.com');
curl_setopt($ch, CURLOPT_POST, count($webHookData));
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($webHookData));

// send post requeste to the webhook url
curl_exec($ch);

// there might even be more side effects like sending out notifications
// to subscribed users

// more code that shows that the blogpost has been stored

Using a publish-subscribe pattern, you could easily make an event listener (or multiple event listeners) that gets triggered every time this event is published. A first step to doing that might look like this:

 $title,
        'content' => $content,
        'author' => $currentUser->getId(),
    ]
);

EventPublisher::publish(new BlogPostWasAdded(
    $currentUser->getId(),
    $postId,
    $title
));

// more code that shows that the blogpost has been stored

 urlencode($event->getBlogPostId()),
            'title' => urlencode($event->getBlogPostTitle()),
        ];

        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, 'http://webhook.example.com');
        curl_setopt($ch, CURLOPT_POST, count($webHookData));
        curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($webHookData));

        curl_exec($ch);
    }
}

This code is still not really testable, because the webhook is called directly in this class. But with the technique used in the previous part (extract code handling IO), it is now really easy to refactor the code in a testable state.

Get rid of global state

Using global state is extremely unpredictable, because every part of your code can alter it. Some global state can even be altered from outside your application. The default timezone in PHP is an example of global state that can be configured outside of your application, but it can also be altered inside of your app during every request.

If your application starts with just a bit of global states, it feels under If control. However this quickly gets out of hand when multiple places in your code either add or edit – or even remove – the same global properties in your application. Other pieces of global state could even change automatically, one example being the current time.

Here’s an example of an untestable class using global state:

There are multiple issues with this code:

Unpredictable methods are used (such as sys_get_temp_dir and temp_name).
We use a global variable.
The global variable can be changed by the CanWriteFiles::verify method which could make the function calls after that unpredictable.

By wrapping the static/unpredictable code in services, injecting these through the constructor and by passing the global user id to the method, we can easily extract all global state out of the method and make it testable. A better iteration for this class could look like this:

canWriteFiles = $canWriteFiles;
        $this->temporaryFiles = $temporaryFiles;
    }

    public function writeToTemporaryFile(string $data, int $userId): string
    {
        $this->canWriteFiles->verify($userId);

        return $this->temporaryFiles->write($userId, $data);
    }
}

We can now perfectly test the logic in this class using a test:

canWriteFiles = $this->createMock(CanWriteFiles::class);
        $this->temporaryFiles = $this->createMock(TemporaryFiles::class);
        $this->fileManager = new FileManager(
            $this->canWriteFiles,
            $this->temporaryFiles
        );
    }

    /**
     * @test
     */
    public function itDoesNotWriteAFileIfUserHasNotEnoughRights()
    {
        $this->givenUserCannotWriteFiles(1);

        $this->temporaryFiles->expects($this->never())
            ->method('write');

        $this->expectException(CannotWriteFiles::class);
        $this->fileManager->writeToTemporaryFile('data', 1);
    }

    /**
     * @test
     */
    public function itDoesReturnAFilenameIfUserCanWriteFiles()
    {
        $this->givenuserCanWriteFiles(1);

        $this->temporaryFiles->expects($this->once())
            ->method('write')
            ->with('data', 1)
            ->willReturn('/tmp/filename');

        $fileName = $this->fileManager->writeToTemporaryFile('data', 1);

        $this->assertEquals('/tmp/filename', $fileName);
    }

    private function givenUserCannotWriteFiles(int $userId)
    {
        $this->canWriteFiles->method('verify')
            ->with($userId)
            ->willThrowException(CannotWriteFiles::withUserId($userId));
    }

    private function givenUserCanWriteFiles(int $userId)
    {
        $this->canWriteFiles->method('verify')
            ->with($userId)
            ->willReturn(true);
    }
}

Avoid instantiating services in code

Whenever you see the new keyword pop up in your code and it is used to instantiate anything but a value object or an entity, you’re likely creating services here that you would prefer to replace by another implementation (e.g. a mock) in your tests.

A good example of this is a piece of code that has to call a fictional microservice, in order to see if a user is subscribed:

 'http://api-base-url.example.com/'
        ]);

        $response = $client->get($endpoint, $requestData);

        if ($response->getStatusCode() != 200) {
            throw new RequestFailedException('could not call ' . $endpoint);
        }

        return json_decode((string) $response->getData(), true);
    }
}

This code is untestable without doing real requests to the production version of a microservice that contains real application data. As a result, such code is prone to changes. When we pull out the client and pass it to the class using dependency injection, we can easily make the class testable. So we can also easily configure the client differently in other environments (e.g. a staging and a local microservice). The result could look like this:

createMock(HttpClientInterface::class);
        $caller = new MicroServiceCaller($client);
        $endpoint = '/isUserSubscribed';
        $requestData = ['user_id' => 123];

        $this->client->expects($this->once())
            ->method('get')
            ->with($endpoint, $requestData)
            ->willReturn(
                Response::withData(
                    '{"data":{"marketing_mails": false, "notifications": true}}'
                )
            );

        $response = $caller->call($endpoint, $requestData);

        $this->assertEquals(
            [
                'marketing_mails' => false,
                'notifications' => true,
            ],
            $response
        );
    }

    /** @test*/
    public function itThrowsWhenTheMicroServiceIsNotReachable()
    {
        $client = $this->createMock(HttpClientInterface::class);
        $caller = new MicroServiceCaller($client);

        $this->client->expects($this->once())
            ->method('get')
            ->willReturn(Response::withErrorCode(400));

        $this->expectException(RequestFailedException::class);
        $caller->call('/isUserSubscribed', ['user_id' => 123]);
    }
}

client = $client;
    }

    public function get(string $endpoint, array $requestData): array
    {
        $response = $this->client->get($endpoint, $requestData);

        if ($response->getStatusCode() != 200) {
            throw new RequestFailedException('could not call ' . $endpoint);
        }

        $responseData = json_decode((string) $response->getData(), true);

        return $responseData['data'];
    }
}

No more statics

The main issue with testing static methods is that it basically is procedural code. Since static methods never have state, they can be entirely unpredictable and can call other code that has side effects which you don’t want to (or even cannot) execute while running your tests. It is also extremely hard to mock code that is static. In PHP, the ‘patchwork’ framework lets you do exactly that. But trust me, you shouldn’t use it! This framework will slow down your tests enormously, because it hooks into all internal calls and checks to see if it has to monkey patch execution paths on the fly. So let’s avoid having to do that!

An example of untestable code is this controller, which uses static methods to see which user it has to show the detail page:

userRepository = $userRepository;
        $this->templating = $templating;
    }

    public function detail(): ResponseInterface
    {
        $userId = RequestData::get('id');

        if (!$userId) {
            $userId = Authentication::getCurrentUserId();
        }

        try {
            $user = $this->userRepository->find($userId);
        } catch (UserNotFound $exception) {
            return new NotFoundResponse();
        }

        return $this->templating->render('user-detail', compact($user));
    }
}

Both static calls might do a lot in the background. Maybe the Request::get method depends on $_POST variables to be available, while the Authentication::getCurrentUserId might depend on a session being started and having user data in there. We can do better! By using a request object that is instantiatable and can be passed to the controller, we can easily mock the request or change it with a different implementation that does not have any dependencies. This would result in something like this:

userRepository = $userRepository;
        $this->templating = $templating;
    }

    public function detail(AuthenticatedRequest $request): ResponseInterface
    {
        $userId = $request->getParameter('id');

        if (!$userId) {
            $userId = $request->getAuthenticatedUserId();
        }

        try {
            $user = $this->userRepository->find($userId);
        } catch (UserNotFound $exception) {
            return new NotFoundResponse();
        }

        return $this->templating->render('user-detail', compact($user));
    }
}

userRepository = $this->createMock(UserRepository::class);
        $this->userController = new UserController(
            $this->userRepository,
            new JsonTemplating() // could be a mock, but let's use a dummy implementation for once
        );
    }

    /** @test */
    public function itUsesTheUserIdProvidedInTheRequest()
    {
        $requestedUser = new User(12, 'John', 'Doe');
        $authenticatedUser = new User(17, 'Jane', 'Roe');
        $this->userRepository->expects($this->once())
            ->method('find')
            ->with($requestedUser->getId())
            ->willReturn();

        $response = $this->userController->detail(new AuthenticatedRequest(
            $authenticatedUser,
            ['id' => $requestedUser->getId()]
        ));

        $this->assertEquals(
            new JsonResponse(json_encode($requestedUser)),
            $response
        );
    }

    /** @test */
    public function itFallsBackToTheAuthenticatedUserIfNoIdIsProvided()
    {
        $authenticatedUser = new User(17, 'Jane', 'Roe');
        $this->userRepository->expects($this->once())
            ->method('find')
            ->with($authenticatedUser->getId())
            ->willReturn();

        $response = $this->userController->detail(new AuthenticatedRequest(
            new AuthenticatedUser(17)
        ));

        $this->assertEquals(
            new JsonResponse(json_encode($authenticatedUser)),
            $response
        );
    }

    /** @test */
    public function itReturnsANotFoundWhenNoUserCouldBeFound()
    {
        $this->userRepository->expects($this->once())
            ->method('find')
            ->willThrowException(UserNotFound::class);

        $response = $this->userController->detail(new AuthenticatedRequest(
            new AuthenticatedUser(17)
        ));

        $this->assertEquals(new NotFoundResponse(), $response);
    }
}

No die/exit statements

When your code contains statements that stop code execution, this will also stop execution of your tests when it reaches that point. Consequently your testsuite won’t be able to output the information it collected by running the testsuite and – even worse – it won’t be able to run all tests coming after your exit or die statement. This is clearly a no-go for testability.

A simplified example of a piece of code that does exactly that could be this untestable service:

mailer = $mailer;
    }

    public function notify(User $user): void
    {
        if (!$user->hasEmail()) {
            die('please add an email to the user');
        }

        $this->mailer->send(
            'Welcome to our platform',
            'Thanks for registering, please activate using ...',
            $user->getEmail()
        );
    }
}

There are two common strategies to get rid of die/exit statements in code:

Use a return instead of a die. This is commonly done to do redirects, instead of echoing headers and then doing an exit. You could return a RedirectResponse object instead: this object then gets converted to the right headers in the outer layer of your code.
Throw an exception instead of exiting immediately. This way, the execution of your code is still stopped, but you can choose to still do something with your exception by catching it wherever you want. This could for example allow you to create a global error handler in your application wich logs the error and shows a user friendly message to the user.

By using the second strategy, we can refactor the code and get to this result:

mailer = $mailer;
    }

    public function notify(User $user): void
    {
        if (!$user->hasEmail()) {
            throw CanNotSendEmail::withoutRecipients();
        }

        $this->mailer->send(
            'Welcome to our platform',
            'Thanks for registering, please activate using ...',
            $user->getEmail()
        );
    }
}

This can then easily be tested like this:

createMock(Mailer::class));
        $userWithoutEmail = new User('name');

        $this->expectException(CanNotSendEmail::class);
        $newUserNotifier->notify($userWithoutEmail);
    }

    /** @test */
    public function itEmailsNewUsersThatHaveAnEmailAddress()
    {
        $mailer = $this->createMock(Mailer::class);
        $newUserNotifier = new NewUserNotifier($mailer);

        $userWithEmail = new User('name', '[email protected]');

        $mailer->expects($this->once())
            ->method('send')
            ->with(
                $this->assertInternalType('string'),
                $this->assertInternalType('string'),
                '[email protected]'
            );
        $newUserNotifier->notify($userWithEmail);
    }
}

As you can see, some of the more common issues in untestable code aren’t that hard to resolve. Splitting up code and using dependency injection are the key strategies to get your untested legacy codebase finally under control. Have fun refactoring and adding tests!

Need help refactoring legacy code?

We love creating the best possible environment for your (technical) teams. We can help you or your team with:

Refactoring towards testability

Wouter Sioen

How to communicate a roadmap to VCs and investors

Never hire senior developers*

CTO Soundboard: our new leadership mentoring service for SaaS CTOs and tech leaders

Extract code handling I/O

Split up complex code into understandable pieces

Get rid of global state

Avoid instantiating services in code

No more statics

No die/exit statements

Need help refactoring legacy code?

Member discussion

Building a React Native/Flutter app is more than coding

PHP lied to me

Automating malware scanning on uploaded files

Why AI?

Be boring

Refactoring towards testability

Wouter Sioen

How to communicate a roadmap to VCs and investors

Never hire senior developers*

CTO Soundboard: our new leadership mentoring service for SaaS CTOs and tech leaders

Get all the latest posts delivered straight to your inbox.

Extract code handling I/O

Split up complex code into understandable pieces

Get rid of global state

Avoid instantiating services in code

No more statics

No die/exit statements

Need help refactoring legacy code?

Member discussion

Building a React Native/Flutter app is more than coding

PHP lied to me

Automating malware scanning on uploaded files

Why AI?

Be boring