Refactoring towards testability

Back-end

Refactoring towards testability

Wouter Sioen

Wouter Sioen

When working on a legacy codebase, you often have to make changes in code you don’t fully understand. When this code is not tested, you’re never sure that it will still work after adapting it. And will it have unexpected effects on other parts of the codebase?
As a software engineer at madewithlove, I often help other companies dealing with their legacy applications. I’ve learned that to efficiently change this code, you need to have a very short feedback loop. This verifies that your application will work as intended so that you don’t have to manually test all possible paths that could be affected by your change.
It is often quite easy to write end-to-end tests for your application or manually test the user interface or the input and output it gives. But these tests typically take a long time to run, so they are run less. This in turn makes it a lot harder to pinpoint what exactly is wrong when the tests fails and when exactly the issue was introduced.

Writing unit tests is the easiest way to create a short feedback loop between developing an app and knowing that something is wrong with it. Badly written code can make it really hard to write such tests and trickier to move towards a more confident development process. Luckily, there are some easily recognisable anti-patterns of untestable code, creating clear paths to get rid of them. Better still, testable code will almost always be code with a better design and readability!

Extract code handling I/O

When code does I/O – such as database or API calls, sending emails, pushing notifications, or putting messages in a queue –  it can be really hard to test that code. After all, you don’t want to send an email to a customer every time your testsuite is run
If for example code is talking to a database, you should try to encapsulate this logic into a specific class that only has this as a responsibility. The repository pattern is a common way of building such objects. After extracting this logic to its own unit, you will be able to put it behind an interface and inject that piece of code everywhere you need to communicate with your database. This has two major advantages for testability:

  • You only have to unit test database (or other) interactions in one place.
  • You can unit test other places, without having the slow input/output calls, by replacing your repository or other abstraction by a mock or a different implementation (such as a repository that holds an in-memory array of data) that follows the same interface/contract.

A simplified example could be this RegistrationController that does a static database call and sends an email:

<?php

class RegisterController extends BaseController
{
    public function register(User $user)
    {
        Database::insertUser(
            [
                'username' => $user->getUsername(),
                'password' => $user->getHashedPassword(),
            ]
        );

        mail(
            $user->getEmail(),
            'Welcome to our platform',
            'Hi ' . $user->getUsername() . ', activate using this link: ' . ActivationLink::forUser($user),
            'From: [email protected]'
        );

        return $this->render('registered-template', compact($user));
    }
}

If you wrap the code that sends the email and that does the database calls into its own class, you can easily test them in isolation. An example of an integration tested repository would be:

<?php

use PHPUnit\Framework\TestCase;

final class DatabaseUserRepositoryTest extends TestCase
{
    private $repository;

    public function setUp()
    {
        $this->repository = new DatabaseUserRepository(
            TestHelper::getTestDatabaseConnection()
        );
    }

    /** @test */
    public function itCanAddUsers()
    {
        $user = new User('username', 'hashed-password');

        $this->repository->add($user);

        $usersFromDatabase = TestHelper::getTestDatabaseConnection()
            ->select('*')
            ->from('users')
            ->where('id', '=', $user->getId())
            ->fetchAll();

        $this->assertCount(1, $usersFromDatabase);
        $this->assertEquals($user->getUsername(), $usersFromDatabase[0]['username']);
        $this->assertEquals($user->getHashedPassword(), $usersFromDatabase[0]['password']);
    }
}
<?php

class DatabaseUserRepository implements UserRepository
{
    private $connection;

    public function __construct(DatabaseConnection $connection)
    {
        $this->connection = $connection;
    }

    public function add(User $user): void
    {
        $this->connection->insert('users', [
            'username' => $user->getUsername(),
            'password' => $user->getHashedPassword(),
        ]);

        $user->setId($this->databaseConnection->lastInsertId());
    }
}

If you use a similar approach for the mail component, you could inject the mailer and the repository in the controller which makes it testable:

<?php

use PHPUnit\Framework\TestCase;

final class RegisterControllerTest extends TestCase
{
    /** @test */
    public function itAddsTheUser()
    {
        $userRepository = $this->createMock(UserRepository::class);
        $controller = new RegisterController(
            $userRepository,
            $this->createMock(UserMailer::class)
        );
        $user = new User('username', 'hashed-password');

        $userRepository->expects($this->once())
            ->method('add')
            ->with($user);

        $controller->register($user);
    }

    /** @test */
    public function itSendsAnActivationMailToTheRegisteredUser()
    {
        $userMailer = $this->createMock(UserMailer::class);
        $controller = new RegisterController(
            $this->createMock(UserRepository::class),
            $userMailer
        );
        $user = new User('username', 'hashed-password');

        $userMailer->expects($this->once())
            ->method('sendActivation')
            ->with($user);

        $controller->register($user);
    }
}
<?php

class RegisterController extends BaseController
{
    private $userRepository;
    private $userMailer;

    public function __construct(UserRepository $userRepository, UserMailer $userMailer)
    {
        $this->userRepository = $userRepository;
        $this->userMailer = $userMailer;
    }

    public function register(User $user)
    {
        $this->userRepository->add($user);
        $this->userMailer->sendActivation($user);

        return $this->render('registered-template', compact($user));
    }
}

Split up complex code into understandable pieces

If a method, class or file is too big, this one unit of code should usually be split up into many smaller pieces. One method can evolve into multiple methods or even multiple classes. The two big advantages of doing this are that smaller pieces of code are much easier to understand and much easier to test.

If this unit of code does many different things, chances are high that it has one main goal and that the other code executed is a side effect of this first action. So the best solution is to cleanly refactor the unit of code towards an event-driven architecture.

An example of this is the following piece of code, which sends out a webhook every time a blog post is added.

<?php

// a lot of code that does stuff like authentication/reading and validating POST data/...

$postId = Database::insert(
    'blog_posts',
    [
        'title' => $title,
        'content' => $content,
        'author' => $currentUser->getId(),
    ]
);

$webHookData = [
    'id' => urlencode($postId),
    'title' => urlencode($title),
];

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, 'http://webhook.example.com');
curl_setopt($ch, CURLOPT_POST, count($webHookData));
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($webHookData));

// send post requeste to the webhook url
curl_exec($ch);

// there might even be more side effects like sending out notifications
// to subscribed users

// more code that shows that the blogpost has been stored

Using a publish-subscribe pattern, you could easily make an event listener (or multiple event listeners) that gets triggered every time this event is published. A first step to doing that might look like this:

<?php

// a lot of code that does stuff like authentication/reading and validating POST data/...

$postId = Database::insert(
    'blog_posts',
    [
        'title' => $title,
        'content' => $content,
        'author' => $currentUser->getId(),
    ]
);

EventPublisher::publish(new BlogPostWasAdded(
    $currentUser->getId(),
    $postId,
    $title
));

// more code that shows that the blogpost has been stored
<?php

class WebHookPublisher
{
    public function onBlogPostWasAdded(BlogPostWasAdded $event)
    {
        $webHookData = [
            'id' => urlencode($event->getBlogPostId()),
            'title' => urlencode($event->getBlogPostTitle()),
        ];

        $ch = curl_init();

        curl_setopt($ch, CURLOPT_URL, 'http://webhook.example.com');
        curl_setopt($ch, CURLOPT_POST, count($webHookData));
        curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($webHookData));

        curl_exec($ch);
    }
}

This code is still not really testable, because the webhook is called directly in this class. But with the technique used in the previous part (extract code handling IO), it is now really easy to refactor the code in a testable state.

Get rid of global state

Using global state is extremely unpredictable, because every part of your code can alter it. Some global state can even be altered from outside your application. The default timezone in PHP is an example of global state that can be configured outside of your application, but it can also be altered inside of your app during every request.

If your application starts with just a bit of global states, it feels under If  control. However this quickly gets out of hand when multiple places in your code either add or edit – or even remove – the same global properties in your application. Other pieces of global state could even change automatically, one example being the current time.

Here’s an example of an untestable class using global state:

<?php

final class FileManager
{
    public function writeToTemporaryFile(string $data)
    {
        global $userId;

        CanWriteFiles::verify($userId);

        $file = tempnam(sys_get_temp_dir(), $userId);
        file_put_contents($file, $data);

        return $file;
    }
}

There are multiple issues with this code:

  • Unpredictable methods are used (such as sys_get_temp_dir and temp_name).
  • We use a global variable.
  • The global variable can be changed by the CanWriteFiles::verify method which could make the function calls after that unpredictable.

By wrapping the static/unpredictable code in services, injecting these through the constructor and by passing the global user id to the method, we can easily extract all global state out of the method and make it testable. A better iteration for this class could look like this:

<?php

final class FileManager
{
    /**
     * @var CanWritefiles
     */
    private $canWriteFiles;

    /**
     * @var TemporaryFiles
     */
    private $temporaryFiles;

    public function __construct(CanWriteFiles $canWriteFiles, TemporaryFiles $temporaryFiles)
    {
        $this->canWriteFiles = $canWriteFiles;
        $this->temporaryFiles = $temporaryFiles;
    }

    public function writeToTemporaryFile(string $data, int $userId): string
    {
        $this->canWriteFiles->verify($userId);

        return $this->temporaryFiles->write($userId, $data);
    }
}

We can now perfectly test the logic in this class using a test:

<?php

use PHPUnit\Framework\TestCase;
use PHPUnit_Framework_MockObject_MockObject;

final class FileManagerTest extends TestCase
{
    /**
     * @var CanWriteFiles | PHPUnit_Framework_MockObject_MockObject
     */
    private $canWriteFiles;

    /**
     * @var TemporaryFiles | PHPUnit_Framework_MockObject_MockObject
     */
    private $temporaryFiles;

    /**
     * @var FileManager
     */
    private $fileManager;

    public function setUp()
    {
        $this->canWriteFiles = $this->createMock(CanWriteFiles::class);
        $this->temporaryFiles = $this->createMock(TemporaryFiles::class);
        $this->fileManager = new FileManager(
            $this->canWriteFiles,
            $this->temporaryFiles
        );
    }

    /**
     * @test
     */
    public function itDoesNotWriteAFileIfUserHasNotEnoughRights()
    {
        $this->givenUserCannotWriteFiles(1);

        $this->temporaryFiles->expects($this->never())
            ->method('write');

        $this->expectException(CannotWriteFiles::class);
        $this->fileManager->writeToTemporaryFile('data', 1);
    }

    /**
     * @test
     */
    public function itDoesReturnAFilenameIfUserCanWriteFiles()
    {
        $this->givenuserCanWriteFiles(1);

        $this->temporaryFiles->expects($this->once())
            ->method('write')
            ->with('data', 1)
            ->willReturn('/tmp/filename');

        $fileName = $this->fileManager->writeToTemporaryFile('data', 1);

        $this->assertEquals('/tmp/filename', $fileName);
    }

    private function givenUserCannotWriteFiles(int $userId)
    {
        $this->canWriteFiles->method('verify')
            ->with($userId)
            ->willThrowException(CannotWriteFiles::withUserId($userId));
    }

    private function givenUserCanWriteFiles(int $userId)
    {
        $this->canWriteFiles->method('verify')
            ->with($userId)
            ->willReturn(true);
    }
}

Avoid instantiating services in code

Whenever you see the new keyword pop up in your code and it is used to instantiate anything but a value object or an entity, you’re likely creating services here that you would prefer to replace by another implementation (e.g. a mock) in your tests.

A good example of this is a piece of code that has to call a fictional microservice, in order to see if a user is subscribed:

<?php

class MicroServiceCaller
{
    public function get(string $endpoint, array $requestData): array
    {
        $client = new HttpClient([
            'base_uri' => 'http://api-base-url.example.com/'
        ]);

        $response = $client->get($endpoint, $requestData);

        if ($response->getStatusCode() != 200) {
            throw new RequestFailedException('could not call ' . $endpoint);
        }

        return json_decode((string) $response->getData(), true);
    }
}

This code is untestable without doing real requests to the production version of a microservice that contains real application data. As a result, such code is prone to changes. When we pull out the client and pass it to the class using dependency injection, we can easily make the class testable. So we can also easily configure the client differently in other environments (e.g. a staging and a local microservice). The result could look like this:

<?php

use PHPUnit\Framework\TestCase;

final class MicroServiceCallerTest extends TestCase
{
    /** @test */
    public function itReturnsTheReturnedJsonDataAsArray()
    {
        $client = $this->createMock(HttpClientInterface::class);
        $caller = new MicroServiceCaller($client);
        $endpoint = '/isUserSubscribed';
        $requestData = ['user_id' => 123];

        $this->client->expects($this->once())
            ->method('get')
            ->with($endpoint, $requestData)
            ->willReturn(
                Response::withData(
                    '{"data":{"marketing_mails": false, "notifications": true}}'
                )
            );

        $response = $caller->call($endpoint, $requestData);

        $this->assertEquals(
            [
                'marketing_mails' => false,
                'notifications' => true,
            ],
            $response
        );
    }

    /** @test*/
    public function itThrowsWhenTheMicroServiceIsNotReachable()
    {
        $client = $this->createMock(HttpClientInterface::class);
        $caller = new MicroServiceCaller($client);

        $this->client->expects($this->once())
            ->method('get')
            ->willReturn(Response::withErrorCode(400));

        $this->expectException(RequestFailedException::class);
        $caller->call('/isUserSubscribed', ['user_id' => 123]);
    }
}
<?php

class MicroServiceCaller
{
    public function __construct(HttpClientInterface $client)
    {
        $this->client = $client;
    }

    public function get(string $endpoint, array $requestData): array
    {
        $response = $this->client->get($endpoint, $requestData);

        if ($response->getStatusCode() != 200) {
            throw new RequestFailedException('could not call ' . $endpoint);
        }

        $responseData = json_decode((string) $response->getData(), true);

        return $responseData['data'];
    }
}

No more statics

The main issue with testing static methods is that it basically is procedural code. Since static methods never have state, they can be entirely unpredictable and can call other code that has side effects which you don’t want to (or even cannot) execute while running your tests. It is also extremely hard to mock code that is static. In PHP, the ‘patchwork’ framework lets you do exactly that. But trust me, you shouldn’t use it! This framework will slow down your tests enormously, because it hooks into all internal calls and checks to see if it has to monkey patch execution paths on the fly. So let’s avoid having to do that!

An example of untestable code is this controller, which uses static methods to see which user it has to show the detail page:

<?php

class UserController
{
    private $userRepository;
    private $templating;

    public function __construct(UserRepository $userRepository, TemplateEngine $templating)
    {
        $this->userRepository = $userRepository;
        $this->templating = $templating;
    }

    public function detail(): ResponseInterface
    {
        $userId = RequestData::get('id');

        if (!$userId) {
            $userId = Authentication::getCurrentUserId();
        }

        try {
            $user = $this->userRepository->find($userId);
        } catch (UserNotFound $exception) {
            return new NotFoundResponse();
        }

        return $this->templating->render('user-detail', compact($user));
    }
}

Both static calls might do a lot in the background. Maybe the Request::get method depends on $_POST variables to be available, while the Authentication::getCurrentUserId might depend on a session being started and having user data in there. We can do better! By using a request object that is instantiatable and can be passed to the controller, we can easily mock the request or change it with a different implementation that does not have any dependencies. This would result in something like this:

<?php

class UserController
{
    private $userRepository;
    private $templating;

    public function __construct(UserRepository $userRepository, TemplateEngine $templating)
    {
        $this->userRepository = $userRepository;
        $this->templating = $templating;
    }

    public function detail(AuthenticatedRequest $request): ResponseInterface
    {
        $userId = $request->getParameter('id');

        if (!$userId) {
            $userId = $request->getAuthenticatedUserId();
        }

        try {
            $user = $this->userRepository->find($userId);
        } catch (UserNotFound $exception) {
            return new NotFoundResponse();
        }

        return $this->templating->render('user-detail', compact($user));
    }
}
<?php

use PHPUnit\Framework\TestCase;

final class UserControllerTest extends TestCase
{
    private $userController;
    private $userRepository;

    public function setUp()
    {
        $this->userRepository = $this->createMock(UserRepository::class);
        $this->userController = new UserController(
            $this->userRepository,
            new JsonTemplating() // could be a mock, but let's use a dummy implementation for once
        );
    }

    /** @test */
    public function itUsesTheUserIdProvidedInTheRequest()
    {
        $requestedUser = new User(12, 'John', 'Doe');
        $authenticatedUser = new User(17, 'Jane', 'Roe');
        $this->userRepository->expects($this->once())
            ->method('find')
            ->with($requestedUser->getId())
            ->willReturn();

        $response = $this->userController->detail(new AuthenticatedRequest(
            $authenticatedUser,
            ['id' => $requestedUser->getId()]
        ));

        $this->assertEquals(
            new JsonResponse(json_encode($requestedUser)),
            $response
        );
    }

    /** @test */
    public function itFallsBackToTheAuthenticatedUserIfNoIdIsProvided()
    {
        $authenticatedUser = new User(17, 'Jane', 'Roe');
        $this->userRepository->expects($this->once())
            ->method('find')
            ->with($authenticatedUser->getId())
            ->willReturn();

        $response = $this->userController->detail(new AuthenticatedRequest(
            new AuthenticatedUser(17)
        ));

        $this->assertEquals(
            new JsonResponse(json_encode($authenticatedUser)),
            $response
        );
    }

    /** @test */
    public function itReturnsANotFoundWhenNoUserCouldBeFound()
    {
        $this->userRepository->expects($this->once())
            ->method('find')
            ->willThrowException(UserNotFound::class);

        $response = $this->userController->detail(new AuthenticatedRequest(
            new AuthenticatedUser(17)
        ));

        $this->assertEquals(new NotFoundResponse(), $response);
    }
}

No die/exit statements

When your code contains statements that stop code execution, this will also stop execution of your tests when it reaches that point. Consequently your testsuite won’t be able to output the information it collected by running the testsuite and – even worse – it won’t be able to run all tests coming after your exit or die statement. This is clearly a no-go for testability.

A simplified example of a piece of code that does exactly that could be this untestable service:

<?php

class NewUserNotifier
{
    private $mailer;

    public function __construct(Mailer $mailer)
    {
        $this->mailer = $mailer;
    }

    public function notify(User $user): void
    {
        if (!$user->hasEmail()) {
            die('please add an email to the user');
        }

        $this->mailer->send(
            'Welcome to our platform',
            'Thanks for registering, please activate using ...',
            $user->getEmail()
        );
    }
}

There are two common strategies to get rid of die/exit statements in code:

  1. Use a return instead of a die. This is commonly done to do redirects, instead of echoing headers and then doing an exit. You could return a RedirectResponse object instead: this object then gets converted to the right headers in the outer layer of your code.
  2. Throw an exception instead of exiting immediately. This way, the execution of your code is still stopped, but you can choose to still do something with your exception by catching it wherever you want. This could for example allow you to create a global error handler in your application wich logs the error and shows a user friendly message to the user.

By using the second strategy, we can refactor the code and get to this result:

<?php

class NewUserNotifier
{
    private $mailer;

    public function __construct(Mailer $mailer)
    {
        $this->mailer = $mailer;
    }

    public function notify(User $user): void
    {
        if (!$user->hasEmail()) {
            throw CanNotSendEmail::withoutRecipients();
        }

        $this->mailer->send(
            'Welcome to our platform',
            'Thanks for registering, please activate using ...',
            $user->getEmail()
        );
    }
}

 

<?php

final class CanNotSendEmail extends InvalidArgumentException
{
    public static function withoutRecipients()
    {
        return new self('add recipients before sending an email');
    }
}

This can then easily be tested like this:

<?php

use PHPUnit\Framework\TestCase;

final class NewUserNotifierTest extends TestCase
{
    /** @test */
    public function itDoesNotSendEmailsWithoutRecipients()
    {
        $newUserNotifier = new NewUserNotifier($this->createMock(Mailer::class));
        $userWithoutEmail = new User('name');

        $this->expectException(CanNotSendEmail::class);
        $newUserNotifier->notify($userWithoutEmail);
    }

    /** @test */
    public function itEmailsNewUsersThatHaveAnEmailAddress()
    {
        $mailer = $this->createMock(Mailer::class);
        $newUserNotifier = new NewUserNotifier($mailer);

        $userWithEmail = new User('name', '[email protected]');

        $mailer->expects($this->once())
            ->method('send')
            ->with(
                $this->assertInternalType('string'),
                $this->assertInternalType('string'),
                '[email protected]'
            );
        $newUserNotifier->notify($userWithEmail);
    }
}

As you can see, some of the more common issues in untestable code aren’t that hard to resolve. Splitting up code and using dependency injection are the key strategies to get your untested legacy codebase finally under control. Have fun refactoring and adding tests!


Need help refactoring legacy code?

We love creating the best possible environment for your (technical) teams. We can help you or your team with:

 

Hire Wouter as a speaker?

Hire Wouter as a speaker?
Wouter Sioen

Wouter Sioen

Wouter looks like the nice kid you knew at high school. But boy oh boy, when it comes to problem-solving, this developer is like a pitbull on steroids. It must have been that world trip he made, which cleared his mind to the extent he can now fully sink his shiny teeth into fixing unsolvable bugs and automatic testing. If he applied himself as much to playing his ukelele, his girlfriend wouldn’t have to listen to that one song all the time. And no, we’re not talking ‘Somewhere Over the Rainbow'.

our blog Related blog articles

Do you have a question? Leave a comment