Unit Testing

Testing software piece by piece

Terms defined: actual result (of test), assertion, caching, defensive programming, design pattern, dynamic loading, error (in a test), exception handler, expected result (of test), exploratory programming, fail (a test), fixture, global variable, introspection, lifecycle, pass (a test), side effect, Singleton pattern, test runner, throw (exception), unit test

We have written many small programs in the previous two chapters, but haven't really tested any of them. That's OK for exploratory programming, but if we are building software that is going to be used instead of just read, we should try to make sure it works.

A tool for writing and running unit tests is a good first step. Such a tool should be able to:

Our design is inspired by tools like Mocha and Jest, which were in turn inspired by tools built for languages like Java Meszaros2007,Tudose2020.

How should we handle unit testing?

As in Mocha and other frameworks, every one of our unit tests will be a function of zero arguments (so that the framework can run every test the same way). Each test will create a fixture to be tested and use assertions to compare the actual result against the expected result. Each test can have one of three outcomes:

To make this work, we need some way to distinguish failing tests from broken ones. Our solution relies on the fact that exceptions are objects and that a program can use introspection to determine the class of an object. If a test throws an exception and that exception's class is assert.AssertionError, then we will assume the exception came from one of the assertions we put in the test as a check (). Any other kind of assertion indicates that the test itself contains an error.

Mental model of unit testing
Running tests that can pass, fail, or contain errors.

How can we separate test registration, execution, and reporting?

To start, let's use a handful of global variables to record tests and their results:

// State of tests.
const HopeTests = []
let HopePass = 0
let HopeFail = 0
let HopeError = 0

The function hopeThat saves a descriptive message and a callback function that implements a test in one of these global variables. (We don't run tests immediately because we want to wrap each one in our own exception handler.)

// Record a single test for running later.
const hopeThat = (message, callback) => {
  HopeTests.push([message, callback])
}

Because we're appending tests to an array, they will be run in the order in which they are registered, but we shouldn't rely on that. Every unit test should be independent so that an error or failure in an early test doesn't affect the result of a later one.

Finally, the function main runs all registered tests:

// Run all of the tests that have been asked for and report summary.
const main = () => {
  HopeTests.forEach(([message, test]) => {
    try {
      test()
      HopePass += 1
    } catch (e) {
      if (e instanceof assert.AssertionError) {
        HopeFail += 1
      } else {
        HopeError += 1
      }
    }
  })

  console.log(`pass ${HopePass}`)
  console.log(`fail ${HopeFail}`)
  console.log(`error ${HopeError}`)
}

If a test completes without an exception, it passes. If any of the assert calls inside the test raises an AssertionError, the test fails, and if it raises any other exception, it's an error. After all tests are run, main reports the number of results of each kind.

Let's try it out:

// Something to test (doesn't handle zero properly).
const sign = (value) => {
  if (value < 0) {
    return -1
  } else {
    return 1
  }
}

// These two should pass.
hopeThat('Sign of negative is -1', () => assert(sign(-3) === -1))
hopeThat('Sign of positive is 1', () => assert(sign(19) === 1))

// This one should fail.
hopeThat('Sign of zero is 0', () => assert(sign(0) === 0))

// This one is an error.
hopeThat('Sign misspelled is error', () => assert(sgn(1) === 1))

// Call the main driver.
main()
pass 2
fail 1
error 1

Our simple "framework" does what it's supposed to, but:

  1. It doesn't tell us which tests have passed or failed.

  2. Those global variables should be consolidated somehow so that it's clear they belong together.

  3. It doesn't discover tests on its own.

  4. We don't have a way to test things that are supposed to raise AssertionError. Putting assertions into code to check that it is behaving correctly is called defensive programming; it's a good practice, but we should make sure those assertions are failing when they're supposed to, just as we should test our smoke detectors every once in a while.

How should we structure test registration?

The next version of our testing tool solves the first two problems in the original by putting the testing machinery in a class. It uses the Singleton design pattern to ensure that only one object of that class is ever created. Singletons are a way to manage related global variables like the ones we're using to record tests and their results, and if we change our mind later about only having one instance of the class, there will be less code to rewrite and re-test.

The file hope.js defines the class and exports one instance of it:

  terse () {
    return this.cases()
      .map(([title, results]) => `${title}: ${results.length}`)
      .join(' ')
  }

  verbose () {
    let report = ''
    let prefix = ''
    for (const [title, results] of this.cases()) {
      report += `${prefix}${title}:`
      prefix = '\n'
      for (const r of results) {
        report += `${prefix}  ${r}`
      }
    }
    return report
  }

  cases () {
    return [
      ['passes', this.passes],
      ['fails', this.fails],
      ['errors', this.errors]]
  }

This strategy relies on two things:

  1. Node executes the code in a JavaScript module as it loads it, which means that it runs new Hope() and exports the newly-created object.

  2. Node caches modules, which means that a given module is only loaded once no matter how many times it is imported.

Once a program has imported hope, it can call Hope.test to record a test for later execution and Hope.run to execute all of the tests registered up until that point ().

Recording and running tests
Creating a singleton, recording tests, and running them.

Finally, our class can reports results as both a terse one-line summary and as a detailed listing. It can also provide the titles and results of individual tests so that if someone wants to format them in a different way (e.g., as HTML) they can do so:

  terse () {
    return this.cases()
      .map(([title, results]) => `${title}: ${results.length}`)
      .join(' ')
  }

  verbose () {
    let report = ''
    let prefix = ''
    for (const [title, results] of this.cases()) {
      report += `${prefix}${title}:`
      prefix = '\n'
      for (const r of results) {
        report += `${prefix}  ${r}`
      }
    }
    return report
  }

  cases () {
    return [
      ['passes', this.passes],
      ['fails', this.fails],
      ['errors', this.errors]]
  }

Who's calling?

Hope.test uses the caller module to get the name of the function that is registering a test. Reporting the test's name helps the user figure out where to start debugging, and getting it via introspection rather than requiring the user to pass it into the call reduces typing and eliminates the problem of a function called test_this telling the framework that its name is test_that.

How can we build a command-line driver for our test manager?

The most important concern in our design is to keep the files containing tests as simple as possible. A couple of import statements to get assert and hope and then one function call per test is about as simple as it gets:

import assert from 'assert'
import hope from './hope.js'

hope.test('Sum of 1 and 2', () => assert((1 + 2) === 3))

We don't want users to have to list files containing tests explicitly, so we will load test files dynamically. While import is usually written as a statement, it can also be used as an async function that takes a path as a parameter and loads the corresponding file. As before, loading files executes the code they contain, which registers tests as a side effect via calls to hope.test:

import minimist from 'minimist'
import glob from 'glob'
import hope from './hope.js'

const main = async (args) => {
  const options = parse(args)
  if (options.filenames.length === 0) {
    options.filenames = glob.sync(`${options.root}/**/test-*.js`)
  }
  for (const f of options.filenames) {
    await import(f)
  }
  hope.run()
  const result = (options.output === 'terse')
    ? hope.terse()
    : hope.verbose()
  console.log(result)
}
...
main(process.argv.slice(2))

By default, this program finds all files below the current working directory whose names match the pattern test-*.js and uses terse output. Since we may want to look for files somewhere else, or request verbose output, the program needs to handle command-line arguments.

The minimist module does this in a way that is consistent with Unix conventions. Given command-line arguments after the program's name (i.e., from process.argv[2] onward), it looks for patterns like -x something and creates an object with flags as keys and values associated with them.

Filenames in minimist

If we use a command line like pray.js -v something.js, then something.js becomes the value of -v. To indicate that we want something.js added to the list of trailing filenames associated with the special key _ (a single underscore), we have to write pray.js -v -- something.js. The double dash is a common Unix convention for signalling the end of parameters.

Our test runner is now complete, so we can try it out with some files containing tests that pass, fail, and contain errors:

node pray.js -v
passes:
  /u/stjs/unit-test/test-add.js::Sum of 1 and 2
  /u/stjs/unit-test/test-sub.js::Difference of 1 and 2
fails:
  /u/stjs/unit-test/test-div.js::Quotient of 1 and 0
  /u/stjs/unit-test/test-mul.js::Product of 1 and 2
errors:
  /u/stjs/unit-test/test-missing.js::Sum of x and 0

Infinity is allowed

test-div.js contains the line:

hope.test('Quotient of 1 and 0', () => assert((1 / 0) === 0))

This test counts as a failure rather than an error because thinks the result of dividing by zero is the special value Infinity rather than an arithmetic error.

The lifecycle of a pair of files test-add.js and test-sub.js is shown in :

  1. pray loads hope.js.
  2. Loading hope.js creates a single instance of the class Hope.
  3. pray uses glob to find files with tests.
  4. pray loads test-add.js using import as a function.
  5. As test-add.js runs, it loads hope.js. Since hope.js is already loaded, this does not create a new instance of Hope.
  6. test-add.js uses hope.test to register a test (which does not run yet).
  7. pray then loads test-sub.js
  8. …which loads Hope
  9. …then registers a test.
  10. pray can now ask the unique instance of Hope to run all of the tests, then get a report from the Hope singleton and display it.
Unit testing lifecycle
Lifecycle of dynamically-discovered unit tests.

Exercises

Asynchronous globbing

Modify pray.js to use the asynchronous version of glob rather than glob.sync.

Timing tests

Install the microtime package and then modify the dry-run.js example so that it records and reports the execution times for tests.

Approximately equal

  1. Write a function assertApproxEqual that does nothing if two values are within a certain tolerance of each other but throws an exception if they are not:

    # throws exception
    assertApproxEqual(1.0, 2.0, 0.01, 'Values are too far apart')
    
    # does not throw
    assertApproxEqual(1.0, 2.0, 10.0, 'Large margin of error')
    
  2. Modify the function so that a default tolerance is used if none is specified:

    # throws exception
    assertApproxEqual(1.0, 2.0, 'Values are too far apart')
    
    # does not throw
    assertApproxEqual(1.0, 2.0, 'Large margin of error', 10.0)
    
  3. Modify the function again so that it checks the relative error instead of the absolute error. (The relative error is the absolute value of the difference between the actual and expected value, divided by the absolute value.)

Rectangle overlay

A windowing application represents rectangles using objects with four values: x and y are the coordinates of the lower-left corner, while w and h are the width and height. All values are non-negative: the lower-left corner of the screen is at (0, 0) and the screen's size is WIDTHxHEIGHT.

  1. Write tests to check that an object represents a valid rectangle.

  2. The function overlay(a, b) takes two rectangles and returns either a new rectangle representing the region where they overlap or null if they do not overlap. Write tests to check that overlay is working correctly.

  3. Do you tests assume that two rectangles that touch on an edge overlap or not? What about two rectangles that only touch at a single corner?

Selecting tests

Modify pray.js so that if the user provides -s pattern or --select pattern then the program only runs tests in files that contain the string pattern in their name.

Tagging tests

Modify hope.js so that users can optionally provide an array of strings to tag tests:

hope.test('Difference of 1 and 2',
          () => assert((1 - 2) === -1),
          ['math', 'fast'])

Then modify pray.js so that if users specify either -t tagName or --tag tagName only tests with that tag are run.

Mock objects

A mock object is a simplified replacement for part of a program whose behavior is easier to control and predict than the thing it is replacing. For example, we may want to test that our program does the right thing if an error occurs while reading a file. To do this, we write a function that wraps fs.readFileSync:

const mockReadFileSync = (filename, encoding = 'utf-8') => {
  return fs.readFileSync(filename, encoding)
}

and then modify it so that it throws an exception under our control. For example, if we define MOCK_READ_FILE_CONTROL like this:

const MOCK_READ_FILE_CONTROL = [false, false, true, false, true]

then the third and fifth calls to mockReadFileSync throw an exception instead of reading data, as do any calls after the fifth. Write this function.

Setup and teardown

Testing frameworks often allow programmers to specify a setup function that is to be run before each test and a corresponding teardown function that is to be run after each test. (setup usually re-creates complicated test fixtures, while teardown functions are sometimes needed to clean up after tests, e.g., to close database connections or delete temporary files.)

Modify the testing framework in this chapter so that if a file of tests contains something like this:

const createFixtures = () => {
  ...do something...
}

hope.setup(createFixtures)

then the function createFixtures will be called exactly once before each test in that file. Add a similar way to register a teardown function with hope.teardown.

Multiple tests

Add a method hope.multiTest that allows users to specify multiple test cases for a function at once. For example, this:

hope.multiTest('check all of these`, functionToTest, [
  [['arg1a', 'arg1b'], 'result1'],
  [['arg2a', 'arg2b'], 'result2'],
  [['arg3a', 'arg3b'], 'result3']
])

should be equivalent to this:

hope.test('check all of these 0',
  () => assert(functionToTest('arg1a', 'arg1b') === 'result1')
)
hope.test('check all of these 1',
  () => assert(functionToTest('arg2a', 'arg2b') === 'result2')
)
hope.test('check all of these 2',
  () => assert(functionToTest('arg3a', 'arg3b') === 'result3')
)

Assertions for sets and maps

  1. Write functions assertSetEqual and assertMapEqual that check whether two instances of Set or two instances of Map are equal.

  2. Write a function assertArraySame that checks whether two arrays have the same elements, even if those elements are in different orders.

Testing promises

Modify the unit testing framework to handle async functions, so that:

hope.test('delayed test', async () => {...})

does the right thing. (Note that you can use typeof to determine whether the object given to hope.test is a function or a promise.)