# Debugger

## Running programs under the control of a breakpointing debugger

Terms defined: breakpoint, source map

We have finally come to one of the topics that sparked this book: how does a debugger work? (The other was layout engines, discussed in .) Debuggers are as much a part of good programmers' lives as version control but are taught far less often (in part, we believe, because it's harder to create homework questions for them). In this chapter we will build a simple single-stepping debugger; in doing so, we will show one way to test interactive applications ().

## What is our starting point?

We would like to debug a higher-level language than the assembly code of , but we don't want to have to write a parser or wrestle with the ASTs of . As a compromise, we will represent programs as simple JSON data structures in which every element has the form [command ...args]:

// const a = [-3, -5, -1, 0, -2, 1, 3, 1]
// const b = Array()
// let largest = a[0]
// let i = 0
// while (i < length(a)) {
//   if (a[i] > largest) {
//     b.push(a[i])
//   }
//   i += 1
// }
// i = 0
// while (i < length(b)) {
//   console.log(b[i])
//   i += 1
// }

[
["defA", "a", ["data", -3, -5, -1, 0, -2, 1, 3, 1]],
["defA", "b", ["data"]],
["defV", "largest", ["getA", "a", ["num", 0]]],
["append", "b", ["getV", "largest"]],
["defV", "i", ["num", 0]],
["loop", ["lt", ["getV", "i"], ["len", "a"]],
["test", ["gt", ["getA", "a", ["getV", "i"]], ["getV", "largest"]],
["setV", "largest", ["getA", "a", ["getV", "i"]]],
["append", "b", ["getV", "largest"]]
],
["setV", "i", ["add", ["getV", "i"], ["num", 1]]]
],
["setV", "i", ["num", 0]],
["loop", ["lt", ["getV", "i"], ["len", "b"]],
["print", ["getA", "b", ["getV", "i"]]],
["setV", "i", ["add", ["getV", "i"], ["num", 1]]]
]
]
filter-base.json

Our virtual machine is structured like the one in . A real system would parse a program to create JSON, then translate JSON into assembly code, then assemble that to create machine instructions. Again, to keep things simple we will execute a program by removing comments and blank lines and then running commands by looking up the command name's and calling that method:

import assert from 'assert'

class VirtualMachineBase {
constructor (program) {
this.program = this.compile(program)
this.prefix = '>>'
}

compile (lines) {
const text = lines
.map(line => line.trim())
.filter(line => (line.length > 0) && !line.startsWith('//'))
.join('\n')
return JSON.parse(text)
}

run () {
this.env = {}
this.runAll(this.program)
}

runAll (commands) {
commands.forEach(command => this.exec(command))
}

exec (command) {
const [op, ...args] = command
assert(op in this,
Unknown op "${op}") return this[op](args) } } export default VirtualMachineBasevm-base.js Remember, functions and methods are just another kind of data, so if an object has a method called "meth", the expression this["meth"] will look it up and the expression this["meth"](args) will call it. If the string "meth" is stored in a variable called name, then this[name](args) will do exactly the same thing. The method in our VM that defines a new variable with an initial value looks like this:  defV (args) { this.checkOp('defV', 2, args) const [name, value] = args this.env[name] = this.exec(value) }vm-base.js while the one that adds two values looks like this:  add (args) { this.checkOp('add', 2, args) const left = this.exec(args[0]) const right = this.exec(args[1]) return left + right }vm-base.js Running a while loop is:  loop (args) { this.checkBody('loop', 1, args) const body = args.slice(1) while (this.exec(args[0])) { this.runAll(body) } }vm-base.js and checking that a variable name refers to an array is:  checkArray (op, name) { this.checkName(op, name) const array = this.env[name] assert(Array.isArray(array), Variable "${name}" used in "${op}" is not array) }vm-base.js The other operations are similar to these. ## How can we make a tracing debugger? The next thing we need in our debugger is a source map that keeps track of where in the source file each instruction came from. Since JSON is a subset of JavaScript, we could get line numbers by parsing our programs with Acorn. However, we would then have to scrape the information we want for this example out of the AST. Since this chapter is supposed to be about debugging, not parsing, we will instead cheat and add a line number to each interesting statement by hand so that our program looks like this:  [ [1, "defA", "a", ["data", -3, -5, -1, 0, -2, 1, 3, 1]], [2, "defA", "b", ["data"]], [3, "defV", "largest", ["getA", "a", ["num", 0]]], [4, "append", "b", ["getV", "largest"]], [5, "defV", "i", ["num", 0]], [6, "loop", ["lt", ["getV", "i"], ["len", "a"]], [7, "test", ["gt", ["getA", "a", ["getV", "i"]], ["getV", "largest"]], [8, "setV", "largest", ["getA", "a", ["getV", "i"]]], [9, "append", "b", ["getV", "largest"]] ], [11, "setV", "i", ["add", ["getV", "i"], ["num", 1]]] ], [13, "setV", "i", ["num", 0]], [14, "loop", ["lt", ["getV", "i"], ["len", "b"]], [15, "print", ["getA", "b", ["getV", "i"]]], [16, "setV", "i", ["add", ["getV", "i"], ["num", 1]]] ] ]filter-source-map.json Building the source map from that is simple; for now, we just modify exec to ignore the line number: import assert from 'assert' import VirtualMachineBase from './vm-base.js' class VirtualMachineSourceMap extends VirtualMachineBase { compile (lines) { const original = super.compile(lines) this.sourceMap = {} const result = original.map(command => this.transform(command)) return result } transform (node) { if (!Array.isArray(node)) { return node } if (Array.length === 0) { return [] } const [first, ...rest] = node if (typeof first !== 'number') { return [first, null, ...rest.map(arg => this.transform(arg))] } const [op, ...args] = rest this.sourceMap[first] = [op, first, ...args.map(arg => this.transform(arg))] return this.sourceMap[first] } exec (command) { const [op, lineNum, ...args] = command assert(op in this, Unknown op "${op}")
return this[op](args)
}
}

export default VirtualMachineSourceMap
vm-source-map.js

### It's not really cheating

We said that adding line numbers by hand was cheating, but it isn't. What we're actually doing is deferring a problem until we're sure we need to solve it. If our approach is clumsy or fails outright because of some aspect of design we didn't foresee, there will have been no point handling line numbers the "right" way. A good rule for software design is to tackle the thing you're least sure about first, using temporary code in place of what you think you'll eventually need.

The next step is to modify the VM's exec method so that it executes a callback function for each significant operation (where "significant" means "we bothered to record its line number"). Since we're not sure what our debugger is going to need, we give this callback the environment holding the current set of variables, the line number, and the operation being performed:

import assert from 'assert'

import VirtualMachineSourceMap from './vm-source-map.js'

class VirtualMachineCallback extends VirtualMachineSourceMap {
constructor (program, dbg) {
super(program)
this.dbg = dbg
this.dbg.setVM(this)
}

exec (command) {
const [op, lineNum, ...args] = command
this.dbg.handle(this.env, lineNum, op)
assert(op in this,
Unknown op "${op}") return this[op](args, lineNum) } message (prefix, val) { this.dbg.message(${prefix} ${val}) } } export default VirtualMachineCallback vm-callback.js We also modify the VM's constructor to record the debugger and give it a reference to the virtual machine (). We have to connect the two objects explicitly because each one needs a reference to the other, but one of them has to be created first. "A gets B then B tells A about itself" is a common pattern; we will look at other ways to manage it in the exercises. To run the program, we create a debugger object and pass it to the VM's constructor: import assert from 'assert' import readSource from './read-source.js' const main = () => { assert(process.argv.length === 5, 'Usage: run-debugger.js ./vm ./debugger input|-') const VM = require(process.argv[2]) const Debugger = require(process.argv[3]) const inFile = process.argv[4] const lines = readSource(inFile) const dbg = new Debugger() const vm = new VM(lines, dbg) vm.run() } main() run-debugger.js A simple debugger just traces interesting statements as they run: import DebuggerBase from './debugger-base.js' class DebuggerTrace extends DebuggerBase { handle (env, lineNum, op) { if (lineNum !== null) { console.log(${lineNum} / ${op}:${JSON.stringify(env)})
}
}
}

export default DebuggerTrace
debugger-trace.js

Let's try it on a program that adds the numbers in an array:

// const a = [-5, 1, 3]
// const total = 0
// let i = 0
// while (i < length(a)) {
//   total += a[i]
//   i += 1
// }
// console.log(total)

[
[1, "defA", "a", ["data", -5, 1, 3]],
[2, "defV", "total", ["num", 0]],
[3, "defV", "i", ["num", 0]],
[4, "loop", ["lt", ["getV", "i"], ["len", "a"]],
[5, "setV", "total",
["add", ["getV", "total"], ["getA", "a", ["getV", "i"]]]
],
[8, "setV", "i", ["add", ["getV", "i"], ["num", 1]]]
],
[10, "print", ["getV", "total"]]
]
sum-source-map.json
1 / defA: {}
2 / defV: {"a":[-5,1,3]}
3 / defV: {"a":[-5,1,3],"total":0}
4 / loop: {"a":[-5,1,3],"total":0,"i":0}
5 / setV: {"a":[-5,1,3],"total":0,"i":0}
8 / setV: {"a":[-5,1,3],"total":-5,"i":0}
5 / setV: {"a":[-5,1,3],"total":-5,"i":1}
8 / setV: {"a":[-5,1,3],"total":-4,"i":1}
5 / setV: {"a":[-5,1,3],"total":-4,"i":2}
8 / setV: {"a":[-5,1,3],"total":-1,"i":2}
10 / print: {"a":[-5,1,3],"total":-1,"i":3}
>> -1
sum-source-map-trace.out

## How can we make the debugger interactive?

What we have built so far is an always-on print statement. To turn it into an interactive debugger, we will use the prompt-sync module to manage user input with the following set of commands:

• ? or help to list commands.

• clear # to clear a breakpoint at a numbered line.

• list to list lines and breakpoints.

• next to go forward one line.

• print name to show a variable while at a breakpoint.

• run to run to the next breakpoint.

• stop # to break at a numbered line.

• variables to list all variable names.

• exit to exit immediately.

When the virtual machine calls the debugger, the debugger first checks whether or not it is on a numbered line. If it isn't, it hands control back to the VM. Otherwise, its action depends on our current state. If we are single-stepping or if this line is a breakpoint, Otherwise, it does nothing.

The overall structure of the interactive debugger is:

import prompt from 'prompt-sync'

import DebuggerBase from './debugger-base.js'

const PROMPT_OPTIONS = { sigint: true }

class DebuggerInteractive extends DebuggerBase {
constructor () {
super()
this.singleStep = true
this.breakpoints = new Set()
this.lookup = {
'?': 'help',
c: 'clear',
l: 'list',
n: 'next',
p: 'print',
r: 'run',
s: 'stop',
v: 'variables',
x: 'exit'
}
}

handle (env, lineNum, op) {
if (lineNum === null) {
return
}
if (this.singleStep) {
this.singleStep = false
this.interact(env, lineNum, op)
} else if (this.breakpoints.has(lineNum)) {
this.interact(env, lineNum, op)
}
}

}

export default DebuggerInteractivedebugger-interactive.js

It interacts with users by lookup up a command and invoking the corresponding method, just as the VM does:


interact (env, lineNum, op) {
let interacting = true
while (interacting) {
const command = this.getCommand(env, lineNum, op)
if (command.length === 0) {
continue
}
const [cmd, ...args] = command
if (cmd in this) {
interacting = this[cmd](env, lineNum, op, args)
} else if (cmd in this.lookup) {
interacting = this[this.lookup[cmd]](env, lineNum, op, args)
} else {
this.message(unknown command ${command} (use '?' for help)) } } } getCommand (env, lineNum, op) { const options = Object.keys(this.lookup).sort().join('') const display = [${lineNum} ${options}]  return this.input(display) .split(/\s+/) .map(s => s.trim()) .filter(s => s.length > 0) } input (display) { return prompt(PROMPT_OPTIONS)(display) }debugger-interactive.js ### Learning as we go We didn't originally put the input and output in methods that could be overridden, but realized later we needed to do this to make the debugger testable. Rather than coming back and rewriting this, we have done it here. With this structure in place, the command handlers are pretty straightforward. For example, this method moves us to the next line:  next (env, lineNum, op, args) { this.singleStep = true return false }debugger-interactive.js while this one prints the value of a variable:  print (env, lineNum, op, args) { if (args.length !== 1) { this.message('p[rint] requires one variable name') } else if (!(args[0] in env)) { this.message(unknown variable name "${args[0]}")
} else {
this.message(JSON.stringify(env[args[0]]))
}
return true
}debugger-interactive.js

After using this for a few moments, though we realized that we needed to change the signature of the loop method. We want to stop the loop each time it runs, and need to know where we are. We didn't allow for this in the base class, and we don't want to have to change every method, so we take advantage of the fact that JavaScript ignores any extra arguments passed to a method:

import VirtualMachineCallback from './vm-callback.js'

class VirtualMachineInteractive extends VirtualMachineCallback {
loop (args, lineNum) {
this.checkBody('loop', 1, args)
const body = args.slice(1)
while (this.exec(args[0])) {
this.dbg.handle(this.env, lineNum, 'loop')
this.runAll(body)
}
}
}

export default VirtualMachineInteractive
vm-interactive.js

This is sloppy, but it works; we will tidy it up in the exercises.

## How can we test an interactive application?

How can we test an interactive application like a debugger? The answer is, "By making it non-interactive." Like many tools over the past thirty years, our approach is based on a program called Expect. Our library replaces the input and output functions of the application being tested with callbacks, then provides input when asked and checks output when it is given ().

The results look like this:


describe('interactive debugger', () => {
it('runs and prints', (done) => {
setup('print-0.json')
.get('[1 ?clnprsvx] ')
.send('r')
.get('>> 0')
.run()
done()
})

it('breaks and resumes', (done) => {
setup('print-3.json')
.get('[1 ?clnprsvx] ')
.send('s 3')
.get('[1 ?clnprsvx] ')
.send('r')
.get('>> 0')
.get('>> 1')
.get('[3 ?clnprsvx] ')
.send('x')
.run()
done()
})
})test/test-expect.js

Our Expect class may be short, but it is hard to understand because it is so abstract:

import assert from 'assert'

class Expect {
constructor (subject, start) {
this.start = start
this.steps = []
subject.setTester(this)
}

send (text) {
this.steps.push({ op: 'toSystem', arg: text })
return this
}

get (text) {
this.steps.push({ op: 'fromSystem', arg: text })
return this
}

run () {
this.start()
assert.strictEqual(this.steps.length, 0,
'Extra steps at end of test')
}

toSystem () {
return this.next('toSystem')
}

fromSystem (actual) {
const expected = this.next('fromSystem')
assert.strictEqual(expected, actual,
Expected "${expected}" got "${actual}")
}

next (kind) {
assert(this.steps.length > 0,
'Unexpected end of steps')
assert.strictEqual(this.steps[0].op, kind,
Expected ${kind}, got "${this.steps[0].op}")
const text = this.steps[0].arg
this.steps = this.steps.slice(1)
return text
}
}

export default Expect
expect.js

Piece by piece:

• subject is the thing being tested.
• start is a callback to start the system running. It gives control to the subject, which then calls back into the test framework for input and output.
• get and send store things to be given to the subject and to be checked against its output. Both methods return this so that we can chain calls together.
• run starts the system and checks that all expected interactions have been used up when testing is done.
• toSystem and fromSystem use next to get the next test record, check its type, and return the string.

Let's modify the debugger to use the tester, keeping in mind that the prompt counts as an output (and yes, we forgot this in the first version):

import DebuggerInteractive from './debugger-interactive.js'

class DebuggerTest extends DebuggerInteractive {
constructor () {
super()
this.tester = null
}

setTester (tester) {
this.tester = tester
}

input (display) {
this.tester.fromSystem(display)
return this.tester.toSystem()
}

message (m) {
this.tester.fromSystem(m)
}
}

export default DebuggerTest
debugger-test.js

Again, we can't pass the tester as a constructor parameter because of initialization order, so we write a setup function to make sure everything is connected the right way:


import Expect from '../expect.js'
import VM from '../vm-interactive.js'
import Debugger from '../debugger-test.js'

const setup = (filename) => {
const dbg = new Debugger()
const vm = new VM(lines, dbg)
return new Expect(dbg, () => vm.run())
}test/test-expect.js

Let's try running our tests:

npm run test -- -g 'interactive debugger'
test-expect.sh

> stjs@1.0.0 test /u/stjs
> mocha */test/test-*.js "-g" "interactive debugger"

interactive debugger
✓ runs and prints
test-expect.out

That works—or does it? Why is only one test shown, and doesn't the summary appear? After a bit of digging, we realize that the debugger's exit command calls process.exit when the simulated program ends, so the whole program including the VM, debugger, and test framework stops immediately before the promises that contain the tests have run.

We could fix this by modifying the debugger callback to return an indication of whether or not execution should continue, then modify the VM to pay attention to that flag. However, this approach becomes very complicated when we have deeply-nested calls to exec, which will happen with loops and conditionals.

A better alternative is to use an exception for control flow. We can define our own kind of exception as an empty class: it doesn't need any data because we are only using it to get a typed object:

class HaltException {
}

export default HaltException
halt-exception.js

Next, we modify the debugger to throw this exception when asked to exit:

import HaltException from './halt-exception.js'
import DebuggerTest from './debugger-test.js'

class DebuggerExit extends DebuggerTest {
exit (env, lineNum, op, args) {
throw new HaltException()
}
}

export default DebuggerExit
debugger-exit.js

And finally we modify the VM to finish cleanly if this exception is thrown, but re-throw any other kind of exception:

import HaltException from './halt-exception.js'
import VirtualMachineInteractive from './vm-interactive.js'

class VirtualMachineExit extends VirtualMachineInteractive {
run () {
this.env = {}
try {
this.runAll(this.program)
} catch (exc) {
if (exc instanceof HaltException) {
return
}
throw exc
}
}
}

export default VirtualMachineExit
vm-exit.js

With these changes in place, we are finally able to test our interactive debugger:

npm run test -- -g 'exitable debugger'
test-exit.sh

> stjs@1.0.0 test /u/stjs
> mocha */test/test-*.js "-g" "exitable debugger"

exitable debugger
✓ runs and prints
✓ breaks and resumes

2 passing (7ms)
test-exit.out

## Exercises

Read the documentation for prompt-sync and then implement tab completion for the debugger.

Add a set command that sets the value of a variable to a new value in a running program. How do you handle setting array elements?

Modify the tracing debugger so that the statements inside loops and conditionals are indented for easier reading.

Our solution for handling loops is sloppy; fix it.

Modify the debugger and virtual machine to use a "continue executing" flag rather than throwing an exception when execution should end. Which approach is easier to understand? Which will be easier to extend in future?

Write a tool that takes a JSON program representation without statement numbers and produces one that numbers all of the interesting statements for debugging purposes. Use whatever definition of "interesting" you think would be most useful.

Implement a "next loop iteration" command that runs the program until it reaches the current point in the next iteration of the current loop.

Rather than having some objects call setXYZ methods in other objects, it is common practice to use a lookup table for mutual dependencies:

1. Every object initializes calls table.set(name, this) in its constructor.

2. Whenever object A needs the instance of object B, it calls table.lookup('B'). It does not store the result in a member variable.

Modify the virtual machine and debugger to use this pattern.

Modify the debugger and virtual machine to implement watchpoints that halt the program whenever the value of a variable changes.

Write a tool that translates the JSON program representation into the assembly code of . To simplify things, increase the number of registers so that there is always storage for intermediate results when doing arithmetic.