Code Generator
Modifying code to track coverage and execution times
Terms defined:
We've been writing tests since
Our code coverage tool will keep track of which functions have and haven't been called. Since we don't want to rewrite Node to make a note each time every function is called, we will modify the functions themselves by parsing the code with Acorn, inserting the instructions we need into the AST, and then turning the AST back into code.
How can we replace a function with another function?
The first thing we need is a way to wrap up an arbitrary function call.
If we declare a function in JavaScript with a parameter like ...args
,
all of the "extra" arguments in the call that don't line up with regular parameters
are stuffed into args
(func(...var)
to spread those values out.
We can use ...args
to capture all of the arguments to a function call
and forward them to another function.
Let's start by creating functions with a varying number of parameters
that run to completion or throw an exception,
then run them to make sure they do what we want:
let funcZero = () => console.log('funcZero')
let funcOne = (first) => console.log(`funcOne(${first})`)
let funcTwo = (first, second) => console.log(`funcTwo(${first}, ${second})`)
let funcError = () => {
console.log('funcError')
throw new Error('from funcError')
console.log('should not reach this')
}
const runAll = (title) => {
console.log(title)
funcZero()
funcOne(1)
funcTwo(1, 2)
try {
funcError()
} catch (error) {
console.log(`caught ${error} as expected`)
}
console.log()
}
runAll('first time')
We can now write a function that takes a function as an input and creates a new function that handles all of the errors in the original function:
const replace = (func) => {
return (...args) => {
console.log('before')
try {
const result = func(...args)
console.log('after')
return result
} catch (error) {
console.log('error')
throw error
}
}
}
funcZero = replace(funcZero)
funcOne = replace(funcOne)
funcTwo = replace(funcTwo)
funcError = replace(funcError)
runAll('second time')
Let's try it out:
first time
funcZero
funcOne(1)
funcTwo(1, 2)
funcError
caught Error: from funcError as expected
second time
before
funcZero
after
before
funcOne(1)
after
before
funcTwo(1, 2)
after
before
funcError
error
caught Error: from funcError as expected
This is an example of the
How can we generate JavaScript?
We could use a decorator to replace every function in our program with one that kept track of whether or not it was called, but we would have to apply it to every one of our functions. What we really want is a way to do this automatically for everything; for that, we need to parse and generate code.
Other ways to do it
A third way to achieve what we want is
to let the system turn code into runnable instructions
and then modify those instructions.
We can't do this because Node doesn't save the generated
Our tool will parse the JavaScript with Acorn to create an AST, modify the AST, and then use a library called Escodegen to turn the AST back into JavaScript. To start, let's look at the Acorn parse tree for a simple function definition, which is 76 lines of pretty-printed JSON:
import acorn from 'acorn'
const text = `const func = (param) => {
return param + 1
}`
const ast = acorn.parse(text, { sourceType: 'module' })
console.log(JSON.stringify(ast, null, 2))
{
"type": "Program",
"start": 0,
"end": 46,
"body": [
{
"type": "VariableDeclaration",
"start": 0,
"end": 46,
"declarations": [
{
"type": "VariableDeclarator",
"start": 6,
"end": 46,
"id": {
"type": "Identifier",
"start": 6,
"end": 10,
"name": "func"
},
"init": {
"type": "ArrowFunctionExpression",
"start": 13,
"end": 46,
"id": null,
"expression": false,
"generator": false,
"async": false,
"params": [
{
"type": "Identifier",
"start": 14,
"end": 19,
"name": "param"
}
],
"body": {
"type": "BlockStatement",
"start": 24,
"end": 46,
"body": [
{
"type": "ReturnStatement",
"start": 28,
"end": 44,
"argument": {
"type": "BinaryExpression",
"start": 35,
"end": 44,
"left": {
"type": "Identifier",
"start": 35,
"end": 40,
"name": "param"
},
"operator": "+",
"right": {
"type": "Literal",
"start": 43,
"end": 44,
"value": 1,
"raw": "1"
}
}
}
]
}
}
}
],
"kind": "const"
}
],
"sourceType": "module"
}
After inspecting a few nodes, we can try to create a few of our own and turn them into code:
import escodegen from 'escodegen'
const result = escodegen.generate({
type: 'BinaryExpression',
operator: '+',
left: { type: 'Literal', value: 40 },
right: { type: 'Literal', value: 2 }
})
console.log(result)
40 + 2
How can we count how often functions are executed?
Our tool will find all the function declaration nodes in the program
and insert a node to increment an entry in a global variable called __counters
.
Our test case is:
const TEXT = `
const funcOuter = (param) => {
return param + 1
}
const funcInner = (param) => {
return param + 1
}
for (const i of [1, 3, 5]) {
funcOuter(funcInner(i) + funcInner(i))
}
`
and the main function of our program is:
const main = () => {
const ast = acorn.parse(TEXT, { sourceType: 'module' })
const allNodes = []
walk.simple(ast, {
VariableDeclarator: (node, state) => {
if (node.init && (node.init.type === 'ArrowFunctionExpression')) {
state.push(node)
}
}
}, null, allNodes)
const names = {}
allNodes.forEach(node => insertCounter(names, node))
console.log(initializeCounters(names))
console.log(escodegen.generate(ast))
console.log(reportCounters())
}
To insert a count we call insertCounter
,
which records the function's name and modifies the node:
const insertCounter = (names, node) => {
const name = node.id.name
names[name] = 0
const body = node.init.body.body
const increment =
acorn.parse(`__counters['${name}'] += 1`, { sourceType: 'module' })
body.unshift(increment)
}
Notice how we don't try to build the nodes by hand, but instead construct the string we need, use Acorn to parse that, and use the result. Doing this saves us from embedding multiple lines of JSON in our program and also ensures that if the AST for our code ever changes, the program will do the right thing automatically.
Finally, we need to add a couple of helper functions:
const initializeCounters = (names) => {
const body = Object.keys(names).map(n => `'${n}': 0`).join(',\n')
return 'const __counters = {\n' + body + '\n}'
}
const reportCounters = () => {
return 'console.log(__counters)'
}
and to run it to make sure it all works:
const __counters = {
'funcOuter': 0,
'funcInner': 0
}
const funcOuter = param => {
__counters['funcOuter'] += 1;
return param + 1;
};
const funcInner = param => {
__counters['funcInner'] += 1;
return param + 1;
};
for (const i of [
1,
3,
5
]) {
funcOuter(funcInner(i) + funcInner(i));
}
console.log(__counters)
Too simple to be safe
Our simple approach doesn't work if functions can have the same names,
which they can if we use modules or
How can we time function execution?
We can use this same strategy to do many other things. For example, we can find out how long it takes functions to run by wrapping them up in code that records the start and end time of each call. As before, we find the nodes of interest and decorate them, then stitch the result together with a bit of administrative code:
const timeFunc = (text) => {
const ast = acorn.parse(text, { sourceType: 'module' })
const allNodes = gatherNodes(ast)
allNodes.forEach(node => wrapFuncDef(node))
return [
initializeCounters(allNodes),
escodegen.generate(ast),
reportCounters()
].join('\n')
}
Gathering nodes is straightforward:
const gatherNodes = (ast) => {
const allNodes = []
walk.simple(ast, {
VariableDeclarator: (node, state) => {
if (node.init && (node.init.type === 'ArrowFunctionExpression')) {
state.push(node)
}
}
}, null, allNodes)
return allNodes
}
as is wrapping the function definition:
const wrapFuncDef = (originalAst) => {
const name = originalAst.id.name
const wrapperAst = makeWrapperAst(name)
wrapperAst.init.body.body[0].declarations[0].init = originalAst.init
originalAst.init = wrapperAst.init
}
The only big difference is how we make the wrapper function. We create it with a placeholder for the original function so that we have a spot in the AST to insert the actual code:
const timeFunc = (text) => {
const ast = acorn.parse(text, { sourceType: 'module' })
const allNodes = gatherNodes(ast)
allNodes.forEach(node => wrapFuncDef(node))
return [
initializeCounters(allNodes),
escodegen.generate(ast),
reportCounters()
].join('\n')
}
One more test:
const __counters = {
'assignment': 0,
'readFile': 0
}
const assignment = (...originalArgs) => {
const originalFunc = range => {
let j = 0;
for (let i = 0; i < range; i += 1) {
j = i;
}
};
const startTime = Date.now();
try {
const result = originalFunc(...originalArgs);
const endTime = Date.now();
__counters['assignment'] += endTime - startTime;
return result;
} catch (error) {
const endTime = Date.now();
__counters['assignment'] += endTime - startTime;
throw error;
}
};
const readFile = (...originalArgs) => {
const originalFunc = (range, filename) => {
for (let i = 0; i < range; i += 1) {
fs.readFileSync(filename, 'utf-8');
}
};
const startTime = Date.now();
try {
const result = originalFunc(...originalArgs);
const endTime = Date.now();
__counters['readFile'] += endTime - startTime;
return result;
} catch (error) {
const endTime = Date.now();
__counters['readFile'] += endTime - startTime;
throw error;
}
};
const numLoops = 100000;
assignment(numLoops);
readFile(numLoops, 'index.md');
console.log(__counters)
OUTPUT
{ assignment: 1, readFile: 2807 }
Source-to-source translation is widely used in JavaScript:
tools like Babel use it to turn modern features like async
and await
(
Exercises
JSON to JavaScript
Write a tool that uses Escodegen to translate simple expressions written in JSON into runnable JavaScript. For example, the tool should translate:
['+', 3, ['*', 5, 'a']]
into:
3 + (5 * a)
JavaScript to HTML
Write a function that takes nested JavaScript function calls for generating HTML like this:
div(h1('title'), p('explanation'))
and turns them into HTML like this:
<div><h1>title</h1><p>explanation</p></div>
Handling modules
Modify the code that counts the number of times a function is called to handle functions with the same name from different modules.
Tracking calls
Write a decorator that takes a function as its argument and returns a new function that behaves exactly the same way except that it keeps track of who called it.
-
The program contains a stack where decorated functions push and pop their names as they are called and as they exit.
-
Each time a function is called it adds a record to an array to record its name and the name at the top of the stack (i.e., the most-recently-called decorated function).
Counting classical function definitions
Modify the code generator to handle functions declared with the function
keyword
as well as functions declared using =>
.
Recording input file size
-
Write a program that replaces all calls to
fs.readFileSync
with calls toreadFileSyncCount
. -
Write the function
readFileSyncCount
to read and return a file usingfs.readFileSync
but to also record the file's name and size in bytes. -
Write a third function
reportInputFileSizes
that reports what files were read and how large they were. -
Write tests for these functions using Mocha and
mock-fs
.
Checking argument types
Write a tool that modifies functions to check the types of their arguments at run-time.
-
Each function is replaced by a function that passes all of its arguments to
checkArgs
along with the function's name, then continues with the function's original operation. -
The first time
checkArgs
is called for a particular function it records the actual types of the arguments. -
On subsequent calls, it checks that the argument types match those of the first call and throws an exception if they do not.
Two-dimensional arrays
The function make2D
takes a row length and one or more values
and creates a two-dimensional array from those values:
make2D(2, 'a', 'b', 'c', 'd')
// produces [['a', 'b'], ['c', 'd']]
Write a function that searches code to find calls to make2D
and replaces them with inline arrays-of-arrays.
This function only has to work for calls with a fixed row length,
i.e., it does not have to handle make2D(N, 'a', 'b')
.
From require to import
Write a function that searches code for simple calls to require
and replaces them with calls to import
.
This function only needs to work for the simplest case;
for example, if the input is:
const name = require('module')
then the output is:
import name from 'module'
Removing empty constructors
Write a function that removes empty constructors from class definitions. For example, if the input is:
class Example {
constructor () {
}
someMethod () {
console.log('some method')
}
}
then the output should be:
class Example {
someMethod () {
console.log('some method')
}
}