Page Templates
Generating HTML pages from templates
Terms defined:
Every program needs documentation in order to be usable, and the best place to put that documentation is on the web. Writing and updating pages by hand is time-consuming and error-prone, particularly when so many of their parts are the same, so most websites use some kind of tool to create HTML from templates.
Thousands of page templating systems have been written in the last thirty years
in every popular programming language
(and in fact one language, PHP, was created for this purpose).
Most of these systems use one of three designs
(
-
Mix commands in a language such as JavaScript with the HTML or Markdown using some kind of marker to indicate which parts are commands and which parts are to be taken as-is. This approach is taken by EJS, which we have used to write these lessons.
-
Create a mini-language with its own commands like Jekyll (the templating system used by GitHub Pages). Mini-languages are appealing because they are smaller and safer than general-purpose languages, but experience shows that they quickly grow many of the features of a general-purpose language. Again, some kind of marker must be used to show which parts of the page are code and which are ordinary text.
-
Use specially-named attributes in the HTML. This approach has been the least popular, but eliminates the need for a special parser (since pages are valid HTML).
In this chapter we will build a simple page templating system using the third option.
We will process each page independently by parsing the HTML
and walking the
What will our system look like?
Let's start by deciding what "done" looks like. Suppose we want to turn an array of strings into an HTML list. Our page will look like this:
<html>
<body>
<p>Expect three items</p>
<ul q-loop="item:names">
<li><span q-var="item"/></li>
</ul>
</body>
</html>
The attribute q-loop
tells the tool to repeat that node;
the loop variable and the collection being looped over
are the attribute's value, separated by a colon.
The attribute q-var
tells the tool to fill in the node with the value of the variable.
The output will look like HTML without any traces of how it was created:
<html>
<body>
<p>Expect three items</p>
<ul>
<li><span>Johnson</span></li>
<li><span>Vaughan</span></li>
<li><span>Jackson</span></li>
</ul>
</body>
</html>
Human-readable vs. machine-readable
The introduction said that mini-languages for page templating quickly start to accumulate extra features. We have already started down that road by putting the loop variable and loop target in a single attribute and parsing that attribute to get them out. Doing that makes loop elements easier for people to type, but means that important information is hidden from standard HTML processing tools, which can't know that this particular attribute of these particular elements contains multiple values or that those values should be extracted by splitting a string on a colon. We could instead require people to use two attributes, as in:
<ul q-loop="names" q-loop-var="item">
What about processing templates? Our tool needs the template itself, somewhere to write its output, and some variables to use in the expansion. These variables might come from a configuration file, from a YAML header in the file itself, or from some mix of the two; for the moment, all we need to know is that we wil pass them into the expansion function as an object:
const variables = {
names: ['Johnson', 'Vaughan', 'Jackson']
}
const dom = readHtml('template.html')
const expander = new Expander(dom, variables)
expander.walk()
console.log(expander.result)
How can we keep track of values?
Speaking of variables, we need a way to keep track of their current values: "current", because the value of a loop variable changes each time we go around the loop. We also need to maintain multiple sets of variables so that we can nest loops.
The standard solution is to create a stack of lookup tables.
Each
Scoping rules
Searching the stack frame by frame is called is
The values in a running program are sometimes called an Env
.
Its methods let us push and pop new stack frames
and find a variable given its name;
if the variable can't be found,
Env.find
returns undefined
instead of throwing an exception
(
class Env {
constructor (initial) {
this.stack = []
this.push(Object.assign({}, initial))
}
push (frame) {
this.stack.push(frame)
}
pop () {
this.stack.pop()
}
find (name) {
for (let i = this.stack.length - 1; i >= 0; i--) {
if (name in this.stack[i]) {
return this.stack[i][name]
}
}
return undefined
}
toString () {
return JSON.stringify(this.stack)
}
}
export default Env
How do we handle nodes?
HTML pages have a nested structure,
so we will process them using the Visitor
's constructor takes the root node of the DOM tree as an argument and saves it.
When we call Visitor.walk
without a value,
it starts recursing from that saved root;
if .walk
is given a value (as it is during recursive calls),
it uses that instead.
import assert from 'assert'
class Visitor {
constructor (root) {
this.root = root
}
walk (node = null) {
if (node === null) {
node = this.root
}
if (this.open(node)) {
node.children.forEach(child => {
this.walk(child)
})
}
this.close(node)
}
open (node) {
assert(false,
'Must implemented "open"')
}
close (node) {
assert(false,
'Must implemented "close"')
}
}
export default Visitor
Visitor
defines two methods called open
and close
that are called
when we first arrive at a node and when we are finished with it
(
The Expander
class is a Visitor
and uses an Env
.
It loads a handler for each type of special node we support---we will write these in a moment---and
uses them to process each type of node:
-
If the node is plain text, copy it to the output.
-
If there is a handler for the node, call the handler's
open
orclose
method. -
Otherwise, open or close a regular tag.
import assert from 'assert'
import Visitor from './visitor.js'
import Env from './env.js'
import q_if from './q-if.js'
import q_loop from './q-loop.js'
import q_num from './q-num.js'
import q_var from './q-var.js'
const HANDLERS = {
'q-if': q_if,
'q-loop': q_loop,
'q-num': q_num,
'q-var': q_var
}
class Expander extends Visitor {
constructor (root, vars) {
super(root)
this.env = new Env(vars)
this.handlers = HANDLERS
this.result = []
}
open (node) {
if (node.type === 'text') {
this.output(node.data)
return false
} else if (this.hasHandler(node)) {
return this.getHandler(node).open(this, node)
} else {
this.showTag(node, false)
return true
}
}
close (node) {
if (node.type === 'text') {
return
}
if (this.hasHandler(node)) {
this.getHandler(node).close(this, node)
} else {
this.showTag(node, true)
}
}
...
}
export default Expander
Checking to see if there is a handler for a particular node and getting that handler are straightforward:
hasHandler (node) {
for (const name in node.attribs) {
if (name in this.handlers) {
return true
}
}
return false
}
getHandler (node) {
const possible = Object.keys(node.attribs)
.filter(name => name in this.handlers)
assert(possible.length === 1,
'Should be exactly one handler')
return this.handlers[possible[0]]
}
Finally, we need a few helper methods to show tags and generate output:
showTag (node, closing) {
if (closing) {
this.output(`</${node.name}>`)
return
}
this.output(`<${node.name}`)
for (const name in node.attribs) {
if (!name.startsWith('q-')) {
this.output(` ${name}="${node.attribs[name]}"`)
}
}
this.output('>')
}
output (text) {
this.result.push((text === undefined) ? 'UNDEF' : text)
}
getResult () {
return this.result.join('')
}
Notice that this class adds strings to an array and then joins them all right at the end rather than concatenating strings repeatedly. Doing this is more efficient and also helps with debugging, since each string in the array corresponds to a single method call.
How do we implement node handlers?
So far we have built a lot of infrastructure but haven't actually processed a single special node. To do that, let's start with a handler that copies a constant number into the output:
export default {
open: (expander, node) => {
expander.showTag(node, false)
expander.output(node.attribs['q-num'])
},
close: (expander, node) => {
expander.showTag(node, true)
}
}
When we enter a node like <span q-num="123"/>
,
this handler prints an opening tag
and then copies the value of the q-num
attribute to the output.
When we are exiting the node,
the handler closes the tag.
Note that this is not a class,
but instead an object with two functions stored under the keys open
and close
.
We could (and probably should) use a class for each handler
so that handlers can store any extra state they need,
but
So much for constants; what about variables?
export default {
open: (expander, node) => {
expander.showTag(node, false)
expander.output(expander.env.find(node.attribs['q-var']))
},
close: (expander, node) => {
expander.showTag(node, true)
}
}
This code is almost the same as the previous example; the only difference is that instead of copying the attribute value directly to the output, we use the attribute value as a key to look up a value in the environment.
These two pairs of handlers look plausible, but do they work? To find out, we can build a program that loads variable definitions from a JSON file, reads an HTML template, and does the expansion:
import fs from 'fs'
import htmlparser2 from 'htmlparser2'
import Expander from './expander.js'
const main = () => {
const vars = readJSON(process.argv[2])
const doc = readHtml(process.argv[3])
const expander = new Expander(doc, vars)
expander.walk()
console.log(expander.getResult())
}
const readJSON = (filename) => {
const text = fs.readFileSync(filename, 'utf-8')
return JSON.parse(text)
}
const readHtml = (filename) => {
const text = fs.readFileSync(filename, 'utf-8')
return htmlparser2.parseDOM(text)[0]
}
main()
As we were writing this chapter, we added new variables for our test cases one by one. To avoid repeating text repeatedly, we show the entire set once:
{
"firstVariable": "firstValue",
"secondVariable": "secondValue",
"variableName": "variableValue",
"showThis": true,
"doNotShowThis": false,
"names": ["Johnson", "Vaughan", "Jackson"]
}
Our first test: is static text copied over as-is?
<html>
<body>
<h1>Only Static Text</h1>
<p>This document only contains:</p>
<ul>
<li>static</li>
<li>text</li>
</ul>
</body>
</html>
node template.js vars.json input-static-text.html
<html>
<body>
<h1>Only Static Text</h1>
<p>This document only contains:</p>
<ul>
<li>static</li>
<li>text</li>
</ul>
</body>
</html>
Only Static Text
This document only contains:
- static
- text
Good. Now, does the expander handle constants?
<html>
<body>
<p><span q-num="123"/></p>
</body>
</html>
<html>
<body>
<p><span>123</span></p>
</body>
</html>
123
What about a single variable?
<html>
<body>
<p><span q-var="variableName"/></p>
</body>
</html>
<html>
<body>
<p><span>variableValue</span></p>
</body>
</html>
variableValue
What about a page containing multiple variables? There's no reason it should fail if the single-variable case works, but variable lookup is one of the more complicated parts of our processing, so we should check:
<html>
<body>
<p><span q-var="firstVariable" /></p>
<p><span q-var="secondVariable" /></p>
</body>
</html>
<html>
<body>
<p><span>firstValue</span></p>
<p><span>secondValue</span></p>
</body>
</html>
firstValue
secondValue
How can we implement control flow?
Our tool supports two types of control flow:
conditional expressions and loops.
Since we don't support Boolean expressions like and
and or
,
implementing a conditional is as simple as looking up a variable
(which we know how to do)
and then expanding the node if the value is true:
export default {
open: (expander, node) => {
const doRest = expander.env.find(node.attribs['q-if'])
if (doRest) {
expander.showTag(node, false)
}
return doRest
},
close: (expander, node) => {
if (expander.env.find(node.attribs['q-if'])) {
expander.showTag(node, true)
}
}
}
Let's test it:
<html>
<body>
<p q-if="showThis">This should be shown.</p>
<p q-if="doNotShowThis">This should <em>not</em> be shown.</p>
</body>
</html>
<html>
<body>
<p>This should be shown.</p>
</body>
</html>
This should be shown.
And finally we come to loops. For these, we need to get the array we're looping over from the environment and do something once for each of its elements. That "something" is:
-
Create a new stack frame holding the current value of the loop variable.
-
Expand all of the node's children with that stack frame in place.
-
Pop the stack frame to get rid of the temporary variable.
export default {
open: (expander, node) => {
const [indexName, targetName] = node.attribs['q-loop'].split(':')
delete node.attribs['q-loop']
expander.showTag(node, false)
const target = expander.env.find(targetName)
for (const index of target) {
expander.env.push({ [indexName]: index })
node.children.forEach(child => expander.walk(child))
expander.env.pop()
}
return false
},
close: (expander, node) => {
expander.showTag(node, true)
}
}
Once again, it's not done until we test it:
<html>
<body>
<p>Expect three items</p>
<ul q-loop="item:names">
<li><span q-var="item"/></li>
</ul>
</body>
</html>
<html>
<body>
<p>Expect three items</p>
<ul>
<li><span>Johnson</span></li>
<li><span>Vaughan</span></li>
<li><span>Jackson</span></li>
</ul>
</body>
</html>
Expect three items
- Johnson
<li><span>Vaughan</span></li>
<li><span>Jackson</span></li>
</ul>
Notice how we create the new stack frame using:
{ [indexName]: index }
This is an ugly but useful trick. We can't write:
{ indexName: index }
because that would create an object with the string indexName
as a key,
rather than one with the value of the variable indexName
as its key.
We can't do this either:
{ `${indexName}`: index }
though it seems like we should be able to. Instead, we create an array containing the string we want. JavaScript automatically converts arrays to strings by concatenating their elements when it needs to, so our expression is a quick way to get the same effect as:
const temp = {}
temp[indexName] = index
expander.env.push(temp)
How did we know how to do all of this?
We have just implemented a simple programming language. It can't do arithmetic, but if we wanted to add tags like:
<span q-math="+"><span q-var="width"/><span q-num="1"/></span>
we could.
It's unlikely anyone would use the result---typing all of that
is so much clumsier than typing width+1
that people wouldn't use it
unless they had no other choice---but the basic design is there.
We didn't invent any of this from scratch,
any more than we invented the parsing algorithm of
Exercises
Tracing execution
Add a directive <span q-trace="variable"/>
that prints the current value of a variable using console.error
for debugging.
Unit tests
Write unit tests for template expansion using Mocha.
Trimming text
Modify all of the directives to take an extra optional attribute q-trim="true"
If this attribute is set,
leading and trailing whitespace is trimmed from the directive's expansion.
Literal text
Add a directive <div q-literal="true">…</div>
that copies the enclosed text as-is
without interpreting or expanding any contained directives.
(A directive like this would be needed when writing documentation for the template expander.)
Including other files
-
Add a directive
<div q-include="filename.html"/>
that includes another file in the file being processed. -
Should included files be processed and the result copied into the including file, or should the text be copied in and then processed? What difference does it make to the way variables are evaluated?
HTML snippets
Add a directive <div q-snippet="variable">…</div>
that saves some text in a variable
so that it can be displayed later.
For example:
<html>
<body>
<div q-snippet="prefix"><strong>Important:</strong></div>
<p>Expect three items</p>
<ul>
<li q-loop="item:names">
<span q-var="prefix"><span q-var="item"/>
</li>
</ul>
</body>
</html>
would printed the word "Important:" in bold before each item in the list.
YAML headers
Modify the template expander to handle variables defined in a YAML header in the page being processed. For example, if the page is:
---
name: "Dorothy Johnson Vaughan"
---
<html>
<body>
<p><span q-var="name"/></p>
</body>
</html>
will create a paragraph containing the given name.
Expanding all files
Write a program expand-all.js
that takes two directory names as command-line arguments
and builds a website in the second directory by expanding all of the HTML files found in the first
or in sub-directories of the first.
Counting loops
Add a directive <div q-index="indexName" q-limit="limitName">…</div>
that loops from zero to the value in the variable limitName
,
putting the current iteration index in indexName
.
Auxiliary functions
-
Modify
Expander
so that it takes an extra argumentauxiliaries
containing zero or more named functions:const expander = new Expander(root, vars, { max: Math.max, trim: (x) => x.trim() })
-
Add a directive
<span q-call="functionName" q-args="var,var"/>
that looks up a function inauxiliaries
and calls it with the given variables as arguments.