Package Manager
Getting and installing packages
Terms defined:
There is no point building software if you can't install it. Inspired by the Comprehensive TeX Archive Network CTAN, most languages now have an online archive from which developers can download packages. Each package typically has a name and one or more version(s), each of which may have a list of dependencies, and the package may specify a version or range of versions for each of those dependencies.
Downloading files requires some web programming that is out of scope for this book,
while installing those files uses the systems programming skills of
This chapter explores how to find a workable installation or prove that there isn't one. It is based in part on this tutorial by Maël Nison.
Satisfiability
What we are trying to do is find a version for each package
that makes the assertion "P is compatible with all its dependencies" true
for every package P.
The general-purpose tools for doing this are called
What is semantic versioning?
Most software projects use
A package's authors increment its major version number
every time something changes in a way that makes the package incompatible with previous versions
(e.g., add a required parameter to a function).
They increment the minor version number when they new functionality
in a >= 1.2.3
means "any version after 1.2.3",
< 4
means "any version before 4.anything",
and 1.0 - 3.1
means "any version in the specified range (including patches)".
Note that version 2.1 is greater than version 1.99.
The semver
module provides useful functions for working with semantic version identifier.
semver.valid('1.2.3')
checks that 1.2.3
is a valid version identifier,
while semver.satisfies('2.2', '1.0 - 3.1')
checks that its first argument
is compatible with the range specified in its second.
How can we find a consistent set of packages?
Imagine that each package we need is represented as an axis on a graph,
with its versions as the tick marks
(
For example,
suppose we have the set of requirements shown in
Package | Requires |
---|---|
X/1 | Y/1-2 |
X/1 | Z/1 |
X/2 | Y/2-3 |
X/2 | Z/1-2 |
Y/1 | Z/2 |
Y/2 | Z/2-3 |
Y/3 | Z/3 |
Z/1 | |
Z/2 | |
Z/3 |
X | Y | Z | Excluded |
---|---|---|---|
1 | 1 | 1 | Y/1 - Z/1 |
1 | 1 | 2 | X/1 - Z/2 |
1 | 1 | 3 | X/1 - Z/3 |
1 | 2 | 1 | Y/2 - Z/1 |
1 | 2 | 2 | X/1 - Z/2 |
1 | 2 | 3 | X/1 - Z/3 |
1 | 3 | 1 | X/1 - Y/3 |
1 | 3 | 2 | X/1 - Y/3 |
1 | 3 | 3 | X/1 - Y/3 |
2 | 1 | 1 | X/2 - Y/1 |
2 | 1 | 2 | X/2 - Y/1 |
2 | 1 | 3 | X/2 - Y/1 |
2 | 2 | 1 | Y/2 - Z/1 |
2 | 2 | 2 | |
2 | 2 | 3 | X/2 - Z/3 |
2 | 3 | 1 | Y/3 - Z/1 |
2 | 3 | 2 | Y/3 - Z/2 |
2 | 3 | 3 | X/2 - Z/3 |
To construct this table we found the transitive closure of all packages plus all of their dependencies. We then picked two and created a list of valid pairs. Choosing a third, we cross off pairs that can't be satisfied to leave triples of legal combinations. We repeat this until all packages are included in our table.
In the worst case this will create a
How can we implement constraint satisfaction?
To avoid messing around with parsers,
our programs reads a JSON data structure describing the problem.
A real package manager would read the
{
"X": {
"1": {
"Y": ["1"]
},
"2": {
"Y": ["2"]
}
},
"Y": {
"1": {},
"2": {}
}
}
Comments
If you ever design a data format, please include a standard way for people to add comments, because they will always want to. YAML has this, but JSON CSV don't.
To check if a configuration of specific versions of all packages is compatible with a manifest, we add each package to our active list in turn and look for violations. If there aren't any more packages on the active list and we haven't found a violation, then what we have must be a legal configuration.
import configStr from './config-str.js'
const sweep = (manifest) => {
const names = Object.keys(manifest)
const result = []
recurse(manifest, names, {}, result)
}
const recurse = (manifest, names, config, result) => {
if (names.length === 0) {
if (allows(manifest, config)) {
result.push({ ...config })
}
} else {
const next = names[0]
const rest = names.slice(1)
for (const version in manifest[next]) {
config[next] = version
recurse(manifest, rest, config, result)
}
}
}
...
export default sweep
The simplest way to find configurations is to sweep over all possibilities. For debugging purposes, our function prints possibilities as it goes:
const allows = (manifest, config) => {
for (const [leftN, leftV] of Object.entries(config)) {
const requirements = manifest[leftN][leftV]
for (const [rightN, rightVAll] of Object.entries(requirements)) {
if (!rightVAll.includes(config[rightN])) {
const title = configStr(config)
const missing = config[rightN]
console.log(`${title} @ ${leftN}/${leftV} ${rightN}/${missing}`)
return false
}
}
}
console.log(configStr(config))
return true
}
If we run this program on the two-package example shown earlier we get this output:
node driver.js ./sweep.js double-chained.json
{X:1 Y:1}
{X:1 Y:2} @ X/1 Y/2
{X:2 Y:1} @ X/2 Y/1
{X:2 Y:2}
When we run it on our triple-package example we get this:
node driver.js ./sweep.js triple.json
{X:1 Y:1 Z:1} @ Y/1 Z/1
{X:1 Y:1 Z:2} @ X/1 Z/2
{X:1 Y:1 Z:3} @ X/1 Z/3
{X:1 Y:2 Z:1} @ Y/2 Z/1
{X:1 Y:2 Z:2} @ X/1 Z/2
{X:1 Y:2 Z:3} @ X/1 Z/3
{X:1 Y:3 Z:1} @ X/1 Y/3
{X:1 Y:3 Z:2} @ X/1 Y/3
{X:1 Y:3 Z:3} @ X/1 Y/3
{X:2 Y:1 Z:1} @ X/2 Y/1
{X:2 Y:1 Z:2} @ X/2 Y/1
{X:2 Y:1 Z:3} @ X/2 Y/1
{X:2 Y:2 Z:1} @ Y/2 Z/1
{X:2 Y:2 Z:2}
{X:2 Y:2 Z:3} @ X/2 Z/3
{X:2 Y:3 Z:1} @ Y/3 Z/1
{X:2 Y:3 Z:2} @ Y/3 Z/2
{X:2 Y:3 Z:3} @ X/2 Z/3
This works, but it is doing a lot of unnecessary work. If we sort the output by the case that caught the exclusion it turns out that 9 of the 17 exclusions are redundant rediscovery of a previous-known problem:
Excluded | X | Y | Z |
---|---|---|---|
X/1 - Y/3 | 1 | 3 | 1 |
… | 1 | 3 | 2 |
… | 1 | 3 | 3 |
X/1 - Z/2 | 1 | 1 | 2 |
… | 1 | 2 | 2 |
X/1 - Z/3 | 1 | 1 | 3 |
… | 1 | 2 | 3 |
X/2 - Y/1 | 2 | 1 | 1 |
… | 2 | 1 | 2 |
… | 2 | 1 | 3 |
X/2 - Z/3 | 2 | 2 | 3 |
… | 2 | 3 | 3 |
Y/1 - Z/1 | 1 | 1 | 1 |
Y/2 - Z/1 | 1 | 2 | 1 |
… | 2 | 2 | 1 |
Y/3 - Z/1 | 2 | 3 | 1 |
… | 2 | 3 | 2 |
2 | 2 | 2 |
How can we do less work?
In order to make this more efficient we need to
This version of the program collects possible solutions and displays them at the end. It only keeps checking a partial solution if what it has found so far looks good:
import configStr from './config-str.js'
const prune = (manifest) => {
const names = Object.keys(manifest)
const result = []
recurse(manifest, names, {}, result)
for (const config of result) {
console.log(configStr(config))
}
}
const recurse = (manifest, names, config, result) => {
if (names.length === 0) {
result.push({ ...config })
} else {
const next = names[0]
const rest = names.slice(1)
for (const version in manifest[next]) {
config[next] = version
if (compatible(manifest, config)) {
recurse(manifest, rest, config, result)
}
delete config[next]
}
}
}
...
const report = (config, leftN, leftV, rightN, rightV) => {
const title = configStr(config)
console.log(`${title} @ ${leftN}/${leftV} ${rightN}/${rightV}`)
}
export default prune
The compatible
function checks to see if adding something will leave us with a consistent configuration:
const compatible = (manifest, config) => {
for (const [leftN, leftV] of Object.entries(config)) {
const leftR = manifest[leftN][leftV]
for (const [rightN, rightV] of Object.entries(config)) {
if ((rightN in leftR) && (!leftR[rightN].includes(rightV))) {
report(config, leftN, leftV, rightN, rightV)
return false
}
const rightR = manifest[rightN][rightV]
if ((leftN in rightR) && (!rightR[leftN].includes(leftV))) {
report(config, leftN, leftV, rightN, rightV)
return false
}
}
}
return true
}
Checking as we go gets us from 18 complete solutions to 11. One is workable and two are incomplete (representing 6 possible solutions that we didn't need to finish):
{X:1 Y:1 Z:1} @ Y/1 Z/1
{X:1 Y:1 Z:2} @ X/1 Z/2
{X:1 Y:1 Z:3} @ X/1 Z/3
{X:1 Y:2 Z:1} @ Y/2 Z/1
{X:1 Y:2 Z:2} @ X/1 Z/2
{X:1 Y:2 Z:3} @ X/1 Z/3
{X:1 Y:3} @ X/1 Y/3
{X:2 Y:1} @ X/2 Y/1
{X:2 Y:2 Z:1} @ Y/2 Z/1
{X:2 Y:2 Z:3} @ X/2 Z/3
{X:2 Y:3 Z:1} @ Y/3 Z/1
{X:2 Y:3 Z:2} @ Y/3 Z/2
{X:2 Y:3 Z:3} @ X/2 Z/3
{X:2 Y:2 Z:2}
Another way to look at the work is the number of steps in the search. The full search had 18×3 = 54 steps. Pruning leaves us with (12×3) + (2×2) = 40 steps so we have eliminated roughly 1/4 of the work.
What if we searched in the reverse order?
import configStr from './config-str.js'
// <reverse>
const reverse = (manifest) => {
const names = Object.keys(manifest)
names.reverse()
const result = []
recurse(manifest, names, {}, result)
for (const config of result) {
console.log(configStr(config))
}
}
// </reverse>
const recurse = (manifest, names, config, result) => {
if (names.length === 0) {
result.push({ ...config })
} else {
const next = names[0]
const rest = names.slice(1)
for (const version in manifest[next]) {
config[next] = version
if (compatible(manifest, config)) {
recurse(manifest, rest, config, result)
}
delete config[next]
}
}
}
const compatible = (manifest, config) => {
for (const [leftN, leftV] of Object.entries(config)) {
const leftR = manifest[leftN][leftV]
for (const [rightN, rightV] of Object.entries(config)) {
if ((rightN in leftR) && (!leftR[rightN].includes(rightV))) {
report(config, leftN, leftV, rightN, rightV)
return false
}
const rightR = manifest[rightN][rightV]
if ((leftN in rightR) && (!rightR[leftN].includes(leftV))) {
report(config, leftN, leftV, rightN, rightV)
return false
}
}
}
return true
}
const report = (config, leftN, leftV, rightN, rightV) => {
const title = configStr(config)
console.log(`${title} @ ${leftN}/${leftV} ${rightN}/${rightV}`)
}
export default reverse
{Z:1 Y:1} @ Z/1 Y/1
{Z:1 Y:2} @ Z/1 Y/2
{Z:1 Y:3} @ Z/1 Y/3
{Z:2 Y:1 X:1} @ Z/2 X/1
{Z:2 Y:1 X:2} @ Y/1 X/2
{Z:2 Y:2 X:1} @ Z/2 X/1
{Z:2 Y:3} @ Z/2 Y/3
{Z:3 Y:1} @ Z/3 Y/1
{Z:3 Y:2 X:1} @ Z/3 X/1
{Z:3 Y:2 X:2} @ Z/3 X/2
{Z:3 Y:3 X:1} @ Z/3 X/1
{Z:3 Y:3 X:2} @ Z/3 X/2
{Z:2 Y:2 X:2}
Now we have (8×3) + (5×2) = 34 steps,
i.e.,
we have eliminated roughly 1/3 of the work.
That may not seem like a big difference,
but if we go five levels deep at the same rate
it cuts the work in half.
There are lots of
Exercises
Comparing semantic versions
Write a function that takes an array of semantic version specifiers
and sorts them in ascending order.
Remember that 2.1
is greater than 1.99
.
Parsing semantic versions
Using the techniques of
Using scoring functions
Many different combinations of package versions can be mutually compatible.
One way to decide which actual combination to install
is to create a
const score (X, Y) => {
if (X.major !== Y.major) {
return 100 * abs(X.major - Y.major)
} else if (X.minor !== Y.minor) {
return 10 * abs(X.minor - Y.minor)
} else {
return abs(X.patch - Y.patch)
}
}
-
Implement a working version of this function and use it to measure the total distance between the set of packages found by the solver and the set containing the most recent version of each package.
-
Explain why this doesn't actually solve the original problem.
Using full semantic versions
Modify the constraint solver to use full semantic versions instead of single digits.
Regular releases
Some packages release new versions on a regular cycle, e.g., Version 2021.1 is released on March 1 of 2021, Version 2021.2 is released on September 1 of that year, version 2022.1 is released on March 1 of the following year, and so on.
-
How does this make package management easier?
-
How does it make it more difficult?
Writing unit tests
Write unit tests for the constraint solver using Mocha.
Generating test fixtures
Write a function that creates fixtures for testing the constraint solver:
-
Its first argument is an object whose keys are (fake) package names and whose values are integers indicating the number of versions of that package to include in the test set, such as
{'left': 3, 'middle': 2, 'right': 15}
. Its second argument is aseed for random number generation. -
It generates one valid configuration, such as
{'left': 2, 'middle': 2, 'right': 9}
. (This is to ensure that there is at least one installable set of packages.) -
It then generates random constraints between the packages. (These may or may not result in other installable combinations.) When this is done, it adds constraints so that the valid configuration from the previous step is included.
Searching least first
Rewrite the constraint solver so that it searches packages by looking at those with the fewest available versions first. Does this reduce the amount of work done for the small examples in this chapter? Does it reduce the amount of work done for larger examples?
Using generators
Rewrite the constraint solver to use generators.
Using exclusions
-
Modify the constraint solver so that it uses a list of package exclusions instead of a list of package requirements, i.e., its input tells it that version 1.2 of package Red can not work with versions 3.1 and 3.2 of package Green (which implies that Red 1.2 can work with any other versions of Green).
-
Explain why package managers aren't built this way.