We propose to use the HTML nav
element to bind together web resources into a web publication. Such a publication is characterized by an ordered list of links, and HTML is well suited to create such a list that is understandable by both human readers and machines. Basing web publications on a table of contents allows for easy authoring, a natural upgrade path from existing web books and EPUBs, avoids duplication, helps meet accessibility and usability requirements, and is functional with today’s web.
A Web Publication is a collection of one or more primary resources, organized together through a manifest into a single logical work with a default reading order. The Web Publication is uniquely identifiable and presentable using Open Web Platform technologies.
Note that a “manifest” is just the set of information necessary for a user agent to process and present a web publication:
Title. This identifies the title of the web publication, which can be distinct from any titles associated with the constituent resources.
Identifier. A unique identifier for the web publication. The identifier chosen will likely differ across use cases—for example, book publishers would likely use urn:isbn
.
Language. What is the language of the collection of resources?
List of Primary Resources. We must provide a list of primary resources, and the default ordering of those resources.
Identity as a Web Publication. We must indicate to a user agent that this collection of resources represents a web publication.
Metadata. We should provide a way to associate metadata with a web publication. Any particular piece of metadata is, however, optional.
How do we bind this collection of resources together? We need a list of the primary resources, with a default order. That’s an ordered list of URLs, which can be semantically represented by the HTML nav
element. Define the URL of a web publication to be the URL of this “index” resource which contains the nav
.
<nav role="doc-toc">
<ol>
<li><a href="html/c001.html">Loomings</a></li>
<li><a href="html/c002.html">The Carpet-Bag</a></li>
<li><a href="html/c003.html">The Spouter-Inn</a></li>
<li><a href="html/c004.html">The Counterpane</a></li>
</ol>
</nav>
But why HTML? Why nav
?
Design for humans. User agents need to know the primary resources and their default ordering, but so do human readers. Just seeing a list of URLs is not enough; one needs human-readable text describing those URLs. This is the heart of a table of contents—describe the contents and the location both.
Make authoring easy. HTML is the lingua franca of the web, a language we already know. It’s easy to find tools and syntax checkers. It’s easy to see what you’re doing, even with tricky things like nesting lists (and yes, good tables of contents are often hierarchical).
Don’t repeat yourself. “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”. Separating the human-readable table of contents from some machine-readable means maintaining two lists of resources.
Web publications for all. One reason for focusing on nav
as the glue that holds a publication together is because of the need for a table of contents that is available to humans and assistive technology, and that supports the visual nuance provided by CSS, and the internationalization features of HTML. WCAG requires multiple ways to navigate multi-document web sites. An HTML table of contents is a primary way to provide such navigation, and is available to assistive technology. CSS can help clarify the document structure, or help personalize for users (for example, providing high- or low-contrast options). WCAG also requires title and language information, which fit naturally in HTML.
Web publications everywhere. HTML and CSS can express most of the world’s scripts and languages.
Progressive enhancement. Basing web publications on nav
allows existing user agents to make web publications functional, and provides an easy path from existing content which has tables of contents (for example, every EPUB3 in the world). A user can point their browser at an HTML file, and the browser can render it. Even if no new features of web publications are implemented in that browser, or shimmed, the user can read the publication.
The html nav element is a good way to describe a default reading order, but is not suited for listing secondary resources. Having an external JSON file is convenient for machine-processing of this information. This could also provide an alternate location for publication-wide metadata.
nav
element, unless there’s only a single primary resource.nav
element, in the default order.nav
elements in the index document. This proposal only covers the first, in document order. The document author may decide not to display the first nav
, if they wish to present a different navigation structure to the user.rel=contents
link to the index resource.rel=prev
and rel=next
links as appropriate.html
element, to identify it as a web publication. The attribute may also be serialized as the text string book
.nav
file<!DOCTYPE html>
<html 📖 lang="en">
<head>
<title>Moby-Dick</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<link rel="next prefetch" href="html/c001.html">
<link rel="publication" href="manifest.json">
</head>
<body>
<main>
<nav role="doc-toc">
<ol>
<li><a href="html/c001.html">Loomings</a></li>
<li><a href="html/c002.html">The Carpet-Bag</a></li>
<li><a href="html/c003.html">The Spouter-Inn</a></li>
<li><a href="html/c004.html">The Counterpane</a></li>
</ol>
</nav>
</main>
</body>
</html>
{
"metadata": {
"creator": "Herman Melville",
"publisher": "Harper and Row"
},
"resources": [
{
"href": "html/c001.html",
"type": "text/html"
},
{
"href": "html/c002.html",
"type": "text/html"
},
{
"href": "html/c003.html",
"type": "text/html"
},
{
"href": "html/c004.html",
"type": "text/html"
},
{
"href": "css/MobyDick.css",
"type": "text/css"
},
{
"href": "Moby_Dick_final_chase.jpg",
"type": "image/jpeg"
},
{
"href": "Moby_Dick_p510_illustration.jpg",
"type": "image/jpeg"
},
{
"href": "moby-dick-book-cover.jpg",
"type": "image/jpeg"
}
]
}
The Theory of Heat Radiation. This book includes MathML, and has a nested table of contents to indicate the Part/Chapter structure. Note that this poses no problems for the TOC model of declaring primary resources and their order. Thanks to Infogrid Pacific for the files.
In these current demos, a Service Worker (SW) enables offline reading. It caches secondary resources (CSS, scripts, images, fonts referenced from CSS) without having an explicit list! This is done by loading each primary resource reference in the ToC into hidden iframe
s to allow the SW to cache the requests as they come in.
The current Service Worker code uses the stale-while-revalidate
pattern–meaning, every request is pulled from cache first (if available) and then the network is checked for updates. This provides the “webby-ist” experience for the author (readers eventually get updates as they browse/page-through the text). However, there are certainly other caching and updating patterns to consider and explore.
To run the demos locally, we recommend http-server with server-side caching shut off:
$ npm i -g http-server
$ http-server . -c-1
We’ve drawn inspiration from Jeremy Keith’s Resilient Web Design. This book is a beautiful example of a “bookish” experience on the web now, with good use of link relations, simple design, and beautifully clear, semantic HTML. The subject of the book is also on-topic.
What about Web Application Manifest?
We consider this to be orthogonal to Web App Manifests. If you wish to have “save to homescreen” functionality, by all means use it.
What about metadata?
The web already has numerous ways of including metadata in HTML documents. All we’re proposing is a convention that metadata in this “index” document applies to the entire web publication. The PWG can decide which method(s) will be used by WPs, or leave it to the web.
What about secondary resources?
With both of our demo books, we’ve been able to cache secondary resources (including fonts referenced from CSS) without having a list of such resources. Such a list is a burden on authors. But many machine-processing scenarios, as well as use cases outside the browser (such as reading systems) may benefit from a machine-optimized list of such resources. Hence we’ve added an optional JSON manifest.
HTML requires a non-empty title element.
Feature, not bug.
Web Publications are being discussed by the W3C Publishing Working Group.