Introduction
Note: This specification is only a thought experiment, inspired by the EPUB+WEB White Paper (by the W3C Digital Publishing Interest Group and the IDPF) and by concerns about the complexity of EPUB3. Please join the Digital Publishing Interest Group and IDPF to help bring about a better future for ebooks.
A book, in the digital world, is a reading mode, not a file format. The Open Web Platform provides us with most of what we need for the content, style, and behavior of a publication. EPUB Zero aims to make publications just another part of the web.
EPUB3 is complex. It requires numerous custom XML vocabularies, and massive repetition of content. Developing a reading system for EPUB3 is a herculean task, as evidenced by the slow progress of Readium. Our goal is to create a simpler publication format, easier to author, and easy to read in a browser with as little additional technology as possible.
Note: EPUB Zero is not just for books, but any packaged web content. We will use the term publication to describe books, magazines, journals, manuals, reference documents, corporate documents, articles, etc.
Note: Any discussion of the definitions of packaged, web, content, book, or document will not be tolerated :)
1. Design goals and rationale
EPUB Zero satisfies the following design goals:
-
Simplicity
-
“Webbiness”
-
Built from HTML and CSS and JSON as much as possible
-
Uses existing, forward-looking standards whenever possible
-
Don’t repeat yourself
-
Built for the convenience of the author and reader rather than the implementor
-
Aims to use the browser as reading system
Note: The ultimate test of this specification is whether we can build a reading system on top of an ordinary browser with only a bit of JS.
2. Overview
EPUB Zero is yet another ebook format, which isn’t just “based on” HTML like EPUB, but is HTML. Where possible, we use HTML solutions to achieve book-like functionality. It appears to be possible to do this using mostly-existing web technology:
-
Online/offline reading via Service Workers
-
Pagination via prollyfills
-
Installable via application manifest
-
Access to navigation document via sidebar link relation
But each of these features are problematic. As this document evolves, we hope to propose alternatives.
2.1. Comparison of EPUB 3 and EPUB Zero
Feature | EPUB 3 | EPUB+WEB | EPUB Zero |
---|---|---|---|
Package Document | manifest, reading order, metadata in XML vocabulary | simplified JSON package file | unnecessary? |
Reading Order | spine element in package file | JSON spine | link rel=next in HTML |
Navigation Document | required nav, optional ncx | required nav | required nav |
Linking | EPUB CFI | EPUB CFI? | TK |
Publication Metadata | custom XML format inside package file | JSON package + external file in any vocabulary | external file in any vocabulary |
Content Documents | XHTML5, SVG | HTML5, SVG | HTML5, SVG |
Fixed Layout | viewport in HTML + configuration in package | ? | just don’t even |
Style | subset of CSS | any CSS | any CSS |
Multimedia | html audio and video, mp3 and mp4 are core media types | ? | Anything supported by the browser |
Fonts | OTF and WOFF | ? | Anything supported by the browser |
Scripting | optional, recommend container constrained | ? | required for offline reading |
Text-to-speech | Media overlays/SMIL | TK | TK |
Container | OCF | ? | ? |
Manifest | OCF manifest | JSON manifest | nav + web app manifest? |
Offline reading | dependent on reading system | dependent on reading system | Service Workers |
3. EPUB Zero documents
3.1. Content documents
An EPUB Zero content document is an HTML5 document.
Can SVG or raster images be a content document? Does [HTML5] define required media types? Issue: Is question whether SVG can be embedded inline in HTML (yes, http://www.w3.org/html/wg/drafts/html/master/semantics.html#svg-0) or whether SVG can be first class citizen?
3.2. Style
There are no restrictions on the use of CSS.
3.2.1. Pagination
Pagination is essential for an optimum long-form reading experience. Several approaches to pagination may be possible:
-
Native support exists in Opera 12.16, via
overflow: paged
. -
Polyfills exist, based on either multicol or regions.
-
Project Houdini may expose primitives making prollyfills easier.
Note: Reading long-form content in paginated form often offers a better experience for readers. We encourage document authors to support pagination via CSS, polyfills, prollyfills, reading systems, and/or political action.
3.2.2. Page Transitions
A book may consist of several HTML files. A user must be able to move from ch1.html to ch2.html as easily as moving from page 1 to page 2, and with the same action.
Note: discussed by CSSWG in thread starting at https://lists.w3.org/Archives/Public/www-style/2014Jan/0093.html
HTML5 link relations support describing previous and next files.
Opera and Firefox have UI for link rel=prev|next
, but Safari, Chrome, and IE do not.
Using link relations introduces a burden on authoring that does not currently exist in EPUB3.
3.3. Interactivity
Security for downloaded publications. How is this handled with service workers?
3.4. Media Overlays
Browsers (as far as we know) do not support SMIL.
Note: See https://github.com/timesheets/timesheets.js for a JS implementation of SMIL
Are there polyfills that are “good enough?” Is there a better approach for synchronizing [HTML5] with multimedia?
3.5. Global Language Support
3.6. Navigation
Most reading systems provide a link to the navigation document as part of the reading system user interface.
Note: In Opera and Firefox, opening a link with rel=sidebar
can open a navigation document in the "secondary browsing context", aka sidebar. This does not work in Safari, Chrome, or IE.
3.7. Fixed Layout
3.8. Accessibility
Note: Compliance with WCAG 2.0 and integration of ARIA 1.1 and the Digital Publishing module of ARIA will aid in creating accessible content.
How do we make EPUB Zero documents “born accessible?”
EPUB 3.0 requires a nav
document. EDUPUB requires the section
element,
and proper use of [HTML5] heading elements.
What’s the state of text-to-speech support in browsers?
4. Packaging
5. Installing EPUB Zero publications
EPUB Zero uses the web manifest specification https://w3c.github.io/manifest/ to facilitate installation as a webapp on user devices.
{ "name": "Moby-Dick", "short_name": "Moby-Dick", "icons": [{ "src": "icons/moby-dick-icon.webp", "sizes": "64x64", "type": "image/webp" }], "start_url": "title-page.html", "display": "minimal-ui", }
the display
property is interesting. Adding an additional
value "display": "book"
might be a good way of indicating to the browser
that it should display the content with a UI optimized for long-form content.
there seems to be some pushback against the web manifest specification.
How might a web manifest fulfill the function of an EPUB manifest?
6. Offline reading
Books must be readable offline and online.
Is offline reading really the same question as packaging/archiving?
Browsers currently offer ways of accessing web content while offline:
6.1. AppCache
Cache manifests allow offline access to web content: https://html.spec.whatwg.org/multipage/browsers.html#offline
AppCache will likely be removed from browsers in favor of Service Workers.
6.2. Service Workers
Service Workers are the preferred way of implementing offline viewing for web content.
Note: A service worker should be declared in start_url
.
7. Publication structure and packaging
the EPUB+WEB White Paper discusses offline access together with packaging. Are these in fact the same issue?
An EPUB Zero publication is a collection of files, which should be collected inside a folder or directory. The top level of the directory should contain the package file. No other restrictions exist on the directory structure.
Compressing or otherwise packaging this directory may be required for many reasons, including
-
reduction of file size
-
creating a single “blob” that can be easily transmitted
-
allowing for a digital signature
-
allowing for digital rights management
-
allowing for streaming of the publication
In simple cases, using ZIP may be sufficient. The W3C WebApps Packaging format may prove useful as well http://w3ctag.github.io/packaging-on-the-web/
how about using presence of package.json as trigger, as what defines a publication as something different than an ordinary bundle of web content? Properties in manifest.json can act as hints to UA/Browsing Context that the thing is “bookish”
8. Metadata
Publication-level metadata can be stored in the publication folder. We recommend the use of JSON-LD as a metadata format, but different communities may use other formats such as RDF, Turtle, or ONIX.
{ "@context": "http://schema.org", "@type": "Book", "accessibilityAPI": "ARIA", "accessibilityControl": [ "fullKeyboardControl", "fullMouseControl" ], "accessibilityFeature": [ "largePrint/CSSEnabled", "highContrast/CSSEnabled", "resizeText/CSSEnabled" ], "accessibilityHazard": [ "noFlashing", "noMotionSimulation", "noSound" ], "aggregateRating": { "@type": "AggregateRating", "reviewCount": "0" }, "bookFormat": "EBook/e0", "copyrightHolder": { "@type": "Organization", "name": "Harper & Row" }, "author": "Herman Melville", "datePublished": "1851-10-19", "image": "moby-dick-book-cover.jpg", "offers": { "@type": "Offer", "availability": "https://example.com/BuyMe?isbn=9780316123456", "price": "6.99", "priceCurrency": "USD" }, "copyrightYear": "1851", "description": "Project Gutenberg edition of Moby-Dick", "genre": "Literary Fiction", "inLanguage": "en-US", "isFamilyFriendly": "true", "isbn": "9780000000000", "name": "Moby-Dick", "numberOfPages": "777", "publisher": { "@type": "Organization", "name": "Harper & Row" } }
Acknowledgments
EPUB Zero was originally inspired by a series of posts on EPUB3 by Daniel Glazman.
Many thanks to Hadrien Gardeur for the structure of the JSON Package file.
Reading system behaviour
-
a user agent receives an http request
GET /urn:isbn:9780316123456 Host: http://www.hachettebookgroup.com Accept: ????
what if there’s a zip (or any compressed file) or .e0 there?
-
If manifest.json exists, read manifest.json; open start_url
-
If manifest.json does not exist a. if index.html exists open that b. if index.html doesn’t exist open first html file c. if there are no html files open the first supported format d. otherwise error "there is not a book here"
References and Further Reading
-
http://manu.sporny.org/2014/json-ld-origins-2/
-
https://infrequently.org/2015/06/progressive-apps-escaping-tabs-without-losing-our-soul/