Define “Audiobook:” The Precision of Spec Language

Photo of Wendy Reid.

Wendy Reid is a Senior QA Analyst at Rakuten Kobo Inc. and has spent the last few years on the other side of EPUB mysteries and reading system technology. She is currently one of the co-chairs of the Publishing Working Group of the W3C and leads the Audiobooks Task Force. In her abundant spare time she likes to learn about new technologies and read as many ebooks as she can. She’ll be at Tech Forum this year delivering a session called Speccing the Void: Adventures in the Audiobooks Abyss and she’ll also be participating on an ebookcraft panel called Cybernetic Ebooks: A Panel on Machine Learning and AI in Book Production.

When the decision was made to start an audiobooks task force within the Publishing Working Group (PWG), I leapt at the challenge to tackle a technical problem I faced on a daily basis. Little did I know that I would spend hours of my time debating the finer points of defining “what” everything to do with audiobooks would be.

My background is in English Literature, so language is always on my mind in some capacity or another. I’ve also been a long-time fan of specifications and their precision and clarifying nature, but if you’ve never read a specification, looking at your first can be a bit intimidating. My first introduction was to the HTML 3.0 spec as a young web-obsessed teenager, when hunting for answers on blogs failed me (this was before Google and StackOverflow!). I understood about 10% of it, but it felt like the Rosetta Stone for my problems. When I encountered EPUB years later, I was prepared. 

The first meeting of the Audiobooks Task Force was spent discussing what an audiobook was. This sounds like a five-minute conversation (and that’s what I had planned), but it took us nearly the full hour, and it would be a definition that was rehashed several times in further conversations. The art of specifications is in the writing. It’s the paragraph at the beginning defining the meaning of seemingly simple phrases like “should,” “must,” or “may.” It’s the difference between normative and informative.

We settled on a definition for an audiobook: “A publication comprised primarily of audio resources with a structure defining reading order and metadata that MAY also include text or image resources as supplemental content.”

The definition is verbose, and possibly a bit awkward, but it summarizes audiobooks as well as possible in one sentence, for now. I don’t doubt further meetings will call into question some or all of the parts of our definition as we work towards building something for the industry.

As of the writing of this post, the audiobooks spec doesn’t exist yet. Too many questions hang in the air awaiting definition before we even dare to put words on the page. There will be hours more discussion on the finer points of metadata, packaging, and what an audiobook MUST and SHOULD be. I, for one, look forward to it.

If you’re curious about the work of the PWG or the Audiobooks Task Force, check out our public repo on GitHub, or feel free to reach out to me via email or twitter.

If you’d like to hear more from Wendy Reid and her adventures in audiobooks, register for Tech Forum on March 20, 2019 in Toronto. You can find more details about the conference here, or sign up for the mailing list to get all of the conference updates.