A couple of Twitters floating around about AZARDI. Took note of the comment about "allegedly will only read conformant ePubs (good luck with that)". Thanks for the good wishes.
Allegedly is probably about the right word in this release. We will be working hard to turn allegedly into an em dash as soon as possible. After all, is it an ePub if it isn't conformant?
To clarify further, we are not trying to make a commercial ePub reader or compete with other readers. It's an open reader, similar to the Bookworm Online ePub reader approach, but for the desktop, when you are not online. Somewhere the specification has to matter, and implementation of the content presentation (and extension) aspects of the specification have to be discussed.
The objective is trust, reliability and completeness. Three huge targets. We have been forced to use ADE on the desktop and it does a number of not-nice things. Other open source implementations are not even close to standards compliance with the exception of Bookworm, but that is an Online strategy.
What to do? Make your ePubs fit ADE's quirky behaviours (so all future value is lost), or go through the learning curve and put the implementation problems out for discussion, if anyone is interested.
Our ePub focus is advanced content
interaction and presentation issues. That's academic, training, textbook and childrens content. That means pictures, interaction (not flash), maths, formulas, massive linking, large navigatable images, archive collections (no silly file size limits thank you), inline/Out-Of-Line XML islands and seeing where ePub can deliver a better content world.
We are starting off with a run on the high ground ie. Standards conformance. It's actually very easy to make a dumbed down ePub reader that isn't strict and just displays text. Just use XHTML in cloud of tags HTML mode, ignore the manifest, whip up the TOC and display away. Later we may make a "relaxed" version, or better, a relaxed mode.
Let's see where it goes. We have a pretty interesting roadmap including transparent custom application features that do lots more, but still remain conformant to the specification.
As the maintainer of Bookworm, I get a window into the ePub ecosystem, and I do have to say it's pretty grim if you're a fan of standards.
I don't have hard numbers (although I may compile some), but my ballpark guess is that truly conforming ePubs on the site make up no more than 25% of the whole. By conforming I mean those that pass epubcheck with zero errors and warnings, which is still no guarantee.
Of those that are conforming, I am pretty sure they are overwhelmingly from O'Reilly and Feedbooks.
Part of the reason is that a significant fraction of the content is composed of "home-grown" ePubs that have been converted from other formats via Calibre. Although a lot of those ebooks are of dodgy quality, I think it's great that there's a groundswell of interest in the format from ordinary readers. Since I think of Bookworm as a site primarily for readers, I haven't minded accommodating a range of conformance.
I do genuinely wish you the best of luck, though. More strict ePubs are good for everyone, and I look forward to some of the applications in education.
Posted by: Liza Daly | 01/27/2009 at 05:24 PM
I truly standards-compliant reader will be great to have, both as a benchmark for other readers and also as a kind of wysiwyg validator for epubs. There's never enough emphasis on that old truism, which I first heard said of databases, that you only get out what you put in (rubbish in, rubbish out, etc.) Thank you for the hard work keeping the Great Database that is the Internet and its ebooks tidy. The tidiers will never rule (they're too busy tidying), but they keep a place respectable.
Posted by: Arthur Attwell | 01/27/2009 at 11:40 PM
I beg your pardon: "A truly standards-compliant reader..." I mean. Not very tidy there.
Posted by: Arthur Attwell | 01/27/2009 at 11:41 PM
I don't think we are under any illusion that standards compliance is a journey. So Arthur "truely" is probably a holy grail destined never to be achieved. We are looking for the specification "grayholes" that need to be filled with experience and common best practice before they get proprietized, and to stretch the format. Arthur, the objective is that AZARDI gives a list of messages that test various aspects of the packaging as it loads. We are looking at incorporating a "repair mode" so someone competent in XHTML type technologies can carry out spot repairs and repackage the ePub. Liza is on a similar track at Bookworm.
Ultimately it's nice if you gently point out a non-conformance, and offer an option to dispose of the problem.
I think your right Liza. In what is nearly a couple of decades of the Internet, there is now only a trend to compliance, and ePub basically being core W3 technologies, is going to be a victim of that history. We have an open-source desktop app on the drawing board to convert ODT/DOC to ePub for reasonable simple content, but like all of these things - how far can should it go. We probably need to stop at the illustrated trade book with no notes, references or indexes.
Bookworm is a stunning resource for the ePub publishing world, and the integration extensions to Calibre and Stanza show the fluidity of the format. I especially like the web browse, download to Stanza/Sony features which really demonstrate the possiblities of a ubiquitous, open format.
Posted by: Richard Pipe | 01/28/2009 at 05:29 AM