DjVu
DjVu (/ˌdeɪʒɑːˈvuː/ DAY-zhah-VOO, like French "déjà vu"[2]) is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, indexed color images, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal (monochrome) images. This allows high-quality, readable images to be stored in a minimum of space, so that they can be made available on the web.
Filename extensions
DjVu has been promoted as providing smaller files than PDF for most scanned documents.[3] The DjVu developers report that color magazine pages compress to 40–70 kB, black-and-white technical papers compress to 15–40 kB, and ancient manuscripts compress to around 100 kB; a satisfactory JPEG image typically requires 500 kB.[4] Like PDF, DjVu can contain an OCR text layer, making it easy to perform copy and paste and text search operations.
Free creators, manipulators, converters, web browser plug-ins, and desktop viewers are available.[2] DjVu is supported by a number of multi-format document viewers and e-book reader software on Linux (Okular, Evince, Zathura), Windows (Okular, SumatraPDF), and Android (Document Viewer,[5] FBReader, EBookDroid, PocketBook).
Technical overview[edit]
File structure[edit]
The DjVu file format is based on the Interchange File Format and is composed of hierarchically organized chunks. The IFF structure is preceded by a 4-byte AT&T
magic number. Following is a single FORM
chunk with a secondary identifier of either DJVU
or DJVM
for a single-page or a multi-page document, respectively.
All the chunks can be contained in a single file in the case of the so called bundled documents, or can be contained in several files: one file for every page plus some files with shared chunks.
Format licensing[edit]
DjVu is an open file format with patents.[3] The file format specification is published, as well as source code for the reference library.[3] The original authors distribute an open-source implementation named "DjVuLibre" under the GNU General Public License. The rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T Corporation, LizardTech,[22] Celartem[23] and Cuminas.[24]
Celartem acquired LizardTech and Extensis.[25][26][23][27][28]
Support[edit]
The selection of downloadable DjVu viewers is wider on Linux distributions than it is on Windows or Mac OS. Additionally, the format is rarely supported by proprietary scanning software.
In 2002, the DjVu file format was chosen by the Internet Archive as a format in which its Million Book Project provides scanned public-domain books online (along with TIFF and PDF).[29] In February 2016, the Internet Archive announced that DjVu would no longer be used for new uploads, among other reasons citing the format's declining use and the difficulty of maintaining their Java applet based viewer for the format.[17]
Wikimedia Commons, a media repository used by Wikipedia among others, conditionally permits PDF and DjVu media files.[30]