The veraPDF consortium today announced the release of veraPDF 1.0, an open-source industry-supported PDF/A validator. Led by the Open Preservation Foundation and the PDF Association, veraPDF validates all parts and conformance levels of ISO 19005 (PDF/A). The software is available under a MPLv2+/GLPv3+ license. Carl Wilson, Technical Lead, Open Preservation Foundation said: “Identifying a file’s […]

The latest version of veraPDF is now available to download. This is the final version before our version 1.0 release on January 9th 2017 making it an effective release candidate. New features include the schematron based policy checker and a fully functional greenfield implementation. For the upcoming version 1.0 we’ll be: performing software fixes in […]

The latest version of veraPDF includes the first beta release of our purpose-built PDF parser and validation model. It can be used alongside the PDFBox parser to compare the results. veraPDF 0.26 has a new batch processor that produces multi-item reports. Highlights from the new release are below: Conformance checker added the new rule for […]


Many libraries, archives and museums (LAMs) have been creating disk images of media in their care, but they’ve previously had few options for providing access to the materials on those disk images. The BitCurator Access Webtools, created through a grant from the Andrew W. Mellon Foundation, allow users to browse file systems contained within disk […]

The tech clinic offers OPF members the opportunity to book one-to-one online sessions to discuss any technical aspects of your work. This might include: getting started with OPF tools, e.g. installation and basic usage; help with integrating open source tools into local automated workflows and systems; investigating problems and issues with open source tools; assistance […]

“Upstream, Downstream: embedding digital curation workflows for data science, scholarship and society” The conference brings together digital curation professionals and educators with data producers and consumers to consider digital curation in a multi-disciplinary context. There will be a programme of workshops on Monday 20 February and Thursday 23 February and the main conference will run […]


  The research question I have never doubted the JHOVE TIFF module. The JHOVE TIFF module is always right. Everybody says so. That’s why nobody uses the myriad alternatives to it, although it’s so easy to write a TIFF validator, I could almost do it myself. But while my colleague Michelle and I are drafting […]

In my previous blog post I addressed the detection of broken audio files in an automated workflow for ripping audio CDs. For (data) CD-ROMs and DVDs that are imaged to an ISO image, a similar problem exists: how can we be reasonably sure that the created image is complete? In this blog post I will […]

At the KB we have a large collection of offline optical media. Most of these are CD-ROMs, but we also have a sizeable proportion of audio CDs. We’re currently in the process of designing a workflow for stabilising the contents of these materials using disk imaging. For audio CDs this involves ‘ripping’ the tracks to […]