The Open Preservation Foundation

sustains technology and knowledge for the long-term management of digital cultural heritage. We provide our members with reliable solutions to the challenges of digital preservation.


The Bodleian Libraries, University of Oxford are looking for the third Polonsky Fellow (Technical Officer/Research Software Engineer) to join their team! The successful candidate will join the team in Oxford where you will undertake research and training to build upon your expertise in the technical issues surrounding digital preservation and your awareness of the tools, systems and […]

The latest version of veraPDF is now available. Version 0.24 includes a prototype of batch validation in both the CLI and GUI. Other new enhancements include: Conformance checker added extraction of the AFRelationship key for embedded files as a part of veraPDF feature extraction. Application enhancements implemented prototype of batch validation from CLI and GUI; […]

The Open Preservation Foundation and nestor have signed a Memorandum of Understanding with the aim to facilitate discussion between the two organisations and to co-operate on activities to promote digital preservation. Sabrina Kistner Hidalgo, Head of the nestor Office said: “nestor and the OPF share central goals such as digital preservation advocacy, knowledge exchange and […]


26 Oct
28 Oct

The Preservation and Archiving Special Interest Group (PASIG), among the preeminent international conferences on Digital Preservation, is coming to NYC. Now in its ninth year, PASIG is a practical, solutions focused conference that places a strong emphasis on the following: Comparison of high-level OAIS architectures, services-oriented architecture work, and use cases Cooperation on standard-based, open-source […]

02 Nov

The first part of the webinar is dedicated to the E-ARK “Common Specification for Information Packages”. This specification defines a standardised set of rules for the structure and use of metadata within any Information Package, regardless of its size, the data it includes or the type of organisations which shall deliver, preserve or reuse it. […]

09 Nov

The webinar is dedicated to the practical preservation of relational database. The first part of the webinar introduces the updates which have been done to the original SIARD format in collaboration by the E-ARK project and the Swiss Federal Archives. Most notably, the SIARD 2.0 format adds additional scalability and support for newer SQL methods […]


On 11th October we held our first JHOVE online hack day. Our aim was to catalogue error messages produced by JHOVE to get a better understanding of their meaning and potential preservation impact. Background: organising an online hack day We have been considering running online hackathons because attending face-to-face events has become more difficult as […]

BACKGROUND Nearly two and a half years ago, I started an effort for Apache Tika™ to help improve its robustness via TIKA-1302.  Apache Tika™ is an umbrella/wrapper project that “detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).” I documented some of the early work […]

This is a relatively long post, so to summarise before delving into the details: We’re exploring Wikidata, the (relatively new) Wikipedia for data, as a knowledge base for digital preservation information and would appreciate feedback and involvement. At Yale University Library we are beginning a new programme of work (with funding from both CLIR and […]