Blogs

Our blogs are written by contributors from the international digital preservation community. You can find information on a wide range of topics covering tools, project news, case studies and best practice. Everyone is welcome to post a blog and join in the discussion.

Sign in or sign up for an account to get started.

Browse blogs by subject

All subjects Access Analysis Android apache tika ApacheTika AQuA ARC ARC to WARC archives archiving audiovisual Benchmark benchmarking best practice best practices Bit rot bitcurator board game British Library Characterisation Community compression Corpora curation Database Database Archiving Database Preservation Delivery Digital Forensics digital preservation digitisation Disk Images DROID E-ARK Project EaaS Education Emulation epub Experimentation extensible Fido File Formats FLAC Flashback floppy disk floppy disks floppy drive Format Identification Format Registry GitHub Hackathon Hardware obsolescence help httpreserve Identification IDPD17 IMPACT Internet Standards iPRES. community survey isolyzer jhove job JP2 JPEG2000 jpylyzer LZW magnetic media Matchbox MediaConch Members Metadata metadate Migration Monitoring Normalisation OCR open Open Planets Foundation Open Preservation Foundation Open source OPF diary Optimization Packaging PDF PDF/A Planets policy PREFORMA PREMIS preservation Preservation Actions preservation planning Preservation Risks Preservation Strategies Preservia Process Projects PRONOM Provenance pywb recordkeeping records Representation Information Research data research infrastructure Resources RFC Rogues Gallery Rosetta Roy SCAPE Server Siegfried Signature Development Software Software benchmarking SPARQL specification spreadsheets SPRUCE standards technical technical registry testing TIFF Tika Tools training validation veraPDF Virtual Machines w3c WARC Watch WAV WAVE Web Archiving Web Publications wget Wikidata Workflow Workflows Zip

I spent 22nd and 23rd of May at the GitHub Satellite conference in London. The aim of the event was to: provide a showcase for users of and service providers for GitHub; unveil some new GitHub features; and offer training workshops covering particular GitHub technology. Given that I use GitHub and associated tools like Travis-CI, […]

By Carl Wilson, posted in Carl Wilson's Blog

19th Jun 2017  12:49 PM  1066 Reads  No comments

On June 7 and 8 2017, the General Annual Meeting of the Open Preservation Foundation was held at the National Library of France in Paris.   While listening to the presentations, talking with the participants and making an inventory of tools, services, workflow software and repository systems, an idea started to grow in my mind. […]

By RvanVeenendaal, posted in RvanVeenendaal's Blog

13th Jun 2017  9:43 AM  1391 Reads  No comments

Some four years ago I wrote a blog post that demonstrated how Apache Preflight (the PDF/A validator tool that is part of Apache PDFBox) can be used to detect features in a PDF that are potential preservation risks. A follow-up blog applied Schematron rules to the Preflight output in an attempt at doing policy-based assessments. […]

By johan, posted in johan's Blog

1st Jun 2017  1:53 PM  1731 Reads  No comments

Much of the inspiration from this blog came from this source here. According to UNESCO, the authenticity of a record can be jeopardized by: Threats to integrity. Changes to the content of the object itself also potentially damage authenticity. Most such changes stem from threats to the object at a data level. A hyperlink is data. […]

By ross-spencer, posted in ross-spencer's Blog

19th May 2017  3:41 AM  2906 Reads  No comments

Introduction Government departments are connecting their information systems to the e-Depot of the National Archives of the Netherlands (NANETH). The digital archival materials (information objects) coming from these systems (closed cases or other process-bound information) are subsequently preserved in the e-Depot. NANETH’s Service Organization supports the more complex connection projects. These projects are always preceded […]

By RvanVeenendaal, posted in RvanVeenendaal's Blog

17th May 2017  9:06 AM  1561 Reads  No comments

The second JHOVE online hack day took place on 27 April. Once again, we were delighted to be joined by volunteers from around the world to help improve both the software and supporting documentation. During the hack day, participants: continued to add explanations to the error messages spreadsheet; reviewed and extended the documentation; fixed a […]

By Carl Wilson, posted in Carl Wilson's Blog

9th May 2017  1:25 PM  2091 Reads  No comments

The development work on an imaging/ripping workflow for optical media is shaping up steadily, and you can expect a write-up with more information about our software and hardware setup here in the near future (you can get a sneak peek here). However, this blog is about a very specific problem that we ran into while […]

By johan, posted in johan's Blog

25th Apr 2017  2:07 PM  3051 Reads  3 Comments

On April 11, 2017, the National Archives of the Netherlands (NANETH) and Het Utrechts Archief (HUA), held a workshop about the preservation tools PRONOM, DROID and JHOVE. Annelot Vijn (Application Manager Department of Archives at HUA) and I, Remco van Veenendaal (Preservation Advisor at NANETH), prepared the programme, and I led the workshop. Starting with […]

By RvanVeenendaal, posted in RvanVeenendaal's Blog

14th Apr 2017  12:13 PM  1506 Reads  No comments

We were asked recently to write up a tactical fix for addressing “Tag out of sequence” errors in image files. It seems like the sort of thing that others might find useful, and so this blog is a record of that method. We are sharing it for comment and discussion. At the end of the […]

By jaygattuso, posted in jaygattuso's Blog

9th Apr 2017  11:53 PM  2571 Reads  1 Comment

Organisations that use JHOVE for PDF validation will already be familiar with the number of error messages it reports. The recently released JHOVE v1.16 Release Candidate (RC) includes a couple of my bug fixes for the PDF module which appear to reduce this number significantly. These fixes were the result of investigating “Invalid Page Dictionary Object” errors and […]

By Peter May, posted in Peter May's Blog

10th Mar 2017  2:23 PM  2849 Reads  No comments