johan's Blog

Digital Preservation Researcher at KB / National Library of the Netherlands

In a previous blog post I showed how we resurrected NL-menu, the first Dutch web index. It explains how we recovered the site’s data from an old CD-ROM, and how we subsequently created a local copy of the site by serving the CD-ROM’s contents on the Apache web server. This follow-up post covers the final […]

By johan, posted in johan's Blog

11th Jul 2018  3:47 PM  362 Reads  No comments

NL-menu was the first Dutch web index. The site was originally founded by a consortium of SURFnet, Dutch universities and the KB. From the mid-nineties onwards it was maintained solely by the KB. NL-menu was discontinued in 2004, after which the site was taken offline. In 2006 the domain name was sold to a private […]

By johan, posted in johan's Blog

24th Apr 2018  5:01 PM  845 Reads  No comments

Earlier this year I blogged about Isolyzer, a tool designed to help the detection of broken ISO images. Today I released a shiny new beta version that adds a significant amount of new functionality. Below is an overview of the main changes, followed by some warnings and caveats. Support of more file systems Where previous […]

By johan, posted in johan's Blog

12th Jul 2017  3:06 PM  1320 Reads  No comments

Over the last months we’ve been working on the development of a provisional workflow for preserving the content of optical media in our collection. The main result thus far is Iromlab, a custom workflow application that streamlines the imaging and ripping process. This blogpost gives an overview of Iromlab, as well as the reasons why […]

By johan, posted in johan's Blog

19th Jun 2017  1:56 PM  2262 Reads  1 Comment

Some four years ago I wrote a blog post that demonstrated how Apache Preflight (the PDF/A validator tool that is part of Apache PDFBox) can be used to detect features in a PDF that are potential preservation risks. A follow-up blog applied Schematron rules to the Preflight output in an attempt at doing policy-based assessments. […]

By johan, posted in johan's Blog

1st Jun 2017  1:53 PM  1216 Reads  No comments

The development work on an imaging/ripping workflow for optical media is shaping up steadily, and you can expect a write-up with more information about our software and hardware setup here in the near future (you can get a sneak peek here). However, this blog is about a very specific problem that we ran into while […]

By johan, posted in johan's Blog

25th Apr 2017  2:07 PM  1890 Reads  3 Comments

In my previous blog post I addressed the detection of broken audio files in an automated workflow for ripping audio CDs. For (data) CD-ROMs and DVDs that are imaged to an ISO image, a similar problem exists: how can we be reasonably sure that the created image is complete? In this blog post I will […]

By johan, posted in johan's Blog

13th Jan 2017  3:30 PM  7118 Reads  5 Comments

At the KB we have a large collection of offline optical media. Most of these are CD-ROMs, but we also have a sizeable proportion of audio CDs. We’re currently in the process of designing a workflow for stabilising the contents of these materials using disk imaging. For audio CDs this involves ‘ripping’ the tracks to […]

By johan, posted in johan's Blog

4th Jan 2017  2:38 PM  4247 Reads  3 Comments

Earlier this week the National Archives of the Netherlands (NANeth) published a report on preferred file formats. It gives an overview of NANeth’s ‘preferred’ and ‘acceptable’ formats for 9 content categories, and also explains the reasoning behind the selected formats. Even though in Dutch language only, the report is well worth a look. However, I […]

By johan, posted in johan's Blog

9th Dec 2016  3:41 PM  3546 Reads  1 Comment

This is the second and final instalment of a 2-part blog on the use of PDF/A validators for identifying preservation risks in PDF. You can read the first part here. In Part 1 I showed how PDF/A validators can be used to identify preservation risks in a PDF. I illustrated this with an example that […]

By johan, posted in johan's Blog

8th Jul 2015  1:32 PM  3250 Reads  No comments