johan's Blog

Digital Preservation Researcher at KB / National Library of the Netherlands

The development work on an imaging/ripping workflow for optical media is shaping up steadily, and you can expect a write-up with more information about our software and hardware setup here in the near future (you can get a sneak peek here). However, this blog is about a very specific problem that we ran into while […]

By johan, posted in johan's Blog

25th Apr 2017  2:07 PM  512 Reads  3 Comments

In my previous blog post I addressed the detection of broken audio files in an automated workflow for ripping audio CDs. For (data) CD-ROMs and DVDs that are imaged to an ISO image, a similar problem exists: how can we be reasonably sure that the created image is complete? In this blog post I will […]

By johan, posted in johan's Blog

13th Jan 2017  3:30 PM  4409 Reads  5 Comments

At the KB we have a large collection of offline optical media. Most of these are CD-ROMs, but we also have a sizeable proportion of audio CDs. We’re currently in the process of designing a workflow for stabilising the contents of these materials using disk imaging. For audio CDs this involves ‘ripping’ the tracks to […]

By johan, posted in johan's Blog

4th Jan 2017  2:38 PM  1134 Reads  3 Comments

Earlier this week the National Archives of the Netherlands (NANeth) published a report on preferred file formats. It gives an overview of NANeth’s ‘preferred’ and ‘acceptable’ formats for 9 content categories, and also explains the reasoning behind the selected formats. Even though in Dutch language only, the report is well worth a look. However, I […]

By johan, posted in johan's Blog

9th Dec 2016  3:41 PM  1827 Reads  1 Comment

This is the second and final instalment of a 2-part blog on the use of PDF/A validators for identifying preservation risks in PDF. You can read the first part here. In Part 1 I showed how PDF/A validators can be used to identify preservation risks in a PDF. I illustrated this with an example that […]

By johan, posted in johan's Blog

8th Jul 2015  1:32 PM  2078 Reads  No comments

This is the first instalment of a 2-part blog. It was prompted by the upcoming Digital Preservation Coalition briefing When is a PDF not a PDF?, for which I was asked to prepare a presentation. My initial idea was to give an overview of the work we did on PDF preservation risk assessment using a […]

By johan, posted in johan's Blog

7th Jul 2015  12:45 PM  2444 Reads  No comments

While browsing ArchiveTeam's File Formats Wiki earlier this week, I came across some entries I created there on Quattro Pro spreadsheets two years ago. At the time I had also contributed some old Quattro Pro for DOS spreadsheets (here and here) from my personal archives to the OPF format corpus. Seeing those files again, I […]

By johan, posted in johan's Blog

29th Oct 2014  2:59 PM  18452 Reads  2 Comments

Earlier this week I had a discussion with some colleagues about the archiving of mobile phone and tablet apps (iPhone/Android), and, equally important, ways to provide long-term access. The immediate incentive for this was an announcement by a Dutch publisher, who recently published a children's book that is accompanied by its own app. Also, there […]

By johan, posted in johan's Blog

23rd Oct 2014  11:33 AM  11254 Reads  No comments

Some time ago Will Palmer, Peter May and Peter Cliff of the British Library published a really interesting paper that investigated three different JPEG 2000 codecs, and their effects on image quality in response to lossy compression. Most remarkably, their analysis revealed differences not only in the way these codecs encode (compress) an image, but […]

By johan, posted in johan's Blog

26th Sep 2014  1:06 PM  14330 Reads  3 Comments

It is well-known that PDF documents can contain features that are preservation risks (e.g. see here and here). Migration of existing PDFs to PDF/A is sometimes advocated as a strategy for mitigating these risks. However, the benefits of this approach are often questionable, and the migration process can also be quite risky in itself. As […]

By johan, posted in johan's Blog

27th Aug 2014  3:47 PM  16452 Reads  9 Comments