ross-spencer's Blog

Digital Preservation Analyst, Archives New Zealand, formerly Digital Preservation Researcher National Archives, UK. My interests are format identification, migration, creation of tools for the characterization of file formats. I'm a coder. I began in industry developing VoIP solutions in C++. More recently I code in Python. The majority of my recent work you will find on GitHub.com. In the past I have blogged about digital preservation for The National Archives, UK. and I maintain a personal blog at exponentialdecay.co.uk. I am interested in any correspondence relating to digital preservation, either via the form here, or via Twitter: @beet_keeper. Please note, with the exception of matter-of-fact commentary on the work I may be conducting at Archives New Zealand, all opinions and thoughts are my own.

We’ve been doing legacy disk extracts at Archives New Zealand for a number of years with much of the effort enabling us to do this work being done by colleague Mick Crouch, and former Archives New Zealand colleague Euan Cochrane – earlier this year, we received some disks from New Zealand’s Department of Conservation (DoC) which we successfully imaged and […]

By ross-spencer, posted in ross-spencer's Blog

23rd Sep 2014  8:14 AM  12986 Reads  4 Comments

During my time at The National Archives UK, colleague, Adam Retter, developed a methodology for the reversible pre-conditioning of complex binary objects. The technique was required to avoid the doubling of storage for malformed JPEG2000 objects numbering in the hundreds of thousands. The difference between a malformed JPEG2000 file and a corrected, well-formed JPEG2000 file, in […]

By ross-spencer, posted in ross-spencer's Blog

9th Jul 2014  12:31 AM  12224 Reads  1 Comment

I have been working on some code to ensure the accurate and consistent output of any file format analysis based on the DROID CSV export, example here. One way of looking at it is an executive summary of a DROID analysis, except I don't think executives, as such, will be its primary user-base.  The reason for pushing […]

By ross-spencer, posted in ross-spencer's Blog

3rd Jun 2014  7:20 AM  11527 Reads  1 Comment

Fifteen days was the estimate I gave for completing an analysis on roughly 450,000 files we were holding at Archives New Zealand. Approximately three seconds per file for each round of analysis: 3 x 450,000 = 1,350,000 seconds 1,350,000 seconds = 15.625 days My bash script included calls to three Java applications, Apache Tika, 1.3 […]

By ross-spencer, posted in ross-spencer's Blog

24th Feb 2014  2:17 AM  17727 Reads  5 Comments

Conducting some research into the chaining of digital preservation tools using a Linux shell script, I once again found it difficult to source a set of files that I could use as a stake in the ground and allow my work to be in some way replicated by others wishing to confirm results and find […]

By ross-spencer, posted in ross-spencer's Blog

20th Feb 2014  6:57 AM  10514 Reads  No comments

A while back I wrote a blog post, MIA: Metadata. I highlighted how difficult it was to capture certain metadata without a managed system – without an Electronic Document and Records Management System (EDRMS). I also questioned if we were doing enough with EDRMS by way of collecting data. Following that blog we sought out […]

By ross-spencer, posted in ross-spencer's Blog

4th Feb 2014  5:21 AM  12172 Reads  1 Comment

  "Digital preservation is more than the technical preservation of a file … it is also about providing readers with the context surrounding it to promote authenticity."   Principle 2, Requirement 8 of the Archives New Zealand Electronic Recordkeeping Metadata Standard asks for seven mandatory elements to be captured: A unique identifier A name Date […]

By ross-spencer, posted in ross-spencer's Blog

12th Jun 2013  4:01 AM  24952 Reads  17 Comments

  ”#Migration: No one does it for the future; they do it (need to do it) for the now.” – https://twitter.com/beet_keeper/status/327968228276060160   Recently I was asked by a colleague to look at some files he’d been sent by Hutt City Council in New Zealand; an unknown format from a 1995 vintage IBM operating system – […]

By ross-spencer, posted in ross-spencer's Blog

13th May 2013  12:25 AM  13583 Reads  3 Comments