Why can’t we have digital preservation tools that just work?

Why can’t we have digital preservation tools that just work?

One of my first blogs here covered an evaluation of a number of format identification tools. One of the more surprising results of that work was that out of the five tools that were tested, no less than four of them (FITS, DROID, Fido and JHOVE2) failed to even run when executed with their associated launcher script. In many cases the Windows launcher scripts (batch files) only worked when executed from the installation folder. Apart from making things unnecessarily difficult for the user, this also completely flies in the face of all existing conventions on command-line interface design. Around the time of this work (summer 2011) I had been in contact with the developers of all the evaluated tools, and until last week I thought those issues were a thing of the past. Well, was I wrong!

FITS 0.8

Fast-forward 2.5 years: this week I saw the announcement of the latest FITS release. This got me curious, also because of the recent work on this tool as part of the FITS Blitz. So I downloaded FITS 0.8, installed it in a directory called c:\fits\on my Windows PC, and then typed (while being in directory f:\myData\):

f:\myData>c:\fits\fits

Instead of the expected helper message I ended up with this:

The system cannot find the path specified.
Error: Could not find or load main class edu.harvard.hul.ois.fits.Fits

Hang on, I've seen this before… don't tell me this is the same bug that I already reported 2.5 years ago ? Well, turns out it is after all!

This got me curious about the status of the other tools that had similar problems in 2011, so I started downloading the latest versions of DROID, JHOVE2 and Fido. As I was on a roll anyway, I gave JHOVE a try as well (even though it was not part of the 2011 evaluation). The objective of the test was simply to run each tool and get some screen output (e.g. a help message), nothing more. I did these tests on a PC running Windows 7 with Java version 1.7.0_25. Here are the results.

DROID 6.1.3

First I installed DROID in a directory C:\droid\. Then I executed it using:

f:\myData>c:\droid\droid

This started up a Java Virtual Machine Launcher that showed this message box:

The Running DROID text document that comes with DROID says:

To run DROID on Windows, use the "droid.bat" file. You can either double-click on this file, or run it from the command-line console, by typing "droid" when you are in the droid installation folder.

So, no progress on this for DROID either, then. I was able to get DROID running by circumventing the launcher script like this:

java -jar c:\droid\droid-command-line-6.1.3.jar

This resulted in the following output:

No command line options specified

This isn't particularly helpful. There is a helper message, for which you have to give the -h flag on the command line. But you don't get to see this until you give the -h flag on the command line. Catch 22 anyone?

JHOVE2-2.1.0

After installing JHOVE2 in c:\jhove2\, I typed:

f:\myData>c:\jhove2\jhove2

This gave me 1393 (yes, you read that right: 1393!) Java deprecation warnings, each along the lines of:

16:51:02,702 [main] WARN  TypeConverterDelegate : PropertyEditor [com.sun.beans.editors.EnumEditor]
found through deprecated global PropertyEditorManager fallback - consider using a more isolated
form of registration, e.g. on the BeanWrapper/BeanFactory!

This was eventually followed by the (expected) JHOVE2 help message, and a quick test on some actual files confirmed that JHOVE2 does actually work. Nevertheless, by the time the tsunami of warning messages is over, many first-time users will have started running for the bunkers!

Fido 1.3.1

Fido doesn't make use of any launcher scripts any more, and the default way to run it is to use the Python script directly. After installing in c:\fido\ I typed:

f:\myData>c:\fido\fido.py

Which resulted in ….. (drum roll) … a nicely formatted Fido help message, which is exactly what I was hoping for. Beautiful!

JHOVE 1.11

I installed JHOVE in c:\jhove\ and then typed:

f:\myData>c:\jhove\jhove

Which resulted in this:

Exception in thread "main" java.lang.NoClassDefFoundError: edu/harvard/hul/ois/j
hove/viewer/ConfigWindow
        at edu.harvard.hul.ois.jhove.DefaultConfigurationBuilder.writeDefaultCon
figFile(Unknown Source)
        at edu.harvard.hul.ois.jhove.JhoveBase.init(Unknown Source)
        at Jhove.main(Unknown Source)
Caused by: java.lang.ClassNotFoundException: edu.harvard.hul.ois.jhove.viewer.Co
nfigWindow
        at java.net.URLClassLoader$1.run(Unknown Source)
        at java.net.URLClassLoader$1.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        ... 3 more

Ouch!

Final remarks

I limited my tests to a Windows environment only, and results may well be better under Linux for some of these tools. Nevertheless, I find it nothing less than astounding that so many of these (often widely cited) preservation tools fail to even execute on today's most widespread operating system. Granted, in some cases there are workarounds, such as tweaking the launcher scripts, or circumventing them altogether. However, this is not an option for less tech-savvy users, who will simply conclude "Hey, this tool doesn't work", give up, and move on to other things. Moreover, this means that much of the (often huge) amounts of development effort that went into these tools will simply fail to reach its potential audience, and I think this is a tremendous waste. I'm also wondering why there's been so little progress on this over the past 2.5 years. Is it really that difficult to develop preservation tools with command-line interfaces that follow basic design conventions that have been ubiquitous elsewhere for more than 30 years? Tools that just work?

5 Comments

  1. yfriese
    February 5, 2015 @ 3:15 pm CET

    Hi,
    I know this is a year old, but as some of these tools are still in the same shape as they used to be, I just want to state that it is not easy (even for a not-so-not-tech-savvy-person) to get to run JHOVE2.
    OK; the command line tool works after eliminating all the spaces and ” in the path (= moving JHOVE2 to a directory without spaces and “), but the library is really difficult to use/or the documentation just is not as good as Gary’s “JHOVE tips for developers” with Standard-JHOVE (should I call it JHOVE 1?).
    I would like to look more deeply into the talkative error messages about tiff tags, but it’s not very encouraging so far.
    Well, I have not given up yet.
    Thanks for giving me a fresh feeling revisiting this blog.
    And, yeah, JHOVE2 still gives a zillion deprecated-warnings which are filling my monitor with black desparation… 🙂
    Best, Yvonne

  2. Jan Hutar
    February 12, 2014 @ 12:18 am CET

    I think this goes back to the problem of governance. All the mentioned tools have different owner, some of them are supported and continuously developed by an institution, some of them by an individual person (!). In short majority of them is permanently in danger of being abandoned and not enhanced anymore, which is a scary if we realize these are tools we all rely on.

    Of course you can fix, develop and enhance these tools within your own organisation, but that is rare and unwanted scenario. So to get to the "nirvana" 😉 status where we have tools which are being enhanced, developed, de-bugged and supported continuously we would need some oficial body which would take responsibility for those, including funding (probably mainly funding). 

    I like the idea to open this topic on iPRES, its probably the right platform to use. 

  3. andrea_goethals
    February 10, 2014 @ 4:24 pm CET

    Beyond the particular issues you point out, I think the bigger question you raise is "why can't the tools we all rely on be more stable and provide the functions we need"? All of these tools "work" for some use cases (usually the core organizations maintaining them), but they don't work for all use cases and they all certainly have bugs that should be fixed. If you think it would be useful I would love for us to brainstorm on how as a community we can put into place better processes for improving our tools. We have started this discussion among some FITS users/developers but we could use a lot more input. Maybe interested parties could meet and discuss at a conference such as iPRES?

  4. johan
    January 31, 2014 @ 3:29 pm CET

    Hi David,

    Thanks, I just got your email & a reply is on its way.

    Cheers,

    Johan

  5. Dclipsham
    January 31, 2014 @ 2:56 pm CET

    Hi Johan,

    Thanks for raising this. I've never encountered the problem you've described with DROID before, so it's obviously one we'd like to work through and understand and fix. I'll contact you via email to prevent my hijacking the blogpost for tech support delivery – I hope that's okay.

    David

Leave a Reply

Join the conversation