Question: Who is/isn’t retaining technical provenance notes?

PDF Eh? – Another Hackathon Tale

If you are, what are you retaining and why?

If not, why not?

There is more to come from us on this topic – but for now I'd love hear any opinions / thoughts.

And what do I mean by technical provenance?

Good question. I mean any filename sanitation, or QA changes to (meta)data, or any file structure moves, or normalisation data or details of any technical process that has touched the original bitstream as it was found (at rest, if applicable) on its source medium.

2 Comments

  1. Jay Gattuso
    February 27, 2013 @ 1:57 am CET

    I'm yet to understand why fundamental parts of digital objects such as filenames and metadata need to be changed ever.

    On the ground experience tells us there are at least 3 related reasons:

    1. some of the tools we have are imperfect
    2. some of the objects we encounter are imperfect/substandard (either by poor design, poor construction or simply through being old and not pinned a known standard). 
    3. legacy O/Ss and file systems don't behave in the same way as current O/Ss and file systems, resulting in (for example) filenames from legacy systems that are simply illegal in de-jour systems

    When it comes to running many processes on many real world object found "in the wild', we often find that any theoretical performance 'perfection' we expect through academic/design exercises are not observed in the real world data.

    Given that we know that at times we have to make changes, having a system that is predicated on not making changes as the primary audit method not a viable or desirable option.

    So in reply to:

    You can do this by showing that it could not have changed. This can be done by documenting policies and processes and being able to demonstrate that they are always followed.

    If your processes are built on a process that infers no changes have been made you've got 2 different problems….

    1. Firstly you have no mechanism to deal with change when the real world kicks in and things (objects/tools) don't do what you expect them to and 
    2. what happens if there is a breach in the trust of the handling chain…? everything that is touched by the untrusted system (person/tool/process/policy) becomes untrusted by default.

     

     

     

     

  2. ecochrane
    February 25, 2013 @ 5:40 am CET

     

    I'm yet to understand why fundamental parts of digital objects such as filenames and metadata need to be changed ever. However despite that, I do acknlowedge that you need to be able to show that what you have managed has not changed since it has been under your management. If your technical provenance information can be used to do this (to show that the content hasn't changed), by showing that even though all these other things have changed/happened, the content hasn't changed, then that would be an argument for capturing and keeping it. However there are other ways of showing that the content hasn't changed. You can do this by showing that it could not have changed. This can be done by documenting policies and processes and being able to demonstrate that they are always followed. This enables auditors to have confidence that the content could not have changed (assuming the policies and procedures ensure that). 

     

    I'm also interested to hear what practitioners are doing. Good question Jay.

     

     

Leave a Reply

Join the conversation