Update: Digital Archaeology and Forensics

Beginning of this year we reported on first results of a joint Archives New Zealand and University of Freiburg data recovery project of a set of 5.25 inch floppy disks from the early 1990s. After recovering the raw bitstreams from the floppy disks with a special hardware device the resulting image files were sent over to Freiburg for further analysis. After being able to establish the file lists contained on each floppy it is possible now to extract single files.

Filesystem Interpreter

Of course it was possible to sneak at the probable file contents before by opening the floppy image file in a hex editor. But this makes it very complicated especially for non-text files to distinguish between file boundaries. Depending on the filesystem used a  file is not necessarily contained in consecutive blocks on the storage medium.

For the purpose of the Archive and the public institution donating the data it is not required to re-implement the filesystem driver of the old platform for some recent one as most probably nobody wants to write files on floppy disks for this architecture again. But nevertheless a thorough understanding of the past filesystem is required to write some tool which can at least perform some basic filesystem functionality like listing the content of a directory and reading a specific file. For fast prototyping and because processing speed and efficiency is not an issue here the Python scripting language was chosen by the student endeavoring this task in his thesis. After the first implementation step to read the directory content, the second step to read actual files was achieved.

Fortunately the project was started early enough so that all relevant information which was coming from one specific site (www.ctosfaq.com) on the net was recovered in time. This site went down and did not leave relevant traces either in the Internet Archive nor in the publicly accessible cache of the search engines. This is a nice example of the challenges digital archeologists face. It gives recommendations for the future to store all relevant information on a past computer architecture within the memory institutions and not to rely on the net too much.

Preliminary Results

The recovery experiment was run on 62 disk images created by the team in New Zealand. In three of those 62 images the File Header Block was unreadable. Two of the failing images had just half the size as the rest of them 320KByte instead of 640KByte. This issue lead to unavailable file information like file address on the image and file length. For the third failing case it is still a bit unclear why the File Header Block is unreadable. This comes to a total of 59 readable images with a total of 1332 identifyable files in them. The text content of the failing disk images was transfered to a single text file per image. At the moment the issues are investigated together with the manufacturer of the reading device. It might be possible to tweak the reading process and extract more information that way to add the missing pieces for the failing images. This might led to some deeper insight into the procedure and some best practice recommendations.

9 Comments

  1. David Schmidt
    May 8, 2012 @ 3:00 am CEST

    Dirk –

    No, I have not had the occasion to use the FC5025 on DEC floppies.  So far, jut various flavors of CP/M, a few DOS variants (160k, 180k, 360k) and now CTOS.  It is capable of imaging Commodore GCR and Apple II disks as well, but I have native hardware with other communications solutions that ultimately work better for those.

  2. Dirk von Suchodoletz
    May 4, 2012 @ 7:37 am CEST

    Thank you very much for pointing out a Kryoflux alternative. The Device Side USB floppy disk controller is very decently priced and offers the source code of driver. This is definitely a big plus in long-term access. The driver page gives a list of supported host operating systems and original environments. Btw. have you tried to used this device for e.g. DEC formatted 5,25″ floppies?

  3. David Schmidt
    May 1, 2012 @ 3:01 pm CEST

    This is encouraging work! I am also engaged in recovering some CTOS disks. I used the floppy controller FC5025, as it comes with source code – and was then able to write the driver to decode the CTOS low-level format, and will contribute that back to the vendor for inclusion in future versions of their extraction software.

    Also of note: the CTOS FAQ has been preserved by OoCities.org – and is available once again here:
    http://www.oocities.org/siliconvalley/pines/4011/

  4. andy jackson
    April 14, 2012 @ 1:41 pm CEST

    That’s a great-looking resource. Unfortunately, they’ve not made the licensing or terms of use of the data clear, which is the kind of thing that gets in the way of archiving/exploiting/re-publishing etc. I’ve dropped them an email through their comments form, so hopefully they will clarify.

Leave a Reply

Join the conversation