xcorrSound

Improve Your Digital Audio Recordings

What is xcorrSound?

xcorrSound consists of four tools:

  • overlap-analysis detects overlap in two audio files
  • waveform-compare compares two audio files and outputs the similarity
  • sound-match detects occurrences of a smaller audio file (e.g. a jingle) within a larger audio file or an index of audio files
  • sound-index builds an index for sound-match to work within

xcorrSound Demo Site

http://scape.opf-labs.org/xcorrsound/index.html

What Can xcorrSound Do For Me?

The automisation of manual processes offers an important performance improvement.
xcorrSound brings the following benefits:

  • precision in the overlap analysis
  • automated processes
  • resource efficiency
  • open source: freely available
  • easy to install and integrate into a workflow (command line tool)
  • leads to an improved and optimised end user experience

xcorrSound Can Be Used By

  • Institutions disseminating audio content
  • Institutions preserving audio collections

Examples

The State and University Library in Denmark holds a large collection of digitised audio recordings, originally recorded
on two-hour tapes, with overlaps from tape to tape. To enhance the user experience, the library wanted to eliminate the
overlaps and make the broadcast a continuous stream. This was done by using xcorrSound overlap-analysis.

In xcorrSound overlap-analysis, algorithms use cross correlation to compare the sound waves. With this an automated overlap analysis of
the audio recordings was conducted. This enabled the library to cut and put together the resulting trimmed files in 24 hour blocks
which enabled improvement of the end users’ listening experience.

Algorithms

All the implemented algorithms rely on the Cross Correlation procedure.

Waveform compare

The input is two wav files of the same length (n), sample-rate, bit-rate
and so on. The output is a real value between 0 and 1 indicating how
similar the two files are (content-wise) where 0 indicates no
similarity and 1 indicates they are identical.

The algorithm splits the two files f and g into blocks f_1, f_2, …,
f_{n/B} and g_1, g_2, …, g_{n/B} all with the same length B. That
is, the first block consists of the first B samples, the second block
of the following B samples from the respective files, and so on. Then
cross correlation is applied on all corresponding blocks, f_i and
g_i. The peak value of the cross correlation tells how much to shift
one block in time to achieve the best match value — we call this the
offset of the block. If there is a block where the offset is more than
500 samples away from the offset of the first block, then an error is
reported and f and g are deemed different. Otherwise a the minimum
match value among the blocks is reported as well as the offset in that
block.

This algorithm has a low memory use that is proportional to the size
of the blocks.

Overlap analysis

The input is two wav files such that the last part (unknown how much)
of the first appears as the first part (also unknown how much) of the
second file — content-wise. The input is two wav files of the same
length (n), sample-rate, bit-rate and so on. The output is a length
and a real value between 0 and 1 indicating how good the match is.

The algorithm does one cross correlation computation of the two input
files and outputs the peak position and value. This means the memory
usage is proportional to the size of the input files which is quite
memory intensive compared to the input. The use case for this tool is
to find small overlaps e.g. a few minutes.

Publications

Leaflet

Conference paper

Blog posts

SlideShare

Vimeo

Components

Credits

  • This work was partially supported by the SCAPE project. The SCAPE project is co-funded
    by the European Union under FP7 ICT-2009.4.1 (Grant Agreement number 270137)
  • XCORRSOUND is copyright 2012 State and University Library, Denmark released under GPLv2, see ./COPYING or http://www.gnu.org/licenses/gpl-2.0.html