Show tools.texi syntax highlighted
@node Available tools
@chapter Available tools
@c TODO: I don't like this title. -gp
The main goal of Marsyas is to provide an extensible framework that
can be used to quickly design and experiment with audio analysis and
synthesis applications. The tools provided with the distribution,
although useful, are only representative examples of what can be
achieved using the provided components. Marsyas is an extensible
framework for building applications, so the primary purpose of these
examples is to provide source code of working applications.
The executable files may be found in @file{bin/release/}, while the
source code for those files is in @file{apps/@{DIR@}/}.
@WANTED{descriptions of all programs; I think we cover about 30% of them}
@menu
* Collections and input files::
* Simple Soundfile Interaction::
* Feature Extraction::
* Synthesis::
* Marsystem Interaction::
* All of the above::
* Regression tests::
@end menu
@node Collections and input files
@section Collections and input files
Many Marsyas tools can operate on individual soundfiles or
collections of soundfiles. A collection is a simple text
file which contain lists of soundfiles.
@menu
* Creating collections manually::
* mkcollection::
@end menu
@node Creating collections manually
@subsection Creating collections manually
A simple way to create a collection is the unix ls command.
For example:
@example
ls /home/gtzan/data/sound/reggae/*.wav > reggae.mf
@end example
@noindent
@code{reggae.mf} will look like this:
@example
/home/gtzan/data/sound/reggae/foo.wav
/home/gtzan/data/sound/reggae/bar.wav
@end example
Any text editor can be used to create collection files. The only
constraint is that the name of the
collections file must have a @code{.mf} extension such as
@code{reggae.mf}. In addition, any
line starting with the @code{#} character is ignored. For Windows Visual
Studio, change the slash character separating directories appropriately.
@subsection Labels
Labels may be added to collections by appending tab-seperated labels
after each sound file:
@example
/home/gtzan/data/sound/reggae/foo.wav \t music
/home/gtzan/data/sound/reggae/bar.wav \t speech
@end example
This allows you to create a @qq{master} collection which includes
different kinds of labelled sound files:
@example
cat music.mf speech.mf > all.mf
@end example
@node mkcollection
@subsection @code{mkcollection}
@cindex mkcollection
@code{mkcollection} is a simple utility for creating collection
files. To create a collection of all the audio files residing
in a directory the following command can be used:
@example
mkcollection -c reggae.mf -l music /home/gtzan/data/sound/
@end example
This also labels the data as @samp{music}.
All the soundfiles residing in that directory or any subdirectories
will be added to the collection. @code{mkcollection} only will add
files with @code{.wav} and @code{ .au} extensions but
does not check that they are valid soundfiles. In general
collection files should contain soundfiles with the same sampling rate
as Marsyas does not perform automatic sampling conversion. The
exception to this rule is collection that mix files at 22050Hz and
44100Hz sampling rates, in which case the 44100Hz files are downsampled
to 22050Hz. No implicit downsampling is performed to collections that
contain only 44100Hz files.
@node Simple Soundfile Interaction
@section Simple Soundfile Interaction
@menu
* sfinfo::
* sfplay::
@end menu
@node sfinfo
@subsection @code{sfinfo}
@cindex sfinfo
sfinfo is a simple command-line utility for displaying
information about a soundfile. It is also a simple
example of how printing out the controls can show
information like channels, sampling rate etc.
@example
sfinfo foo.wav
@end example
@node sfplay
@subsection @code{sfplay}
@cindex sfplay
sfplay is a flexible command-line soundfile player that allows
playback of multiple soundfiles in various formats with either
real-time audio output or soundfile output. The following two example
show two extremes of using of sfplay: simple playback of foo.wav and
playing 3.2 seconds (-l) clips starting at 10.0 seconds (-s) into the
file and repeating the clips for 2.5 times (-r) writing the output to
output.wav (-f) at half volume (-g) playing each file in the
collection reggae.mf. The last command stores the MarSystem dataflow
network used in sfplay as a plugin in playback.mpl. The plugin is
essentially a textual description of the created network. Because
MarSystems can be created at run-time the network can be loaded in a
sfplugin which is a generic executable that flows audio data through
any particular network. Running sfplugin -p playback.mpl bar.wav will
play using the created plugin the file bar.wav. It is important to
note that although both sfplay and sfplugin have the same behavior in
this case they achieve it very different. The main difference is that
in sfplay the network is created at compile time wheras in sfplugin
the network is created at run time.
@example
sfplay foo.wav
sfplay -s 10.0 -l 3.2 -r 2.5 -g 0.5 foo.wav bar.au -f output.wav
sfplay -l 3.0 reggae.mf
sfplay foo.wav -p playback.mpl
@end example
@node Feature Extraction
@section Feature Extraction
@menu
* pitchextract::
* extract::
* bextract::
@end menu
@node pitchextract
@subsection @code{pitchextract}
@cindex pitchextract
pitchextract is used to extract the fundamental frequency
contour from monophonic audio signals. A simple sinusoidal
playback is provided for playback of the resulting contour.
@node extract
@subsection @code{extract}
@cindex extract
extract is a single-file executable for feature extraction.
It can be used as part of external systems for feature
extraction therefore it outputs the results in a simple
tab-separated text file. For more serious feature extraction
over multiple files check bextract which is what I use most
of the time. It also serves as an example of a network
of MarSystems with relatively complicated structure.
The following commands extract a single vector of features
based on the first 30 seconds of the provided
soundfile. By default the feature extractor is based on
extracting features based on the magnitude of the Short Time
Fourier Transform (STFT) (i.e means and variances of Spectral
Centroid, Rolloff, Flux). The second command extracts
the means and variances of Mel-Frequency Cepstral
Coefficients.
@example
extract foo.wav
extract -e SVMFCC foo.wav
@end example
@node bextract
@subsection @code{bextract}
@cindex bextract
bextract is one of the most powerful executables provided by
Marsyas. It can be used for complete feature extraction and
classification experiments with multiple files. It serves as a
canonical example of how audio analysis algorithms can be expressed in
the framework.
Suppose that you want to build a real-time music/speech descriminator
based on a collection of music files named music.mf and a collection
of speech files named speech.mf. These collections can either be
created manually or using the mkcollection utility. The following
commandline will extract means and variances of Mel-Frequency cepstral
coefficients (MFCC) over a texture window of 1 sec. The results are
stored in a wekaOut.arff which is a text file storing the feature
values that can be used in the Weka machine learning environment for
experimentation with different classifiers. At the same time that the
features are extracted, a simple Gaussian classifier is trained and
when feature extraction is completed the whole network of feature
extraction and classification is stored and can be used for real-time
audio classification directly as a Marsyas plugin. The plugin makes a
classification decision every 20ms but aggregates the results by
majority voting to display output approximately every 1 second. The
whole network is stored in musp_classify.mpl which is loaded into
sfplugin and a new file named new.wav is passed through. The screen
output shows the classification results and confidence.
Users familiar with marsyas 0.1 will notice that currently the machine
learning part of marsyas 0.2 is not as sophisticated as the one of
0.1. For example there is no evaluate executable for performing
cross-validation experiments and the only classifier currently
implemented is a simple multidimensional Gaussian classifiar. For my
own research I have been increasingly using Weka for all the machine
learning experiments so porting this functionality to the new version
is not a high priority. On the other hand I have a clear notion of how
they can be integrated and most of the necessary components and APIs
are already in place. Eventually I would like to port most of Weka
into Marsyas but it will be some time until that happens.
@example
bextract -e STFTMFCC all.mf -p classify.mpl -w wekaOut.arff
sfplugin -p musp_classify.mpl new.wav
@end example
Feature extractors that start with SV produce one value
for each value and can be used for non-realtime classification
such as genre classification. The following command can
be used to generate a weka file for genre classification.
@example
bextract -e SVSTFT classical.mf jazz.mf rock.mf -w genre.arff
@end example
Currently no classifier is generated for the SV feature extractors
but it's only a matter of time before this feature is added.
The generated file genre.arff can the be loaded into Weka
where classification experiments can be conducted.
@node Synthesis
@section Synthesis
@menu
* phasevocoder::
* sftransform::
@end menu
@node phasevocoder
@subsection @code{phasevocoder}
@cindex phasevocoder
phasevocoder is probably the most powerful and canonical example of
sound synthesis provided currently by Marsyas. It is based on the
phasevocoder implementation described by F.R.Moore in his book
@qq{Elements of Computer Music}. It is broken into individual
MarSystems in a modular way and can be used for real-time
pitch-shifting and time-scaling.
@example
phasevocoder -p 1.4 -s 100
@end example
@node sftransform
@subsection @code{sftransform}
@cindex sftransform
sftransform is an example of having a doubly nested
network with two FFT/inverse FFT identity transformations.
It's not particularly useful but show how
to nested networks can be created.
@node Marsystem Interaction
@section Marsystem Interaction
@menu
* sfplugin::
* msl::
@end menu
@node sfplugin
@subsection @code{sfplugin}
@cindex sfplugin
sfplugin is the universal executable. Any network of Marsystems
stored as a plugin can be loaded at run-time and sound can flow
through the network. The following example with appropriate plugins
will peform playback of foo.wav and playback with real time music
speech classification of foo.wav.
@example
sfplugin -p plugins/playback.mpl foo.wav
sfplugin -p musp_classify.mpl foo.wav
@end example
@node msl
@subsection @code{msl}
@cindex msl
One of the most useful and powerful characteristics of Marsyas
is the ability to create and combine MarSystems at run time.
msl (marsyas scripting language) is a simple interpreter
that can be used to create dataflow networks, adjust controls,
and run sound through the network. It's used as a backend for
user interfaces therefore it has limited (or more accurately
non-existent) editing functionality. The current syntax
is being revised so currently it's more a proof-of-concept.
Here is an example of creating a simple network in msl and
playing a sound file:
@example
msl
[ msl ] create Series playbacknet
[ msl ] create SoundFileSource src
[ msl ] create Gain g
[ msl ] create AudioSink dest
[ msl ] add src > playbacknet
[ msl ] add g > playbacknet
[ msl ] add dest > playbacknet
[ msl ] updctrl playbacknet SoundFileSource/src/string/filename technomusic.au
[ msl ] run playbacknet
@end example
The important thing to notice is that both the creation of MarSystems
and their assembly into networks can be done at run-time without
having to recompile any code. If anyone would like to pick
a project to do for Marsyas it would be to use the GNU readline
utility for it's commandline editing capabilities and try
to come up with some alternative syntax (I have some ideas
in that direction).
@node All of the above
@section All of the above
@subsection Mudbox
@WANTED{funny description of mudbox -- I'm look at you, George}
Developers interested in contributing to this mess should read
@develref{Playing in the mudbox}.
@node Regression tests
@section Regression tests
Marsyas contains a suite of tests which attempt@footnote{Somewhat
unsuccessfully, unfortunately.} to make sure that new functionality (or
bug fixes) do not cause bugs in existing (working) code. The tests do
not require the marsyas executables to be installed (by default they
execute @file{bin/release/PROGNAME} ), but marsyas must be
compiled.
The tests also require Python to be installed. Python is installed by
default on Linux and MacOS X machines; Windows users may install it from
@uref{http://python.org}.
Developers interested in the inner workings of these tests may find the
information in @develref{Regression tests (devel)}.
@menu
* Sanity tests::
* Coffee tests::
@end menu
@node Sanity tests
@subsection Sanity tests
These are short, quick tests; on modern hardware they will take less
than 10 seconds to run.
To test, simply run
@example
scripts/regtests_sanity.py
@end example
Results will be printed on the command line, with more information in
the file @file{tests/results.log}.
These tests are not complete by any means, nor are they intended to be.
They are simply a quick test to see if anything fundamental is broken.
Since they @emph{are} so quick and easy to run, please run these tests
often.
@node Coffee tests
@subsection Coffee tests
These are a longer set of tests; they may take up to 5 minutes to run.
They also require files that are not part of the Marsyas svn or
tarballs. They are available on the web, see @ref{Optional programs and
datasets}. The location of these files must be passed to the script,
for example
@example
scripts/commit_tests.py ../../marsyas-coffee/
@end example
Results will be printed on the command line, with more information in
the file @file{tests/results.log}.
These tests are a useful excuse for taking a coffee break.
See more files for this project here