ref: 5ac69af1fada28257ab8a5118c46d6680902fb8e
parent: 00c5f228e698c24e9b98f169ef05c4e57449e3b3
author: cbagwell <cbagwell>
date: Sun Apr 30 22:39:11 EDT 2000
Document updates.
--- a/libst.txt
+++ b/libst.txt
@@ -72,8 +72,8 @@
The format structure contains a list of control parameters
for the sample: sampling rate, data size (bytes, words,
- floats, etc.), style (unsigned, signed, logarithmic), num-
- ber of sound channels. It also contains other state
+ floats, etc.), encoding (unsigned, signed, logarithmic),
+ number of sound channels. It also contains other state
information: whether the sample file needs to be byte-
swapped, whether fseek() will work, its suffix, its file
stream pointer, its format pointer, and the private struc-
--- a/sox.txt
+++ b/sox.txt
@@ -23,9 +23,11 @@
Effects:
avg [ -l | -r ]
band [ -n ] center [ width ]
+ bandpass frequency bandwidth
+ bandreject frequency bandwidth
check
chorus gain-in gain out delay decay speed depth
- -s | -t [ delay decay speed depth -s | -fI-t ]
+ -s | -t [ delay decay speed depth -s | -t ]
compand attack1,decay1[,attack2,decay2...]
in-dB1,out-dB1[,in-dB2,out-dB2...]
[gain] [initial-volume]
@@ -35,13 +37,17 @@
echo gain-in gain-out delay decay [ delay decay ...]
echos gain-in gain-out delay decay [ delay decay ...]
filter [ low ]-[ high ] [ window-len [ beta ]]
- flanger gain-in gain-out delay decay speed -s | -fI-t
+ flanger gain-in gain-out delay decay speed -s | -t
highp center
+ highpass frequency
lowp center
+ lowpass frequency
map
mask
+ pan direction
phaser gain-in gain-out delay decay speed -s | -t
pick
+ pitch shift [ width interpole fade ]
polyphase [ -w < nut / ham > ]
[ -width < long / short / # > ]
[ -cutoff # ]
@@ -49,18 +55,12 @@
resample
reverb gain-out reverb-time delay [ delay ... ]
reverse
+ speed factor
split
stat [ debug | -v ]
- swap [ 1 2 3 4 ]
- vibro speed [ depth ]
-DESCRIPTION
- SoX is a command line program that can convert most popu-
- lar audio files to most other popular audio file formats.
- It can optionally apply a sound effect to the file during
-
December 10, 1999 1
@@ -70,22 +70,31 @@
SoX(1) SoX(1)
+ stretch [ factor [ window fade shift fading ]
+ swap [ 1 2 3 4 ]
+ vibro speed [ depth ]
+ vol gain [ type ]
+
+DESCRIPTION
+ SoX is a command line program that can convert most popu-
+ lar audio files to most other popular audio file formats.
+ It can optionally apply a sound effect to the file during
this translation.
- There are two types of audio files formats that SoX can
- work with. The first are self-describing file formats.
- These contain a header that completely describe the char-
+ There are two types of audio files formats that SoX can
+ work with. The first are self-describing file formats.
+ These contain a header that completely describe the char-
acteristics of the audio data that follows.
- The second type are headerless data, or sometimes called
- raw data. A user must pass enough information to SoX on
- the command line so that it knows what type of data it
+ The second type are headerless data, or sometimes called
+ raw data. A user must pass enough information to SoX on
+ the command line so that it knows what type of data it
contains.
- Audio data can usually be totally described by four char-
+ Audio data can usually be totally described by four char-
acteristics:
- rate The sample rate is in samples per second. For
+ rate The sample rate is in samples per second. For
example, CD sample rates are at 44100.
data type What format the data is stored in. Most popular
@@ -92,14 +101,14 @@
are 8-bit or 16-bit words.
data format
- What encoding the data type uses. Examples are
+ What encoding the data type uses. Examples are
u-law, ADPCM, or signed linear data.
- channels How many channels are contained in the audio
- data. Mono and Stereo are the two most common.
+ channels How many channels are contained in the audio
+ data. Mono and Stereo are the two most common.
- Please refer to the soxexam(1) manual page for a long
- description with examples on how to use sox with various
+ Please refer to the soxexam(1) manual page for a long
+ description with examples on how to use sox with various
types of file formats.
OPTIONS
@@ -107,26 +116,17 @@
sox file.au file.voc
- translates a sound file in SUN Sparc .AU format into a
+ translates a sound file in SUN Sparc .AU format into a
SoundBlaster .VOC file, while
sox -v 0.5 file.au -r 12000 file.voc rate
- does the same format translation but also lowers the
- amplitude by 1/2 and changes the sampling rate from 8000
+ does the same format translation but also lowers the
+ amplitude by 1/2 and changes the sampling rate from 8000
hertz to 12000 hertz via the rate sound effect loop.
- Format options:
- Format options effect the audio samples that they immedi-
- ately percede. If they are placed before the input file
- name then they effect the input data. If they are placed
- before the output file name then they will effect the out-
- put data. By taking advantage of this, you can override a
- input file's currupted header or produce an output file
-
-
December 10, 1999 2
@@ -136,6 +136,14 @@
SoX(1) SoX(1)
+ Format options:
+
+ Format options effect the audio samples that they immedi-
+ ately percede. If they are placed before the input file
+ name then they effect the input data. If they are placed
+ before the output file name then they will effect the out-
+ put data. By taking advantage of this, you can override a
+ input file's currupted header or produce an output file
that is totally different style then the input file.
-t filetype
@@ -143,53 +151,45 @@
-r rate Give sample rate in Hertz of file. To cause the
output file to have a different sample rate than
- the input file, include this option with the
- appropriate rate value along with the output
- options. If the input and output files have
+ the input file, include this option with the
+ appropriate rate value along with the output
+ options. If the input and output files have
different rates then a sample rate change effect
- must be ran. If a sample rate changing effect
+ must be ran. If a sample rate changing effect
is not specified then a default one will be used
with its default parameters.
-s/-u/-U/-A/-a/-i/-g
- The sample data format is signed linear (2's
- complement), unsigned linear, U-law (logarith-
- mic), A-law (logarithmic), ADPCM, IMA_ADPCM, or
- GSM. U-law and A-law are the U.S. and interna-
+ The sample data format is signed linear (2's
+ complement), unsigned linear, U-law (logarith-
+ mic), A-law (logarithmic), ADPCM, IMA_ADPCM, or
+ GSM. U-law and A-law are the U.S. and interna-
tional standards for logarithmic telephone sound
compression. ADPCM is form of sound compression
- that has a good compromise between good sound
+ that has a good compromise between good sound
quality and fast encoding/decoding time.
- IMA_ADPCM is also a form of adpcm compression,
- slightly simpler and slightly lower fidelity
- than Microsoft's flavor of ADPCM. IMA_ADPCM is
- also called DVI_ADPCM. GSM is a standard used
+ IMA_ADPCM is also a form of adpcm compression,
+ slightly simpler and slightly lower fidelity
+ than Microsoft's flavor of ADPCM. IMA_ADPCM is
+ also called DVI_ADPCM. GSM is a standard used
for telephone sound compression in European
- countries and its gaining popularity because of
+ countries and its gaining popularity because of
its quality.
-b/-w/-l/-f/-d/-D
- The sample data type is in bytes, 16-bit words,
- 32-bit longwords, 32-bit floats, 64-bit double
- floats, or 80-bit IEEE floats. Floats and dou-
+ The sample data type is in bytes, 16-bit words,
+ 32-bit longwords, 32-bit floats, 64-bit double
+ floats, or 80-bit IEEE floats. Floats and dou-
ble floats are in native machine format.
- -x The sample data is in XINU format; that is, it
- comes from a machine with the opposite word
- order than yours and must be swapped according
- to the word-size given above. Only 16-bit and
- 32-bit integer data may be swapped. Machine-
+ -x The sample data is in XINU format; that is, it
+ comes from a machine with the opposite word
+ order than yours and must be swapped according
+ to the word-size given above. Only 16-bit and
+ 32-bit integer data may be swapped. Machine-
format floating-point data is not portable.
IEEE floats are a fixed, portable format.
- -c channels
- The number of sound channels in the data file.
- This may be 1, 2, or 4; for mono, stereo, or
- quad sound data. To cause the output file to
- have a different number of channels than the
- input file, include this option with the appro-
- raite value with the output file options. If
- the input and output file have a different
@@ -202,16 +202,24 @@
SoX(1) SoX(1)
- number of channels then the avg effect must be
+ -c channels
+ The number of sound channels in the data file.
+ This may be 1, 2, or 4; for mono, stereo, or
+ quad sound data. To cause the output file to
+ have a different number of channels than the
+ input file, include this option with the appro-
+ raite value with the output file options. If
+ the input and output file have a different num-
+ ber of channels then the avg effect must be
used. If the avg effect is not specified on the
- command line it will be invoked with default
+ command line it will be invoked with default
parameters.
General options:
- -e When used after the input file (so that it
- applies to the output file) it allows you to
- avoid giving an output filename and will not
+ -e When used after the input file (so that it
+ applies to the output file) it allows you to
+ avoid giving an output filename and will not
produce an output file. It will apply any spec-
ified effects to the input file. This is mainly
useful with the stat effect but can be used with
@@ -219,109 +227,101 @@
-h Print version number and usage information.
- -p Run in preview mode and run fast. This will
+ -p Run in preview mode and run fast. This will
somewhat speed up sox when the output format has
- a different number of channels and a different
- rate than the input file. The order that the
- effects are run in will be arranged for maximum
+ a different number of channels and a different
+ rate than the input file. The order that the
+ effects are run in will be arranged for maximum
speed and not quality.
-v volume Change amplitude (floating point); less than 1.0
decreases, greater than 1.0 increases. Note: we
- perceive volume logarithmically, not linearly.
+ perceive volume logarithmically, not linearly.
Note: see the stat effect.
- -V Print a description of processing phases. Use-
+ -V Print a description of processing phases. Use-
ful for figuring out exactly how sox is mangling
your sound samples.
FILE TYPES
- SoX uses the file extension of the input and output file
+ SoX uses the file extension of the input and output file
to determine what type of file format to use. This can be
- overriden by specifying the "-t" option on the command
+ overriden by specifying the "-t" option on the command
line.
- The input and output files may be read from standard in
+ The input and output files may be read from standard in
and out. This is done by specifing '-' as the filename.
- File formats which have headers are checked, if that
- header doesn't seem right, the program exits with an
+ File formats which have headers are checked, if that
+ header doesn't seem right, the program exits with an
appropriate message.
- The following file formats are supported:
- .8svx Amiga 8SVX musical instrument description for-
- mat.
- .aiff AIFF files used on Apple IIc/IIgs and SGI.
- Note: the AIFF format supports only one SSND
+ December 10, 1999 4
- December 10, 1999 4
+SoX(1) SoX(1)
+ The following file formats are supported:
-SoX(1) SoX(1)
+ .8svx Amiga 8SVX musical instrument description for-
+ mat.
+ .aiff AIFF files used on Apple IIc/IIgs and SGI.
+ Note: the AIFF format supports only one SSND
chunk. It does not support multiple sound
- chunks, or the 8SVX musical instrument descrip-
+ chunks, or the 8SVX musical instrument descrip-
tion format. AIFF files are multimedia archives
- and and can have multiple audio and picture
- chunks. You may need a separate archiver to
+ and and can have multiple audio and picture
+ chunks. You may need a separate archiver to
work with them.
.au SUN Microsystems AU files. There are apparently
- many types of .au files; DEC has invented its
- own with a different magic number and word
+ many types of .au files; DEC has invented its
+ own with a different magic number and word
order. The .au handler can read these files but
- will not write them. Some .au files have valid
- AU headers and some do not. The latter are
- probably original SUN u-law 8000 hz samples.
- These can be dealt with using the .ul format
+ will not write them. Some .au files have valid
+ AU headers and some do not. The latter are
+ probably original SUN u-law 8000 hz samples.
+ These can be dealt with using the .ul format
(see below).
.avr Audio Visual Research
- The AVR format is produced by a number of com-
+ The AVR format is produced by a number of com-
mercial packages on the Mac.
.cdr CD-R
- CD-R files are used in mastering music Compact
+ CD-R files are used in mastering music Compact
Disks. The file format is, as you might expect,
- raw stereo raw unsigned samples at 44khz. But,
+ raw stereo raw unsigned samples at 44khz. But,
there's some blocking/padding oddity in the for-
mat, so it needs its own handler.
.cvs Continuously Variable Slope Delta modulation
- Used to compress speech audio for applications
+ Used to compress speech audio for applications
such as voice mail.
.dat Text Data files
- These files contain a textual representation of
- the sample data. There is one line at the
+ These files contain a textual representation of
+ the sample data. There is one line at the
beginning that contains the sample rate. Subse-
- quent lines contain two numeric data items: the
- time since the beginning of the sample and the
+ quent lines contain two numeric data items: the
+ time since the beginning of the sample and the
sample value. Values are normalized so that the
- maximum and minimum are 1.00 and -1.00. This
+ maximum and minimum are 1.00 and -1.00. This
file format can be used to create data files for
external programs such as FFT analyzers or graph
- routines. SoX can also convert a file in this
- format back into one of the other file formats.
+ routines. SoX can also convert a file in this
+ format back into one of the other file formats.
.gsm GSM 06.10 Lossy Speech Compression
- A standard for compressing speech which is used
- in the Global Standard for Mobil telecommunica-
- tions (GSM). Its good for its purpose, shrink-
- ing audio data size, but it will introduce lots
- of noise when a given sound sample is encoded
- and decoded multiple times. This format is used
- by some voice mail applications. It is rather
- CPU intensive. GSM in sox is optional and
@@ -334,63 +334,63 @@
SoX(1) SoX(1)
- requires access to an external GSM library. To
- see if there is support for gsm run sox -h and
- look for it under the list of supported file
+ A standard for compressing speech which is used
+ in the Global Standard for Mobil telecommunica-
+ tions (GSM). Its good for its purpose, shrink-
+ ing audio data size, but it will introduce lots
+ of noise when a given sound sample is encoded
+ and decoded multiple times. This format is used
+ by some voice mail applications. It is rather
+ CPU intensive. GSM in sox is optional and
+ requires access to an external GSM library. To
+ see if there is support for gsm run sox -h and
+ look for it under the list of supported file
formats.
- .hcom Macintosh HCOM files. These are (apparently)
+ .hcom Macintosh HCOM files. These are (apparently)
Mac FSSD files with some variant of Huffman com-
- pression. The Macintosh has wacky file formats
- and this format handler apparently doesn't han-
+ pression. The Macintosh has wacky file formats
+ and this format handler apparently doesn't han-
dle all the ones it should. Mac users will need
- your usual arsenal of file converters to deal
+ your usual arsenal of file converters to deal
with an HCOM file under Unix or DOS.
.maud An Amiga format
An IFF-conform sound file type, registered by MS
- MacroSystem Computer GmbH, published along with
- the "Toccata" sound-card on the Amiga. Allows
- 8bit linear, 16bit linear, A-Law, u-law in mono
+ MacroSystem Computer GmbH, published along with
+ the "Toccata" sound-card on the Amiga. Allows
+ 8bit linear, 16bit linear, A-Law, u-law in mono
and stereo.
ossdsp OSS /dev/dsp device driver
This is a pseudo-file type and can be optionally
- compiled into Sox. Run sox -h to see if you
- have support for this file type. When this
- driver is used it allows you to open up the OSS
- /dev/dsp file and configure it to use the same
- data type as passed in to Sox. It works for
- both playing and recording sound samples. When
- playing sound files it attempts to set up the
- OSS driver to use the same format as the input
- file. It is suggested to always override the
+ compiled into Sox. Run sox -h to see if you
+ have support for this file type. When this
+ driver is used it allows you to open up the OSS
+ /dev/dsp file and configure it to use the same
+ data type as passed in to Sox. It works for
+ both playing and recording sound samples. When
+ playing sound files it attempts to set up the
+ OSS driver to use the same format as the input
+ file. It is suggested to always override the
output values to use the highest quality samples
- your sound card can handle. Example: -t ossdsp
+ your sound card can handle. Example: -t ossdsp
-w -s /dev/dsp
.sf IRCAM Sound Files.
- SoundFiles are used by academic music software
- such as the CSound package, and the MixView
+ SoundFiles are used by academic music software
+ such as the CSound package, and the MixView
sound sample editor.
.smp Turtle Beach SampleVision files.
- SMP files are for use with the PC-DOS package
- SampleVision by Turtle Beach Softworks. This
- package is for communication to several MIDI
- samplers. All sample rates are supported by the
- package, although not all are supported by the
- samplers themselves. Currently loop points are
- ignored.
+ SMP files are for use with the PC-DOS package
+ SampleVision by Turtle Beach Softworks. This
+ package is for communication to several MIDI
+ samplers. All sample rates are supported by the
+ package, although not all are supported by the
- sunau Sun /dev/audio device driver
- This is a pseudo-file type and can be optionally
- compiled into Sox. Run sox -h to see if you
- have support for this file type. When this
- driver is used it allows you to open up a Sun
-
December 10, 1999 6
@@ -400,63 +400,63 @@
SoX(1) SoX(1)
+ samplers themselves. Currently loop points are
+ ignored.
+
+ sunau Sun /dev/audio device driver
+ This is a pseudo-file type and can be optionally
+ compiled into Sox. Run sox -h to see if you
+ have support for this file type. When this
+ driver is used it allows you to open up a Sun
/dev/audio file and configure it to use the same
- data type as passed in to Sox. It works for
- both playing and recording sound samples. When
- playing sound files it attempts to set up the
+ data type as passed in to Sox. It works for
+ both playing and recording sound samples. When
+ playing sound files it attempts to set up the
audio driver to use the same format as the input
- file. It is suggested to always override the
+ file. It is suggested to always override the
output values to use the highest quality samples
- your hardware can handle. Example: -t sunau -w
+ your hardware can handle. Example: -t sunau -w
-s /dev/audio or -t sunau -U -c 1 /dev/audio for
older sun equipment.
.txw Yamaha TX-16W sampler.
- A file format from a Yamaha sampling keyboard
- which wrote IBM-PC format 3.5" floppies. Han-
+ A file format from a Yamaha sampling keyboard
+ which wrote IBM-PC format 3.5" floppies. Han-
dles reading of files which do not have the sam-
- ple rate field set to one of the expected by
- looking at some other bytes in the attack/loop
- length fields, and defaulting to 33kHz if the
+ ple rate field set to one of the expected by
+ looking at some other bytes in the attack/loop
+ length fields, and defaulting to 33kHz if the
sample rate is still unknown.
.vms More info to come.
- Used to compress speech audio for applications
+ Used to compress speech audio for applications
such as voice mail.
.voc Sound Blaster VOC files.
- VOC files are multi-part and contain silence
- parts, looping, and different sample rates for
- different chunks. On input, the silence parts
- are filled out, loops are rejected, and sample
- data with a new sample rate is rejected.
- Silence with a different sample rate is gener-
- ated appropriately. On output, silence is not
+ VOC files are multi-part and contain silence
+ parts, looping, and different sample rates for
+ different chunks. On input, the silence parts
+ are filled out, loops are rejected, and sample
+ data with a new sample rate is rejected.
+ Silence with a different sample rate is gener-
+ ated appropriately. On output, silence is not
detected, nor are impossible sample rates.
.wav Microsoft .WAV RIFF files.
- These appear to be very similar to IFF files,
- but not the same. They are the native sound
+ These appear to be very similar to IFF files,
+ but not the same. They are the native sound
file format of Windows. (Obviously, Windows was
- of such incredible importance to the computer
- industry that it just had to have its own sound
+ of such incredible importance to the computer
+ industry that it just had to have its own sound
file format.) Normally .wav files have all for-
- matting information in their headers, and so do
- not need any format options specified for an
- input file. If any are, they will override the
- file header, and you will be warned to this
+ matting information in their headers, and so do
+ not need any format options specified for an
+ input file. If any are, they will override the
+ file header, and you will be warned to this
effect. You had better know what you are doing!
- Output format options will cause a format con-
- version, and the .wav will written appropri-
- ately. Sox currently can read PCM, ULAW, ALAW,
- MS ADPCM, and IMA (or DVI) ADPCM. It can write
- all of these formats including (NEW!) the ADPCM
- styles.
- .wve Psion 8-bit alaw
-
December 10, 1999 7
@@ -466,63 +466,63 @@
SoX(1) SoX(1)
- These are 8-bit a-law 8khz sound files used on
+ Output format options will cause a format con-
+ version, and the .wav will written appropri-
+ ately. Sox currently can read PCM, ULAW, ALAW,
+ MS ADPCM, and IMA (or DVI) ADPCM. It can write
+ all of these formats including (NEW!) the ADPCM
+ encoding.
+
+ .wve Psion 8-bit alaw
+ These are 8-bit a-law 8khz sound files used on
the Psion palmtop portable computer.
.raw Raw files (no header).
- The sample rate, size (byte, word, etc), and
- style (signed, unsigned, etc.) of the sample
- file must be given. The number of channels
+ The sample rate, size (byte, word, etc), and
+ encoding (signed, unsigned, etc.) of the sample
+ file must be given. The number of channels
defaults to 1.
.ub, .sb, .uw, .sw, .ul, .sl
- These are several suffices which serve as a
- shorthand for raw files with a given size and
- style. Thus, ub, sb, uw, sw, ul and sl corre-
- spond to "unsigned byte", "signed byte",
- "unsigned word", "signed word", "ulaw" (byte),
- and "signed long". The sample rate defaults to
+ These are several suffices which serve as a
+ shorthand for raw files with a given size and
+ encoding. Thus, ub, sb, uw, sw, ul and sl cor-
+ respond to "unsigned byte", "signed byte",
+ "unsigned word", "signed word", "ulaw" (byte),
+ and "signed long". The sample rate defaults to
8000 hz if not explicitly set, and the number of
- channels (as always) defaults to 1. There are
- lots of Sparc samples floating around in u-law
+ channels (as always) defaults to 1. There are
+ lots of Sparc samples floating around in u-law
format with no header and fixed at a sample rate
- of 8000 hz. (Certain sound management software
+ of 8000 hz. (Certain sound management software
cheerfully ignores the headers.) Similarly,
most Mac sound files are in unsigned byte format
with a sample rate of 11025 or 22050 hz.
- .auto This is a ``meta-type'': specifying this type
- for an input file triggers some code that tries
- to guess the real type by looking for magic
- words in the header. If the type can't be
- guessed, the program exits with an error mes-
- sage. The input must be a plain file, not a
+ .auto This is a ``meta-type'': specifying this type
+ for an input file triggers some code that tries
+ to guess the real type by looking for magic
+ words in the header. If the type can't be
+ guessed, the program exits with an error mes-
+ sage. The input must be a plain file, not a
pipe. This type can't be used for output files.
EFFECTS
Only one effect from the palette may be applied to a sound
- sample. To do multiple effects you'll need to run sox in
+ sample. To do multiple effects you'll need to run sox in
a pipeline.
avg [ -l | -r ]
- Reduce the number of channels by averaging the
- samples, or duplicate channels to increase the
- number of channels. This effect is automati-
- cally used when the number of input samples dif-
- fer from the number of output channels. When
- reducing the number of channels it is possible
- to manually specify the avg effect and use the
- -l and -r options to select only the left or
- right channel for the output instead of averag-
- ing the two channels.
+ Reduce the number of channels by averaging the
+ samples, or duplicate channels to increase the
+ number of channels. This effect is automati-
+ cally used when the number of input channels
+ differ from the number of output channels. When
+ reducing the number of channels it is possible
+ to manually specify the avg effect and use the
- band [ -n ] center [ width ]
- Apply a band-pass filter. The frequency
- response drops logarithmically around the center
- frequency. The width gives the slope of the
-
December 10, 1999 8
@@ -532,23 +532,32 @@
SoX(1) SoX(1)
- drop. The frequencies at center + width and
- center - width will be half of their original
+ -l and -r options to select only the left or
+ right channel for the output instead of averag-
+ ing the two channels.
+
+ band [ -n ] center [ width ]
+ Apply a band-pass filter. The frequency
+ response drops logarithmically around the center
+ frequency. The width gives the slope of the
+ drop. The frequencies at center + width and
+ center - width will be half of their original
amplitudes. Band defaults to a mode oriented to
pitched signals, i.e. voice, singing, or instru-
- mental music. The -n (for noise) option uses
- the alternate mode for un-pitched signals.
- Warning: -n introduces a power-gain of about
- 11dB in the filter, so beware of output clip-
+ mental music. The -n (for noise) option uses
+ the alternate mode for un-pitched signals.
+ Warning: -n introduces a power-gain of about
+ 11dB in the filter, so beware of output clip-
ping. Band introduces noise in the shape of the
filter, i.e. peaking at the center frequency and
- settling around it. See filter for a bandpass
+ settling around it. See filter for a bandpass
effect with steeper shoulders.
- bandpass Butterworth bandpass filter. Description coming
+ bandpass frequency bandwidth
+ Butterworth bandpass filter. Description coming
soon!
- bandreject
+ bandreject frequency bandwidth
Butterworth bandreject filter. Description com-
ing soon!
@@ -555,10 +564,10 @@
chorus gain-in gain-out delay decay speed depth
-s | -t [ delay decay speed depth -s | -t ... ]
- Add a chorus to a sound sample. Each quadtuple
- delay/decay/speed/depth gives the delay in mil-
- liseconds and the decay (relative to gain-in)
- with a modulation speed in Hz using depth in
+ Add a chorus to a sound sample. Each quadtuple
+ delay/decay/speed/depth gives the delay in mil-
+ liseconds and the decay (relative to gain-in)
+ with a modulation speed in Hz using depth in
milliseconds. The modulation is either sinodial
(-s) or triangular (-t). Gain-out is the volume
of the output.
@@ -568,24 +577,15 @@
in-dB1,out-dB1[,in-dB2,out-dB2...]
[gain] [initial-volume]
- Compand (compress or expand) the dynamic range
- of a sample. The attack and decay time specify
- the integration time over which the absolute
- value of the input signal is integrated to
- determine its volume. Where more than one pair
- of attack/decay parameters are specified, each
- channel is treated separately and the number of
- pairs must agree with the number of input chan-
- nels. The second parameter is a list of points
- on the compander's transfer function specified
- in dB relative to the maximum possible signal
- amplitude. The input values must be in a
- strictly increasing order but the transfer func-
- tion does not have to be monotonically rising.
- The special value -inf may be used to indicate
- that the input volume should be associated out-
- put volume. The points -inf,-inf and 0,0 are
- assumed; the latter may be overridden, but the
+ Compand (compress or expand) the dynamic range
+ of a sample. The attack and decay time specify
+ the integration time over which the absolute
+ value of the input signal is integrated to
+ determine its volume. Where more than one pair
+ of attack/decay parameters are specified, each
+ channel is treated separately and the number of
+ pairs must agree with the number of input chan-
+ nels. The second parameter is a list of points
@@ -598,36 +598,45 @@
SoX(1) SoX(1)
- former may not. The third (optional) parameter
- is a postprocessing gain in dB which is applied
+ on the compander's transfer function specified
+ in dB relative to the maximum possible signal
+ amplitude. The input values must be in a
+ strictly increasing order but the transfer func-
+ tion does not have to be monotonically rising.
+ The special value -inf may be used to indicate
+ that the input volume should be associated out-
+ put volume. The points -inf,-inf and 0,0 are
+ assumed; the latter may be overridden, but the
+ former may not. The third (optional) parameter
+ is a postprocessing gain in dB which is applied
after the compression has taken place; the
fourth (optional) parameter is an initial volume
- to be assumed for each channel when the effect
+ to be assumed for each channel when the effect
starts. This permits the user to supply a nomi-
- nal level initially, so that, for example, a
+ nal level initially, so that, for example, a
very large gain is not applied to initial signal
levels before the companding action has begun to
- operate: it is quite probable that in such an
- event, the output would be severely clipped
- while the compander gain properly adjusts
+ operate: it is quite probable that in such an
+ event, the output would be severely clipped
+ while the compander gain properly adjusts
itself.
copy Copy the input file to the output file. This is
- the default effect if both files have the same
+ the default effect if both files have the same
sampling rate.
cut loopnumber
Extract loop #N from a sample.
- deemph Apply a treble attenuation shelving filter to
+ deemph Apply a treble attenuation shelving filter to
samples in audio cd format. The frequency
- response of pre-emphasized recordings is recti-
- fied. The filtering is defined in the standard
+ response of pre-emphasized recordings is recti-
+ fied. The filtering is defined in the standard
document ISO 908.
echo gain-in gain-out delay decay [ delay decay ... ]
Add echoing to a sound sample. Each delay/decay
- part gives the delay in milliseconds and the
+ part gives the delay in milliseconds and the
decay (relative to gain-in) of that echo. Gain-
out is the volume of the output.
@@ -634,27 +643,18 @@
echos gain-in gain-out delay decay [ delay decay ... ]
Add a sequence of echos to a sound sample. Each
delay/decay part gives the delay in milliseconds
- and the decay (relative to gain-in) of that
+ and the decay (relative to gain-in) of that
echo. Gain-out is the volume of the output.
filter [ low ]-[ high ] [ window-len [ beta ] ]
Apply a Sinc-windowed lowpass, highpass, or
- bandpass filter of given window length to the
- signal. low refers to the frequency of the
- lower 6dB corner of the filter. high refers to
- the frequency of the upper 6dB corner of the
- filter.
+ bandpass filter of given window length to the
+ signal. low refers to the frequency of the
+ lower 6dB corner of the filter. high refers to
+ the frequency of the upper 6dB corner of the
- A lowpass filter is obtained by leaving low
- unspecified, or 0. A highpass filter is
- obtained by leaving high unspecified, or 0, or
- greater than or equal to the Nyquist frequency.
- The window-len, if unspecified, defaults to 128.
- Longer windows give a sharper cutoff, smaller
-
-
December 10, 1999 10
@@ -664,63 +664,63 @@
SoX(1) SoX(1)
+ filter.
+
+ A lowpass filter is obtained by leaving low
+ unspecified, or 0. A highpass filter is
+ obtained by leaving high unspecified, or 0, or
+ greater than or equal to the Nyquist frequency.
+
+ The window-len, if unspecified, defaults to 128.
+ Longer windows give a sharper cutoff, smaller
windows a more gradual cutoff.
- The beta, if unspecified, defaults to 16. This
- selects a Kaiser window. You can select a Nut-
- tall window by specifying anything <= 2.0 here.
- For more discussion of beta, look under the
+ The beta, if unspecified, defaults to 16. This
+ selects a Kaiser window. You can select a Nut-
+ tall window by specifying anything <= 2.0 here.
+ For more discussion of beta, look under the
resample effect.
flanger gain-in gain-out delay decay speed -s | -t
- Add a flanger to a sound sample. Each triple
- delay/decay/speed gives the delay in millisec-
- onds and the decay (relative to gain-in) with a
+ Add a flanger to a sound sample. Each triple
+ delay/decay/speed gives the delay in millisec-
+ onds and the decay (relative to gain-in) with a
modulation speed in Hz. The modulation is
- either sinodial (-s) or triangular (-t). Gain-
+ either sinodial (-s) or triangular (-t). Gain-
out is the volume of the output.
highp center
- Apply a high-pass filter. The frequency
- response drops logarithmically with center fre-
- quency in the middle of the drop. The slope of
- the filter is quite gentle. See filter for a
+ Apply a high-pass filter. The frequency
+ response drops logarithmically with center fre-
+ quency in the middle of the drop. The slope of
+ the filter is quite gentle. See filter for a
highpass effect with sharper cutoff.
- highpass Butterworth highpass filter. Description com-
+ highpass frequency
+ Butterworth highpass filter. Description com-
ming soon!
lowp center
Apply a low-pass filter. The frequency response
- drops logarithmically with center frequency in
+ drops logarithmically with center frequency in
the middle of the drop. The slope of the filter
- is quite gentle. See filter for a lowpass
+ is quite gentle. See filter for a lowpass
effect with sharper cutoff.
- lowpass Butterworth lowpass filter. Description coming
+ lowpass frequency
+ Butterworth lowpass filter. Description coming
soon!
map Display a list of loops in a sample, and miscel-
laneous loop info.
- mask Add "masking noise" to signal. This effect
- deliberately adds white noise to a sound in
- order to mask quantization effects, created by
- the process of playing a sound digitally. It
- tends to mask buzzing voices, for example. It
- adds 1/2 bit of noise to the sound file at the
- output bit depth.
+ mask Add "masking noise" to signal. This effect
+ deliberately adds white noise to a sound in
+ order to mask quantization effects, created by
- phaser gain-in gain-out delay decay speed -s | -t
- Add a phaser to a sound sample. Each triple
- delay/decay/speed gives the delay in millisec-
- onds and the decay (relative to gain-in) with a
- modulation speed in Hz. The modulation is
- either sinodial (-s) or triangular (-t). The
-
December 10, 1999 11
@@ -730,6 +730,33 @@
SoX(1) SoX(1)
+ the process of playing a sound digitally. It
+ tends to mask buzzing voices, for example. It
+ adds 1/2 bit of noise to the sound file at the
+ output bit depth.
+
+ pan direction
+ Pan the sound of an audio file from one channel
+ to another. This is done by changing the volume
+ of the input channels so that it fade's out on
+ one channel and fades-in on another. If the
+ number of input channels is different then the
+ number of output channels then this effect tries
+ to intellegently handle this. For instance, if
+ the input contains 1 channel and the output con-
+ tains 2 channels, then it will create the miss-
+ ing channel itself. The direction is a value
+ from -1.0 to 1.0. -1.0 represents far left and
+ 1.0 represents far right. Numbers in between
+ will start the pan effect without totally muting
+ the opposite channel.
+
+ phaser gain-in gain-out delay decay speed -s | -t
+ Add a phaser to a sound sample. Each triple
+ delay/decay/speed gives the delay in millisec-
+ onds and the decay (relative to gain-in) with a
+ modulation speed in Hz. The modulation is
+ either sinodial (-s) or triangular (-t). The
decay should be less than 0.5 to avoid feedback.
Gain-out is the volume of the output.
@@ -737,6 +764,18 @@
sample, or one of four channels in a quadro-
phonic sample.
+ pitch shift [ width interpole fade ]
+ Change the pitch of file without affecting its
+ duration by cross-fading shifted samples. shift
+ is given in cents. Use a positive value to shift
+ to treble, negative value to shift to bass.
+ Default shift is 0. width of window is in ms.
+ Default width is 20ms. Try 30ms to lower pitch,
+ and 10ms to raise pitch. interpole option, can
+ be "cubic" or "linear". Default is "cubic". The
+ fade option, can be "cos", "hamming", "linear"
+ or "trapezoid". Default is "cos".
+
polyphase [ -w < nut / ham > ]
[ -width < long / short / # > ]
@@ -743,51 +782,78 @@
[ -cutoff # ]
Translate input sampling rate to output sampling
- rate via polyphase interpolation, a DSP algo-
- rithm. This method is slow and uses lots of
+ rate via polyphase interpolation, a DSP algo-
+ rithm. This method is slow and uses lots of
+
+
+
+ December 10, 1999 12
+
+
+
+
+
+SoX(1) SoX(1)
+
+
RAM, but gives much better results than rate.
- -w < nut / ham > : select either a Nuttal (~90
- dB stopband) or Hamming (~43 dB stopband) win-
+ -w < nut / ham > : select either a Nuttal (~90
+ dB stopband) or Hamming (~43 dB stopband) win-
dow. Default is nut.
- -width long / short / # : specify the (approxi-
- mate) width of the filter. long is 1024 sam-
- ples; short is 128 samples. Alternatively, an
+ -width long / short / # : specify the (approxi-
+ mate) width of the filter. long is 1024 sam-
+ ples; short is 128 samples. Alternatively, an
exact number can be used. Default is long. The
- short option is not recommended, as it produces
+ short option is not recommended, as it produces
poor quality results.
- -cutoff # : specify the filter cutoff frequency
- in terms of fraction of bandwidth. If upsam-
+ -cutoff # : specify the filter cutoff frequency
+ in terms of fraction of bandwidth. If upsam-
pling, then this is the fraction of the original
signal that should go through. If downsampling,
- this is the fraction of the signal left after
- downsampling. Default is 0.95. Remember that
+ this is the fraction of the signal left after
+ downsampling. Default is 0.95. Remember that
this is a float.
rate Translate input sampling rate to output sampling
- rate via linear interpolation to the Least Com-
+ rate via linear interpolation to the Least Com-
mon Multiple of the two sampling rates. This is
the default effect if the two files have differ-
- ent sampling rates and the preview options was
+ ent sampling rates and the preview options was
specified. This is fast but noisy: the spectrum
- of the original sound will be shifted upwards
- and duplicated faintly when up-translating by a
+ of the original sound will be shifted upwards
+ and duplicated faintly when up-translating by a
multiple. Lerp-ing is acceptable for cheap
- 8-bit sound hardware, but for CD-quality sound
- you should instead use either resample or
- polyphase. If you are wondering which of SoX's
- rate changing effects to use, you will want to
- read a detailed analysis of all of them at
- http://eakaw2.et.tu-dresden.de/~andreas/resam-
+ 8-bit sound hardware, but for CD-quality sound
+ you should instead use either resample or
+ polyphase. If you are wondering which of SoX's
+ rate changing effects to use, you will want to
+ read a detailed analysis of all of them at
+ http://eakaw2.et.tu-dresden.de/~wilde/resam-
ple/resample.html [Nov,1999: These tests need to
- be updated for sox-12.17, which has bugfixes to
+ be updated for sox-12.17, which has bugfixes to
the resample and polyphase code.]
+ resample [ -qs | -q | -ql ] [ rolloff [ beta ] ]
+ Translate input sampling rate to output sampling
+ rate via simulated analog filtration. This
+ method is slower than rate, but gives much bet-
+ ter results.
+ The -qs, -q, or -ql options specify increased
+ accuracy at the cost of lower execution speed.
+ By default, linear interpolation is used, with a
+ window width about 45 samples at the lower rate.
+ This gives an accuracy of about 16 bits, but
+ insufficient stopband rejection in the case that
+ you want to have rolloff greater than about 0.80
+ of the Nyquist frequency. The -q* options use
+ quadratic interpolation of filter coefficients,
+ resulting in about 24 bits precision.
- December 10, 1999 12
+ December 10, 1999 13
@@ -796,23 +862,7 @@
SoX(1) SoX(1)
- resample [ -qs | -q | -ql ] [ rolloff [ beta ] ]
- Translate input sampling rate to output sampling
- rate via simulated analog filtration. This
- method is slower than rate, but gives much bet-
- ter results.
-
- The -qs, -q, or -ql options specify increased
- accuracy at the cost of lower execution speed.
- By default, linear interpolation is used, with a
- window width about 45 samples at the lower rate.
- This gives an accuracy of about 16 bits, but
- insufficient stopband rejection in the case that
- you want to have rolloff greater than about 0.80
- of the Nyquist frequency. The -q* options use
- quadratic interpolation of filter coefficients,
- resulting in about 24 bits precision.
- Following is a table of the reasonable defaults
+ Following is a table of the reasonable defaults
which are built-in to sox:
Option Window rolloff beta interpolation
------ ------ ------- ---- -------------
@@ -822,63 +872,62 @@
-ql 149 0.94 16 quadratic
------ ------ ------- ---- -------------
-qs, -q, or -ql use window lengths of 45, 75, or
- 149 samples, respectively, at the lower sample-
+ 149 samples, respectively, at the lower sample-
rate of the two files. This means progressively
- sharper stop-band rejection, at proportionally
+ sharper stop-band rejection, at proportionally
slower execution times.
- rolloff refers to the cut-off frequency of the
- low pass filter and is given in terms of the
- Nyquist frequency for the lower sample rate.
+ rolloff refers to the cut-off frequency of the
+ low pass filter and is given in terms of the
+ Nyquist frequency for the lower sample rate.
rolloff therefore should be something between 0.
- and 1., in practice 0.8-0.95. The defaults are
+ and 1., in practice 0.8-0.95. The defaults are
indicated above.
The beta parameter determines the type of filter
- window used. Any value greater than 2.0 is the
+ window used. Any value greater than 2.0 is the
beta for a Kaiser window. Beta <= 2.0 selects a
- Nuttall window. If unspecified, the default is
+ Nuttall window. If unspecified, the default is
a Kaiser window with beta 16.
In the case of Kaiser window (beta > 2.0), lower
- betas produce a somewhat faster transition from
- passband to stopband, at the cost of noticeable
- artifacts. A beta of 16 is the default, beta
- less than 10 is not recommended. If you want a
- sharper cutoff, don't use low beta's, use a
+ betas produce a somewhat faster transition from
+ passband to stopband, at the cost of noticeable
+ artifacts. A beta of 16 is the default, beta
+ less than 10 is not recommended. If you want a
+ sharper cutoff, don't use low beta's, use a
longer sample window. A Nuttall window is
- selected by specifying any 'beta' <= 2, and the
- Nuttall window has somewhat steeper cutoff than
- the default Kaiser window. You will probably
+ selected by specifying any 'beta' <= 2, and the
+ Nuttall window has somewhat steeper cutoff than
+ the default Kaiser window. You will probably
+ not need to use the beta parameter at all,
+ unless you are just curious about comparing the
+ effects of Nuttall vs. Kaiser windows.
+ This is the default effect if the two files have
+ different sampling rates. Default parameters
+ are, as indicated above, Kaiser window of length
+ 45, rolloff 0.80, beta 16, linear interpolation.
+ NOTE: -qs is only slightly slower, but more
+ accurate for 16-bit or higher precision.
- December 10, 1999 13
+ NOTE: In many cases of up-sampling, no interpo-
+ lation is needed, as exact filter coefficients
+ can be computed in a reasonable amount of space.
+ To be precise, this is done when
+ December 10, 1999 14
-SoX(1) SoX(1)
- not need to use the beta parameter at all,
- unless you are just curious about comparing the
- effects of Nuttall vs. Kaiser windows.
- This is the default effect if the two files have
- different sampling rates. Default parameters
- are, as indicated above, Kaiser window of length
- 45, rolloff 0.80, beta 16, linear interpolation.
+SoX(1) SoX(1)
- NOTE: -qs is only slightly slower, but more
- accurate for 16-bit or higher precision.
- NOTE: In many cases of up-sampling, no interpo-
- lation is needed, as exact filter coefficients
- can be computed in a reasonable amount of space.
- To be precise, this is done when
-
input_rate < output_rate
&&
output_rate/gcd(input_rate,output_rate) <= 511
@@ -885,63 +934,104 @@
reverb gain-out delay [ delay ... ]
Add reverberation to a sound sample. Each delay
- is given in milliseconds and its feedback is
- depending on the reverb-time in milliseconds.
- Each delay should be in the range of half to
+ is given in milliseconds and its feedback is
+ depending on the reverb-time in milliseconds.
+ Each delay should be in the range of half to
quarter of reverb-time to get a realistic rever-
beration. Gain-out is the volume of the output.
- reverse Reverse the sound sample completely. Included
+ reverse Reverse the sound sample completely. Included
for finding Satanic subliminals.
+ speed factor
+ Speed up or down the sound, as a magnetic tape
+ with a speed control. It affects both pitch and
+ time. A factor of 1.0 means no change, and is
+ the default. 2.0 doubles speed, thus time
+ length is cut by a half and pitch is one octave
+ higher. 0.5 halves speed thus time length dou-
+ bles and pitch is one octave lower.
+
split Turn a mono sample into a stereo sample by copy-
- ing the input channel to the left and right
+ ing the input channel to the left and right
channels.
stat [ debug | -v ]
- Do a statistical check on the input file, and
- print results on the standard error file. stat
- may copy the file untouched from input to out-
- put, if you select an output file. The "Volume
- Adjustment:" field in the statistics gives you
- the argument to the -v number which will make
+ Do a statistical check on the input file, and
+ print results on the standard error file. stat
+ may copy the file untouched from input to out-
+ put, if you select an output file. The "Volume
+ Adjustment:" field in the statistics gives you
+ the argument to the -v number which will make
the sample as loud as possible without clipping.
- There is an optional parameter -v that will
+ There is an optional parameter -v that will
print out the "Volume Adjustment:" field's value
- and return. This could be of use in scripts to
- auto convert the volume. There is an also an
- optional parameter debug that will place sox
- into debug mode and print out a hex dump of the
- sound file from the internal buffer that is in
- 32-bit signed PCM data. This is mainly only of
- use in tracking down endian problems that creep
+ and return. This could be of use in scripts to
+ auto convert the volume. There is an also an
+ optional parameter debug that will place sox
+ into debug mode and print out a hex dump of the
+ sound file from the internal buffer that is in
+ 32-bit signed PCM data. This is mainly only of
+ use in tracking down endian problems that creep
in to sox on cross-platform versions.
+ stretch factor [window fade shift fading]
+ Time stretch file by a given factor. Change
+ duration without affecting the pitch. factor of
+ stretching: >1.0 lengthen, <1.0 shorten dura-
+ tion. window size is in ms. Default is 20ms.
+ The fade option, can be "lin". shift ratio, in
+ [0.0 1.0]. Default depends on stretch factor.
- December 10, 1999 14
+ December 10, 1999 15
+
SoX(1) SoX(1)
+ 1.0 to shorten, 0.8 to lengthen. The fading
+ ratio, in [0.0 0.5]. The amount of a fade's
+ default depends on factor and shift.
+
swap [ 1 2 3 4 ]
- Swap channels in multi-channel sound files. In
- files with more than 2 channels you may specify
+ Swap channels in multi-channel sound files. In
+ files with more than 2 channels you may specify
the order that the channels should be rearranged
in.
vibro speed [ depth ]
- Add the world-famous Fender Vibro-Champ sound
+ Add the world-famous Fender Vibro-Champ sound
effect to a sound sample by using a sine wave as
the volume knob. Speed gives the Hertz value of
- the wave. This must be under 30. Depth gives
- the amount the volume is cut into by the sine
- wave, ranging 0.0 to 1.0 and defaulting to 0.5.
+ the wave. This must be under 30. Depth gives
+ the amount the volume is cut into by the sine
+ wave, ranging 0.0 to 1.0 and defaulting to 0.5.
+ vol gain [ type ]
+ The vol effect is much like the command line
+ option -v. It allows you to adjust the volume
+ of an input file and allows you to specify the
+ adjustment in relation to amplitude, power, or
+ dB. When type is amplitude then a linear change
+ of the amplitude is performed based on the gain.
+ Therefore, a value of 1.0 will keep the volume
+ the same, 0.0 to < 1.0 will cause the volume to
+ decrease and values of > 1.0 will cause the vol-
+ ume to increase. Beware of clipping audio data
+ when the gain is greater then 1.0. A negative
+ value performs the same adjustment while also
+ changing the phase.
+ When type is power then a value of 1.0 also
+ means no change in volume.
+ When type is dB the amplitude is change loga-
+ rithmically. 0.0 is constant while +6 doubles
+ the amplitude.
+
Sox enforces certain effects. If the two files have dif-
ferent sampling rates, the requested effect must be one of
copy, or rate, If the two files have different numbers of
@@ -958,6 +1048,18 @@
SEE ALSO
play(1), rec(1), soxexam(1)
+
+
+
+ December 10, 1999 16
+
+
+
+
+
+SoX(1) SoX(1)
+
+
NOTICES
The version of Sox that accompanies this manual page is
support by Chris Bagwell (cbagwell@sprynet.com). Please
@@ -985,6 +1087,36 @@
- December 10, 1999 15
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ December 10, 1999 17