shithub: sox

ref: b9cb8289a553aa616c8ecaa450b5c07b0e76d6c7
dir: /sox.txt/

View raw version



SoX(1)							   SoX(1)


NAME
       sox - Sound eXchange : universal sound sample translator

SYNOPSIS
       sox infile outfile

       sox [ general options ] [ format options ] infile
	   -e effect [ effect options ]

       sox [ general options ] [ format options ] infile
	   [ format options ] outfile
	   [ effect [ effect options ] ... ]

       General options:
	   [ -h ] [ -p ] [ -v volume ] [ -V ]

       Format options:
	   [ -t filetype ] [ -r rate ] [ -s/-u/-U/-A/-a/-i/-g ]
	   [ -b/-w/-l/-f/-d/-D ]
	   [ -c channels ] [ -x ] [ -e ]

       Effects:
	   avg [ -l | -r ]
	   band [ -n ] center [ width ]
	   bandpass frequency bandwidth
	   bandreject frequency bandwidth
	   chorus gain-in gain out delay decay speed depth
		  -s | -t [ delay decay speed depth -s | -t ]
	   compand attack1,decay1[,attack2,decay2...]
		   in-dB1,out-dB1[,in-dB2,out-dB2...]
		   [ gain ] [ initial-volume ]
	   copy
	   cut
	   deemph
	   earwax
	   echo gain-in gain-out delay decay [ delay decay ... ]
	   echos gain-in gain-out delay decay [ delay decay ... ]
	   fade [ type ] fade-in-length
		[ stop-time [ fade-out-length ] ]
	   filter [ low ]-[ high ] [ window-len [ beta ]]
	   flanger gain-in gain-out delay decay speed < -s | -t >
	   highp frequency
	   highpass frequency
	   lowp frequency
	   lowpass frequency
	   map
	   mask
	   pan direction
	   phaser gain-in gain-out delay decay speed < -s | -t >
	   pick [ -1 | -2 | -3 | -4 | -l | -r ]
	   pitch shift [ width interpole fade ]
	   polyphase [ -w < nut / ham > ]
		     [	-width < long / short / # > ]
		     [ -cutoff # ]



			  July 24, 2000				1





SoX(1)							   SoX(1)


	   rate
	   resample [ -qs | -q | -ql ] [ rolloff [ beta ] ]
	   reverb gain-out reverb-time delay [ delay ... ]
	   reverse
	   speed factor
	   split
	   stat [ -s n ] [ -rms ] [ -v ] [ -d ]
	   stretch [ factor [ window fade shift fading ]
	   swap [ 1 2 | 1 2 3 4 ]
	   trim start [ length ]
	   vibro speed [ depth ]
	   vol gain [ type [ limitergain ] ]

DESCRIPTION
       SoX  is a command line program that can convert most popu�
       lar audio files to most other popular audio file	 formats.
       It  can	optionally  change the audio sample data type and
       apply one or more sound effects to the  file  during  this
       translation.

       There  are  two	types of audio files formats that SoX can
       work with.  The first are  self-describing  file	 formats.
       These  contain a header that completely describe the char�
       acteristics of the audio data that follows.

       The second type are headerless data, or	sometimes  called
       raw  data.   A user must pass enough information to SoX on
       the command line so that it knows what  type  of	 data  it
       contains.

       Audio  data can usually be totally described by four char�
       acteristics:

       rate	 The sample rate is in samples per  second.   For
		 example, CD sample rates are at 44100.

       data size The precision the data is stored in.  Most popu�
		 lar are 8-bit bytes or 16-bit words.

       data encoding
		 What encoding the data type uses.  Examples  are
		 u-law, ADPCM, or signed linear data.

       channels	 How  many  channels  are  contained in the audio
		 data.	Mono and Stereo are the two most  common.

       Please  refer  to  the  soxexam(1)  manual page for a long
       description with examples on how to use sox  with  various
       types of file formats.

OPTIONS
       The option syntax is a little grotty, but in essence:

	    sox file.au file.wav



			  July 24, 2000				2





SoX(1)							   SoX(1)


       translates  a  sound  file  in SUN Sparc .AU format into a
       Microsoft .WAV file, while

	    sox -v 0.5 file.au -r 12000 file.wav mask

       does the same  format  translation  but	also  lowers  the
       amplitude  by  1/2,  changes  the  sampling  rate to 12000
       hertz, and applies the mask  sound  effect  to  the  audio
       data.

       Format options:

       Format  options effect the audio samples that they immedi�
       ately preceed.  If they are placed before the  input  file
       name  then they effect the input data.  If they are placed
       before the output file name then they will effect the out�
       put data.  By taking advantage of this, you can override a
       input file's corrupted header or produce	 an  output  file
       that  is	 totally different style then the input file.  It
       is also how sox is informed about the format of raw  input
       data.

       -t filetype
		 gives the type of the sound sample file.  Useful
		 when file extension is not standard or for spec�
		 ifying the .auto file type.

       -r rate	 Gives	the sample rate in Hertz of the file.  To
		 cause the output file to have a different sample
		 rate than the input file, include this option as
		 a part of the output options.
		 If the input and  output  files  have	different
		 rates	then  a sample rate change effect must be
		 ran.  If a sample rate changing  effect  is  not
		 specified  then a default one will internally be
		 ran by sox using its default parameters.

       -s/-u/-U/-A/-a/-i/-g
		 The sample data encoding is signed  linear  (2's
		 complement),  unsigned	 linear, U-law (logarith�
		 mic), A-law (logarithmic), ADPCM, IMA_ADPCM,  or
		 GSM.
		 U-law	(actually shorthand for mu-law) and A-law
		 are the U.S.  and  international  standards  for
		 logarithmic  telephone	 sound compression.  When
		 uncompressed it has  roughly  the  precision  of
		 12-byte PCM audio.
		 ADPCM	is  form  of sound compression that has a
		 good compromise between good sound  quality  and
		 fast  encoding/decoding  time.	  It  is used for
		 telephone sound compression and places were full
		 fidelity is not as important.	When uncompressed
		 it has	 roughly  the  precision  of  16-bit  PCM
		 audio.	  Popular version of ADPCM include G.726,



			  July 24, 2000				3





SoX(1)							   SoX(1)


		 MS ADPCM, and IMA ADPCM.  The -a flag	has  dif�
		 ferent	 meanings in different file handlers.  In
		 .wav files it represents MS ADPCM files, in  all
		 others	 it  means  G.726  ADPCM.  IMA ADPCM is a
		 specific form	of  adpcm  compression,	 slightly
		 simpler   and	 slightly   lower  fidelity  than
		 Microsoft's flavor of ADPCM.  IMA ADPCM is  also
		 called DVI ADPCM.
		 GSM  is a standard used for telephone sound com�
		 pression in European countries and  its  gaining
		 popularity  because  of its quality.  It usually
		 is CPU intensive to work with GSM audio data.

       -b/-w/-l/-f/-d/-D
		 The sample data size is in bytes, 16-bit  words,
		 32-bit	 longwords,  32-bit floats, 64-bit double
		 floats, or 80-bit IEEE floats.	 Floats and  dou�
		 ble floats are in native machine format.

       -x	 The  sample  data is in XINU format; that is, it
		 comes from a  machine	with  the  opposite  word
		 order	than  yours and must be swapped according
		 to the word-size given above.	Only  16-bit  and
		 32-bit	 integer  data	may be swapped.	 Machine-
		 format	 floating-point	 data  is  not	portable.
		 IEEE floats are a fixed, portable format.

       -c channels
		 The  number  of sound channels in the data file.
		 This may be 1, 2, or 4;  for  mono,  stereo,  or
		 quad  sound  data.   To cause the output file to
		 have a different number  of  channels	than  the
		 input	file, include this option with the output
		 file options.	If the input and output file have
		 a  different  number  of  channels  then the avg
		 effect must be used.  If the avg effect  is  not
		 specified on the command line it will be invoked
		 internally with default parameters.

       -e	 When used after the input filename (so	 that  it
		 applies  to  the  output  file) it allows you to
		 avoid giving an output	 filename  and	will  not
		 produce an output file.  It will apply any spec�
		 ified effects to the input file.  This is mainly
		 useful with the stat effect but can be used with
		 others.

       General options:

       -h	 Print version number and usage information.

       -p	 Run in preview mode and  run  fast.   This  will
		 somewhat speed up sox when the output format has
		 a different number of channels and  a	different



			  July 24, 2000				4





SoX(1)							   SoX(1)


		 rate  than  the  input	 file.	 Currently,  this
		 defaults to using the rate effect instead of the
		 resample effect for sample rate changes.

       -v volume Change amplitude (floating point); less than 1.0
		 decreases, greater than 1.0 increases.	 May  use
		 a  negative  number  to  invert the phase of the
		 audio data.  It is interesting to note	 that  we
		 percieve volume logarithmically but this adjusts
		 the amplitude linearly.
		 Note: see the stat  effect  for  information  on
		 finding  the maximum value that can be used with
		 this option without causing  audio  data  be  be
		 clipped.

       -V	 Print	a description of processing phases.  Use�
		 ful for figuring out exactly how sox is mangling
		 your sound samples.

FILE TYPES
       SoX  uses  the file extension of the input and output file
       to determine what type of file format to use.  This can be
       overridden  by  specifying  the "-t" option on the command
       line.

       The input and output files may be read  from  standard  in
       and  out.  This is done by specifying '-' as the filename.

       File formats which  have	 headers  are  checked,	 if  that
       header  doesn't	seem  right,  the  program  exits with an
       appropriate message.

       The following file formats are supported:


       .8svx	 Amiga 8SVX musical instrument	description  for�
		 mat.

       .aiff	 AIFF  files  used  on	Apple  IIc/IIgs	 and SGI.
		 Note: the AIFF format	supports  only	one  SSND
		 chunk.	  It  does  not	 support  multiple  sound
		 chunks, or the 8SVX musical instrument	 descrip�
		 tion format.  AIFF files are multimedia archives
		 and can have multiple audio and picture  chunks.
		 You  may  need	 a separate archiver to work with
		 them.

       .au	 SUN Microsystems AU files.  There are apparently
		 many  types  of  .au files; DEC has invented its
		 own with  a  different	 magic	number	and  word
		 order.	 The .au handler can read these files but
		 will not write them.  Some .au files have  valid
		 AU  headers  and  some	 do  not.  The latter are
		 probably original SUN	u-law  8000  hz	 samples.



			  July 24, 2000				5





SoX(1)							   SoX(1)


		 These	can  be	 dealt	with using the .ul format
		 (see below).

       .avr	 Audio Visual Research
		 The AVR format is produced by a number	 of  com�
		 mercial packages on the Mac.

       .cdr	 CD-R
		 CD-R  files  are used in mastering music on Com�
		 pact Disks.  The audio data on a CD-R disk is	a
		 raw  audio  file  with a format of stereo 16-bit
		 signed samples at a 44khz sample rate.	 There is
		 a  special blocking/padding oddity at the end of
		 the audio file and is why it needs its own  han�
		 dler.

       .cvs	 Continuously Variable Slope Delta modulation
		 Used  to  compress speech audio for applications
		 such as voice mail.

       .dat	 Text Data files
		 These files contain a textual representation  of
		 the  sample  data.   There  is	 one  line at the
		 beginning that contains the sample rate.  Subse�
		 quent	lines contain two numeric data items: the
		 time since the beginning of the first sample and
		 the sample value.  Values are normalized so that
		 the maximum and  minimum  are	1.00  and  -1.00.
		 This  file  format  can  be  used to create data
		 files for external programs such as FFT  analyz�
		 ers  or  graph routines.  SoX can also convert a
		 file in this format back into one of  the  other
		 file formats.

       .gsm	 GSM 06.10 Lossy Speech Compression
		 A  standard for compressing speech which is used
		 in the Global Standard for Mobil  telecommunica�
		 tions	(GSM).	Its good for its purpose, shrink�
		 ing audio data size, but it will introduce  lots
		 of  noise  when  a given sound sample is encoded
		 and decoded multiple times.  This format is used
		 by  some  voice mail applications.  It is rather
		 CPU intensive.
		 GSM in sox is optional and requires access to an
		 external  GSM	library.  To see if there is sup�
		 port for gsm run sox -h and look  for	it  under
		 the list of supported file formats.

       .hcom	 Macintosh  HCOM  files.   These are (apparently)
		 Mac FSSD files with some variant of Huffman com�
		 pression.   The Macintosh has wacky file formats
		 and this format handler apparently doesn't  han�
		 dle all the ones it should.  Mac users will need
		 your usual arsenal of file  converters	 to  deal



			  July 24, 2000				6





SoX(1)							   SoX(1)


		 with an HCOM file under Unix or DOS.

       .maud	 An Amiga format
		 An IFF-conform sound file type, registered by MS
		 MacroSystem Computer GmbH, published along  with
		 the  "Toccata"	 sound-card on the Amiga.  Allows
		 8bit linear, 16bit linear, A-Law, u-law in  mono
		 and stereo.

       ossdsp	 OSS /dev/dsp device driver
		 This is a pseudo-file type and can be optionally
		 compiled into Sox.  Run sox -h	 to  see  if  you
		 have  support	for  this  file	 type.	When this
		 driver is used it allows you to open up the  OSS
		 /dev/dsp  file	 and configure it to use the same
		 data format as passed in to  /fBSoX.	It  works
		 for  both  playing  and recording sound samples.
		 When playing sound files it attempts to  set  up
		 the  OSS  driver  to  use the same format as the
		 input file.  It is suggested to always	 override
		 the  output  values  to  use the highest quality
		 samples your sound card can handle.  Example: -t
		 ossdsp -w -s /dev/dsp

       .sf	 IRCAM Sound Files.
		 Sound	Files are used by academic music software
		 such as the  CSound  package,	and  the  MixView
		 sound sample editor.

       .sph
		 SPHERE	 (SPeech HEader Resources) is a file for�
		 mat defined by NIST (National Institute of Stan�
		 dards	and  Technology)  and is used with speech
		 audio.	 SoX can read these files when they  con�
		 tain  ulaw  and  PCM  data.   It will ignore any
		 header information that says the  data	 is  com�
		 pressed using shorten compression and will treat
		 the data as either ulaw or PCM.  This will allow
		 SoX  and  the command line shorten program to be
		 ran together using pipes to uncompress the  data
		 and  then pass the result to SoX for processing.

       .smp	 Turtle Beach SampleVision files.
		 SMP files are for use with  the  PC-DOS  package
		 SampleVision  by  Turtle  Beach  Softworks. This
		 package is for	 communication	to  several  MIDI
		 samplers.  All sample rates are supported by the
		 package, although not all are supported  by  the
		 samplers  themselves.	Currently loop points are
		 ignored.

       .snd
		 Under DOS this file format is the  same  as  the
		 .sndt	format.	  Under all other platforms it is



			  July 24, 2000				7





SoX(1)							   SoX(1)


		 the same as the .au format.

       .sndt	 SoundTool files.
		 This is an older DOS file format.

       sunau	 Sun /dev/audio device driver
		 This is a pseudo-file type and can be optionally
		 compiled  into	 Sox.	Run  sox -h to see if you
		 have support for  this	 file  type.   When  this
		 driver	 is  used  it allows you to open up a Sun
		 /dev/audio file and configure it to use the same
		 data  type  as	 passed	 in to Sox.  It works for
		 both playing and recording sound samples.   When
		 playing  sound	 files	it attempts to set up the
		 audio driver to use the same format as the input
		 file.	 It  is	 suggested to always override the
		 output values to use the highest quality samples
		 your  hardware can handle.  Example: -t sunau -w
		 -s /dev/audio or -t sunau -U -c 1 /dev/audio for
		 older sun equipment.

       .txw	 Yamaha TX-16W sampler.
		 A  file  format  from a Yamaha sampling keyboard
		 which wrote IBM-PC format 3.5"	 floppies.   Han�
		 dles reading of files which do not have the sam�
		 ple rate field set to one  of	the  expected  by
		 looking  at  some other bytes in the attack/loop
		 length fields, and defaulting to  33kHz  if  the
		 sample rate is still unknown.

       .vms	 More info to come.
		 Used  to  compress speech audio for applications
		 such as voice mail.

       .voc	 Sound Blaster VOC files.
		 VOC files are	multi-part  and	 contain  silence
		 parts,	 looping,  and different sample rates for
		 different chunks.  On input, the  silence  parts
		 are  filled  out, loops are rejected, and sample
		 data  with  a	new  sample  rate  is	rejected.
		 Silence  with	a different sample rate is gener�
		 ated appropriately.  On output, silence  is  not
		 detected, nor are impossible sample rates.

       .wav	 Microsoft .WAV RIFF files.
		 These	appear	to  be very similar to IFF files,
		 but not the same.  They  are  the  native  sound
		 file format of Windows.  (Obviously, Windows was
		 of such incredible importance	to  the	 computer
		 industry  that it just had to have its own sound
		 file format.)	Normally .wav files have all for�
		 matting  information in their headers, and so do
		 not need any format  options  specified  for  an
		 input	file.  If any are, they will override the



			  July 24, 2000				8





SoX(1)							   SoX(1)


		 file header, and you  will  be	 warned	 to  this
		 effect.  You had better know what you are doing!
		 Output format options will cause a  format  con�
		 version,  and	the  .wav  will written appropri�
		 ately.	 Sox currently can read PCM, ULAW,  ALAW,
		 MS  ADPCM, and IMA (or DVI) ADPCM.  It can write
		 all of these formats including (NEW!)	the ADPCM
		 encoding.

       .wve	 Psion 8-bit alaw
		 These	are  8-bit a-law 8khz sound files used on
		 the Psion palmtop portable computer.

       .raw	 Raw files (no header).
		 The sample rate, size	(byte,	word,  etc),  and
		 encoding (signed, unsigned, etc.)  of the sample
		 file must be  given.	The  number  of	 channels
		 defaults to 1.

       .ub, .sb, .uw, .sw, .ul, .al, .sl
		 These	are  several  suffices	which  serve as a
		 shorthand for raw files with a	 given	size  and
		 encoding.   Thus, ub, sb, uw, sw, ul and sl cor�
		 respond  to  "unsigned	 byte",	 "signed   byte",
		 "unsigned  word",  "signed word", "ulaw" (byte),
		 "alaw" (byte), and "signed  long".   The  sample
		 rate  defaults to 8000 hz if not explicitly set,
		 and the number of channels (as always)	 defaults
		 to  1.	 There are lots of Sparc samples floating
		 around in u-law format with no header and  fixed
		 at  a	sample	rate  of 8000 hz.  (Certain sound
		 management software cheerfully ignores the head�
		 ers.)	 Similarly,  most  Mac sound files are in
		 unsigned byte format with a sample rate of 11025
		 or 22050 hz.

       .auto	 This  is  a  ``meta-type'': specifying this type
		 for an input file triggers some code that  tries
		 to  guess  the	 real  type  by looking for magic
		 words in the  header.	 If  the  type	can't  be
		 guessed,  the	program	 exits with an error mes�
		 sage.	The input must be a  plain  file,  not	a
		 pipe.	This type can't be used for output files.

EFFECTS
       Multiple effects may be applied to the audio data by spec�
       ifying  them  one  after another at the end of the command
       line.

       avg [ -l | -r ]
		 Reduce the number of channels by  averaging  the
		 samples,  or  duplicate channels to increase the
		 number of channels.  This  effect  is	automati�
		 cally	used  when  the	 number of input channels



			  July 24, 2000				9





SoX(1)							   SoX(1)


		 differ from the number of output channels.  When
		 reducing  the	number of channels it is possible
		 to manually specify the avg effect and	 use  the
		 -l  and  -r  options  to select only the left or
		 right channel for the output instead of  averag�
		 ing the two channels.

       band [ -n ] center [ width ]
		 Apply	 a   band-pass	 filter.   The	frequency
		 response drops logarithmically around the center
		 frequency.   The  width  gives	 the slope of the
		 drop.	The frequencies at  center  +  width  and
		 center	 -  width  will be half of their original
		 amplitudes.  Band defaults to a mode oriented to
		 pitched signals, i.e. voice, singing, or instru�
		 mental music.	The -n (for  noise)  option  uses
		 the   alternate  mode	for  un-pitched	 signals.
		 Warning: -n introduces	 a  power-gain	of  about
		 11dB  in  the	filter, so beware of output clip�
		 ping.	Band introduces noise in the shape of the
		 filter, i.e. peaking at the center frequency and
		 settling around it.  See filter for  a	 bandpass
		 effect with steeper shoulders.

       bandpass frequency bandwidth
		 Butterworth  bandpass filter. Description coming
		 soon!

       bandreject frequency bandwidth
		 Butterworth bandreject filter.	 Description com�
		 ing soon!

       chorus gain-in gain-out delay decay speed depth

	      -s | -t [ delay decay speed depth -s | -t ... ]
		 Add  a chorus to a sound sample.  Each quadtuple
		 delay/decay/speed/depth gives the delay in  mil�
		 liseconds  and	 the  decay (relative to gain-in)
		 with a modulation speed in  Hz	 using	depth  in
		 milliseconds.	The modulation is either sinodial
		 (-s) or triangular (-t).  Gain-out is the volume
		 of the output.

       compand attack1,decay1[,attack2,decay2...]

	       in-dB1,out-dB1[,in-dB2,out-dB2...]

	       [gain] [initial-volume]
		 Compand  (compress  or expand) the dynamic range
		 of a sample.  The attack and decay time  specify
		 the  integration  time	 over  which the absolute
		 value of  the	input  signal  is  integrated  to
		 determine  its volume.	 Where more than one pair
		 of attack/decay parameters are	 specified,  each



			  July 24, 2000			       10





SoX(1)							   SoX(1)


		 channel  is treated separately and the number of
		 pairs must agree with the number of input  chan�
		 nels.	 The second parameter is a list of points
		 on the compander's transfer  function	specified
		 in  dB	 relative  to the maximum possible signal
		 amplitude.   The  input  values  must	be  in	a
		 strictly increasing order but the transfer func�
		 tion does not have to be  monotonically  rising.
		 The  special  value -inf may be used to indicate
		 that the input volume should be associated  out�
		 put  volume.	The  points -inf,-inf and 0,0 are
		 assumed; the latter may be overridden,	 but  the
		 former	 may not.  The third (optional) parameter
		 is a postprocessing gain in dB which is  applied
		 after	the  compression  has  taken  place;  the
		 fourth (optional) parameter is an initial volume
		 to  be	 assumed for each channel when the effect
		 starts.  This permits the user to supply a nomi�
		 nal  level  initially,	 so  that, for example, a
		 very large gain is not applied to initial signal
		 levels before the companding action has begun to
		 operate: it is quite probable that  in	 such  an
		 event,	 the  output  would  be	 severely clipped
		 while	the  compander	gain   properly	  adjusts
		 itself.

       copy	 Copy the input file to the output file.  This is
		 the default effect if both files have	the  same
		 sampling rate.

       cut loopnumber
		 Extract loop #N from a sample.

       deemph	 Apply	a  treble  attenuation shelving filter to
		 samples  in  audio  cd	 format.   The	frequency
		 response  of pre-emphasized recordings is recti�
		 fied.	The filtering is defined in the	 standard
		 document ISO 908.

       earwax	 Makes	sound  easier to listen to on headphones.
		 Adds audio-cues to samples in audio cd format so
		 that  when  listened to on headphones the stereo
		 image is moved from inside your  head	(standard
		 for  headphones)  to outside and in front of the
		 listener (standard for speakers). See
		 www.geocities.com/beinges for	a  full	 explana�
		 tion.

       echo gain-in gain-out delay decay [ delay decay ... ]
		 Add echoing to a sound sample.	 Each delay/decay
		 part gives the delay  in  milliseconds	 and  the
		 decay (relative to gain-in) of that echo.  Gain-
		 out is the volume of the output.




			  July 24, 2000			       11





SoX(1)							   SoX(1)


       echos gain-in gain-out delay decay [ delay decay ... ]
		 Add a sequence of echos to a sound sample.  Each
		 delay/decay part gives the delay in milliseconds
		 and the decay	(relative  to  gain-in)	 of  that
		 echo.	Gain-out is the volume of the output.

       fade [ type ] fade-in-length

	    [ stop-time [ fade-out-length ] ]
		 Add a fade effect to the beginning, end, or both
		 of the audio data.

		 For fade-ins, this starts from the first  sample
		 and ramps the volume of the audio from 0 to full
		 volume over fade-in-length seconds.   Specify	0
		 seconds if no fade-in is wanted.

		 For  fade-outs,  the audio data will be trucated
		 at the stop-time and the volume will  be  ramped
		 from full volume down to 0 starting at fade-out-
		 length seconds before the stop-time.	No  fade-
		 out is performed if these options are not speci�
		 fied.

		 An optional type can be specified to change  the
		 type  of envelope.  Choices are q for quarter of
		 a sinewave, h for half a sinewave, t for  linear
		 slope,	 l  for	 logarithmic,  and p for inverted
		 parabola.  The default is a linear slope.

       filter [ low ]-[ high ] [ window-len [ beta ] ]
		 Apply	a  Sinc-windowed  lowpass,  highpass,  or
		 bandpass  filter  of  given window length to the
		 signal.  low refers  to  the  frequency  of  the
		 lower	6dB corner of the filter.  high refers to
		 the frequency of the upper  6dB  corner  of  the
		 filter.

		 A  lowpass  filter  is	 obtained  by leaving low
		 unspecified,  or  0.	A  highpass   filter   is
		 obtained  by  leaving high unspecified, or 0, or
		 greater than or equal to the Nyquist  frequency.

		 The window-len, if unspecified, defaults to 128.
		 Longer windows give a	sharper	 cutoff,  smaller
		 windows a more gradual cutoff.

		 The  beta, if unspecified, defaults to 16.  This
		 selects a Kaiser window.  You can select a  Nut�
		 tall  window by specifying anything <= 2.0 here.
		 For more discussion  of  beta,	 look  under  the
		 resample effect.





			  July 24, 2000			       12





SoX(1)							   SoX(1)


       flanger gain-in gain-out delay decay speed < -s | -t >
		 Add  a	 flanger  to a sound sample.  Each triple
		 delay/decay/speed gives the delay  in	millisec�
		 onds  and the decay (relative to gain-in) with a
		 modulation  speed  in	Hz.   The  modulation  is
		 either	 sinodial (-s) or triangular (-t).  Gain-
		 out is the volume of the output.

       highp frequency
		 Apply a single pole recursive high-pass  filter.
		 The  frequency	 response  drops  logarithmically
		 with I frequency in the middle of the drop.  The
		 slope of the filter is quite gentle.  See filter
		 for a highpass effect with sharper cutoff.

       highpass frequency
		 Butterworth highpass filter.	Description  com�
		 ming soon!

       lowp frequency
		 Apply	a  single pole recursive low-pass filter.
		 The  frequency	 response  drops  logarithmically
		 with  frequency  in the middle of the drop.  The
		 slope of the filter is quite gentle.  See filter
		 for a lowpass effect with sharper cutoff.

       lowpass frequency
		 Butterworth  lowpass filter.  Description coming
		 soon!

       map	 Display a list of loops in a sample, and miscel�
		 laneous loop info.

       mask	 Add  "masking	noise"	to  signal.   This effect
		 deliberately adds white  noise	 to  a	sound  in
		 order	to  mask quantization effects, created by
		 the process of playing a  sound  digitally.   It
		 tends	to  mask buzzing voices, for example.  It
		 adds 1/2 bit of noise to the sound file  at  the
		 output bit depth.

       pan direction
		 Pan  the sound of an audio file from one channel
		 to another.  This is done by changing the volume
		 of  the  input	 channels so that it fades out on
		 one channel and fades-in  on  another.	  If  the
		 number	 of  input channels is different then the
		 number of output channels then this effect tries
		 to  intelligently handle this.	 For instance, if
		 the input contains 1 channel and the output con�
		 tains	2 channels, then it will create the miss�
		 ing channel itself.  The direction  is	 a  value
		 from  -1.0 to 1.0.  -1.0 represents far left and
		 1.0 represents far right.   Numbers  in  between



			  July 24, 2000			       13





SoX(1)							   SoX(1)


		 will start the pan effect without totally muting
		 the opposite channel.

       phaser gain-in gain-out delay decay speed < -s | -t >
		 Add a phaser to a  sound  sample.   Each  triple
		 delay/decay/speed  gives  the delay in millisec�
		 onds and the decay (relative to gain-in) with	a
		 modulation  speed  in	Hz.   The  modulation  is
		 either sinodial (-s) or  triangular  (-t).   The
		 decay should be less than 0.5 to avoid feedback.
		 Gain-out is the volume of the output.

       pick [ -1 | -2 | -3 | -4 | -l | -r ]
		 Select the left or right  channel  of	a  stereo
		 sample,  or  one  of  four channels in a quadro�
		 phonic sample. The -l and -r  options	represent
		 either	  the  left  or	 right	channel.   It  is
		 required that you use	the  -c	 1  command  line
		 option in order to force the output file to con�
		 tain only 1 channel.

       pitch shift [ width interpole fade ]
		 Change the pitch of file without  affecting  its
		 duration by cross-fading shifted samples.  shift
		 is given in cents. Use a positive value to shift
		 to  treble,  negative	value  to  shift to bass.
		 Default shift is 0.  width of window is  in  ms.
		 Default  width is 20ms. Try 30ms to lower pitch,
		 and 10ms to raise pitch.  interpole option,  can
		 be "cubic" or "linear". Default is "cubic".  The
		 fade option, can be "cos",  "hamming",	 "linear"
		 or "trapezoid".  Default is "cos".

       polyphase [ -w < nut / ham > ]

		 [  -width <  long  / short  / # > ]

		 [ -cutoff #  ]
		 Translate input sampling rate to output sampling
		 rate via polyphase interpolation,  a  DSP  algo�
		 rithm.	  This	method	is  slow and uses lots of
		 RAM, but gives much better results than rate.

		 -w < nut / ham > : select either a  Nuttal  (~90
		 dB  stopband)	or Hamming (~43 dB stopband) win�
		 dow.  Default is nut.

		 -width long / short / # : specify the	(approxi�
		 mate)	width  of  the filter.	long is 1024 sam�
		 ples; short is 128 samples.   Alternatively,  an
		 exact number can be used.  Default is long.  The
		 short option is not recommended, as it	 produces
		 poor quality results.




			  July 24, 2000			       14





SoX(1)							   SoX(1)


		 -cutoff  # : specify the filter cutoff frequency
		 in terms of  fraction	of  frequency  bandwidth,
		 also  know as the Nyquist frequency.  Please see
		 the resample effect for further  information  on
		 Nyquist  frequency.  If upsampling, then this is
		 the fraction of the original signal that  should
		 go  through.  If downsampling, this is the frac�
		 tion of  the  signal  left  after  downsampling.
		 Default is 0.95.  Remember that this is a float.


       rate	 Translate input sampling rate to output sampling
		 rate  via linear interpolation to the Least Com�
		 mon Multiple of the two sampling rates.  This is
		 the default effect if the two files have differ�
		 ent sampling rates and the preview  options  was
		 specified.  This is fast but noisy: the spectrum
		 of the original sound will  be	 shifted  upwards
		 and  duplicated faintly when up-translating by a
		 multiple.

		 Lerp-ing is acceptable	 for  cheap  8-bit  sound
		 hardware,  but	 for  CD-quality sound you should
		 instead use either resample  or  polyphase.   If
		 you are wondering which rate changing effects to
		 use, you will want to read a  detailed	 analysis
		 of  all  of  them  at	http://eakaw2.et.tu-dres�
		 den.de/~wilde/resample/resample.html

       resample [ -qs | -q | -ql ] [ rolloff [ beta ] ]
		 Translate input sampling rate to output sampling
		 rate  via  simulated  analog  filtration.   This
		 method is slower than rate, but gives much  bet�
		 ter results.

		 By default, linear interpolation is used, with a
		 window width about 45 samples at  the	lower  of
		 the  two  rate.  This gives an accuracy of about
		 16 bits, but insufficient stopband rejection  in
		 the  case  that you want to have rolloff greater
		 than about 0.80 of the Nyquist frequency.

		 The -q* options will change the  default  values
		 for  rolloff  and  beta as well as use quadratic
		 interpolation of filter coefficients,	resulting
		 in about 24 bits precision.  The -qs, -q, or -ql
		 options specify increased accuracy at	the  cost
		 of  lower  execution  speed.	It is optional to
		 specify rolloff and beta parameters  when  using
		 the -q* options.

		 Following  is a table of the reasonable defaults
		 which are built-in to sox:




			  July 24, 2000			       15





SoX(1)							   SoX(1)


		    Option  Window rolloff beta interpolation
		    ------  ------ ------- ---- -------------
		    (none)    45    0.80    16	   linear
		      -qs     45    0.80    16	  quadratic
		      -q      75    0.875   16	  quadratic
		      -ql    149    0.94    16	  quadratic
		    ------  ------ ------- ---- -------------

		 -qs, -q, or -ql use window lengths of 45, 75, or
		 149  samples, respectively, at the lower sample-
		 rate of the two files.	 This means progressively
		 sharper  stop-band  rejection, at proportionally
		 slower execution times.

		 rolloff refers to the cut-off frequency  of  the
		 low  pass  filter  and	 is given in terms of the
		 Nyquist frequency for	the  lower  sample  rate.
		 rolloff  therefore  should  be something between
		 0.0 and 1.0, in practice 0.8-0.95.  The defaults
		 are indicated above.

		 The Nyquist frequency is equal to (sample rate /
		 2).  Logically, this is  because  the	A/D  con�
		 verter	 needs	at  least  2  samples to detect 1
		 cycle at  the	Nyquist	 frequency.   Frequencies
		 higher	 then the Nyquist will actually appear as
		 lower frequencies to the A/D  converter  and  is
		 called aliasing.  Normally, A/D converts run the
		 signal through a highpass filter first to  avoid
		 these problems.

		 Similar  problems  will  happen in software when
		 reducing the sample rate of an audio file  (fre�
		 quencies  above the new Nyquist frequency can be
		 aliased to  lower  frequencies).   Therefore,	a
		 good  resample	 effect will remove all frequency
		 information above the new Nyquist frequency.

		 The rolloff refers to how close to  the  Nyquist
		 frequency this cutoff is, with closer being bet�
		 ter.  When increasing	the  sample  rate  of  an
		 audio file you would not expect to have any fre�
		 quencies  exist  that	are  past  the	 original
		 Nyquist  frequency.  Because of resampling prop�
		 erties, it is common to have alaising data  cre�
		 ated  that  is	 above the old Nyquist frequency.
		 In that case the rolloff refers to how close  to
		 the original Nyquist frequency to use a highpass
		 filter to remove this false  data,  with  closer
		 also being better.

		 The beta parameter determines the type of filter
		 window used.  Any value greater than 2.0 is  the
		 beta for a Kaiser window.  Beta <= 2.0 selects a



			  July 24, 2000			       16





SoX(1)							   SoX(1)


		 Nuttall window.  If unspecified, the default  is
		 a Kaiser window with beta 16.

		 In the case of Kaiser window (beta > 2.0), lower
		 betas produce a somewhat faster transition  from
		 passband  to stopband, at the cost of noticeable
		 artifacts.  A beta of 16 is  the  default,  beta
		 less  than 10 is not recommended.  If you want a
		 sharper cutoff, don't	use  low  beta's,  use	a
		 longer	 sample	 window.   A  Nuttall  window  is
		 selected by specifying any 'beta' <= 2, and  the
		 Nuttall  window has somewhat steeper cutoff than
		 the default Kaiser window.   You  will	 probably
		 not  need  to	use  the  beta	parameter at all,
		 unless you are just curious about comparing  the
		 effects of Nuttall vs. Kaiser windows.

		 This is the default effect if the two files have
		 different sampling  rates.   Default  parameters
		 are, as indicated above, Kaiser window of length
		 45, rolloff 0.80, beta 16, linear interpolation.

		 NOTE:	-qs  is	 only  slightly	 slower, but more
		 accurate for 16-bit or higher precision.

		 NOTE: In many cases of up-sampling, no	 interpo�
		 lation	 is  needed, as exact filter coefficients
		 can be computed in a reasonable amount of space.
		 To be precise, this is done when

			    input_rate < output_rate
				       &&
		   output_rate/gcd(input_rate,output_rate) <= 511

       reverb gain-out delay [ delay ... ]
		 Add reverberation to a sound sample.  Each delay
		 is  given  in	milliseconds  and its feedback is
		 depending on the  reverb-time	in  milliseconds.
		 Each  delay  should  be  in the range of half to
		 quarter of reverb-time to get a realistic rever�
		 beration.  Gain-out is the volume of the output.

       reverse	 Reverse the sound sample  completely.	 Included
		 for finding Satanic subliminals.

       speed factor
		 Speed	up  or down the sound, as a magnetic tape
		 with a speed control.	It affects both pitch and
		 time.	A  factor  of 1.0 means no change, and is
		 the  default.	 2.0  doubles  speed,  thus  time
		 length	 is cut by a half and pitch is one octave
		 higher.  0.5 halves speed thus time length  dou�
		 bles and pitch is one octave lower.




			  July 24, 2000			       17





SoX(1)							   SoX(1)


       split	 Turn a mono sample into a stereo sample by copy�
		 ing the input channel	to  the	 left  and  right
		 channels.

       stat [ -s n ] [-rms ] [ -v ] [ -d ]
		 Do  a	statistical  check on the input file, and
		 print results on the standard error file.  Audio
		 data  is  passed unmodified from input to output
		 file unless used along with the -e option.

		 The "Volume Adjustment:" field in the statistics
		 gives	you  the  argument to the -v number which
		 will make the sample as loud as possible without
		 clipping.

		 The option -v will print out the "Volume Adjust�
		 ment:" field's	 value	only  and  return.   This
		 could	be  of use in scripts to auto convert the
		 volume.

		 The -s n option is used to scale the input  data
		 by  a	given  factor.	The default value of n is
		 the  max  value  of  a	 signed	  long	 variable
		 (0x7fffffff).	Internal effects always work with
		 signed long PCM data and  so  the  value  should
		 relate to this fact.

		 The  -rms option will convert all output average
		 values to root mean square format.

		 There is also an optional parameter -d that will
		 print	out a hex dump of the sound file from the
		 internal buffer that is  in  32-bit  signed  PCM
		 data.	 This  is  mainly only of use in tracking
		 down endian problems that creep  in  to  sox  on
		 cross-platform versions.


       stretch factor [window fade shift fading]
		 Time  stretch	file  by  a  given factor. Change
		 duration without affecting the pitch.	factor of
		 stretching:  >1.0  lengthen,  <1.0 shorten dura�
		 tion.	window size is in ms.  Default	is  20ms.
		 The  fade option, can be "lin".  shift ratio, in
		 [0.0 1.0]. Default depends  on	 stretch  factor.
		 1.0  to  shorten,  0.8	 to lengthen.  The fading
		 ratio, in [0.0 0.5].  The  amount  of	a  fade's
		 default depends on factor and shift.

       swap [ 1 2 | 1 2 3 4 ]
		 Swap  channels	 in  multi-channel  sound  files.
		 Optionally, you may specify  the  channel  order
		 you  would like the output in.	 This defaults to
		 output channel 2 and then 1 for stereo and 2, 1,



			  July 24, 2000			       18





SoX(1)							   SoX(1)


		 4,  3 for quad-channels.  An interesting feature
		 is that you may duplicate  a  given  channel  by
		 overwriting  another.	This is done by repeating
		 an output channel  on	the  command  line.   For
		 example,  swap 2 2 will overwrite channel 1 with
		 channel 2's data; creating a  stereo  file  with
		 both channels containing the same audio data.

       trim start [ length ]
		 Trim  can  trim off unwanted audio data from the
		 beginning and end of the audio file.  Audio sam�
		 ples are not sent to the output stream until the
		 start location is reached.  start is a	 floating
		 point number that tells the number of seconds to
		 wait before starting.	If you	know  the  sample
		 number	 you would like to start at then the sec�
		 onds can be obtained by  multiply  (sample  #	*
		 sample rate).
		 The  optional	length parameter tells the number
		 of samples to output after the start sample  and
		 is  used  to trim off the back side of the audio
		 data.	Using a value of 0 for the start  parame�
		 ter  will allow trimming off the back side only.

       vibro speed  [ depth ]
		 Add the world-famous  Fender  Vibro-Champ  sound
		 effect to a sound sample by using a sine wave as
		 the volume knob.  Speed gives the Hertz value of
		 the  wave.   This must be under 30.  Depth gives
		 the amount the volume is cut into  by	the  sine
		 wave,	ranging 0.0 to 1.0 and defaulting to 0.5.

       vol gain [ type [ limitergain ] ]
		 The vol effect is much	 like  the  command  line
		 option	 -v.   It allows you to adjust the volume
		 of an input file and allows you to  specify  the
		 adjustment  in	 relation to amplitude, power, or
		 dB.  If type is not specified then  it	 defaults
		 to amplitude.
		 When  type  is amplitude then a linear change of
		 the amplitude is performed based  on  the  gain.
		 Therefore,  a	value of 1.0 will keep the volume
		 the same, 0.0 to < 1.0 will cause the volume  to
		 decrease and values of > 1.0 will cause the vol�
		 ume to increase.  Beware of clipping audio  data
		 when  the  gain is greater then 1.0.  A negative
		 value performs the same  adjustment  while  also
		 changing the phase.
		 When  type  is	 power	then  a value of 1.0 also
		 means no change in volume.
		 When type is dB the amplitude is  changed  loga�
		 rithmically.	0.0  is constant while +6 doubles
		 the amplitude.
		 An optional limitergain value can  be	specified



			  July 24, 2000			       19





SoX(1)							   SoX(1)


		 and  should  be  a  value much less then 1.0 (ie
		 0.05 or 0.02) and is used only on peaks to  pre�
		 vent  clipping.   Not	specifying this parameter
		 will cause no limiter to be  used.   In  verbose
		 mode, this effect will display the percentage of
		 audio data that needed to be limited.

BUGS
       The syntax is horrific.	Thats the breaks when  trying  to
       handle all things from the command line.

       Please  report  any  bugs  found in this version of sox to
       Chris Bagwell (cbagwell@sprynet.com)

FILES
SEE ALSO
       play(1), rec(1), soxexam(1)

NOTICES
       The version of Sox that accompanies this	 manual	 page  is
       support	by  Chris Bagwell (cbagwell@sprynet.com).  Please
       refer any questions regarding it to this address.  You may
       obtain	the   latest   version	 at   the  the	web  site
       http://home.sprynet.com/~cbagwell/sox.html

































			  July 24, 2000			       20