shithub: sox

Download patch

ref: f5156b42cb38830a77cee0c5fc23a643107b994a
parent: 1365bf7bff6df0b526caa443e9baa2713546b36d
author: rrt <rrt>
date: Sat Nov 11 17:56:59 EST 2006

Document new "rabbit" effect.

Tidy up formatting to make it more consistent and remove some problems.

Document Psion file types slightly more clearly.

--- a/sox.1
+++ b/sox.1
@@ -197,7 +197,7 @@
 number to invert the phase of the audio data.  It is interesting to note
 that we perceive volume
 logarithmically but this adjusts the amplitude linearly.
-.br
+
 As with other format options, the volume option effects the
 file its specified with.  This is useful whe processing mutiple
 input files as the volume adjustment can be specified for each
@@ -204,7 +204,7 @@
 input file or just once to adjust the output file.  This can be
 compared to an audio mixer were you can control the volume of
 each input as well as a master volume (output side).
-.br
+
 \fIsoxmix\fR defaults the value of the -v option for each input
 file to 1/input_file_count.  This means if your mixing two
 input files together then each input file's volume is adjusted
@@ -212,7 +212,7 @@
 the mixing operation. 
 Users will most likely not be happy with this large of a volume adjustment
 and can specify the -v option to override this default value.
-.br
+                                   
 Note: For the non-mixing case, see the \fBstat\fR effect for information on 
 finding the maximum volume adjustment that can be done with this option 
 without causing audio data to be clipped.
@@ -229,12 +229,12 @@
 The sample data encoding is signed linear (2's complement),
 unsigned linear, u-law (logarithmic), A-law (logarithmic),
 ADPCM, IMA_ADPCM, GSM, or Floating-point.
-.br
+
 U-law (actually shorthand for mu-law) and A-law are the U.S. and
 international standards for logarithmic telephone sound compression.
 When uncompressed u-law has roughly the precision of 14-bit PCM audio
 and A-law has roughly the precision of 13-bit PCM audio.
-.br
+
 A-law and u-law data is sometimes encoded using a reversed bit-ordering
 (ie. MSB becomes LSB).  Internally, SoX understands how to work with
 this encoding but there is currently no command line option to
@@ -241,7 +241,7 @@
 specify it.  If you need this support then you can use the psuedo
 file types of ".la" and ".lu" to inform sox of the encoding.  See
 supported file types for more information.
-.br
+                                   
 ADPCM is a form of sound compression that has a good
 compromise between good sound quality and fast encoding/decoding
 time.  It is used for telephone sound compression and places were
@@ -253,7 +253,7 @@
 IMA ADPCM is a specific form of ADPCM compression, slightly simpler
 and slightly lower fidelity than Microsoft's flavor of ADPCM.
 IMA ADPCM is also called DVI ADPCM.
-.br
+                                   
 GSM is a standard used for telephone sound compression in
 European countries and its gaining popularity because of its
 quality.  It usually is CPU intensive to work with GSM audio data.
@@ -293,8 +293,7 @@
 You may need a separate archiver to work with them.
 .TP 10
 .B .alsa
-ALSA /dev/snd/pcmCxDxp device driver
-.br
+ALSA /dev/snd/pcmCxDxp device driver.
 This is a pseudo-file type and can be optionally compiled into SoX.  Run
 .B sox -h
 to see if you have support for this file type.  When this driver is used
@@ -319,15 +318,12 @@
 format (see below).
 .TP 10
 .B .avr
-Audio Visual Research
-.br
+Audio Visual Research.
 The AVR format is produced by a number of commercial packages
 on the Mac.
 .TP 10
 .B .cdr
-CD-R
-.br
-CD-R files are used in mastering music on Compact Disks.
+CD-R. CD-R files are used in mastering music on Compact Disks.
 The audio data on a CD-R disk is a raw audio file
 with a format of stereo 16-bit signed samples at a 44khz sample
 rate.  There is a special blocking/padding oddity at the end
@@ -334,13 +330,11 @@
 of the audio file and is why it needs its own handler.
 .TP 10
 .B .cvs
-Continuously Variable Slope Delta modulation
-.br
+Continuously Variable Slope Delta modulation. 
 Used to compress speech audio for applications such as voice mail.
 .TP 10
 .B .dat      
-Text Data files
-.br
+Text Data files. 
 These files contain a textual representation of the
 sample data.  There is one line at the beginning
 that contains the sample rate.  Subsequent lines
@@ -354,8 +348,7 @@
 formats.
 .TP 10
 .B .gsm
-GSM 06.10 Lossy Speech Compression
-.br
+GSM 06.10 Lossy Speech Compression. 
 A standard for compressing speech which is used in the
 Global Standard for Mobil telecommunications (GSM).  Its good
 for its purpose, shrinking audio data size, but it will introduce
@@ -379,9 +372,7 @@
 to deal with an HCOM file under Unix or DOS.
 .TP 10
 .B .maud
-An Amiga format
-.br
-An IFF-conform sound file type, registered by
+An IFF-conformant sound file type, registered by
 MS MacroSystem Computer GmbH, published along
 with the "Toccata" sound-card on the Amiga.
 Allows 8bit linear, 16bit linear, A-Law, u-law
@@ -388,9 +379,7 @@
 in mono and stereo.
 .TP 10
 .B .mp3
-MP3 Compressed Audio
-.br
-MP3 audio files come from the MPEG standards for audio and video compression.  They are a lossy compression format that achieves good compression rates with a minimum amount of quality loss.  Also see Ogg Vorbis for a similar format.
+MP3 Compressed Audio. MP3 audio files come from the MPEG standards for audio and video compression.  They are a lossy compression format that achieves good compression rates with a minimum amount of quality loss.  Also see Ogg Vorbis for a similar format.
 MP3 support in
 .B SoX
 is optional and requires access to either or both the external 
@@ -407,8 +396,7 @@
 but would like to specify a filename for consistency.
 .TP 10
 .B .ogg
-Ogg Vorbis Compressed Audio.
-.br
+Ogg Vorbis Compressed Audio. 
 Ogg Vorbis is a open, patent-free CODEC designed for compressing music
 and streaming audio.  It is similar to MP3, VQF, AAC, and other lossy
 formats.  
@@ -423,8 +411,7 @@
 and look for it under the list of supported file formats as "vorbis".
 .TP 10
 .B ossdsp
-OSS /dev/dsp device driver
-.br
+OSS /dev/dsp device driver.
 This is a pseudo-file type and can be optionally compiled into SoX.  Run
 .B sox -h
 to see if you have support for this file type.  When this driver is used
@@ -437,15 +424,11 @@
 .I sox infile -t ossdsp -w -s /dev/dsp
 .TP 10
 .B .prc
-Psion record.app
-.br
-Used in some Psion devices for System alarms.  This format is newer then
+Psion Record. Used in some Psion devices for System alarms and recordings made by the built-in Record application.  This format is newer then
 the .wve format that is used in some Psion devices.
 .TP 10
 .B .sf
-IRCAM Sound Files.
-.br
-Sound Files are used by academic music software 
+IRCAM Sound Files. Sound Files are used by academic music software 
 such as the CSound package, and the MixView sound sample editor.
 .TP 10
 .B .sph
@@ -461,7 +444,6 @@
 .TP 10
 .B .smp
 Turtle Beach SampleVision files.
-.br
 SMP files are for use with the PC-DOS package SampleVision by Turtle Beach
 Softworks. This package is for communication to several MIDI samplers. All
 sample rates are supported by the package, although not all are supported by
@@ -468,18 +450,15 @@
 the samplers themselves. Currently loop points are ignored.
 .TP 10
 .B .snd
-.br
 Under DOS this file format is the same as the \fB.sndt\fR format.  Under all
 other platforms it is the same as the \fB.au\fR format.
 .TP 10
 .B .sndt
 SoundTool files.
-.br
 This is an older DOS file format.
 .TP 10
 .B sunau
-Sun /dev/audio device driver
-.br
+Sun /dev/audio device driver.
 This is a pseudo-file type and can be optionally compiled into SoX.  Run
 .B sox -h
 to see if you have support for this file type.  When this driver is used
@@ -497,21 +476,18 @@
 .TP 10
 .B .txw
 Yamaha TX-16W sampler.
-.br
 A file format from a Yamaha sampling keyboard which wrote IBM-PC
-format 3.5" floppies.  Handles reading of files which do not have
+format 3.5\" floppies.  Handles reading of files which do not have
 the sample rate field set to one of the expected by looking at some
 other bytes in the attack/loop length fields, and defaulting to
 33kHz if the sample rate is still unknown.
 .TP 10
 .B .vms
-More info to come.
-.br
+(More info to come.)
 Used to compress speech audio for applications such as voice mail.
 .TP 10
 .B .voc
 Sound Blaster VOC files.
-.br
 VOC files are multi-part and contain silence parts, looping, and
 different sample rates for different chunks.
 On input, the silence parts are filled out, loops are rejected,
@@ -532,14 +508,8 @@
 .TP 10
 .B .wav
 Microsoft .WAV RIFF files.
-.br
-These appear to be very similar to IFF files,
-but not the same.  
-They are the native sound file format of Windows.
-(Obviously, Windows was of such incredible importance
-to the computer industry that it just had to have its own 
-sound file format.)
-.br
+The are the native sound file format of Windows, and widely used for uncompressed sound.
+
 Normally \fB.wav\fR files have all formatting information
 in their headers, and so do not need any format options
 specified for an input file. If any are, they will
@@ -547,7 +517,7 @@
 You had better know what you are doing! Output format
 options will cause a format conversion, and the \fB.wav\fR
 will written appropriately.
-.br
+
 SoX currently can read PCM, ULAW, ALAW, MS ADPCM, and IMA (or DVI) ADPCM.
 It can write all of these formats including the ADPCM encoding.
 Big endian versions of RIFF files, called RIFX, can also be read
@@ -556,14 +526,10 @@
 option with the output file options.
 .TP 10
 .B .wve
-Psion 8-bit A-law
-.br
-These are 8-bit A-law 8khz sound files used on the
-Psion palmtop portable computer.
+Psion 8-bit A-law. Used on older Psion PDAs.
 .TP 10
 .B .raw
 Raw files (no header).
-.br
 The sample rate, size (byte, word, etc), 
 and encoding (signed, unsigned, etc.)
 of the sample file must be given.
@@ -585,7 +551,7 @@
 a sample rate of 11025 or 22050 hz.
 .TP 10
 .B .auto
-This is a ``meta-type'' and is the default file type if the user does not specify one. This file type attempts to guess the real type by looking for magic words in the header. If the type can't be guessed, the program
+This is a "meta-type" and is the default file type if the user does not specify one. This file type attempts to guess the real type by looking for magic words in the header. If the type can't be guessed, the program
 exits with an error message.  The input must be a plain file, not a
 pipe.  This type can't be used for output files.
 .SH EFFECTS
@@ -736,10 +702,10 @@
 This is most useful if your audio data tends to not be centered around
 a value of 0.  Shifting it back will allow you to get the most volume
 adjustments without clipping audio data.
-.br
+
 The first option is the \fIdcshift\fR value.  It is a floating point number that
 indicates the amount to shift.
-.br
+
 An option limtergain value can be specified as well.  It should have a value much less then 1.0 and is used only on peaks to prevent clipping.
 .TP 10
 deemph
@@ -755,7 +721,6 @@
 moved from inside
 your head (standard for headphones) to outside and in front of the
 listener (standard for speakers). See 
-.br
 www.geocities.com/beinges
 for a full explanation.
 .TP 10
@@ -783,12 +748,12 @@
 \fIfade-out-length\fR seconds before the \fIstop-time\fR.  If fade-out-length
 is not specified, it defaults to the same value as fade-in-length.
 No fade-out is performed if the stop-time is not specified.
-.br
+
 All times can be specified in either periods of time or sample counts.
 To specify time periods use the format hh:mm:ss.frac format.  To specify
 using sample counts, specify the number of samples and append the letter 's'
 to the sample count (for example 8000s).
-.br
+
 An optional \fItype\fR can be specified to change the type of envelope.  Choices are q for quarter of a sinewave, h for half a sinewave, t for linear slope, l for logarithmic, and p for inverted parabola.  The default is a linear slope.
 .TP 10
 filter [ \fIlow\fR ]-[ \fIhigh\fR ] [ \fIwindow-len\fR [ \fIbeta\fR ] ]
@@ -814,7 +779,7 @@
 delay/decay/speed gives the delay in milliseconds
 and the decay (relative to gain-in) with a modulation
 speed in Hz.
-The modulation is either sinodial (-s) or triangular
+The modulation is either sinusoidal (-s) or triangular
 (-t).  Gain-out is the volume of the output.
 .TP 10
 highp \fIfrequency\fR
@@ -957,6 +922,17 @@
 this is a float.
 
 .TP 10
+rabbit [ \fI-c0\fR | \fI-c1\fR | \fI-c2\fR | \fI-c3\fR | \fI-c4\fR ]
+Resample using libsamplerate, aka Secret Rabbit Code. See
+http://www.mega-nerd.com/SRC/ for details of the algorithm. Algorithms
+0 through 2 are progressively faster and lower quality versions of the
+sinc algorithm; the default is \fI-c0\fR, which is probably the best
+quality algorithm for general use currently available in sox.
+Algorithm 3 is zero-order hold, and 4 is linear interpolation, which
+is only included for completeness. See the \fIresample\fR effect for
+more discussion of resampling.
+
+.TP 10
 rate
 Translate input sampling rate to output sampling rate
 via linear interpolation to the Least Common Multiple
@@ -970,7 +946,8 @@
 
 Lerp-ing is acceptable for cheap 8-bit sound hardware,
 but for CD-quality sound you should instead use either
-.B resample
+.B resample,
+.B rabbit
 or
 .B polyphase.
 If you are wondering which rate changing effects to use, you will want to read a
@@ -979,7 +956,7 @@
 repeat \fIcount\fR
 Repeats the audio data \fIcount\fR times.  Requires disk space to store the data to be repeated.
 .TP 10
-resample [ \fI-qs\fB | \fI-q\fB | \fI-ql\fB ] [ \fIrolloff\fB [ \fIbeta\fB ] ]\fR
+resample [ \fI-qs\fR | \fI-q\fR | \fI-ql\fR ] [ \fIrolloff\fR [ \fIbeta\fR ] ]
 Translate input sampling rate to output sampling rate
 via simulated analog filtration.
 This method is slower than 
@@ -1081,7 +1058,7 @@
   output_rate/gcd(input_rate,output_rate) <= 511
 .br
 .TP 10
-reverb \fIgain-out reverbe-time delay \fR[ \fIdelay ... \fR]
+reverb \fIgain-out reverb-time delay \fR[ \fIdelay ... \fR]
 Add reverberation to a sound sample.  Each delay is given 
 in milliseconds and its feedback is depending on the
 reverb-time in milliseconds.  Each delay should be in 
@@ -1092,12 +1069,9 @@
 reverse 
 Reverse the sound sample completely.
 Included for finding Satanic subliminals.
-.TP
-\fBsilence\fR \fIabove_periods\fR [ \fIduration threshold\fR[ \fId\fR | \fI%\fR ]
-.TP
-        [ \fIbelow_periods duration 
 .TP 10
-          threshold\fR[ \fId\fR | \fI%\fR ]]
+silence \fIabove_periods\fR [ \fIduration threshold\fR[ \fId\fR | \fI%\fR ] [ \fIbelow_periods duration threshold\fR[ \fId\fR | \fI%\fR ]]
+
 Removes silence from the beginning, middle, or end of a sound file.  Silence is anything below a specified threshold.
 
 The \fIabove_periods\fR value is used to indicate if sound should be trimmed at 
@@ -1136,7 +1110,7 @@
 in the middle and 2 seconds of silence at the end, a duration of 2
 seconds could be used to skip over the middle silence.
 
-Unfortunetly, you must know the length of the silence at the 
+Unfortunately, you must know the length of the silence at the 
 end of your audio file to trim off silence reliably.  A work around is
 to use the \fIsilence\fR effect in combination with the \fIreverse\fR effect.
 By first reversing the audio, you can use the \fIabove_periods\fR
@@ -1224,49 +1198,53 @@
 This is done by repeating an output channel on the command line.  For example,
 swap 2 2 will overwrite channel 1 with channel 2's data; creating a stereo
 file with both channels containing the same audio data.
-.TP
-synth [ \fIlength\fR ] \fItype mix\fR [ \fIfreq\fR [ \fI-freq2\fR ]
 .TP 10
-      [ \fIoff\fR ] [ \fIph\fR ] [ \fIp1\fR ] [ \fIp2\fR ] [ \fIp3\fR ]
+synth [ \fIlength\fR ] \fItype mix\fR [ \fIfreq\fR [ \fI-freq2\fR ] [ \fIoff\fR ] [ \fIph\fR ] [ \fIp1\fR ] [ \fIp2\fR ] [ \fIp3\fR ]
 The synth effect will generate various types of audio data.  Although
 this effect is used to generate audio data, an input file must be specified.
 The length of the input audio file determines the length of the output
 audio file.
-.br
-<length> length in sec or hh:mm:ss.frac, 0=inputlength, default=0
-.br
-<type> is sine, square, triangle, sawtooth, trapetz, exp,
+
+\fIlength\fR length in sec or hh:mm:ss.frac, 0=inputlength, default=0
+
+\fItype\fR is sine, square, triangle, sawtooth, trapetz, exp,
 whitenoise, pinknoise, brownnoise, default=sine
-.br
-<mix> is create, mix, amod, default=create
-.br
-<freq> frequency at beginning in Hz, not used  for noise..
-.br
-<freq2> frequency at end in Hz, not used for noise..
+
+\fImix\fR is create, mix, amod, default=create
+
+\fIfreq\fR frequency at beginning in Hz, not used  for noise..
+
+\fIfreq2\fR frequency at end in Hz, not used for noise..
 <freq/2> can be given as %%n, where 'n' is the number of
 half notes in respect to A (440Hz)
-.br
-<off> Bias (DC-offset)  of signal in percent, default=0
-.br
-<ph> phase shift 0..100 shift phase 0..2*Pi, not used for noise..
-.br
-<p1> square: Ton/Toff, triangle+trapetz: rising slope time (0..100)
-.br
-<p2> trapetz: ON time (0..100)
-.br
-<p3> trapetz: falling slope position (0..100)
+
+\fIoff\fR Bias (DC-offset) of signal in percent, default=0
+
+\fIph\fR phase shift 0..100 shift phase 0..2*Pi, not used for noise..
+
+\fIp1\fR square: Ton/Toff, triangle+trapetz: rising slope time (0..100)
+
+\fIp2\fR trapezium: ON time (0..100)
+
+\fIp3\fR trapezium: falling slope position (0..100)
 .TP 10
 trim \fIstart\fR [ \fIlength\fR ]
 Trim can trim off unwanted audio data from the beginning and end of the
 audio file.  Audio samples are not sent to the output stream until
 the \fIstart\fR location is reached.
-.br
+
 The optional \fIlength\fR parameter tells the number of samples to output
 after the \fIstart\fR sample and is used to trim off the back side of the
 audio data.  Using a value of 0 for the \fIstart\fR parameter will allow
 trimming off the back side only.
-.br
-Both options can be specified using either an amount of time and an exact count of samples.  The format for specifying lengths in time is hh:mm:ss.frac.  A start value of 1:30.5 will not start until 1 minute, thirty and 1/2 seconds into the audio data.  The format for specifying sample counts is the number of samples with the letter 's' appended to it.  A value of 8000s will wait until 8000 samples are read before starting to process audio data.
+
+Both options can be specified using either an amount of time or an
+exact count of samples. The format for specifying lengths in time is
+hh:mm:ss.frac. A start value of 1:30.5 will not start until 1 minute,
+thirty and 1/2 seconds into the audio data. The format for specifying
+sample counts is the number of samples with the letter 's' appended to
+it. A value of 8000s will wait until 8000 samples are read before
+starting to process audio data.
 .TP 10
 vibro \fIspeed \fB [ \fIdepth\fB ]
 Add the world-famous Fender Vibro-Champ sound
@@ -1285,7 +1263,7 @@
 adjust the volume of an input file and allows you to specify the adjustment
 in relation to amplitude, power, or dB.  If \fItype\fR is not specified then
 it defaults to \fIamplitude\fR.
-.br 
+ 
 When type is 
 .I amplitude
 then a linear change of the amplitude is performed based on the gain.  Therefore,
@@ -1293,16 +1271,16 @@
 volume to decrease and values of > 1.0 will cause the volume to increase.
 Beware of clipping audio data when the gain is greater then 1.0.  A negative
 value performs the same adjustment while also changing the phase.
-.br
+
 When type is 
 .I power
 then a value of 1.0 also means no change in volume.
-.br
+
 When type is 
 .I dB
 the amplitude is changed logarithmically.
 0.0 is constant while +6 doubles the amplitude.
-.br
+
 An optional \fIlimitergain\fR value can be specified and should be a
 value much less
 then 1.0 (ie 0.05 or 0.02) and is used only on peaks to prevent clipping.
@@ -1332,5 +1310,5 @@
 .SH AUTHORS
 Chris Bagwell (cbagwell@users.sourceforge.net).  
 .P
-Additional Authors and contributors are listed in the Changelog file that
+Additional authors and contributors are listed in the Changelog file that
 is distributed with the source code.