ref: 6763cba31dea9a680d10d085e5e8e3f04e8cf3ce
dir: /soxformat.7/
'\" t '\" The line above instructs most `man' programs to invoke tbl '\" '\" Separate paragraphs; not the same as PP which resets indent level. .de SP .if t .sp .5 .if n .sp .. '\" '\" Replacement em-dash for nroff (default is too short). .ie n .ds m " - .el .ds m \(em '\" '\" Placeholder macro for if longer nroff arrow is needed. .ds RA \(-> '\" '\" Decimal point set slightly raised .if t .ds d \v'-.15m'.\v'+.15m' .if n .ds d . '\" '\" Enclosure macro for examples .de EX .SP .nf .ft CW .. .de EE .ft R .SP .fi .. .TH SoX 7 "April 17, 2007" "soxformat" "Sound eXchange" .SH NAME SoX \- Sound eXchange, the Swiss Army knife of audio manipulation .SH DESCRIPTION File types that can be determined by a filename extension are listed with their names preceded by a dot. .SP File types that require an external library, such as ffmpeg or libsndfile, are marked e.g. `\fB(ffmpeg)\fR'. File types that can be handled by an external library via its pseudo file type (currently libsndfile or ffmpeg) are marked e.g. `(also with \fB\-t sndfile\fR)'. This might be useful if you have a file that doesn't work with SoX's default format readers and writers, and there's an external reader or writer for that format. .SP .TP \&\fB.raw\fR (also with \fB\-t sndfile\fR), \fB.s1\fR, \fB.s2\fR, \fB.s3\fR, \fB.s4\fR, \fB.u1\fR, \fB.u2\fR, \fB.u3\fR, \fB.u4\fR, \fB.ul\fR, \fB.al\fR, \fB.lu\fR, \fB.la\fR Raw (headerless) audio files. For .BR raw , the sample rate and the data encoding must be given using command-line format options; for all other types, the sample rate defaults to 8kHz (but may be overridden), and the data encoding is defined by the given suffix. Thus \fBs1\fR, \fBs2\fR, \fBs3\fR, and \fBs4\fR indicate files encoded as 1, 2, 3, and 4-byte signed integer PCM respectively; \fBu1\fR, \fBu2\fR, \fBu3\fR, and \fBu4\fR indicate files encoded as 1, 2, 3, and 4-byte unsigned integer PCM respectively; \fBul\fR indicates `\(*m-law' (byte), \fBal\fR indicates `A-law' (byte), and \fBlu\fR and \fBla\fR are inverse bit order `\(*m-law' and inverse bit order `A-law'i respectively. \fBsb\fR, \fBsw\fR, \fBub\fR, \fBuw\fR, and \fBsl\fR are aliases for \fBs1\fR, \fBs2\fR, \fBu1\fR, \fBu2\fR, and \fBs4\fR respectively. For all raw formats, the number of channels defaults to 1 (but may be overridden). .SP Headerless audio files on a SPARC computer are likely to be of format \fBul\fR; on a Mac, they're likely to be \fBub\fR but with a sample rate of 11025 or 22050\ Hz. .TP \&\fB.8svx\fR (also with \fB\-t sndfile\fR) Amiga 8SVX musical instrument description format. .TP \&\fB.aiff\fR, \fB.aif\fR (also with \fB\-t sndfile\fR) AIFF files used on Apple IIc/IIgs and SGI. Note: the AIFF format supports only one SSND chunk. It does not support multiple audio chunks, or the 8SVX musical instrument description format. AIFF files are multimedia archives and can have multiple audio and picture chunks. You may need a separate archiver to work with them. .TP \&\fB.aiffc\fR, \fB.aifc\fR (also with \fB\-t sndfile\fR) AIFF-C (not compressed, linear), defined in DAVIC 1.4 Part 9 Annex B. This format is referred from ARIB STD-B24, which is specified for Japanese data broadcasting. Any private chunks are not supported. .SP Note: The input file is currently processed as .aiff. .TP .B alsa ALSA device driver. This is a pseudo-file type and can be optionally compiled into SoX. Run .EX sox -h .EE to see if you have support for this file type. When this driver is used it allows you to open up a ALSA device and configure it to use the same data format as passed in to SoX. It works for both playing and recording audio files. When playing audio files it attempts to set up the ALSA driver to use the same format as the input file. It is suggested to always override the output values to use the highest quality format your ALSA system can handle. Examples: .EX sox infile -t alsa sox infile -t alsa default sox infile -t alsa hw:0 sox -t alsa hw:1 outfile .EE .TP .B .amb Ambisonic B-Format: a specialisation of .B .wav with between 3 and 16 channels of audio for use with an Ambisonic decoder. See http://www.ambisonia.com/Members/mleese/file-format-for-b-format for details. It is up to the user to get the channels together in the right order and at the correct amplitude. .TP \&\fB.amr\-nb\fR Adaptive Multi Rate\*mNarrow Band speech codec; a lossy format used in 3rd generation mobile telephony and defined in 3GPP TS 26.071 et al. .SP AMR-NB audio has a fixed sampling rate of 8 kHz and supports encoding to the following bit-rates (as selected by the .B \-C option): 0 = 4\*d75 kbit/s, 1 = 5\*d15 kbit/s, 2 = 5\*d9 kbit/s, 3 = 6\*d7 kbit/s, 4 = 7\*d4 kbit/s 5 = 7\*d95 kbit/s, 6 = 10\*d2 kbit/s, 7 = 12\*d2 kbit/s. .SP This format in SoX is optional and requires access to external libraries. To see if there is support for this format, enter .EX sox -h .EE and look for it under the list: .IR "SUPPORTED FILE FORMATS" . .TP \&\fB.amr\-wb\fR Adaptive Multi Rate\*mWide Band speech codec; a lossy format used in 3rd generation mobile telephony and defined in 3GPP TS 26.171 et al. .SP AMR-WB audio has a fixed sampling rate of 16 kHz and supports encoding to the following bit-rates (as selected by the .B \-C option): 0 = 6\*d6 kbit/s, 1 = 8\*d85 kbit/s, 2 = 12\*d65 kbit/s, 3 = 14\*d25 kbit/s, 4 = 15\*d85 kbit/s 5 = 18\*d25 kbit/s, 6 = 19\*d85 kbit/s, 7 = 23\*d05 kbit/s, 8 = 23\*d85 kbit/s. .SP This format in SoX is optional and requires access to external libraries. To see if there is support for this format on your system, enter .EX sox -h .EE and look for it under the list: .IR "SUPPORTED FILE FORMATS" . .TP .B ao libao device driver. This is a pseudo-file type and can be optionally compiled into SoX. Run .EX sox -h .EE to see if you have support for this file type. It works only for playing audio files. It can play to a wide range of devices and sound systems. See its documentation for the full range. For the most part, SoX's use of libao cannot be configured directly; you must use libao configuration files. .SP The filename specified is used to determine which libao plugin to us. Normally, you should specify "default" as the filename. If that doesn't give the desired behavior then you can specify the short name for a given plugin (such as \fBpulse\fR for pulse audio plugin). .TP \&\fB.au\fR, \fB.snd\fR (also with \fB\-t sndfile\fR) Sun Microsystems AU files. There are many types of AU file; DEC has invented its own with a different magic number and byte order. SoX can read these files but will not write them. Some .au files are known to have invalid AU headers; these are probably original Sun \(*m-law 8000\ Hz files and can be dealt with using the .B .ul format (see below). .SP It is possible to override AU file header information with the .B \-r and .B \-c options, in which case SoX will issue a warning to that effect. .TP .B .avr Audio Visual Research. The AVR format is produced by a number of commercial packages on the Mac. .TP \&\fB.caf\fR (libsndfile) Core Audio File format. .TP \&\fB.cdda\fR, \fB.cdr\fR `Red Book' Compact Disc Digital Audio. CDDA has two audio channels formatted as 16-bit signed integers at a sample rate of 44\*d1\ kHz. The number of (stereo) samples in each CDDA track is always a multiple of 588 which is why it needs its own handler. .TP \&\fB.cvsd\fR, \fB.cvs\fR Continuously Variable Slope Delta modulation. A headerless format used to compress speech audio for applications such as voice mail. This format is sometimes used with bit-reversed samples\*mthe .B \-X format option can be used to set the bit-order. .TP .B .dat Text Data files. These files contain a textual representation of the sample data. There is one line at the beginning that contains the sample rate. Subsequent lines contain two numeric data items: the time since the beginning of the first sample and the sample value. Values are normalized so that the maximum and minimum are 1 and \-1. This file format can be used to create data files for external programs such as FFT analysers or graph routines. SoX can also convert a file in this format back into one of the other file formats. .TP \&\fB.dvms\fR, \fB.vms\fR Used in Germany to compress speech audio for voice mail. A self-describing variant of .BR cvsd . .TP \&\fB.fap\fR (libsndfile) See .BR .paf . .TP .B ffmpeg This is a pseudo-type that forces ffmpeg to be used. The actual file type is deduced from the file name (it cannot be used on stdio). This pseudo-type depends on SoX having been built with optional ffmpeg support. It can read a wide range of audio files, not all of which are documented here, and also the audio track of many video files (including AVI, WMV and MPEG). At present only the first audio track of a file can be read. .TP \&\fB.flac\fR (also with \fB\-t sndfile\fR) Free Lossless Audio CODEC compressed audio. FLAC is an open, patent-free CODEC designed for compressing music. It is similar to MP3 and Ogg Vorbis, but lossless, meaning that audio is compressed in FLAC without any loss in quality. .SP SoX can read native FLAC files (.flac) but not Ogg FLAC files (.ogg). [But see .B .ogg below for information relating to support for Ogg Vorbis files.] .SP SoX can write native FLAC files according to a given or default compression level. 8 is the default compression level and gives the best (but slowest) compression; 0 gives the least (but fastest) compression. The compression level is selected using the .B \-C option [see .BR sox (1)] with a whole number from 0 to 8. .SP FLAC support in SoX is optional and requires optional FLAC libraries. To see if there is support for FLAC run .EX sox -h .EE and look for it under the list of supported file formats as `flac'. .TP .B .fssd An alias for the .B .ub format. .TP \&\fB.gsm\fR (also with \fB\-t sndfile\fR) GSM 06.10 Lossy Speech Compression. A lossy format for compressing speech which is used in the Global Standard for Mobile telecommunications (GSM). It's good for its purpose, shrinking audio data size, but it will introduce lots of noise when a given audio signal is encoded and decoded multiple times. This format is used by some voice mail applications. It is rather CPU intensive. .SP GSM in SoX is optional and requires access to an external GSM library. To see if there is support for GSM run .EX sox -h .EE and look for it under the list of supported file formats. .TP .B .hcom Macintosh HCOM files. These are (apparently) Mac FSSD files with some variant of Huffman compression. The Macintosh has wacky file formats and this format handler apparently doesn't handle all the ones it should. Mac users will need their usual arsenal of file converters to deal with an HCOM file on other systems. .TP .B .htk Single channel 16-bit PCM format used by HTK, a toolkit for building Hidden Markov Model speech processing tools. .TP \&\fB.ircam\fR (also with \fB\-t sndfile\fR) Another name for .BR .sf . .TP \&\fB.ima\fR (also with \fB\-t sndfile\fR) A headerless file of IMA ADPCM audio data. IMA ADPCM claims 16-bit precision packed into only 4 bits, but in fact sounds no better than .BR .vox . .TP \&\fB.lpc\fR, \fB.lpc10\fR LPC-10 is a compression scheme for speech developed in the United States. See http://www.arl.wustl.edu/~jaf/lpc/ for details. There is no associated file format, so SoX's implementation is headerless. .TP \&\fB.mat\fR, \fB.mat4\fR, \fB.mat5\fR \fB(libsndfile)\fR Matlab 4.2/5.0 (respectively GNU Octave 2.0/2.1) format (.mat is the same as .mat4). .TP .B .m3u A .I playlist format; contains a list of audio files. See [1] for details of this format. .TP .B .maud An IFF-conforming audio file type, registered by MS MacroSystem Computer GmbH, published along with the `Toccata' sound-card on the Amiga. Allows 8bit linear, 16bit linear, A-Law, \(*m-law in mono and stereo. .TP \&\fB.mp3\fR, \fB.mp2\fR MP3 compressed audio. MP3 (MPEG Layer 3) is part of the MPEG standards for audio and video compression. It is a lossy compression format that achieves good compression rates with little quality loss. See also .B Ogg Vorbis for a similar format. .SP MP3 support in SoX is optional and requires access to either or both the external libmad and libmp3lame libraries. To see if there is support for MP3 run .EX sox -h .EE and look for it under the list of supported file formats as `mp3'. .SP .TP \&\fB.mp4\fR, \fB.m4a\fR (ffmpeg) MP4 compressed audio. MP3 (MPEG 4) is part of the MPEG standards for audio and video compression. See .B mp3 for more information. .SP MP4 support in SoX is optional and requires access to the external ffmpeg libraries. .TP \&\fB.nist\fR (also with \fB\-t sndfile\fR) See \fB.sph\fR. .TP \&\fB.ogg\fR, \fB.vorbis\fR Ogg Vorbis compressed audio. Ogg Vorbis is a open, patent-free CODEC designed for compressing music and streaming audio. It is a lossy compression format (similar to MP3, VQF & AAC) that achieves good compression rates with a minimum amount of quality loss. See also .B MP3 for a similar format. .SP SoX can decode all types of Ogg Vorbis files, and can encode at different compression levels/qualities given as a number from \-1 (highest compression/lowest quality) to 10 (lowest compression, highest quality). By default the encoding quality level is 3 (which gives an encoded rate of approx. 112kbps), but this can be changed using the .B \-C option (see above) with a number from \-1 to 10; fractional numbers (e.g. 3\*d6) are also allowed. .SP Decoding is somewhat CPU intensive and encoding is very CPU intensive. .SP Ogg Vorbis in SoX is optional and requires access to external Ogg Vorbis libraries. To see if there is support for Ogg Vorbis run .EX sox -h .EE and look for it under the list of supported file formats as `vorbis'. .TP .B oss OSS /dev/dsp device driver. This is a pseudo-file that can be optionally compiled into SoX. Run .EX sox -h .EE to see if it is supported. When this driver is used it allows you to play and record sounds on supported systems. When playing audio files it attempts to set up the OSS driver to use the same format as the input file. It is suggested to always override the output values to use the highest quality format your OSS system can handle. Example: .EX sox infile -t oss -2 -s /dev/dsp .EE .TP \&\fB.paf\fR, \fB.fap\fR (libsndfile) Ensoniq PARIS file format (big and little-endian respectively). .TP .B .pls A .I playlist format; contains a list of audio files. See [2] for details of this format. .SP Note: SHOUTcast PLS relies on .BR wget (1) and is only partially supported: it's necessary to specify the audio type manually, e.g. .EX play -t mp3 \(dqhttp://a.server/pls?rn=265&file=filename.pls\(dq .EE and SoX does not know about alternative servers\*mhit Ctrl-C twice in quick succession to quit. .TP .B .prc Psion Record. Used in Psion EPOC PDAs (Series 5, Revo and similar) for System alarms and recordings made by the built-in Record application. When writing, SoX defaults to A-law, which is recommended; if you must use ADPCM, then use the \fB\-i\fR switch. The sound quality is poor because Psion Record seems to insist on frames of 800 samples or fewer, so that the ADPCM CODEC has to be reset at every 800 frames, which causes the sound to glitch every tenth of a second. .TP \&\fB.pvf\fR (libsndfile) Portable Voice Format. .TP \&\fB.sd2\fR (libsndfile) Sound Designer 2 format. .TP \&\fB.sds\fR (libsndfile) MIDI Sample Dump Standard. .TP \&\fB.sf\fR (also with \fB\-t sndfile\fR) IRCAM SDIF (Institut de Recherche et Coordination Acoustique/Musique Sound Description Interchange Format). Used by academic music software such as the CSound package, and the MixView sound sample editor. .TP \&\fB.sph\fR, \fB.nist\fR (also with \fB\-t sndfile\fR) SPHERE (SPeech HEader Resources) is a file format defined by NIST (National Institute of Standards and Technology) and is used with speech audio. SoX can read these files when they contain \(*m-law and PCM data. It will ignore any header information that says the data is compressed using \fIshorten\fR compression and will treat the data as either \(*m-law or PCM. This will allow SoX and the command line \fIshorten\fR program to be run together using pipes to encompasses the data and then pass the result to SoX for processing. .TP .B .smp Turtle Beach SampleVision files. SMP files are for use with the PC-DOS package SampleVision by Turtle Beach Softworks. This package is for communication to several MIDI samplers. All sample rates are supported by the package, although not all are supported by the samplers themselves. Currently loop points are ignored. .TP .B .snd See .BR .au . .TP .B sndfile This is a pseudo-type that forces libsndfile to be used. For writing files, the actual file type is then taken from the output file name; for reading them, it is deduced from the file. This pseudo-type depends on SoX having been built with optional libsndfile support. .TP .B .sndt Sndtool files. This format dates from the MS-DOS era. Bizarrely, this file type can also be used to read Sounder files. .TP .B .sou An alias for the .B .ub format. .TP .B sunau Sun /dev/audio device driver. This is a pseudo-file type and can be optionally compiled into SoX. Run .EX sox -h .EE to see if you have support for this file type. When this driver is used it allows you to open up a Sun /dev/audio file and configure it to use the same data type as passed in to SoX. It works for both playing and recording audio files. When playing audio files it attempts to set up the audio driver to use the same format as the input file. It is suggested to always override the output values to use the highest quality format your hardware can handle. Example: .EX sox infile -t sunau -2 -s /dev/audio .EE or .EX sox infile -t sunau -U -c 1 /dev/audio .EE for older sun equipment. .TP .B .txw Yamaha TX-16W sampler. A file format from a Yamaha sampling keyboard which wrote IBM-PC format 3\*d5\(dq floppies. Handles reading of files which do not have the sample rate field set to one of the expected by looking at some other bytes in the attack/loop length fields, and defaulting to 33\ kHz if the sample rate is still unknown. .TP .B .vms See .BR .dvms . .TP \&\fB.voc\fR (also with \fB\-t sndfile\fR) Sound Blaster VOC files. VOC files are multi-part and contain silence parts, looping, and different sample rates for different chunks. On input, the silence parts are filled out, loops are rejected, and sample data with a new sample rate is rejected. Silence with a different sample rate is generated appropriately. On output, silence is not detected, nor are impossible sample rates. SoX supports reading (but not writing) VOC files with multiple blocks, and files containing \(*m-law, A-law, and 2/3/4-bit ADPCM samples. .TP .B .vorbis See .BR .ogg . .TP \&\fB.vox\fR (also with \fB\-t sndfile\fR) A headerless file of Dialogic/OKI ADPCM audio data commonly comes with the extension .vox. This ADPCM data has 12-bit precision packed into only 4-bits. .SP Note: some early Dialogic hardware does not always reset the ADPCM encoder at the start of each vox file. This can result in clipping and/or DC offset problems when it comes to decoding the audio. Whilst little can be done about the clipping, a DC offset can be removed by passing the decoded audio through a high-pass filter, e.g.: .EX sox input.vox output.au highpass 10 .EE .TP \&\fB.w64\fR (libsndfile) Sonic Foundry's 64-bit RIFF/WAV format. .TP \&\fB.wav\fR (also with \fB\-t sndfile\fR) Microsoft .WAV RIFF files. This is the native audio file format of Windows, and widely used for uncompressed audio. .SP Normally \fB.wav\fR files have all formatting information in their headers, and so do not need any format options specified for an input file. If any are, they will override the file header, and you will be warned to this effect. You had better know what you are doing! Output format options will cause a format conversion, and the \fB.wav\fR will written appropriately. .SP SoX currently can read PCM, \(*m-law, A-law, MS ADPCM, and IMA (or DVI) ADPCM. It can write all of these formats including the ADPCM encoding. Big endian versions of RIFF files, called RIFX, can also be read and written. To write a RIFX file, use the .B \-B option with the output file options. .TP .B .wavpcm A non-standard variant of .BR .wav . Some applications cannot read a standard WAV file header for PCM-encoded data with sample-size greater than 16-bits or with more than two channels, but can read a non-standard WAV header. It is likely that such applications will eventually be updated to support the standard header, but in the mean time, this SoX format can be used to create files with the non-standard header that should work with these applications. (Note that SoX will automatically detect and read WAV files with the non-standard header.) .TP \&\fB.wve\fR (also with \fB\-t sndfile\fR) Psion 8-bit A-law. Used on Psion SIBO PDAs (Series 3 and similar). This format is deprecated in SoX, but will continue to be used in libsndfile. .TP .B .xa Maxis XA files. These are 16-bit ADPCM audio files used by Maxis games. Writing .xa files is currently not supported, although adding write support should not be very difficult. .TP \&\fB.xi\fR (libsndfile) Fasttracker 2 Extended Instrument format. .SH SEE ALSO .BR sox (1), .BR soxi (1), .BR soxeffect (7), .BR libsox (3), .BR octave (1), .BR soxexam (7), .BR wget (1) .SP The SoX web page at http://sox.sourceforge.net .SS References .TP [1] Wikipedia, .IR "M3U" , http://en.wikipedia.org/wiki/M3U .TP [2] Wikipedia, .IR "PLS" , http://en.wikipedia.org/wiki/PLS_(file_format) .SH AUTHORS Chris Bagwell (cbagwell@users.sourceforge.net). Other authors and contributors are listed in the AUTHORS file that is distributed with the source code.