audio(7i)
NAME
audio - generic audio device interface
SYNOPSIS
#include <sys/audio.h>
OVERVIEW
An audio device is used to play and/or record a stream of
audio data. Since a specific audio device may not support
all functionality described below, refer to the device-
specific manual pages for a complete description of each
hardware device. An application can use the AUDIO_GETDEV
ioctl(2) to determine the current audio hardware associated
with /dev/audio.
AUDIO FORMATS
Digital audio data represents a quantized approximation of
an analog audio signal waveform. In the simplest case, these
quantized numbers represent the amplitude of the input
waveform at particular sampling intervals. To achieve the
best approximation of an input signal, the highest possible
sampling frequency and precision should be used. However,
increased accuracy comes at a cost of increased data storage
requirements. For instance, one minute of monaural audio
recorded in u-Law format (pronounced mew-law) at 8 KHz
requires nearly 0.5 megabytes of storage, while the standard
Compact Disc audio format (stereo 16-bit linear PCM data
sampled at 44.1 KHz) requires approximately 10 megabytes per
minute.
Audio data may be represented in several different formats.
An audio device's current audio data format can be deter-
mined by using the AUDIO_GETINFO ioctl(2) described below.
An audio data format is characterized in the audio driver by
four parameters: Sample Rate, Encoding, Precision, and
Channels. Refer to the device-specific manual pages for a
list of the audio formats that each device supports. In
addition to the formats that the audio device supports
directly, other formats provide higher data compression.
Applications may convert audio data to and from these for-
mats when playing or recording.
Sample Rate
Sample rate is a number that represents the sampling fre-
quency (in samples per second) of the audio data.
Encodings
An encoding parameter specifies the audio data representa-
tion. u-Law encoding corresponds to CCITT G.711, and is the
standard for voice data used by telephone companies in the
United States, Canada, and Japan. A-Law encoding is also
part of CCITT G.711 and is the standard encoding for
telephony elsewhere in the world. A-Law and u-Law audio data
are sampled at a rate of 8000 samples per second with 12-bit
precision, with the data compressed to 8-bit samples. The
resulting audio data quality is equivalent to that of stan-
dard analog telephone service.
Linear Pulse Code Modulation (PCM) is an uncompressed,
signed audio format in which sample values are directly pro-
portional to audio signal voltages. Each sample is a 2's
complement number that represents a positive or negative
amplitude.
Precision
Precision indicates the number of bits used to store each
audio sample. For instance, u-Law and A-Law data are stored
with 8-bit precision. PCM data may be stored at various pre-
cisions, though 16-bit is the most common.
Channels
Multiple channels of audio may be interleaved at sample
boundaries. A sample frame consists of a single sample from
each active channel. For example, a sample frame of stereo
16-bit PCM data consists of 2 16-bit samples, corresponding
to the left and right channel data.
DESCRIPTION
The device /dev/audio is a device driver that dispatches
audio requests to the appropriate underlying audio hardware.
The audio driver is implemented as a STREAMS driver. In
order to record audio input, applications open(2) the
/dev/audio device and read data from it using the read(2)
system call. Similarly, sound data is queued to the audio
output port by using the write(2) system call. Device con-
figuration is performed using the ioctl(2) interface.
Alternatively, opening /dev/audio may open a mixing audio
driver that provides a super set of this audio interface.
The audio mixer removes the exclusive resource restriction,
allowing multiple processes to play and record audio at the
same time. See the mixer(7I) and audio_support(7I) manual
pages for more information.
Because some systems may contain more than one audio device,
application writers are encouraged to query the AUDIODEV
environment variable. If this variable is present in the
environment, its value should identify the path name of the
default audio device.
Opening the Audio Device
The audio device is treated as an exclusive resource, mean-
ing that only one process can open the device at a time.
However, if the DUPLEX bit is set in the hw_features field
of the audio information structure, two processes may simul-
taneously access the device. This allows one process to
open the device as read-only and a second process to open it
as write-only. See below for details.
When a process cannot open /dev/audio because the device is
busy:
o if either the O_NDELAY or O_NONBLOCK flags are set in
the open() oflag argument, then -1 is immediately
returned, with errno set to EBUSY.
o if neither the O_NDELAY nor the O_NONBLOCK flag are
set, then open() hangs until the device is available
or a signal is delivered to the process, in which case
a -1 is returned with errno set to EINTR. This allows
a process to block in the open call while waiting for
the audio device to become available.
Upon the initial open() of the audio device, the driver
resets the data format of the device to the default state of
8-bit, 8Khz, mono u-Law data. If the device is already open
and a different audio format is set, this will not be possi-
ble on some devices. Audio applications should explicitly
set the encoding characteristics to match the audio data
requirements rather than depend on the default configura-
tion.
Since the audio device grants exclusive read or write access
to a single process at a time, long-lived audio applications
may choose to close the device when they enter an idle state
and reopen it when required. The play.waiting and
record.waiting flags in the audio information structure (see
below) provide an indication that another process has
requested access to the device. For instance, a background
audio output process may choose to relinquish the audio dev-
ice whenever another process requests write access.
Recording Audio Data
The read() system call copies data from the system's buffers
to the application. Ordinarily, read() blocks until the user
buffer is filled. The I_NREAD ioctl (see streamio(7I)) may
be used to determine the amount of data that may be read
without blocking. The device may alternatively be set to a
non-blocking mode, in which case read() completes immedi-
ately, but may return fewer bytes than requested. Refer to
the read(2) manual page for a complete description of this
behavior.
When the audio device is opened with read access, the device
driver immediately starts buffering audio input data. Since
this consumes system resources, processes that do not record
audio data should open the device write-only (O_WRONLY).
The transfer of input data to STREAMS buffers may be paused
(or resumed) by using the AUDIO_SETINFO ioctl to set (or
clear) the record.pause flag in the audio information struc-
ture (see below). All unread input data in the STREAMS
queue may be discarded by using the I_FLUSH STREAMS ioctl
(see streamio(7I)). When changing record parameters, the
input stream should be paused and flushed before the change,
and resumed afterward. Otherwise, subsequent reads may
return samples in the old format followed by samples in the
new format. This is particularly important when new parame-
ters result in a changed sample size.
Input data can accumulate in STREAMS buffers very quickly.
At a minimum, it will accumulate at 8000 bytes per second
for 8-bit, 8 KHz, mono, u-Law data. If the device is config-
ured for 16-bit linear or higher sample rates, it will accu-
mulate even faster. If the application that consumes the
data cannot keep up with this data rate, the STREAMS queue
may become full. When this occurs, the record.error flag is
set in the audio information structure and input sampling
ceases until there is room in the input queue for additional
data. In such cases, the input data stream contains a
discontinuity. For this reason, audio recording applications
should open the audio device when they are prepared to begin
reading data, rather than at the start of extensive initial-
ization.
Playing Audio Data
The write() system call copies data from an application's
buffer to the STREAMS output queue. Ordinarily, write()
blocks until the entire user buffer is transferred. The dev-
ice may alternatively be set to a non-blocking mode, in
which case write() completes immediately, but may have
transferred fewer bytes than requested (see write(2)).
Although write() returns when the data is successfully
queued, the actual completion of audio output may take con-
siderably longer. The AUDIO_DRAIN ioctl may be issued to
allow an application to block until all of the queued output
data has been played. Alternatively, a process may request
asynchronous notification of output completion by writing a
zero-length buffer (end-of-file record) to the output
stream. When such a buffer has been processed, the play.eof
flag in the audio information structure (see below) is
incremented.
The final close(2) of the file descriptor hangs until all of
the audio output has drained. If a signal interrupts the
close(), or if the process exits without closing the device,
any remaining data queued for audio output is flushed and
the device is closed immediately.
The consumption of output data may be paused (or resumed) by
using the AUDIO_SETINFO ioctl to set (or clear) the
play.pause flag in the audio information structure. Queued
output data may be discarded by using the I_FLUSH STREAMS
ioctl. (See streamio(7I)).
Output data is played from the STREAMS buffers at a default
rate of at least 8000 bytes per second for u-Law, A-Law or
8-bit PCM data (faster for 16-bit linear data or higher sam-
pling rates). If the output queue becomes empty, the
play.error flag is set in the audio information structure
and output is stopped until additional data is written. If
an application attempts to write a number of bytes that is
not a multiple of the current sample frame size, an error is
generated and the bad data is thrown away. Additional writes
are allowed.
Asynchronous I/O
The I_SETSIG STREAMS ioctl enables asynchronous notifica-
tion, through the SIGPOLL signal, of input and output ready
condition changes. The O_NONBLOCK flag may be set using the
F_SETFL fcntl(2) to enable non-blocking read() and write()
requests. This is normally sufficient for applications to
maintain an audio stream in the background.
Audio Control Pseudo-Device
It is sometimes convenient to have an application, such as a
volume control panel, modify certain characteristics of the
audio device while it is being used by an unrelated process.
The /dev/audioctl pseudo-device is provided for this pur-
pose. Any number of processes may open /dev/audioctl simul-
taneously. However, read() and write() system calls are
ignored by /dev/audioctl. The AUDIO_GETINFO and
AUDIO_SETINFO ioctl commands may be issued to /dev/audioctl
to determine the status or alter the behavior of /dev/audio.
Note: In general, the audio control device name is con-
structed by appending the letters "ctl" to the path name of
the audio device.
Audio Status Change Notification
Applications that open the audio control pseudo-device may
request asynchronous notification of changes in the state of
the audio device by setting the S_MSG flag in an I_SETSIG
STREAMS ioctl. Such processes receive a SIGPOLL signal when
any of the following events occur:
o An AUDIO_SETINFO ioctl has altered the device state.
o An input overflow or output underflow has occurred.
o An end-of-file record (zero-length buffer) has been
processed on output.
o An open() or close() of /dev/audio has altered the
device state.
o An external event (such as speakerbox's volume con-
trol) has altered the device state.
IOCTLS
Audio Information Structure
The state of the audio device may be polled or modified
using the AUDIO_GETINFO and AUDIO_SETINFO ioctl commands.
These commands operate on the audio_info structure as
defined, in <sys/audioio.h>, as follows:
/*
* This structure contains state information for audio device
* IO streams
*/
struct audio_prinfo {
/*
* The following values describe the
* audio data encoding
*/
uint_t sample_rate; /* samples per second */
uint_t channels; /* number of interleaved channels */
uint_t precision; /* number of bits per sample */
uint_t encoding; /* data encoding method */
/*
* The following values control audio device
* configuration
*/
uint_t gain; /* volume level */
uint_t port; /* selected I/O port */
uint_t buffer_size; /* I/O buffer size */
/*
* The following values describe the current device
* state
*/
uint_t samples; /* number of samples converted */
uint_t eof; /* End Of File counter (play only) */
uchar_t pause; /* non-zero if paused, zero to resume */
uchar_t error; /* non-zero if overflow/underflow */
uchar_t waiting; /* non-zero if a process wants access */
uchar_t balance; /* stereo channel balance */
/*
* The following values are read-only device state
* information
*/
uchar_t open; /* non-zero if open access granted */
uchar_t active; /* non-zero if I/O active */
uint_t avail_ports; /* available I/O ports */
uint_t mod_ports; /* modifiable I/O ports */
};
typedef struct audio_prinfo audioi_prinfo_t;
/*
* This structure is used in AUDIO_GETINFO and AUDIO_SETINFO ioctl
* commands
*/
struct audio_info {
audio_prinfo_t record; /* input status info */
audio_prinfo_t play; /* output status info */
uint_t monitor_gain; /* input to output mix */
uchar_t output_muted; /* non-zero if output muted */
uint_t hw_features; /* supported H/W features */
uint_t sw_features; /* supported S/W features */
uint_t sw_features_enabled;
/* supported S/W features enabled */
};
typedef struct audio_info audio_info_t;
/* Audio encoding types */
#define AUDIO_ENCODING_ULAW (1) /* u-Law encoding */
#define AUDIO_ENCODING_ALAW (2) /* A-Law encoding */
#define AUDIO_ENCODING_LINEAR (3) /* Signed Linear PCM encoding */
/*
* These ranges apply to record, play, and
* monitor gain values
*/
#define AUDIO_MIN_GAIN (0) /* minimum gain value */
#define AUDIO_MAX_GAIN (255) /* maximum gain value */
/*
* These values apply to the balance field to adjust channel
* gain values
*/
#define AUDIO_LEFT_BALANCE (0) /* left channel only */
#define AUDIO_MID_BALANCE (32) /* equal left/right balance */
#define AUDIO_RIGHT_BALANCE (64) /* right channel only */
/*
* Define some convenient audio port names
* (for port, avail_ports and mod_ports)
*/
/* output ports (several might be enabled at once) */
#define AUDIO_SPEAKER (0x01) /* built-in speaker */
#define AUDIO_HEADPHONE (0x02) /* headphone jack */
#define AUDIO_LINE_OUT (0x04) /* line out */
#define AUDIO_SPDIF_OUT (0x08) /* SPDIF port */
#define AUDIO_AUX1_OUT (0x10) /* aux1 out */
#define AUDIO_AUX2_OUT (0x20) /* aux2 out */
/* input ports (usually only one may be
* enabled at a time)
*/
#define AUDIO_MICROPHONE (0x01) /* microphone */
#define AUDIO_LINE_IN (0x02) /* line in */
#define AUDIO_CD (0x04) /* on-board CD inputs */
#define AUDIO_SPDIF_IN (0x08) /* SPDIF port */
#define AUDIO_AUX1_IN (0x10) /* aux1 in */
#define AUDIO_AUX2_IN (0x20) /* aux2 in */
#define AUDIO_CODEC_LOOPB_IN (0x40) /* Codec inter.loopback */
/* These defines are for hardware features */
#define AUDIO_HWFEATURE_DUPLEX (0x00000001u)
/*simult. play & cap. supported */
#define AUDIO_HWFEATURE_MSCODEC (0x00000002u)
/* multi-stream Codec */
/* These defines are for software features *
#define AUDIO_SWFEATURE_MIXER (0x00000001u)
/* audio mixer audio pers. mod. */
/*
* Parameter for the AUDIO_GETDEV ioctl
* to determine current audio devices
*/
#define MAX_AUDIO_DEV_LEN (16)
struct audio_device {
char name[MAX_AUDIO_DEV_LEN];
char version[MAX_AUDIO_DEV_LEN];
char config[MAX_AUDIO_DEV_LEN];
};
typedef struct audio_device audio_device_t;
The play.gain and record.gain fields specify the output and
input volume levels. A value of AUDIO_MAX_GAIN indicates
maximum volume. Audio output may also be temporarily muted
by setting a non-zero value in the output_muted field.
Clearing this field restores audio output to the normal
state. Most audio devices allow input data to be monitored
by mixing audio input onto the output channel. The
monitor_gain field controls the level of this feedback path.
The play.port field controls the output path for the audio
device. It can be set to either AUDIO_SPEAKER (built-in
speaker), AUDIO_HEADPHONE (headphone jack), AUDIO_LINE_OUT
(line-out port), AUDIO_AUX1_OUT (auxilary1 out), or
AUDIO_AUX2_OUT (auxilary2 out). For some devices, it may be
set to a combination of these ports. The play.avail_ports
field returns the set of output ports that are currently
accessible. The play.mod_ports field returns the set of out-
put ports that may be turned on and off. If a port is miss-
ing from play.mod_ports then that port is assumed to always
be on.
The record.port field controls the input path for the audio
device. It can be either AUDIO_MICROPHONE (microphone jack),
AUDIO_LINE_IN (line-out port), AUDIO_CD (internal CD-ROM),
AUDIO_AUX1_IN (auxilary1 in), AUDIO_AUX2_IN (auxilary2 in),
or AUDIO_CODEC_LOOPB_IN (internal loopback). The
record.avail_ports field returns the set of input ports that
are currently accessible. The record.mod_ports field returns
the set of input ports that may be turned on and off. If a
port is missing from record.mod_ports, it is assumed to
always be on. Input ports are considered to be mutually
exclusive.
The play.balance and record.balance fields are used to con-
trol the volume between the left and right channels when
manipulating stereo data. When the value is set between
AUDIO_LEFT_BALANCE and AUDIO_MID_BALANCE, the right channel
volume will be reduced in proportion to the balance value.
Conversely, when balance is set between AUDIO_MID_BALANCE
and AUDIO_RIGHT_BALANCE, the left channel will be propor-
tionally reduced.
The play.pause and record.pause flags may be used to pause
and resume the transfer of data between the audio device and
the STREAMS buffers. The play.error and record.error flags
indicate that data underflow or overflow has occurred. The
play.active and record.active flags indicate that data
transfer is currently active in the corresponding direction.
The play.open and record.open flags indicate that the device
is currently open with the corresponding access permission.
The play.waiting and record.waiting flags provide an indica-
tion that a process may be waiting to access the device.
These flags are set automatically when a process blocks on
open(), though they may also be set using the AUDIO_SETINFO
ioctl command. They are cleared only when a process relinqu-
ishes access by closing the device.
The play.samples and record.samples fields are zeroed at
open() and are incremented each time a data sample is copied
to or from the associated STREAMS queue. Some audio drivers
may be limited to counting buffers of samples, instead of
single samples for their samples accounting. For this
reason, applications should not assume that the samples
fields contain a perfectly accurate count. The play.eof
field increments whenever a zero-length output buffer is
synchronously processed. Applications may use this field to
detect the completion of particular segments of audio out-
put.
The record.buffer_size field controls the amount of input
data that is buffered in the device driver during record
operations. Applications that have particular requirements
for low latency should set the value appropriately. Note
however that smaller input buffer sizes may result in higher
system overhead. The value of this field is specified in
bytes and drivers will constrain it to be a multiple of the
current sample frame size. Some drivers may place other
requirements on the value of this field. Refer to the audio
device-specific manual page for more details. If an applica-
tion changes the format of the audio device and does not
modify the record.buffer_size field, the device driver may
use a default value to compensate for the new data rate.
Therefore, if an application is going to modify this field,
it should modify it during or after the format change
itself, not before. When changing the record.buffer_size
parameters, the input stream should be paused and flushed
before the change, and resumed afterward. Otherwise, subse-
quent reads may return samples in the old format followed by
samples in the new format. This is particularly important
when new parameters result in a changed sample size. If you
change the record.buffer_size for the first packet, this
protocol must be followed or the first buffer will be the
default buffer size for the device, followed by packets of
the requested change size.
The record.buffer_size field may be modified only on the
/dev/audio device by processes that have it opened for read-
ing.
The play.buffer_size field is currently not supported.
The audio data format is indicated by the sample_rate, chan-
nels, precision, and encoding fields. The values of these
fields correspond to the descriptions in the AUDIO FORMATS
section above. Refer to the audio device-specific manual
pages for a list of supported data format combinations.
The data format fields may be modified only on the
/dev/audio device. Some audio hardware may constrain the
input and output data formats to be identical. If this is
the case, the data format may not be changed if multiple
processes have opened the audio device. As a result,
a process should check that the ioctl() does not fail when
it attempts to set the data format.
If the parameter changes requested by an AUDIO_SETINFO ioctl
cannot all be accommodated, ioctl() will return with errno
set to EINVAL and no changes will be made to the device
state.
Streamio IOCTLS
All of the streamio(7I) ioctl commands may be issued for the
/dev/audio device. Because the /dev/audioctl device has its
own STREAMS queues, most of these commands neither modify
nor report the state of /dev/audio if issued for the
/dev/audioctl device. The I_SETSIG ioctl may be issued for
/dev/audioctl to enable the notification of audio status
changes, as described above.
Audio IOCTLS
The audio device additionally supports the following ioctl
commands:
AUDIO_DRAIN
The argument is ignored. This command suspends the
calling process until the output STREAMS queue is
empty, or until a signal is delivered to the calling
process. It may not be issued for the /dev/audioctl
device. An implicit AUDIO_DRAIN is performed on the
final close() of /dev/audio.
AUDIO_GETDEV
The argument is a pointer to an audio_device_t struc-
ture. This command may be issued for either /dev/audio
or /dev/audioctl. The returned value in the name field
will be a string that will identify the current
/dev/audio hardware device, the value in version will
be a string indicating the current version of the
hardware, and config will be a device-specific string
identifying the properties of the audio stream associ-
ated with that file descriptor. Refer to the audio
device-specific manual pages to determine the actual
strings returned by the device driver.
AUDIO_GETINFO
The argument is a pointer to an audio_info_t struc-
ture. This command may be issued for either /dev/audio
or /dev/audioctl. The current state of the /dev/audio
device is returned in the structure.
AUDIO_SETINFO
The argument is a pointer to an audio_info_t struc-
ture. This command may be issued for either the
/dev/audio or the /dev/audioctl device with some res-
trictions. This command configures the audio device
according to the supplied structure and overwrites
the existing structure with the new state of the
device. Note: The play.samples, record.samples,
play.error, record.error, and play.eof fields are
modified to reflect the state of the device when the
AUDIO_SETINFO is issued. This allows programs to
automatically modify these fields while retrieving the
previous value.
Certain fields in the audio information structure, such as
the pause flags, are treated as read-only when /dev/audio is
not open with the corresponding access permission. Other
fields, such as the gain levels and encoding information,
may have a restricted set of acceptable values. Applications
that attempt to modify such fields should check the returned
values to be sure that the corresponding change took effect.
The sample_rate, channels, precision, and encoding fields
treated as read-only for /dev/audioctl, so that applications
can be guaranteed that the existing audio format will stay
in place until they relinquish the audio device.
AUDIO_SETINFO will return EINVAL when the desired configura-
tion is not possible, or EBUSY when another process has con-
trol of the audio device.
Once set, the following values persist through subsequent
open() and close() calls of the device and automatic device
unloads: play.gain, record.gain, play.balance,
record.balance, play.port, record.port and monitor_gain. For
the dbri driver, an automatic device driver unload resets
these parameters to their default values on the next load.
All other state is reset when the corresponding I/O stream
of /dev/audio is closed.
The audio_info_t structure may be initialized through the
use of the AUDIO_INITINFO macro. This macro sets all fields
in the structure to values that are ignored by the
AUDIO_SETINFO command. For instance, the following code
switches the output port from the built-in speaker to the
headphone jack without modifying any other audio parameters:
audio_info_t info;
AUDIO_INITINFO(&info);
info.play.port = AUDIO_HEADPHONE;
err = ioctl(audio_fd, AUDIO_SETINFO, &info);
This technique eliminates problems associated with using a
sequence of AUDIO_GETINFO followed by AUDIO_SETINFO.
ERRORS
An open() will fail if:
EBUSY The requested play or record access is busy and either
the O_NDELAY or O_NONBLOCK flag was set in the open()
request.
EINTR The requested play or record access is busy and a sig-
nal interrupted the open() request.
An ioctl() will fail if:
EINVAL
The parameter changes requested in the AUDIO_SETINFO
ioctl are invalid or are not supported by the device.
EBUSY The parameter changes requested in the AUDIO_SETINFO
ioctl could not be made because another process has
the device open and is using a different format.
FILES
The physical audio device names are system dependent and are
rarely used by programmers. Programmers should use the gen-
eric device names listed below.
/dev/audio
symbolic link to the system's primary audio device
/dev/audioctl
symbolic link to the control device for /dev/audio
/dev/sound/0
first audio device in the system
/dev/sound/0ctl
audio control device for /dev/sound/0
/usr/share/audio/samples
audio files
ATTRIBUTES
See attributes(5) for a description of the following attri-
butes:
____________________________________________________________
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
|____________________|______________________________________|
| Architecture | SPARC, x86 |
|____________________|______________________________________|
| Availability | SUNWcsu, SUNWcsxu, SUNWaudd,|
| | SUNWauddx, SUNWaudh |
|____________________|______________________________________|
| Stability Level | Evolving |
|____________________|______________________________________|
SEE ALSO
close(2), fcntl(2), ioctl(2), open(2), poll(2), read(2),
write(2), attributes(5), audiocs(7D), audioens(7D),
audiots(7D), dbri(7D), sbpro(7D), usb_ac(7D),
audio_support(7I), mixer(7I), streamio(7I)
BUGS
Due to a feature of the STREAMS implementation, programs
that are terminated or exit without closing the audio device
may hang for a short period while audio output drains. In
general, programs that produce audio output should catch the
SIGINT signal and flush the output stream before exiting.
On LX machines running Solaris 2.3, catting a demo audio
file to the audio device /dev/audio does not work. Use the
audioplay command on LX machines instead of cat.
FUTURE DIRECTIONS
Future audio drivers should use the mixer(7I) audio device
to gain access to new features.
Man(1) output converted with
man2html