What is Digital Audio Workstation: Analog Audio to Digital Conversion and Pulse Code Modulation

Emerson Maningo

14 years ago

Digital Audio Workstation or DAW is one of the commonly used terms in home music production environment. Yet some are still confused especially the beginners in music production with no electrical/sound engineering background as to what is really the meaning of Digital Audio Workstation? The fact is that, it’s so hard to understand what is digital audio workstation without giving the beginner; a complete information of how everything starts and end in music production. It’s why this lengthy post is perfect for those completely new in digital-based home recording or recording music using computers! Let’s get started..

First, you need to understand how the music goes into your computer. Your music is a complex sound wave which is analog in nature, a continuous form of signal(e.g. a sine wave). A musical instrument or a disturbance (e.g a water droplet falling into a pail of water) and can cause vibrations in the air that causes it propagate in the form of a sound wave. When these air pressure vibrations reaches your ear, you will perceive it as a sound if the pressure is strong enough to cause vibrations in your ear drum and if the frequency is audible (20Hz to 20,000Hz). The music you hear are actually composed of musical notes which are sinusoidal in nature and has two properties which are:

a.) Amplitude (how strong are the pressure vibrations, which is usually measured by SPL or sound pressure level using decibels).
b.) Frequency( how high or low is the pitch of the sound wave, measured in Hertz)

What happens if the music is recorded?

When you put a microphone in front of a sound wave source such as a guitar playing or a person singing. What happens is that the microphone converts the sound wave into an electrical signal which is also in analog form. Microphone is a transducer that converts sound(acoustic) energy to electrical energy. This captured electrical wave by your microphone mimics the amplitude and frequency characteristic of the recorded sound wave. How accurate is the reproduction now depends on the microphone being used. Condenser or ceramic microphones offers the widest and the most accurate frequency response, it is why it is generally expensive compared to dynamic microphones.

What happens to the frequency and amplitude after recording?

When it is now converted into an electrical signal using a microphone. A sound wave is now called as an “analog audio”. Analog audio has two properties:

a.) Voltage/Current –measurement of amplitude. A higher voltage/current means a loud captured sound. Since analog audio is also a continuous form of signal, the volume can be represented as voltages or fluctuations in the analog domain such as +0.2volts, +1.0volts, -3.0volts, -0.86volts, etc.

b.) Frequency – measurement of pitch (units in Hz)

Anything that is outside your computer is an analog device; example are the microphones, guitars and mixers. In these devices, the audio signal is still in analog form. Modern computers used in home recording and music production is NOT an analog device. It is a digital device which can only accept, analyzed and output information as 0 or 1. A good example of a digital information is a series of 1 and 0 such as: 10100110101001010010101010111001 ; They are not continuous form of signals since it starts and ends abruptly, hence it is called a binary information or digital information. Digital can be represented as a square wave instead of being sinusoidal so the ups and downs represent 1 and 0.

So what happens when an analog audio is processed to a computer?

This where you need a device called as an “analog to digital converter”. Its job is converting the analog audio (in volts) into a digital signal (series of 1 and 0) that your computer can understand. Common home studio devices such as sound card and external audio interface handles that analog audio to digital conversion tasks. How accurate is the conversion now depends on the quality of your analog to digital converter, expensive sound cards or audio interfaces means better/quality conversion. Bear in mind that we are not still talking about DAW yet.

Take note that an analog audio cannot be reproduced “perfectly” into a digital audio even for very expensive sound cards or audio interfaces. There will be some quantization errors during the conversion, but you cannot even notice it in reality and in your studio monitors. These errors depend on the bit depth and sampling rate. What appears to be an “acceptable” reproduction of an analog into digital audio is sampled at 16 bits 44.1Khz. The primary reason is the sampling theorem/pulse code modulation (very detailed explanation in the next section).

But for best quality in music production, it is why its good to aim higher than 16bit/44.1Khz. This is where you might hear audio professionals recommending recording at 24-bits and 44.1Khz or better for optimum results. After conversion, your analog audio which is represented as +0.2volts, +1.0volts, -3.0volts, -0.86volts now becomes a series of digital bits (also known as “digital audio”):

10100110101001010010101010111001
11001101010100010101001010110011
11100101010100010100001110101001

These bits will even become longer as your recording bit depth and sample rate are increased. It is then bounced or saved to your computer hard drive which your computer can access. The higher the bit depth and the sampling rate, the bigger will be the resulting file size of the digital audio being recorded.

Pulse Code Modulation in details

The heart of the analog to digital conversion is PCM (Pulse Code Modulation). In PCM, it is a standard of representing analog signals in the digital domain. This is a sampling technique; using a high resolution sampling method results in a more accurate digital representation of the analog signals.

Now to sample an analog audio to digital, your converter needs two parameters:

a.) Bit depth
b.) Sampling rate

Analog is represented by continuous signals such as voltages. After all when the sound wave hits the microphone, it is first converted into microphone levels (weak millivolts) then it will be amplified by audio interface or mixer pre-amp into line level signals. Line level signals are stronger voltages which are then inputted to your analog to digital converter inside your audio interface.

Continuous signals such as voltages vary a lot continuously but digital signals are not. Digital signals are square waves which have only two possible values: 0 and 1. Continuous signals such as voltages at the input of the audio interfaces have infinite possible values. They could take just any value of voltages as long as it’s within the resolution of the converter for example 0.324 volts, 0.345 volts, 0.232 volts, or even negative voltages -0.656 volts, etc.

These voltages hold up the “INFORMATION” of the audio content in analog domain. Remember that the microphone is a transducer that converts sound pressure vibrations into voltage levels. High sound pressure results to higher voltage induced at the microphone output while low sound pressure will have smaller induced voltage

So you could imagine a singer with dynamics (low to high pitch or volume levels) can induce infinite possibilities of voltage levels at the microphone output. These voltage levels would make up the singer audio waveform which is to be converted to digital (if it’s to be recorded in your DAW).

Representing these analog levels in digital domain:

To represent analog levels, you need sufficient sampling rate and bit depth to convert the waveform accurately. The sampling rate required is twice the highest frequency to be converted, so for music, it would be around 22050Hz. It is why the most common sampling rate for music would be 44.1 KHz because:

Sampling rate required for accurate reproduction = 22050Hz x 2 =44.1KHz, This implies that there are 44100 analog voltages samples taken per second.

The number of bits you need can also affect the resulting representation. Supposing you want to sample a sine wave voltage signal using 3 bits at low sampling rate, these are the output:

quantized 3 bits

Take note that the sine wave is jagged and not a good looking sine wave. This is not a replica of the analog sine wave because it has only been sampled using 3 bits. The maximum possibilities of 3-bits are: 2^8 = eight possible representations of the analog signal voltages. These are not enough, considering that the analog signal is continuous.

However, if you are using 24-bits and using a reasonable sampling rate such as 44100Hz it would become very accurate as there are 2^24 = 16,777,216 possibilities that a voltage levels can be represented and then there are 44100 samples taken per second. The resulting digitized waveform would now be very smooth looking much like the original analog sine waveform:

Perfect sine wave

You cannot see anymore those jagged or ragged corners on the sine wave. Those dots are the samples and there are sufficient samples taken per second at reasonable bit depths; thus making the digitized representation accurate.

Always record at 24-bits for better resolution

Bit depths tell you how much resolution you have in your analog to digital conversion. So if you record only at 16-bit, you only have 2^16=65,535 levels. Supposing the analog voltage levels to be coded is from -20,000 mV to +20,000 mV then the resolution would be:

Resolution = [+20,000mV – (-20,000mV)]/65535 =0.61mV per sample

If you are recording at 24-bits, this resolution would be:

Resolution = [+20,000mV – (-20,000mV)]/16,777,216 =0.0024 mV per sample

Now you can accurately represent an analog audio signal if recorded at 24-bits because of this very high resolution. With 16-bits example above, you cannot resolve voltages significantly smaller than 0.61mV resolution so it would simply be round off to 0.61mV thus causing what is known as “quantization error” because of the significant difference between the analog input and the digitized output. This will have a significant effect on the resulting recording quality.

To make this clear to you, the resolution affects the granularity or steps in the digitized signal, a high resolution results in a more smoothly looking converted/digitized analog signals, see screenshot:

quantization tips

If the resolution is to be made to be smaller, those “steps” would become smaller and looks smoother. The resulting sound would also be less “digitized” making it sound exactly like analog. But with 24-bits example above, with 0.0024mV resolution, even smaller changes in the analog input voltage can be represented, thus resulting into a more accurate reproduction of the analog input. It is why 24-bits are a standard in music production for recording. You should be recording at 24-bits.

Digital audio workstation comes into play

Since music now becomes a digital audio, your computer can now start processing it. Your computer, operates with a software for processing these digital audio information and hence it is called as a “digital audio workstation” or simply as DAW. For me, a complete DAW system is not the software itself but it’s the combination of the audio hardware (computer, sound monitors, etc.) components and the software (Reaper, Adobe audition, Cubase, Logic, Sonar, etc). In the computer screen, the captured analog audio is then visualized into a waveform based on the digital audio information rendered by your software. As an engineer you will then start interacting and editing these digital information into three steps that comprises an important processes in music production:

a.) Basic Tracking and Editing (recording instruments into your computer, removing noise and editing parts digitally)

b.) Digital Audio Mixing (combining the recorded tracks to form the best sound results by adjusting EQ, compression, panning, effects of different digital audio tracks) then rendering a “mixdown” which is a single waveform summing up all adjusted tracks in multi-track. It is called “digital mixing” because you are using your DAW to mix the tracks in digital domain as opposed to using an analog mixing console in the analog domain.

c.) Digital Mastering – optimizing the mixdown to sound best in different audio monitors/players. You are still using your DAW to do this step so its still in the digital domain.

You will learn the details on each of the above processes as you progress with your reading later on.

The ending

After mastering, a digital audio information is then stored into an audio CD (which is still a digital device) or any digital storage device such as DVD, etc. If this digital audio information is played using a CD player; it includes a digital to audio converter that converts it back again to analog form!

Thus, it then flows to the electrical wires as an analog electrical signal and ends up to your studio monitors/speakers. Your monitor is also a transducer that converts electrical energy/signals into acoustic/sound pressure wave vibrations. Thus creating disturbance in the air that creates the music you hear. This is how it ends.

Content last updated on October 10, 2012