In this post, an experiment will be conducted on the effects of MP3 quality in relation to dithering and sample rate conversion which are done in the mastering process. The tools used are the following:
a.) Reaper digital audio workstation with LAME encoder functionality for converting to MP3 and doing test dithering.
b.) Voxengo R8brain free – for doing sample rate conversion and test dithering.
c.) Adobe Audition 1.5 – for doing spectral analysis of the converted result.
d.) The 24-bit/96KHz WAV sweep test tone signal provided here.
Objective and Methodology of the Study
This study will aim to investigate the results on the following combinations (using free/open source tools):
Test Flow#1: The quality of the MP3 as result of direct conversion
24-bit/96KHz sweep tone signal == > Reaper LAME MP3 encoder == > Assess spectral quality results of MP3
Test Flow#2: Sample rate conversion and dithering is being done first before MP3 conversion
24-bit/96KHz sweep tone signal == > Sample rate conversion using Voxengo R8brain to 44.1KHz == > Dithering and noise shaping using Reaper built-in dither functionality == > 16-bit/44.1KHz wav input to Reaper LAME MP3 encoder == > Assess results
Test Flow#3: Sample rate conversion and dithering to be done entirely by R8brain
24-bit/96KHz sweep test tone signal == > Sample rate and dithering by Voxengo R8brain == > 16-bit/44.1KHz WAV input to Reaper LAME MP3 encoder== > Analyze results
Test Flow#4: SRC (Sample rate conversion) is done first but no dithering has been externally applied. Then the 24-bit wav is inputted directly to the LAME encoder.
24-bit/96KHz sweep test tone signal == > SRC by R8brain == > 24-bit/44.1KHz WAV input to Reaper LAME MP3 encoder== > Analyze results
Spectral result of the original source audio
Using Adobe Audition 1.5 Spectral graph analysis, the original/unaltered 24-bit/96KHz WAV sweep tone plot is shown below:
The x-axis is the time in seconds while the y-axis is frequencies. The curve blue line is trending upward (because it’s as sweep tone) indicates the change of frequency content with respect to time. So we can say approximately by looking at the plot that at 4 seconds; the signal frequency of the content is at 10,000Hz. The black background/region indicates the absence of signal frequency content.
Since the sample rate is 96 KHz, it can accommodate signals up to 48 KHz as the maximum depicted in the plot. According to Nyquist theory on the post on 44.1 KHz vs. 48 KHz audio recording sample rate.
Maximum frequency content = Sample rate/2
Beyond 20 KHz, human ear cannot anymore distinguish or detect these ultra-sonic frequencies but flying bats do.
Test Flow#1: Direct Encoding
In this test, a high resolution test tone of 24-bit/96KHz is directly feed to the Reaper LAME encoder (File – Batch File/Item converter). In Reaper, these are the options: [sample rate=source, channels=source, re-sample if needed=best, Output format=MP3, mode=maximum bit rate/quality]. If you are new to Reaper, you can add MP3 functionality by reading this guide on Reaper DAW Tutorial. And then go to the “Using MP3 with Reaper” section.
For the above test; this is the spectral result of the MP3:
As you can see, very small artifacts are now present and the background is not anymore pure black. This indicates the presence of slight aliasing distortion (those light hazy blue lines) brought by the MP3 encoder sample rate conversion. It is worth noting that LAME encoder converts a 24-bit/96 KHz sample rate to a 16-bit/48 KHz mp3. If it’s directly converted to a standard 16-bit/44.1KHz MP3 using the same process, this is the result:
And now you have much bigger problems of aliasing distortion; fully obvious and more audible. See those lines crossing the ideal signal which are not present in the original test tone? These are distortions/artifacts that can make your MP3 sound bad.