I am using the FMOD library to extract PCM from MP3. I get only 2 channels - 16 bits, and I also get that the sampling frequency of 44100 Hz is 44 100 samples of “sound” in 1 second. What I am not getting is what is a 16 bit value. I know how to plot the coordinates along the xy axis, but what am I drawing? The y axis represents time, the x axis represents what? Sound level? Is it the same as amplitude? How to identify the various sounds that make up this meaning. I mean, how do I get the spectrum from a 16-bit number.
This may be a separate question, but actually I really need to answer: how to get the amplitude every 25 milliseconds? I take 44 100 values, dividing by 40 (40 * 0.025 seconds = 1 second)? This gives 1102.5 samples; so that I would submit 1102 values to the black box, which gives me the amplitude for this point in time?
Edited source message for adding code I plan to check soon: (note, I changed the frame rate from 25 ms to 40 ms)
private const int CHUNKSIZE = 7056;
uint bytesread = 0;
var squares = new double[CHUNKSIZE / 4];
const double scale = 1.0d / 32768.0d;
do
{
result = sound.readData(data, CHUNKSIZE, ref read);
Marshal.Copy(data, buffer, 0, CHUNKSIZE);
Array.Reverse(buffer);
for (var i = 0; i < buffer.Length; i += 4)
{
var avg = scale * (Math.Abs((double)BitConverter.ToInt16(buffer, i)) + Math.Abs((double)BitConverter.ToInt16(buffer, i + 2))) / 2.0d;
squares[i >> 2] = avg * avg;
}
var rmsAmplitude = ((int)(Math.Floor(Math.Sqrt(squares.Average()) * 32768.0d))).ToString("X2");
fs.Write(buffer, 0, (int) read);
bytesread += read;
statusBar.Text = "writing " + bytesread + " bytes of " + length + " to output.raw";
} while (result == FMOD.RESULT.OK && read == CHUNKSIZE);
After downloading the mp3 it seems that my rmsAmplitude is in the range of 3C00 to 4900. Did I do something wrong? I expected a wider distribution.
source
share