GPSYCHO - Variable Bit Rate

Suggested usage:

lame -v -V 2 -b 128 input.wav output.mp3

VBR mode automatically uses the highest quality option. So both "-v" and "-h" are not necessary when using -V. Options:

-V n (where n=0..9): 0 = highest quality
9 = lowest quality
-b <minimum allowed bitrate>
-B <maximum allowed bitrate>

Using -B with other than 320kbs is not recommended, since even a 128kbs CBR stream will sometimes use frames as large as 320kbs via the bitreservoir.

Variables used in VBR code description:

sfb: Scale factor band index.
thm[sfb]: Allowed masking. thm[sfb] = How much noise is allowed in the sfb'th band, as computed by the GPSYCHO.
gain[sfb]: MDCT coefficents are scaled by 2^(-.25*gain) before quantizing. Smaller values of gain (more negative) mean that more bits are required to encode the coefficients, but the quantization noise will be (usually) smaller.
desired_gain[sfb]: The amount of gain needed so that if gain[sfb] <= desired_gain[sfb], the quantization noise will be <= thm[sfb].

An MP3 can use the following variables to achieve a given gain[sfb]. For longblocks:

gain[sfb][i] = 2^ [ -.25 * ( global_gain -210 - ifqstep*scalefac[gr][ch].l[sfb] - ifqstep*pretab[sfb]) ]

For shortblocks (i=0..2 for the three short blocks):

gain[sfb][i] = 2^ [ -.25*( global_gain -210 - 8*subblock_gain[i] - ifqstep*scalefac.s[sfb][i]) ]

In both of the above cases, calculate ifqstep:

ifqstep = scalefac_scale==0 ? 2 : 4;

Algorithm

The VBR algorithm is as follows.

Step 1: psymodel.c: Computes the allowed maskings, thm[sfb] thm[sfb] may be reduced by a few db depending on the quality setting. The smaller thm[sfb], the more bits will be required to encode the frame.
Step 2: find_scalefac() in vbrquantize.c: Compute desired_gain[sfb] by iterating over the values of sfb from 0 to SBMAX. At each value, compute desired_gain[sfb] using a divide and conquer iteration so that quantization_noise[sfb] < thm[sfb] . This requires 7 iterations of calc_sfb_noise() which computes quantization error for the specified gain. This is the only time VBR needs to do any (expensive) quantization!
Step 3: VBR_noise_shaping() in vbrquantize.c: Find a combination of global_gain, subblock_gain, preflag, scalefac_scale, etc... so that: gain[sfb] <= desired_gain[sfb]
Step 4: VBR_quantize_granule() in vbrquantize.c: Calculate the number of bits needed to encode the frame with the values computed in step 3. Unlike CBR, VBR (usually) only has to do this expensive huffman bit counting stuff once!
Step 5: VBR_noise_shaping() in vbrquantize.c: if bits < minimum_bits: Repeat step 3, only with a larger value of global_gain. (but allow bits < minimum_bits for analog silence)
if bits > maximum_bits: decrease global_gain, keeping all other scalefactors the same.
Usually step 5 is not necessary.
step 6: VBR_quantize() in vbrquantize.c: After encoding both channels and granules, check to make sure that the total number of bits for the whole frame does not exceed the maximum allowed. If it does, lower the quality and repeat steps 2,3 and 4 for the granules that were using lots of bits.

Flow

The actual flow chart looks something like this:

VBR_quantize(): determine minbits, maxbits for each granule determine max_frame_bits adjust global quality setting based on VBR_q do frame_bits=0 loop over each channel, granule: compute thm[sfb] bits = VBR_noise_shaping(): Encodes each granule with minbits <= bits <= maxbits frame_bits += bits lower the global quality setting while (frame_bits > max_frame_bits)
VBR_noise_shaping(): find_scalefac() (computes desired_gain) Estimate largest possible value of global_gain do compute_scalefac_long/short() scalefacts, etc. so that gain <= desired_gain) bits = VBR_quantize_granule() if (bits < minbits && analog silence) break; if (bits >= minbits) break; decrease global_gain (which increases number of bits used) while 1 if bits > maxbits do increase global_gain bits = VBR_quantize_granule() while (bits > maxbits)
find_scalefac(): Simple divide and conquer iteration which repeatidly calls calc_sfb_noise() with different values of desired_gain until it finds the largest desired_gain such that the quantization_noise < allowed masking Requires 7 iterations.