ref: a41a4860c0b3be7815f37b4ec833e87218307c4f
parent: c43af9a8a3adc7bd3888e746ce7b7bd581c476ae
author: Jingning Han <jingning@google.com>
date: Fri Jun 14 07:28:56 EDT 2013
Make fdct32 computation flow within 16bit range This commit makes use of dual fdct32x32 versions for rate-distortion optimization loop and encoding process, respectively. The one for rd loop requires only 16 bits precision for intermediate steps. The original fdct32x32 that allows higher intermediate precision (18 bits) was retained for the encoding process only. This allows speed-up for fdct32x32 in the rd loop. No performance loss observed. Change-Id: I3237770e39a8f87ed17ae5513c87228533397cc3