ref: 6344c84c82f6a4f82a6a4f9f33a6d1ec85691930
parent: 449f136886e96fcf448bf9b68952977da703c614
author: Yunqing Wang <yunqingwang@google.com>
date: Fri Mar 15 07:33:10 EDT 2013
Optimize 8x8 idct function Wrote sse2 functions of vp9_short_idct8x8 and vp9_short_idct10_8x8. Compared to c version, the sse2 version is 2X faster. The decoder test didn't show noticeable gain since 8x8 idct doesn't take much of decoding time (less than 1% in my test). Change-Id: I56313e18cd481700b3b52c4eda5ca204ca6365f3