ref: 4645bd26aa506fe5dd54dc230f3d36e446261360
parent: d906dda2240b2c4b39687f7474a4d1607319681a
author: Sindre Aamås <saamas@cisco.com>
date: Tue Apr 19 15:42:17 EDT 2016
[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount Avoid touching some cache lines by using popcnt instead of table lookups. Also gives a speedup of ~1.4x on Haswell as compared with SSE2.