-Поиск по дневнику

Поиск сообщений в rss_planet_mozilla

 -Подписка по e-mail

 

 -Постоянные читатели

 -Статистика

Статистика LiveInternet.ru: показано количество хитов и посетителей
Создан: 19.06.2007
Записей:
Комментариев:
Написано: 7


Dan Minor: Using masked writes with ARM NEON intrinsics

Пятница, 08 Января 2016 г. 16:53 + в цитатник

I recently fixed Bug 1105513 which was to provide an ARM NEON optimized version of the AudioBlockPanStereoToStereo for the case where the “OnTheLeft” is an array. This is used by the StereoPanner node when the value is set at a future time, for instance with code like the following:

panner = oac.createStereoPanner();
panner.pan.setValueAtTime(-0.1, 0.0);
panner.pan.setValueAtTime(0.2, 0.5);

The “OnTheLeft” values determine whether the sound is on the left or right of the listener at a given time, which controls the interpolation calculation performed when panning. If this changes with time, then this is passed as an array rather than as a constant.

The unoptimized version of this function checks each value of “OnTheLeft” and performs the appropriate calculation. This isn’t an option for NEON which lacks this kind of conditional execution.

The bright side is that NEON does provide masked writes where a variable controls which components of a vector are written. Unfortunately, the NEON documentation is spare at best, so it took a few tries to get things right.

The first trick is to convert from a bool to a suitable mask. What a bool is, is of course platform dependent, but in this case I had an array of eight bytes, each containing a zero or a one. The best solution I came up with was to load them as a vector of 8 unsigned bytes and then load each corresponding float value in the mask individually:

isOnTheLeft = vld1_u8((uint8_t *)&aIsOnTheLeft[i]);
voutL0 = vsetq_lane_f32(vget_lane_u8(isOnTheLeft, 0), voutL0, 0);
voutL1 = vsetq_lane_f32(vget_lane_u8(isOnTheLeft, 1), voutL0, 1);
...

Once loaded, they can be converted into a suitable mask by using the vcgtq function which sets all bits to 1 in the first argument if it is greater than the second argument:

voutL0 = (float32x4_t)vcgtq_f32(voutL0, zero);

After that, the appropriate calculations are done for both the case where “OnTheLeft” is true and where it is false. These are then written to the result using vbsql function, which treats the mask as the output, and selects from the second two arguments based upon the value in the mask:

voutL0 = vbslq_f32((uint32x4_t)voutL0, onleft0, notonleft0);

I evaluated these changes on a StereoPanner benchmark where I saw a small performance improvement.

http://www.lowleveldrone.com/mozilla/webaudio/2016/01/08/masked-writes-arm-intrinsics.html


 

Добавить комментарий:
Текст комментария: смайлики

Проверка орфографии: (найти ошибки)

Прикрепить картинку:

 Переводить URL в ссылку
 Подписаться на комментарии
 Подписать картинку