-Поиск по дневнику

Поиск сообщений в rss_thedaily_wtf

 -Подписка по e-mail

 

 -Постоянные читатели

 -Статистика

Статистика LiveInternet.ru: показано количество хитов и посетителей
Создан: 06.04.2008
Записей:
Комментариев:
Написано: 0


CodeSOD: Collated Performance

Среда, 09 Декабря 2015 г. 14:30 + в цитатник

Eliza had a co-worker who had a problem. There were users with names like Ren'e. Other users, using standard keyboards, wanted to search for Ren'e, but couldnt be bothered to figure out how to type that accent, so they just searched for Rene.

The co-worker came up with this solution:

SELECT * FROM table WHERE UPPER(trim(translate(first_name, ''E^E"E`EAE+^A`A"A'AC`U^U"U^O"O"I^IŸ'e^e"e`e^a`a"a'ac`u^u"u^u^o"o^i"i"y', 'EEEEEEAAAACUUUOOIIYeeeeaaaacuuuuooiiy'))) LIKE (translate('FIRSTNAMESEARCHSTRING%', ''E^E"E`EAE+^A`A"A'AC`U^U"U^O"O"I^IŸ'e^e"e`e^a`a"a'ac`u^u"u^u^o"o^i"i"y', 'EEEEEEAAAACUUUOOIIYeeeeaaaacuuuuooiiy')) AND UPPER(trim(translate(last_name, ''E^E"E`EAE+^A`A"A'AC`U^U"U^O"O"I^IŸ'e^e"e`e^a`a"a'ac`u^u"u^u^o"o^i"i"y', 'EEEEEEAAAACUUUOOIIYeeeeaaaacuuuuooiiy'))) LIKE (translate('LASTNAMESEARCHSTRING%', ''E^E"E`EAE+^A`A"A'AC`U^U"U^O"O"I^IŸ'e^e"e`e^a`a"a'ac`u^u"u^u^o"o^i"i"y', 'EEEEEEAAAACUUUOOIIYeeeeaaaacuuuuooiiy'))

Eliza noticed it because this query was extremely slow- and no surprise. translate is unavoidably a character by character operation. It was taking 30 seconds to search their database with a nine character last name that didnt even contain any of the replaced characters. And of course, in most collations and character sets, 'E and upper(''e') are going to be the same character, making half the replacements completely unnecessary.

Speaking of collations, theyre a tool thats been standardized to make this really complicated problem of determining which characters are the same, or what order is alphabetical easy for programmers to solve. Every RDBMS supports them, and by specifying a collation. You can set the collation either on the table, the column, the individual query youre running against, or for the session of the query, using the standard SQL command COLLATE.

Eliza added some code to issue a COLLATE utf8_general_ci; command before running the query, removed the translates, and watched the query execute in 1/30th of a second, instead of 30 seconds.

[Advertisement] Use NuGet or npm? Check out ProGet, the easy-to-use package repository that lets you host and manage your own personal or enterprise-wide NuGet feeds and npm repositories. It's got an impressively-featured free edition, too!

http://thedailywtf.com/articles/collated-performance

Метки:  

 

Добавить комментарий:
Текст комментария: смайлики

Проверка орфографии: (найти ошибки)

Прикрепить картинку:

 Переводить URL в ссылку
 Подписаться на комментарии
 Подписать картинку