Unicode and performance

I am migrating a large-scale web service to be compatible with international characters. This is the Tomcat / Spring MVC / SQL Server stack. The migration itself was relatively straightforward, we made a few tweaks to Tomcat to force UTF-8 in the response, changed the Java code to use encoding, and moved some VARCHAR columns to NVARCHAR, followed by a healthy dose of unit / function tests.

Another person on my team wants to check the load to make sure that none of these changes affect system performance. The individual components of this transition, described above, do not really hint at any changes in performance, and, frankly, I do not think that this is absolutely necessary based on my limited knowledge. I plan to do this anyway, but this is my question: are there any results that can be seen in such a migration? Is there anything specific for encoding another character that can change system performance?

The only thing I could think of was comparing strings and sorting strings, etc. Any ideas?

+3
source share
3 answers

I have only this joke:

, (ASCII) unicode . sql , , , ascii. .

+2

SQL Server 2008 R2, Unicode Compression:

SQL Server 2008 R2 Unicode (SCSU) Unicode, . , Unicode nchar (n) nvarchar (n). SQL Database Engine Unicode 2 , . UCS-2. , SCSU SQL Server 2008 R2 50% .

, , - . NVARCHAR , VARCHAR, , , NVARCHAR. , A B, VARCHAR , CAST(A as NVARCHAR) B (, B NVARCHAR), SARGable ( ). , WHERE, . , , ( ).

+4

Character encodings, if done correctly, should not be a problem. Unicode is much more complicated, but you don’t think about it. Someone else did it. All you need to think about is that you are not arbitrarily converting arbitrary strings.

However, you will see that all of your string data takes up twice as much space. This affects the heuristics used by SQL Server to create execution plans, and there are subtle issues with indexes that can change, but I wouldn't worry about that if you really don't have large datasets.

+1
source

All Articles