SQL Server XML Shredding Performance

I work with current NOAA XML monitoring (for example: Washington, DC ) and shred files for 4000+ stations in the SQL Server 2008 R2 table. Having tried many different approaches, I have one with which I am moving forward.

This question is about performance between different methods and, most importantly, why it is so strong.

First try

Working in C #, I parsed all the files with Linq for XML and wrote the resulting records in the database with Linq to SQL. The code for this is predictable, so I won’t bother you.

Overwriting with linq in the Entity Framework did not help.

This led to the fact that the application earned more than an hour and processed only 1600 files. Slowness is the result of both Linq to SQL and Linq for Entities that insert and select for each record.

Second attempt

While still working in C #, I tried to speed it up using the available bulk insert methods (for example: Accelerate insertions using Linq-to-SQL - Part 1 ).

Still slow, albeit noticeably faster than the first attempt.

At this point, I switched to using a stored procedure to handle XML shredding and pasting with C # code combining files into a single XML line and adding a wrapper tag.

Third attempt

Using SQL Server XML Query similar to this (@xml is an xml file) [from memory]:

select credit = T.observation.value('credit[1]', 'varchar(256)')
       ,... -- the rest of the elements possible in the file.
from @xml.nodes('wrapper') W(station)
    cross apply W.station.nodes('current_observation') T(observation)

I allow him to work for 15 minutes and is canceled from 250 processed records.

OpenXML:

declare $idoc int

exec sp_xml_preparedocument @idoc output, @xml

select Credit
       ,... -- the rest of the elements
from openxml(@idoc, '/wrapper/current_observations', 2)
    with (
        Credit varchar(256) 'credit'
        ,...) -- the rest of the elements

exec sp_xml_removedocument @idoc

4000+ 10 ! .

, , .

, ,

" ?"

, , 3 .

+5
3

, , XQuery, , .

, XML - node, , XML <wrapper>, <current_observation> , XQuery :

select 
    credit = T.observation.value('credit[1]', 'varchar(256)')
    ,... -- the rest of the elements possible in the file.
from 
    @xml.nodes('wrapper/current_observation') T(observation)

, , .

- , - XQUery OPENXML.

+2

, ('..') ? . text(), , :

select
o.c.value('(credit/text())[1]', 'varchar(max)'),
--...
from @xml.nodes('wrapper/current_observation') o(c)
+1

Have you tried to use a text accessory? I got a 15-20% improvement in my playback with a 6 megabyte xml file with 4096 entries in it, although this only applies to untyped XML (without the XSD associated with SQL Server).

I also found that my request works after 10-12 seconds, so I'm still a little puzzled by your 43 seconds. What version / service pack of SQL Server are you using? I remember that in SQL 2005 there was a problem inserting a table into a variable, but it was believed that this was fixed.

0
source

All Articles