I work with current NOAA XML monitoring (for example: Washington, DC ) and shred files for 4000+ stations in the SQL Server 2008 R2 table. Having tried many different approaches, I have one with which I am moving forward.
This question is about performance between different methods and, most importantly, why it is so strong.
First try
Working in C #, I parsed all the files with Linq for XML and wrote the resulting records in the database with Linq to SQL. The code for this is predictable, so I won’t bother you.
Overwriting with linq in the Entity Framework did not help.
This led to the fact that the application earned more than an hour and processed only 1600 files. Slowness is the result of both Linq to SQL and Linq for Entities that insert and select for each record.
Second attempt
While still working in C #, I tried to speed it up using the available bulk insert methods (for example: Accelerate insertions using Linq-to-SQL - Part 1 ).
Still slow, albeit noticeably faster than the first attempt.
At this point, I switched to using a stored procedure to handle XML shredding and pasting with C # code combining files into a single XML line and adding a wrapper tag.
Third attempt
Using SQL Server XML Query similar to this (@xml is an xml file) [from memory]:
select credit = T.observation.value('credit[1]', 'varchar(256)')
,...
from @xml.nodes('wrapper') W(station)
cross apply W.station.nodes('current_observation') T(observation)
I allow him to work for 15 minutes and is canceled from 250 processed records.
OpenXML:
declare $idoc int
exec sp_xml_preparedocument @idoc output, @xml
select Credit
,...
from openxml(@idoc, '/wrapper/current_observations', 2)
with (
Credit varchar(256) 'credit'
,...)
exec sp_xml_removedocument @idoc
4000+ 10 ! .
, , .
, ,
" ?"
, , 3 .