Search or index XML files

I work on a news website that stores all their stories as XML. I know, not the best way to go, but this is what it is. What I'm trying to do allows me to search for XML files from a website. Right now, our search function - everything works on Google (it searches only for what Google has already crawled).

What I think right off the bat is to use Grep, which works well, but probably won't scale too much. Another option, which will require much more work, but will work better, is to store parts of XML in a relational database.

Considering how our backend is configured, switching to another storage model will take a lot of time, so for now we have to work. Ideas?

+5
source share
3 answers

Adding caching can help you ease the idea of ​​grep. Nevertheless, you can consider a solution that will not only help us solve the problem today, but tomorrow will bring you closer to a better solution. Perhaps developing a better solution and implementing it piecemeal over time will do the trick.

+3
source

I would also suggest using an XML database system such as BaseX (.org), as it is very fast. I would suggest storing each article in a separate file. BaseX supports XQuery 3.0, as well as Full Text, an updater ...

+1
source

XML, XML, Berkeley DBXML eXist-db. xqueries. eXist , DBXML, .

0

All Articles