I am looking to write (or use an existing) web service that accepts MS WORD and a PDF file, extracts its contents and returns it as text.
Does anyone know such a service or how to write it?
For Word-to-text, you can use antiwordand pass your result to the client.
antiword
In a PDF file PdfTk - dump_data operation may be useful.