Replace original MS Outlook html string with regex?

I have an application that reads the original html and downloads all email attachments. This works fine, except for the fact that Microsoft Outlook has some weird source value, like ...

<img width="163" height="39" id="Picture_x0020_1" src="cid:image001.png@01CD7F6C.70CD2320" alt="Description: Description: Description: cid:image001.png@01CC6D59.AEF6D270">

First, I would like to change it to just attachments \ image001.png as a source. In addition, alt should just be image001.png, not that long weird alt. Not quite sure how to do this.

+5
source share
1 answer

You should use Regex (I updated the tags in your question to reflect this):

Regex.Replace(text, @"src=""cid:(?<FileName>[^@]+)@[^""]*""", @"src=""Attachments\${FileName}""",
    RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
Regex.Replace(x, @"alt=""[^.]*cid:(?<FileName>[^@]+)@[^""]*""", @"alt=""${FileName}""",
    RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);

I'm sure there are more effective ways to do this, but this is what I could come up with.

+2
source

All Articles