Create a simple RSS reader, extract content

I am trying to make a simple RSS reader using a SyndicationFeedclass.

There are some standard tags such as <title>, <link>, <description>... with them no problem.

But there are other tags. for example, this feed created by WordPress has a tag <content:encoded>. I think there may be other tags for some of the content of other websites. is not it?

I want to know how to find the main content of each message, are there any standards? What tags should I look for?

(for example, a site can use <content:encoded>, but some others just use <description>, or someone uses a different standard ... I don’t know what to do to get the main content of the mail)

PS: I use this code to test my simple RSS reader:

        var reader = XmlReader.Create("http://feed.2barnamenevis.com/2barnamenevis");
        var feed = SyndicationFeed.Load(reader);

        string s = "";
        foreach (SyndicationItem i in feed.Items)
        {
            s += i.Title.Text + "<br />" + i.Summary.Text + "<br />" + i.PublishDate.ToString() + "<br />";
            foreach (SyndicationElementExtension extension in i.ElementExtensions)
            {
                XElement ele = extension.GetObject<XElement>();
                s += ele.Name + " :: " + ele.Value + "<br />";
            }
            s += "<hr />";
        }
        return s;
+5
source share
3 answers

I found the Argotic Framework Syndication (thanks from JoeEnos).

Argotic has many extensions that can be used to process elements that are not standard.

For example, you can use Argotic.Extensions.Core.SiteSummaryContentSyndicationExtensionto extract <content:encoded>. Here you can see an example here . (if this example returns nullfor content, you should just use MyRssItem.Description)

Some other useful extensions are: WellFormedWebCommentsSyndicationExtension(to get the root of the comment comments) and SiteSummarySlashSyndicationExtension(to get the comments).

+1
source

, , , - Argotic RSS.NET .

+4

Depending on what you want to support. The content element is not part of RSS2.0, but it matters Atom (rss 4287).

Check out the RSS2.0 specs http://cyber.law.harvard.edu/rss/rss.html#hrelementsOfLtitemgt Read the Atom specs http://tools.ietf.org/html/rfc4287

0
source

All Articles