Getting the first image from html using Python / Django

I grab a bunch of html from the service and understand a bit about it. I am looking for a way to capture a link from the first image tag.

Something similar to this jQuery code:

var imagelink = $('img:first', feed.content).attr('src');

But, of course, using only Python / Django (the server runs in a Google application). I prefer not to use any other libraries, just to get a simple link.

+3
source share
3 answers

If I am more familiar with html, I will probably consider one of the suggested libraries. But now I have solved this:

   startImgPos = post.find('<img', 0, len(post)) + 4
    if(startImgPos > -1):
        endImgPos = post.find('>', startImgPos, len(post))
        imageTag = post[startImgPos:endImgPos]
        startSrcPos = imageTag.find('src="', 0, len(post)) +5
        endSrcPos = imageTag.find('"', startSrcPos , len(post)) 
        linkTag = imageTag[startSrcPos:endSrcPos]
        r['linktag'] = linkTag

I will improve this later, but at the moment it does the trick. Feel free to suggest more ideas / improvements for the above code.

0
source

You can use BeautifulSoup for this:

http://www.crummy.com/software/BeautifulSoup/

XML/HTML. , raw html, /attrs ..

- :

tree = BeautifulSoup(raw_html)
img_link = (tree.find('img')[0]).attr['src']
+7

, . , :

tree = BeautifulSoup(raw_html)
img_link = tree.find_all('img')[0].get('src')

It works great! thanks timmy-omahony

+3
source

All Articles