In Python, how do I check if two different links actually point to the same page?

2 answers

Call geturl()by result urllib2.urlopen(). geturl()"returns the URL of the retrieved resource, typically used to determine if a redirect was made.

For instance:

#!/usr/bin/env python
# coding: utf-8

import urllib2

url1 = 'http://www.independent.co.uk/life-style/gadgets-and-tech/news/chinese-blamed-for-gmail-hacking-2292113.html'
url2 = 'http://www.independent.co.uk/life-style/gadgets-and-tech/news/2292113.html'

for url in [url1, url2]:
    result = urllib2.urlopen(url)
    print result.geturl()

Conclusion:

http://www.independent.co.uk/life-style/gadgets-and-tech/news/chinese-blamed-for-gmail-hacking-2292113.html
http://www.independent.co.uk/life-style/gadgets-and-tech/news/chinese-blamed-for-gmail-hacking-2292113.html
+12
source

It is impossible to distinguish this simply from URLs, obviously.

, , , , - , , , , .

, , , , , , , .

+2

All Articles