How to remove schema from url in Python?

I am working with an application that returns URLs written using Flask. I want the URL displayed to the user to be as clean as possible, so I want to remove http: // from it. I looked and found the urlparse library, but could not find examples of how to do this. What would be the best way to do this, and if urlparse is redundant, is there an easier way? Would just remove the substring "http: //" from the URL, just using the usual parsing tools, be bad practice or cause problems?

+3
source share
2 answers

I do not think that urlparseoffers one method or function for this. Here's how I do it:

from urlparse import urlparse

url = 'HtTp://stackoverflow.com/questions/tagged/python?page=2'

def strip_scheme(url):
    parsed = urlparse(url)
    scheme = "%s://" % parsed.scheme
    return parsed.geturl().replace(scheme, '', 1)

print strip_scheme(url)

Conclusion:

stackoverflow.com/questions/tagged/python?page=2

() , http[s] , , . , .

+5

, , , urlparse URL- .

ParseResult . , , .

# py2/3 compatibility
try:
    from urllib.parse import urlparse, ParseResult
except ImportError:
    from urlparse import urlparse, ParseResult


def strip_scheme(url):
    parsed_result = urlparse(url)
    return ParseResult('', *parsed_result[1:]).geturl()

parsedresult, .

, @Lukas Graf. , "//" URL- , , .

>>> Lukas_strip_scheme('https://yoman/hi?whatup')
'yoman/hi?whatup'
>>> strip_scheme('https://yoman/hi?whatup')
'//yoman/hi?whatup'
0

All Articles