Users can enter websites from domain names. They can also enter mailadresses from their contacts.
Know that we need to find customers that websited whoose domain can be associated with mailadresses.
So my idea is to extract the host from the website and from the URL and compare them
So what is the most reliable algorithm to get the hostname from the URL?
for example, a host can be:
foo.com
www.foo.com
http:
https:
https:
The result should always be foo.com
source
share