Parsing an HTTP Response with Nokogiri
Hi, I am having trouble parsing HTTPresponse objects using Nokogiri.
I use this function to get the website here:
select link
def fetch(uri_str, limit = 10)
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(URI.encode(uri_str.strip))
puts url
req = Net::HTTP::Get.new(url.path,headers)
response = Net::HTTP.start(url.host,url.port) { |http|
http.request(req)
}
case response
when Net::HTTPSuccess
then
puts "this is location" + uri_str
puts "this is the host #{url.host}"
puts "this is the path #{url.path}"
return response
when Net::HTTPRedirection
then
puts "this is redirect" + response['location']
return fetch(response['location'],aFile, limit - 1)
else
response.error!
end
end
html = fetch("http://www.somewebsite.com/hahaha/")
puts html
noko = Nokogiri::HTML(html)
When I do this, html prints a whole bunch of gibberish and Nokogiri complains that "node_set should be Nokogiri :: XML :: NOdeset
If anyone can offer help, he would greatly appreciate it.
source
share