Handling bad UTF-8 from json, in ruby

I am retrieving data from remote json at http://hndroidapi.appspot.com/news/format/json/page/?appid=test . The problem I am facing is that this API seems to generate JSON without properly handling the UTF-8 encoding (correct me if I am wrong here). For example, part of the result being transmitted right now,

{
"title":"IPad - please don€™t ding while you and I are asleep  ",
"url":"http://modern-products.tumblr.com/post/25384729998/ipad-please-dont-ding-while-you-and-i-are-asleep",
"score":"10 points",
"user":"roee",
"comments":"18 comments",
"time":"1 hour ago",
"item_id":"4128497",
"description":"10 points by roee 1 hour ago  | 18 comments"
}

Pay attention to don€™t. And this is not the only type of character that suffocates. Is there anything I can do to convert the data to something clean, given that I don't control the API?

Edit:

This is how I drop JSON:

hn_url = "http://hndroidapi.appspot.com/news/format/json/page/?appid=test"
  url = URI.parse(hn_url)

  # Attempt to get the json
  req = Net::HTTP::Get.new(hn_url)
  req.add_field('User-Agent', 'Test')
  res = Net::HTTP.start(url.host, url.port) {|http| http.request(req) }
  response = res.body
  if response.nil?
    puts "Bad response when fetching HN json"
    return
  end

  # Attempt to parse the json
  result = JSON.parse(response)
  if result.nil?
    puts "Error parsing HN json"
    return
  end

Edit 2:

API GitHub. , . , - , : https://github.com/glebpopov/Hacker-News-Droid-API/issues/4

+5
2

, JSON, , US-ASCII UTF-8, Net::HTTP .

1.9.3p194 :044 > puts res.body.encoding
US-ASCII

Ruby 1.9.3 , , . :

response = res.body.force_encoding('UTF-8')

JSON UTF-8 , .

+4

force_encoding . , .

Net::HTTP - .

1.9.3:

  • , ASCII-8BIT. , , .
  • http.request Get, US-ASCII. .
  • http.get, .
    • , ASCII-8BIT
    • , US-ASCII

US-ASCII, , Net::HTTP , , US-ASCII. ( net/ , ruby.)

ASCII-8BIT, Get .

2.0, , UTF-8, , . -K, . n, e, s, u -K.

+1

All Articles