How do you write a test for `Iconv.new (" UTF8 // IGNORE ", ...)` idioms?

Question

How do you write a test for `Iconv.new (" UTF8 // IGNORE ", ...)` idioms?

This icoma Iconv transcodes the string to UTF-8 and transfers characters that cannot be transliterated:

require "iconv"

def normalize(text)
  Iconv.new('UTF-8//IGNORE', 'UTF-8').iconv(text.dup)
end

How do you actually write a test for this?

Edit: I ended up simplifying the question as I realized that the context of trying to test this in the specified Rails file # encoding: utf-8complicated the problem. So now, generosity is stupid, but I will reward it if someone can show a test with which I can work.

+5

ruby ruby-on-rails ruby-on-rails-3.2 character-encoding

danneu Feb 12 '13 at 0:22

source share

2 answers

, #encoding .

, URL-:

require "iconv"
require "cgi"

def normalize(text)
  Iconv.new('UTF-8//IGNORE', 'UTF-8').iconv(text)
end

puts normalize(CGI.unescape("m%FCstring")) # => mstring

.

ruby 1.9 Iconv , encode !

+1

phoet 14 . '13 20:44

severin · Accepted Answer · 2013-02-16T15:42:45+0000

, # pack. , / .

:

describe "#normalize" do
  it "should remove/ignore invalid characters" do
    # this "string" equals "Mandados de busca do caso Megaupload considerados inv\xE1lidos - Tecnologia - Sol"
    bad_string = [77, 97, 110, 100, 97, 100, 111, 115, 32, 100, 101, 32, 98, 117, 115, 99, 97, 32, 100, 111, 32, 99, 97, 115, 111, 32, 77, 101, 103, 97, 117, 112, 108, 111, 97, 100, 32, 99, 111, 110, 115, 105, 100, 101, 114, 97, 100, 111, 115, 32, 105, 110, 118, 225, 108, 105, 100, 111, 115, 32, 45, 32, 84, 101, 99, 110, 111, 108, 111, 103, 105, 97, 32, 45, 32, 83, 111, 108].pack('c*').force_encoding('UTF-8')

    normalize(bad_string).should == 'Mandados de busca do caso Megaupload considerados invlidos - Tecnologia - Sol'
  end
end

( , )

How do you write a test for `Iconv.new (" UTF8 // IGNORE ", ...)` idioms?

More articles: