The answer is "this is not possible." At least not for the specific case of hadoop streaming for snappy-compressed files occurring outside of chaos.
I (completely!) Explored two main options to come to this conclusion: (1) try using the built-in instant adoop compression as suggested with a high degree of protection, or (2) write my own stream module for consuming and unpacking instant files.
For option (1), it looks like hasoop adds some markup to files when compressed using snappy. Since my files are compressed using snappy outside of chaos, the hasoop built-in codec cannot decompress files.
One symptom of this problem was a heap error:
2013-04-03 20:14:49,739 FATAL org.apache.hadoop.mapred.Child (main): Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:102)
at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:82)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:76)
at java.io.InputStream.read(InputStream.java:85)
...
mapred.child.java.opts, :
java.io.IOException: IO error in map input file s3n:
Hadoop snappy codec , .
(2) , \n,\r \r\n. , , . :
2013-04-03 22:29:50,194 WARN org.apache.hadoop.mapred.Child (main): Error running child
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:372)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:586)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
...
Java- adoop (, ) , , \r vs\n. , , , hadoop, Java. , , .
, , , , gzip lzo.
PS - (2) (, textinputformat.record.delimiter = X), .
PPS. S3, , -copyFromLocal, HDFS. , , .