I know this is an old thread, but I could not get this method to work. I think I will share my conclusions.
input: (1-2-3, abc)
(4-5-6, xyz)
desired output:
(1, abc)
(2, abc)
(3, abc)
(4, xyz)
(5, xyz)
(6, xyz)
I originally used STRSPLIT, which generates a tuple leading to a similar input, as mentioned above, but failed.
output = FOREACH input GENERATE FLATTEN(TOBAG(STRSPLIT($0, '-'))), $1
As a result, the result was as follows:
(1,2,3,abc)
(4,5,6,xyz)
However, when I used tokenize and replaced functions, I got the desired result.
output = FOREACH input GENERATE FLATTEN(TOKENIZE(REPLACE($0,'-', ' '))), $1;
source
share