How to combine / concatenately two bags in lead latin

I have two datasets:

A = {uid, url}; B = {uid, url};

now i do a cogroup:

C = COGROUP A BY uid, B BY uid;

and I want to change C to { group AS uid, DISTINCT A.url+B.url};

My question is, how do I do this concatenation of the two packages A.url and B.url?

Or, in another way, how to do DISTINCTfor multiple columns?

+3
source share
2 answers

It may not be what you expect, but what I understood from your question:

C = JOIN A BY uid, B BY uid;
D = DISTINCT C;

Concatenation is performed as follows:

E = FOREACH D GENERATE CONCAT(A::uid,B::uid); 
0
source
A = LOAD 'A' using PigStorage() as (uid,url);
B = LOAD 'B' using PigStorage() as (uid,url);
C = JOIN A by uid ,B by uid;
D = FOREACH C GENERATE $0,CONCAT(A::url,B::url);
E= DISTINCT D;
dump E;
0
source

All Articles