Map-Reduce Query to Count Tags

I have a database of documents tagged with keywords. I am trying to find (and then count) unique tags that are used next to each other. Therefore, for any given tag, I want to know which tags were used next to this tag.

For example, if I had one document with tags [fruit, apple, plant], then upon request [apple]I should receive [fruit, plant]. If another document has tags [apple, banana], then my request for would [apple]instead give me [fruit, plant, banana].

This is my card function that emits all tags and their neighbors:

function(doc) {
  if(doc.tags) {
    doc.tags.forEach(function(tag1) {
      doc.tags.forEach(function(tag2) {
        emit(tag1, tag2);
      });
    });
  }
}

So, in my example above, it will emit

apple -- fruit
apple -- plant
apple -- banana
fruit -- apple
fruit -- plant
...

My question is: what should be the reduction function? The reduction function should essentially filter out duplicates and group them all together.

, (CouchDB) : reduce_overflow_error. .


EDIT: -, , , . , "rereduce". , reduce_overflow_errors. - , ? , , ?

function(keys, values, rereduce) {
  if(rereduce) return null; // Throws error without this.

  var a = [];
  values.forEach(function(tag) {
    if(a.indexOf(tag) < 0) a.push(tag);
  });
  return a;
}
+3
2

, , , , , . , .

CouchDB , . , , , - , "sibling" , , . - , : , CouchDB.

, 2- . , , fruit, apple, plant.

// Pseudo-code visualization of view rows (before reduce)
// Key         , Value
[apple, fruit ], 1
[apple, plant ], 1 // Basically this is every combination of 2 tags in the set.
[fruit, apple ], 1
[fruit, plant ], 1
[plant, apple ], 1
[plant, fruit ], 1

- apple, banana.

// Pseudo-code visualization of view rows (before reduce)
// Key         , Value
[apple, banana], 1 // This is from my new doc
[apple, fruit ], 1
[apple, plant ], 1 // This is also from my new doc
[banana, apple], 1
[fruit, apple ], 1
[fruit, plant ], 1
[plant, apple ], 1
[plant, fruit ], 1

1? : _sum, . ?group_level=2 CouchDB .

:

function(doc) {
  // Emit "sibling" tags, keyed on tag pairs.
  var tags = doc.tags || []
  tags.forEach(function(tag1) {
    tags.forEach(function(tag2) {
      if(tag1 != tag2)
        emit([tag1, tag2], 1)
    })
  })
}
+4

, . , CouchDB reduce_limit = false, .

Futon http://localhost:5984/_utils/config.html query_server_config, .

, , "" :

function(doc) {
  if(doc.tags) {
    doc.tags.forEach(function(tag1) {
      doc.tags.forEach(function(tag2) {
        emit(tag1, [tag2]); // Array with single value
      });
    });
  }
}

:

function(keys, values) {
  var a = [];
  values.forEach(function(tags) {
    tags.forEach(function(tag) {
      if(a.indexOf(tag) < 0) a.push(tag);
    });
  });
  return a;
}

, -!

+1

All Articles