Does DistributedCache delete cached files after every job?

Question

Does DistributedCache delete cached files after every job?

The documentation for DistributedCache states:

Its effectiveness is due to the fact that files are copied only once per task and the ability to cache archives that are not archived on slave devices.

What does it mean when he says that he can "cache archives that are not archived on subordinates"? Are cached files deleted after every job? I would like to be able to run the same job hundreds of times on different datasets without the additional overhead of redistributing DistributedCache files to each individual job. Is it possible?

+3

mapreduce hadoop

LeonardBlunderbuss Feb 05 '14 at 21:22

source share

1

rVr · Accepted Answer · 2014-02-06T03:18:28+0000

Hadoop , DistributedCache. 0, . , DistributedCache , node .

Does DistributedCache delete cached files after every job?

More articles: