How to find a file from blockName in HDFS hadoop

Question

How to find a file from blockName in HDFS hadoop

What is the easiest way to find a file associated with a block in HDFS, given the Name / ID block

+5

hadoop hdfs

Inder singh Jun 04 '12 at 12:40

source share

2 answers

Not sure when it was introduced, but you can do it

hdfs fsck -blockId <block_id>

hdfs fsck -blockId blk_1100790203
Connecting to namenode 
FSCK started by hdfs 

Block Id: blk_1100790203
Block belongs to: /common/FFL1447685899336.txt

+5

Abhijith Mar 6 '17 at 21:38

source share

Chris white · Accepted Answer · 2012-06-05T01:22:47+0000

A long and painful path, assuming you read access to all files (and do for directories):

hadoop fsck / -files -blocks | grep blk_520275863902385418_1002 -B 20

Then scan the backup from its matching block to the previous file name:

/hadoop/mapred/system/jobtracker.info 4 bytes, 1 block(s):  OK
0. blk_520275863902385418_1002 len=4 repl=1

In this case, blk_5202 ... is part of the file /hadoop/mapred/system/jobtracker.info

Programmatically, this is not an interface to a node name that allows you to search by block identifier, but you could examine the source of the secondary name of the node and see how it merges the changes, and then experiment on the saved output of the secondary name of the node (instead of risking working with live node file).

Good luck

How to find a file from blockName in HDFS hadoop

More articles: