hive - Hadoop: FSCK result shows missing replicas -


could let me know how fix missing replicas?

============================================================================

total size: 3447348383 b

total dirs: 120

total files: 98

total blocks (validated): 133 (avg. block size 25919912 b)

minimally replicated blocks: 133 (100.0 %)

over-replicated blocks: 0 (0.0 %)

under-replicated blocks: 21 (15.789474 %)

mis-replicated blocks: 0 (0.0 %)

default replication factor: 3

average block replication: 2.3834586

corrupt blocks: 0

missing replicas: 147 (46.37224 %)

number of data-nodes: 3

number of racks: 1

============================================================================

as per indefinite guide,

corrupt or missing blocks biggest cause concern, means data has been lost. default, fsck leaves files corrupt or missing blocks, can tell perform 1 of following actions on them:

• move affected files /lost+found directory in hdfs, using -move option. files broken chains of contiguous blocks aid salvaging efforts may attempt.

• delete affected files, using -delete option. files cannot recovered after being deleted.

here question how find out affected files? have worked hive required outputs without issue. affect performance/speed of query processing.

regards,

raj

missing replicas should self-healing on time. however, if you're wanting move them lost+found, can use:

hadoop fsck / -move 

or delete them with:

hadoop fsck / -delete 

if want identify files under-replicated blocks, use:

hadoop fsck / -files -blocks -locations 

that give lots of detail, including list of expected/actual block replication counts.


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -