hive - Hadoop: FSCK result shows missing replicas -
could let me know how fix missing replicas?
============================================================================
total size: 3447348383 b
total dirs: 120
total files: 98
total blocks (validated): 133 (avg. block size 25919912 b)
minimally replicated blocks: 133 (100.0 %)
over-replicated blocks: 0 (0.0 %)
under-replicated blocks: 21 (15.789474 %)
mis-replicated blocks: 0 (0.0 %)
default replication factor: 3
average block replication: 2.3834586
corrupt blocks: 0
missing replicas: 147 (46.37224 %)
number of data-nodes: 3
number of racks: 1
============================================================================
as per indefinite guide,
corrupt or missing blocks biggest cause concern, means data has been lost. default, fsck leaves files corrupt or missing blocks, can tell perform 1 of following actions on them:
• move affected files /lost+found directory in hdfs, using -move option. files broken chains of contiguous blocks aid salvaging efforts may attempt.
• delete affected files, using -delete option. files cannot recovered after being deleted.
here question how find out affected files? have worked hive required outputs without issue. affect performance/speed of query processing.
regards,
raj
missing replicas should self-healing on time. however, if you're wanting move them lost+found, can use:
hadoop fsck / -move
or delete them with:
hadoop fsck / -delete
if want identify files under-replicated blocks, use:
hadoop fsck / -files -blocks -locations
that give lots of detail, including list of expected/actual block replication counts.
Comments
Post a Comment