Quantcast
Channel: StackExchange Replication Questions
Viewing all articles
Browse latest Browse all 17268

Hadoop HDFS imbalance with replication factor = number of datanodes?

$
0
0

I've been reading hdfs balancer questions this weekend and could not answer my question from what I've seen here.

So here it goes: "I've Hadoop grid with replication factor = 4 and 4 DNs, I expect all to have roughly the same blocks and disk usage across the grid, but that's NOT what's happening. I've one DN with 50% usage when the others have around 20% occupation."

I've ran hdfs balancer this weekend to try to balance disk usage but nothing happened, I always end up with 5 idle tasks doing nothing even though the job detects almost 300GB of imbalanced data on the grid. I've tried tweaking threshold and policy parameters to no avail.

Any tips? Thanks!


Viewing all articles
Browse latest Browse all 17268

Trending Articles