Quantcast
Channel: StackExchange Replication Questions
Viewing all articles
Browse latest Browse all 17268

How do the "dfs.replication" and "dfs.datanode.data.dir" configurations work in a cluster?

$
0
0

I've followed the Apache "Single Node Setup" instructions which sets the dfs.replication on the single node.

But then I followed the "Cluster Setup" but it doesn't mention about this property, so I don't know whether this is a property to be set on the Namenode, or also/only on Datanodes ..

I have also read that setting multiple (comma-separated) paths in dfs.datanode.data.dir on data nodes will replicate data on all paths.

So my question is : on which node(s) will the dfs.replication have an effect, and if multiple paths for dfs.datanode.data.dir are set, are these extra independent replications only per Datanode, or are these also tied in some way by the dfs.replication factor ?

And also, what is the use this extra local replication on Datanodes when the data is already replicated on other nodes ?


Viewing all articles
Browse latest Browse all 17268

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>