In Hadoop you can control how many replicas of your file gets created. I wanted to try that out, so i tried different options.
First option is to set up the replication factor in hdfs-site.xml, the settings in hdfs-site.xml apply to all the files (globals ettings)
- dfs.replication: Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
- dfs.replication.max: Maximal block replication.
- dfs.namenode.replication.min: Minimal block replication.
After configuring global replication settings, i did restart hdfs daemons and when i executed
hdfs fsck on one of the files to see the effect of setting replication and this is the output i got
You could also use the command line tools to set particular value of replication factor by executing
hdfs setrep
command like this.
hdfs dfs -setrep 5 /user/user/aesop.txt
Then i could verify the effect of replication settings like this
Then the last option is to
set replication factor programmatically
No comments:
Post a Comment