|
Contents
|
Pattern - Hadoop Cluster Setup with Hadoop files in a shared filesystem
Features
- A shared filesystem is used to contain the hadoop binaries
- A shared filesystem is used to contain the hadoop-site.xml file, log4j.properties and any other configuration files used.
- This filesystem is network mounted by all nodes in the cluster.
Advantages
- Overwriting the files in the shared filesystem can update all machines in the cluster with new binaries or configuration files.
- Maintenance costs become O(1), although initial installation costs can still be O(N)
Disadvantages
- There's still a need to be able to decommission datanodes, and to shut down and restart the entire cluster.
- It is now possible to misconfigure the entire cluster simultaneously.
- There is a race condition: the shared filestore must be up before the nodes run their startup scripts.
- The shared filestore can become a point of failure.
|