Pattern - Hadoop Cluster Setup with Hadoop files in a shared filesystem

Contents

Pattern - Hadoop Cluster Setup with Hadoop files in a shared filesystem

Features

  • A shared filesystem is used to contain the hadoop binaries
  • A shared filesystem is used to contain the hadoop-site.xml file, log4j.properties and any other configuration files used.
  • This filesystem is network mounted by all nodes in the cluster.

Advantages

  • Overwriting the files in the shared filesystem can update all machines in the cluster with new binaries or configuration files.
  • Maintenance costs become O(1), although initial installation costs can still be O(N)

Disadvantages

  • There's still a need to be able to decommission datanodes, and to shut down and restart the entire cluster.
  • It is now possible to misconfigure the entire cluster simultaneously.
  • There is a race condition: the shared filestore must be up before the nodes run their startup scripts.
  • The shared filestore can become a point of failure.
Get SmartFrog at SourceForge.net. Fast, secure and Free Open Source software downloads