Pattern - Manual Hadoop Cluster Setup

Contents

Pattern - Manual Hadoop Cluster Setup

It is technically possible to set up a hadoop cluster by hand.

Features

  • The system administrators install Hadoop by hand

Advantages

  • The system administrators get to learn what steps are needed to bring up a Hadoop cluster.

Disadvantages

  • The system administrators repeat the steps needed to bring up a Hadoop node for every node in the cluster.
  • Installation costs scale O(N) where N is the number of nodes.
  • Maintenance costs scale O(N) where N is the number of nodes.
  • If any machine is misconfigured, work on that machine could have a different outcome, or data could get lost.
  • There's no formal means of monitoring cluster health.

Process

  1. Install the OS images; bring them up to date.
  2. Make sure that DNS is live, or edit every host's /etc/hosts file to make consistent.
  3. Install the Java runtime.
  4. expand the Hadoop tar files.
  5. Write your hadoop-site.xml with all site-specific configuration options.
  6. Copy it out to every site in the cluster.
  7. Decide which machines will be namenodes, datanodes, job trackers and task trackers.
  8. On the namenode, set the namenode script to run when the system boots.
  9. On the datanode, set the datanode script to run when the system boots.
  10. On the job tracker, set the job tracker script to run when the system boots.
  11. On any task tracker machines, set the task tracker script to run when the system boots.
Get SmartFrog at SourceForge.net. Fast, secure and Free Open Source software downloads