Chaos - Documentation
How To: Build a supercomputer with a business card from your wallet
This document is the first HowTo in a series of HowTo's that will document various facets of CHAOS, including functionality, usage and configuration information. Each document will be stored online in PDF format. It is a good read for those who have not yet constructed an openMosix cluster from the CHAOS distribution. What you will learn here, is how to construct a distributed processing environment from any of the network connected Pentium-grade computers that you already have access too.
It is beyond the scope of this document to convey details on burning CDROM images, using unix/Linux shell commands, constructing IP networks. This document will not go into detail regarding the individual CHAOS OS components, these details can already be found amongst the technical pages.
You should be familiar with the general administration and operation of an openMosix cluster. You should also be familiar with concepts such as IP networking, routing, and address management. And, if you are so inclined, it might be a good idea to be a little familiar with Linux packet filtering (netfilter). If you are not confident in these areas, then acquire some playtime, Redhat and openMosix RPMs.
It is recommended that you plan your cluster before you set out to install it. This practice will help you to avoid complicated, time consuming and expensive surprises. While this advice seems contrary to the ad-hoc intent of CHAOS, production deployments will serve well with some planning.
To get your cluster running immediately, you will need to perform the following four major steps.
Initialising your first CHAOS/openMosix cluster
1. Some minor preparation work must come first;
o Download the CHAOS-1.6 ISO image from the CHAOS web site
o Burn the ISO image to CDROM media (preferably a business-card CD)
o Acquire i586 (or better) PCs; each with CDROM drive, ethernet NIC and a console
2. Then, for each PC (node) you can perform the following;
o Ensure that this PC is capable of booting from its CDROM drive (check the BIOS configuration)
o Boot the node with the CHAOS disc
o If your LAN has dynamic IP allocation;
- Press Enter (optionally; select "n1" if this is the first node)
o If your LAN is statically allocated;
- Press F5 for help (and removing the boot prompt timeout)
- Enter the boot command shown, but populate the correct IP detail for this node
o If the node's hardware is supported then within 15 seconds the CDROM will eject the disc
o As soon as that disc is ejected, you can repeat the process on the next node using the same disc
3. If your nodes are on differing LANs, then to join the cluster from all bar the first node;
o At the command prompt of the new node, enter; tyd --master 192.168.1.12
where 192.168.1.12 is substituted with the IP address of the first node
Without any hardware issues, your first cluster should have two nodes running within about 10-15 minutes of downloading the CHAOS ISO image.
If you've followed through the instructions in the previous section, then you should have a running cluster. There are two very useful applications built into CHAOS that will help to validate the cluster's operational status and to demonstrate the power of the openMosix cluster that you have at your fingertips; An openMosix performance monitoring tool and a test application.
Testing the cluster to validate its operational state
mosmon - The openMosix performance monitoring tool "mosmon" is really useful. Take the time out to play with mosmon, and to experiment with its display options. Once you've started mosmon, you should see all of the nodes in your cluster listed horizontally, with their respective load graphed vertically.
o At the command prompt of any node, enter; mosmon
testapp - The test application "testapp" is a tiny little program with one job; testapp is going to increment a counter until it expires. Despite processing at millions of cycles per second, the testapp incrementer will run for a very long time before it completes - making it the ideal openMosix cluster candidate. As you launch testapp instances you will see a huge load spike on the testapp "home" node, before it manages to offload (migrate) testapp instances to free nodes.
o At the command prompt of any other node, enter; testapp
do this for at least one testapp per node in your cluster
When you've finished your experiment, you can exit mosmon by pressing 'q', and you can terminate the testapp instances from the node that they were launched from (the home node) using ps and kill (as per regular unix-land tools and practices).