XCAT iDataPlex Cluster Quick Start-LMLPHP

This document describes the steps necessary to quickly set up acluster with IBM system x, rack-mounted servers. Although the examplesgiven in this document are specific to iDataplex hardware (becausethat's the most common server type used for clusters), the basicinstructions apply to any x86_64, IPMI-controlled, rack-mounted servers.

Contents

[]

xCAT Installation on an iDataplex Configuration

This document is meant to get you going as quickly as possible andtherefore only goes through the most common scenario. For additionalscenarios and setup tasks, see XCAT iDataPlex Advanced Setup.

Example Configuration Used in This Document

This configuration will have a single dx360 Management Node with 167other dx360 servers as nodes. The OS deployed will be RH EnterpriseLinux 6.2, x86_64 edition. Here is a diagram of the racks:

In our example, the management node is known as 'mgt', the nodenamess are n1-n167, and the domain will be 'cluster'. We will use theBMCs in shared mode so they will share the NIC on each node that thenode's operating system communicates to the xCAT management node over.This is call the management LAN. We will use subnet 172.16.0.0 with anetmask of 255.240.0.0 (/12) for it. (This provides an IP address rangeof 172.16.0.1 - 172.31.255.254 .) We will use the following subsets ofthis range for:

  • The management node: 172.20.0.1
  • The node OSes: 172.20.100+racknum.nodenuminrack
  • The node BMCs: 172.29.100+racknum.nodenuminrack
  • The management port of the switches: 172.30.50.switchnum
  • The DHPC dynamic range for unknown nodes: 172.20.255.1 - 172.20.255.254


The network is physically laid out such that port number on a switch isequal to the U position number within a column, like this:

XCAT iDataPlex Cluster Quick Start-LMLPHP

Overview of Cluster Setup Process

Here is a summary of the steps required to set up the cluster and what this document will take you through:

  1. Prepare the management node - doing these things before installing the xCAT software helps the process to go more smoothly.
  2. Install the xCAT software on the management node.
  3. Configure some cluster wide information
  4. Define a little bit of information in the xCAT database aboutthe ethernet switches and nodes - this is necessary to direct the nodediscovery process.
  5. Have xCAT configure and start several network daemons - this is necessary for both node discovery and node installation.
  6. Discovery the nodes - during this phase, xCAT configures theBMC's and collects many attributes about each node and stores them inthe database.
  7. Set up the OS images and install the nodes.

Distro-specific Steps

  • [RH] indicates that step only needs to be done for RHEL and Red Hat based distros (CentOS, Scientific Linux, and in most cases Fedora).
  • [SLES] indicates that step only needs to be done for SLES.

Command Man Pages and Database Attribute Descriptions

Prepare the Management Node for xCAT Installation

Install the Management Node OS

Install one of the supported distros on the Management Node (MN). Itis recommended to ensure that dhcp, bind (not bind-chroot), httpd,nfs-utils, and perl-XML-Parser are installed. (But if not, the processof installing the xCAT software later will pull them in, assuming youfollow the steps to make the distro RPMs available.)

Hardware requirements for your xCAT management node are dependent on your cluster size and configuration. A minimum requirement for anxCAT Management Node or Service Node that is dedicated to running xCATto install a small cluster ( < 16 nodes) should have 4-6 Gigabytes of memory. A medium size cluster, 6-8 Gigabytes of memory; and a largecluster, 16 Gigabytes or more. Keeping swapping to a minimum should be a goal.

Supported OS and Hardware

For a list of supported OS and Hardware, refer to XCAT_Features.

[RH] Ensure that SELinux is Disabled

Note: you can skip this step in xCAT 2.8.1 and above, because xCAT does it automatically when it is installed.

To disable SELinux manually:

echo 0 > /selinux/enforce
sed -i 's/^SELINUX=.*$/SELINUX=disabled/' /etc/selinux/config

Disable the Firewall

Note: you can skip this step in xCAT 2.8 and above, because xCAT does it automatically when it is installed.

The management node provides many services to the cluster nodes,but the firewall on the management node can interfere with this. If your cluster is on a secure network, the easiest thing to do is to disablethe firewall on the Management Mode:

For RH:

service iptables stop
chkconfig iptables off

If disabling the firewall completely isn't an option, configureiptables to allow the following services on the NIC that faces thecluster: DHCP, TFTP, NFS, HTTP, DNS.

For SLES:

SuSEfirewall2 stop

Set Up the Networks

The xCAT installation process will scan and populate certain settings from the running configuration. Having the networks configured ahead of time will aid in correct configuration. (After installation of xCAT,all the networks in the cluster must be defined in the xCAT networkstable before starting to install cluster nodes.) When xCAT is installed on the Management Node, it will automatically run makenetworks tocreate an entry in the networks table for each of the networks themanagement node is on. Additional network configurations can be added to the xCAT networks table manually later if needed.

The networks that are typically used in a cluster are:

  • Management network - used by the management node to install and manage the OS of the nodes. The MN and in-band NIC of the nodes areconnected to this network. If you have a large cluster with servicenodes, sometimes this network is segregated into separate VLANs for each service node. See Setting Up a Linux Hierarchical Cluster for details.
  • Service network - used by the management node to control thenodes out of band via the BMC. If the BMCs are configured in sharedmode, then this network can be combined with the management network.
  • Application network - used by the HPC applications on the compute nodes. Usually an IB network.
  • Site (Public) network - used to access the management node and sometimes for the compute nodes to provide services to the site.

In our example, we only deal with the management network because:

  • the BMCs are in shared mode, so they don't need a separate service network
  • we are not showing how to have xCAT automatically configure the application network NICs. See Configuring Secondary Adapters if you are interested in that.
  • under normal circumstances there is no need to put the site network in the networks table

For more information, see Setting Up a Linux xCAT Mgmt Node#Appendix A: Network Table Setup Example.

Configure NICS

Configure the cluster facing NIC(s) on the management node.For example edit the following files:

[RH]: /etc/sysconfig/network-scripts/ifcfg-eth1

[SLES]: /etc/sysconfig/network/ifcfg-eth1

DEVICE=eth1
ONBOOT=yes
BOOTPROTO=static
IPADDR=172.20.0.1
NETMASK=255.240.0.0

Prevent DHCP client from overwriting DNS configuration (Optional)

If the public facing NIC on your management node is configured by DHCP, you may want to set PEERDNS=no in the NIC's config file to prevent the dhclient from rewriting/etc/resolv.conf. This would be important if you will be configuringDNS on the management node (via makedns - covered later in this doc) and want the management node itself to use that DNS. In this case, set PEERDNS=no in each /etc/sysconfig/network-scripts/ifcfg-* file that has BOOTPROTO=dhcp.

On the other hand, if you want dhclient to configure /etc/resolv.conf on your management node, then don't set PEERDNS=no in the NIC config files.

Configure hostname

The xCAT management node hostname should be configured beforeinstalling xCAT on the management node. The hostname or its resolvableip address will be used as the default master name in the xCAT sitetable, when installed. This name needs to be the one that will resolveto the cluster-facing NIC. Short hostnames (no domain) are the norm forthe management node and all cluster nodes. Node names should never endin "-enx" for any x.

To set the hostname, edit /etc/sysconfig/network to contain, for example:

HOSTNAME=mgt

If you run hostname command, if should return the same:

# hostname
mgt

Setup basic hosts file

Ensure that at least the management node is in /etc/hosts:

127.0.0.1               localhost.localdomain localhost
::1                     localhost6.localdomain6 localhost6
###
172.20.0.1 mgt mgt.cluster

Setup the TimeZone

When using the management node to install compute nodes, the timezone configuration on the management node will be inherited by the computenodes. So it is recommended to setup the correct timezone on themanagement node. To do this on RHEL, see http://www.redhat.com/advice/tips/timezone.html. The process is similar, but not identical, for SLES. (Just google it.)

You can also optionally set up the MN as an NTP for the cluster. See Setting up NTP in xCAT.


Create a Separate File system for /install (optional)

It is not required, but recommended, that you create a separate filesystem for the /install directory on the Management Node. The sizeshould be at least 30 meg to hold to allow space for several installimages.

Restart Management Node

Note: in xCAT 2.8 and above, you do not need to restart themanagement node. Simply restart the cluster-facing NIC, for example:ifdown eth1; ifup eth1

For xCAT 2.7 and below, though it is possible to restart thecorrect services for all settings, the simplest step would be to rebootthe Management Node at this point.

Configure Ethernet Switches

It is recommended that spanning tree be set in the switches toportfast or edge-port for faster boot performance. Please see therelevant switch documentation as to how to configure this item.

It is recommended that lldp protocol in the switches is enabledto collect the switch and port information for compute node duringdiscovery process.

Note: this step is necessary if you want to use xCAT'sautomatic switch-based discovery (described later on in this document)for IPMI-controlled rack-mounted servers (including iDataPlex) and Flexchassis. If you have a small cluster and prefer to use the sequentialdiscover method (described later) or manually enter the MACs for thehardware, you can skip this section. Although you may want to still set up your switches for management so you can use xCAT tools to managethem, as described in Managing Ethernet Switches.

xCAT will use the ethernet switches during node discovery to find out which switch port a particular MAC address is communicating over.This allows xCAT to match a random booting node with the proper nodename in the database. To set up a switch, give it an IP address on itsmanagement port and enable basic SNMP functionality. (Typically, theSNMP agent in the switches is disabled by default.) The easiest methodis to configure the switches to give the SNMP version 1 community string called "public" read access. This will allow xCAT to communicate tothe switches without further customization. (xCAT will get the list ofswitches from the switch table.) If you want to use SNMP version 3 (e.g. for better security),see the example below. With SNMP V3 you also have to set theuser/password and AuthProto (default is 'md5') in the switches table.

If for some reason you can't configure SNMP on your switches, you can use sequential discovery or the more manual method of entering thenodes' MACs into the database. See #Discover the Nodes for a description of your choices.

SNMP V3 Configuration Example:

xCAT supports many switch types, such as BNT and Cisco. Here is an example of configuring SNMP V3 on the Cisco switch 3750/3650:

1. First, user should switch to the configure mode by the following commands:

[root@x346n01 ~]# telnet xcat3750
Trying 192.168.0.234...
Connected to xcat3750.
Escape character is '^]'.
User Access Verification
Password:
xcat3750-1>enable
Password:
xcat3750-1#configure terminal
Enter configuration commands, one per line.  End with CNTL/Z.
xcat3750-1(config)#

2. Configure the snmp-server on the switch:

Switch(config)# access-list 10 permit 192.168.0.20    # 192.168.0.20 is the IP of MN
Switch(config)# snmp-server group xcatadmin v3 auth write v1default
Switch(config)# snmp-server community public RO 10
Switch(config)# snmp-server community private RW 10
Switch(config)# snmp-server enable traps license?

3. Configure the snmp user id (assuming a user/pw of xcat/passw0rd):

Switch(config)# snmp-server user xcat xcatadmin v3 auth SHA passw0rd access 10

4. Check the snmp communication to the switch :

  • On the MN: make sure the snmp rpms have been installed. If not, install them:
yum install net-snmp net-snmp-utils
  • Run the following command to check that the snmpcommunication has been setup successfully (assuming the IP of the switch is 192.168.0.234):
snmpwalk -v 3 -u xcat -a SHA -A passw0rd -X cluster -l authnoPriv 192.168.0.234 .1.3.6.1.2.1.2.2.1.2

Later on in this document, it will explain how to make sure the switch and switches tables are setup correctly.

Install xCAT on the Management Node

There are two options for installation of xCAT:

  1. download the software first
  2. or install directly from the internet-hosted repository

Pick either one, but not both.

Option 1: Prepare for the Install of xCAT without Internet Access

If not able to, or not wishing to, use the live internet repository, choose this option.

Go to the Download xCAT site and download the level of xCAT tarball you desire. Go to the xCAT Dependencies Download page and download the latest snap of the xCAT dependency tarball. (The latest snap of the xCAT dependency tarball will work with any versionof xCAT.)

Copy the files to the Management Node (MN) and untar them:

mkdir /root/xcat2
cd /root/xcat2
tar jxvf xcat-core-2.*.tar.bz2     # or core-rpms-snap.tar.bz2
tar jxvf xcat-dep-*.tar.bz2

Setup YUM repositories for xCAT and Dependencies

Point YUM to the local repositories for xCAT and its dependencies:

cd /root/xcat2/xcat-dep//
./mklocalrepo.sh
cd /root/xcat2/xcat-core
./mklocalrepo.sh

[SLES 11]:

 zypper ar file:///root/xcat2/xcat-dep/sles11/ xCAT-dep
 zypper ar file:///root/xcat2/xcat-core  xcat-core

You can check a zypper repository using "zypper lr -d", or remove a zypper repository using "zypper rr".

[SLES 10.2+]:

zypper sa file:///root/xcat2/xcat-dep/sles10/ xCAT-dep
zypper sa file:///root/xcat2/xcat-core xcat-core

You can check a zypper repository using "zypper sl -d", or remove a zypper repository using "zypper sd".

Option 2: Prepare to Install xCAT Directly from the Internet-hosted Repository

When using the live internet repository, you need to first make surethat name resolution on your management node is at least set up enoughto resolve sourceforge.net. Then make sure the correct repo files arein /etc/yum.repos.d:

To get the current official release:

cd /etc/yum.repos.d
wget http://sourceforge.net/projects/xcat/files/yum/stable/xcat-core/xCAT-core.repo 

To get the deps package:

wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep///xCAT-dep.repo

for example:

wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep/rh6/x86_64/xCAT-dep.repo 

To setup to use SLES with zypper:

[SLES11]:

zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/stable/xcat-core xCAT-core
zypper ar -t rpm-md http://sourceforge.net/projects/xcat/files/yum/xcat-dep// xCAT-dep


[SLES10.2+]:

zypper sa http://sourceforge.net/projects/xcat/files/yum/stable/xcat-core xCAT-core
zypper sa http://sourceforge.net/projects/xcat/files/yum/xcat-dep// xCAT-dep

For Both Options: Make Required Packages From the Distro Available

xCAT uses on several packages that come from the Linux distro. Follow this section to create the repository of the OS on the Management Node.

See the following documentation:

Setting_Up_the_OS_Repository_on_the_Mgmt_Node

Install xCAT Packages

[RH]: Use yum to install xCAT and all the dependencies:

yum clean metadata
yum install xCAT

[SLES]: Use zypper to install xCAT and all the dependencies:

zypper install xCAT

Using the New Sysclone Deployment Method

Note: in xCAT 2.8.2 and above, xCAT supports cloning new nodes from a pre-installed/pre-configured node, we call this provisioning method as sysclone. It leverages the opensource tool systemimager. If you will be installing stateful(diskful) nodes using the sysclone provmethod, you need to install systemimager and all the dependencies (using sysclone is optional):

[RH]: Use yum to install systemimager and all the dependencies:

yum install systemimager-server

[SLES]: Use zypper to install systemimager and all the dependencies:

zypper install systemimager-server

Quick Test of xCAT Installation

Add xCAT commands to the path by running the following:

source /etc/profile.d/xcat.sh

Check to see the database is initialized:

tabdump site

The output should similar to the following:

key,value,comments,disable
"xcatdport","3001",,
"xcatiport","3002",,
"tftpdir","/tftpboot",,
"installdir","/install",,
     .
     .
     .

If the tabdump command does not work, see Debugging xCAT Problems.

Updating xCAT Packages Later

If you need to update the xCAT RPMs later:

  • If the management node does not have access to the internet: download the new version of xCAT from Download xCAT and the dependencies from xCAT Dependencies Download and untar them in the same place as before.
  • If the management node has access to the internet, the yum command below will pull the updates directly from the xCAT site.

To update xCAT:

[RH]:

yum clean metadata
yum update '*xCAT*'

[SLES]:

zypper refresh
zypper update -t package '*xCAT*'

Note: this will not apply updates that may have been made to some ofthe xCAT deps packages. (If there are brand new deps packages, theywill get installed.) In most cases, this is ok, but if you want to make all updates for xCAT rpms and deps, run the following command. Thiscommand will also pick up additional OS updates.

[RH]:

yum update

[SLES]:

zypper refresh
zypper update

Note: If you are updating from xCAT 2.7.x (or earlier) to xCAT 2.8 or later, there are some additional migration steps that need to be considered:

  1. Switch from xCAT IBM HPC Integration support to using Software Kits - see Switching_from_xCAT_IBM_HPC_Integration_Support_to_Using_Software_Kits for details.
  2. (Optional) Use nic attibutes to replace the otherinterfaces attribute to configure secondary adapters - see otherinterfaces vs nic attributes for details.
  3. Convert non-osimage based system to osimage based system - see Convert non-osimage based system to osimage based system for details.

Configure xCAT

Networks Table

All networks in the cluster must be defined in the networks table.When xCAT was installed, it ran makenetworks, which created an entry inthis table for each of the networks the management node is connected to. Now is the time to add to the networks table any other networks in thecluster, or update existing networks in the table.

For a sample Networks Setup, see the following example: Setting_Up_a_Linux_xCAT_Mgmt_Node#Appendix_A:_Network_Table_Setup_Example

passwd Table

The password should be set in the passwd table that will be assignedto root when the node is installed. You can modify this table usingtabedit. To change the default password for root on the nodes, changethe system line. To change the password to be used for the BMCs, change the ipmi line.

tabedit passwd
#key,username,password,cryptmethod,comments,disable
"system","root","cluster",,,
"ipmi","USERID","PASSW0RD",,,

Setup DNS

To get the hostname/IP pairs copied from /etc/hosts to the DNS on the MN:

  • Ensure that /etc/sysconfig/named does not have ROOTDIR set
  • Set site.forwarders to your site-wide DNS servers that canresolve site or public hostnames. The DNS on the MN will forward anyrequests it can't answer to these servers.
chdef -t site forwarders=1.2.3.4,1.2.5.6
  • Edit /etc/resolv.conf to point the MN to its own DNS. (Note: this won't be required in xCAT 2.8 and above.)
search cluster
nameserver 172.20.0.1
  • Run makedns
makedns -n

For more information about name resolution in an xCAT Cluster, see Cluster Name Resolution.

Setup DHCP

You usually don't want your DHCP server listening on your public(site) network, so set site.dhcpinterfaces to your MN's cluster facingNICs. For example:

chdef -t site dhcpinterfaces=eth1

Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:

makedhcp -n

The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.

Setup TFTP

Nothing to do here - the TFTP server is done by xCAT during the Management Node install.

Setup conserver

makeconservercf

Node Definition and Discovery

Declare a dynamic range of addresses for discovery

If you want to run a discovery process, a dynamic range must bedefined in the networks table. It's used for the nodes to get an IPaddress before xCAT knows their MAC addresses.

In this case, we'll designate 172.20.255.1-172.20.255.254 as a dynamic range:

chdef -t network 172_16_0_0-255_240_0_0 dynamicrange=172.20.255.1-172.20.255.254

Load the e1350 Templates

Several xCAT database tables must be filled in while setting up aniDataPlex cluster. To make this process easier, xCAT provides severaltemplate files in /opt/xcat/share/xcat/templates/e1350/. These filescontain regular expressions that describe the naming patterns in thecluster. With xCAT's regular expression support, one line in a tablecan define one or more attribute values for all the nodes in a nodegroup. (For more information on xCAT's database regular expressions,see http://xcat.sourceforge.net/man5/xcatdb.5.html .) To load the default templates into your database:

cd /opt/xcat/share/xcat/templates/e1350/
for i in *csv; do tabrestore $i; done

These templates contain entries for a lot of different node groups, but we will be using the following node groups:

  • ipmi - the nodes controlled via IPMI.
  • idataplex - the iDataPlex nodes
  • 42perswitch - the nodes that are connected to 42 port switches
  • compute - all of the compute nodes
  • 84bmcperrack - the BMCs that are in a fully populated rack of iDataPlex
  • switch - the ethernet switches in the cluster

In our example, ipmi, idataplex, 42perswitch, and compute will allhave the exact same membership because all of our iDataPlex nodes havethose characteristics.

The templates automatically define the following attributes and naming conventions:

  • The iDataPlex compute nodes:
    • node names are of the form , for example n1
    • ip: 172.20.100+racknum.nodenuminrack
    • bmc: the bmc with the same number as the node
    • switch: divide the node number by 42 to get the switch number
    • switchport: the nodes are plugged into 42-port ethernet switches in order of node number
    • mgt: 'ipmi'
    • netboot: 'xnba'
    • profile: 'compute'
    • rack: node number divided by 84
    • unit: in the range of A1 - A42 for the 1st 42 nodes in each rack, and in the range of C1 - C42 for the 2nd 42 nodes in each rack
    • chain: 'runcmd=bmcsetup,shell'
    • ondiscover: 'nodediscover'
  • The BMCs:
    • node names are of the form -bmc, for example n001-bmc
    • ip: 172.29.100+racknum.nodenuminrack
  • The management connection to each ethernet switch:
    • node names are of the form switch, for example switch1
    • ip: 172.30.50.switchnum

For a description of the attribute names in bold above, see the node object definition.

If these conventions don't work for your situation, you can either:

  1. modify the regular expressions - see xCAT iDataPlex Advanced Setup#Template modification example
  2. or manually define each node - see xCAT iDataPlex Advanced Setup#Manually setup the node attributes instead of using the templates or switch discovery

Add Nodes to the nodelist Table

Now you can use the power of the templates to define the nodesquickly. By simply adding the nodes to the correct groups, they willpick up all of the attributes of that group:

nodeadd n[001-167] groups=ipmi,idataplex,42perswitch,compute,all
nodeadd n[001-167]-bmc groups=84bmcperrack
nodeadd switch1-switch4 groups=switch

To see the list of nodes you just defined:

nodels

To see all of the attributes that the combination of the templates and your nodelist have defined for a few sample nodes:

lsdef n100,n100-bmc,switch2

This is the easiest way to verify that the regular expressions in the templates are giving you attribute values you are happy with. (Or, ifyou modified the regular expressions, that you did it correctly.)

Declare use of SOL

If not using a terminal server, SOL is recommended, but not requiredto be configured. To instruct xCAT to configure SOL in installedoperating systems on dx340 systems:

chdef -t group -o compute serialport=1 serialspeed=19200 serialflow=hard

For dx360-m2 and newer use:

chdef -t group -o compute serialport=0 serialspeed=115200 serialflow=hard

Setup /etc/hosts and DNS

Since the map between the xCAT node names and IP addresses have beenadded in the hosts table by the 31350 template, you can run the makehosts xCAT command to create the /etc/hosts file from the xCAT hosts table.(You can skip this step if creating /etc/hosts manually.)

makehosts switch,idataplex,ipmi

Verify the entries have been created in the file /etc/hosts. For example your /etc/hosts should look like this:

127.0.0.1               localhost.localdomain localhost
::1                     localhost6.localdomain6 localhost6
###
172.20.0.1 mgt mgt.cluster
172.20.101.1 n1 n1.cluster
172.20.101.2 n2 n2.cluster
172.20.101.3 n3 n3.cluster
172.20.101.4 n4 n4.cluster
172.20.101.5 n5 n5.cluster
172.20.101.6 n6 n6.cluster
172.20.101.7 n7 n7.cluster
              .
              .
              .

Add the node ip mapping to the DNS.

makedns

Discover the Nodes

xCAT supports 3 approaches to discover the new physical nodes and define them to xCAT database:

  • Option 1: Sequential Discovery

This is a simple approach in which you give xCAT a range of nodenames to be given to the discovered nodes, and then you power the nodeson sequentially (usually in physical order), and each node is given thenext node name in the noderange.

  • Option 2: Switch Discovery

With this approach, xCAT assumes the nodes are plugged into yourethernet switches in an orderly fashion. So it uses each node's switchport number to determine where it is physically located in the racks and therefore what node name it should be given. This method requires alittle more setup (configuring the switches and defining the switchtable). But the advantage of this method is that you can power all ofthe nodes on at the same time and xCAT will sort out which node iswhich. This can save you a lot of time in a large cluster.

  • Option 3: Manual Discovery

If you don't want to use either of the automatically discovery processes, just follow the manual discovery process.

Choose just one of these options and follow the corresponding section below (and skip the other two).

Option 1: Sequential Discovery

Note: This feature is only supported in xCAT 2.8.1 and higher.

Sequential Discovery means the new nodes will be discovered oneby one. The nodes will be given names from a 'node name pool' in theorder they are powered on.

Initialize the discovery process

Specify the node name pool by giving a noderange to the nodediscoverstart command:

nodediscoverstart noderange=n[001-010]

The value of noderange should be in the xCAT noderange format.

Note: other node attributes can be given to nodediscoverstart sothat xCAT will assign those attributes to the nodes as they arediscovered. We aren't showing that in this document, because we already predefined the nodes, the groups they are in, and several attributes(provided by the e1350 templates). If you don't want to predefinenodes, you can give more attributes to nodediscoverstart and have itdefine the nodes. See the nodediscoverstart man page for details.

Power on the nodes sequentially

At this point you can physically power on the nodes one at a time, in the order you want them to receive their node names.

Display information about the discovery process

There are additional nodediscover commands you can run during the discovery process. See their man pages for more details.

  • Verify the status of discovery
nodediscoverstatus
  • Show the nodes that have been discovered so far:
nodediscoverls -t seq -l
  • Stop the current sequential discovery process:
nodediscoverstop

Note: The sequential discovery process will be stopped automaticallywhen all of the node names in the node name pool are used up.

Option 2: Switch Discovery

This method of discovery assumes that you have the nodes plugged into your ethernet switches in an orderly fashion. So we use each nodesswitch port number to determine where it is physically located in theracks and therefore what node name it should be given.

To use this discovery method, you must have already configured the switches as described in #Configure Ethernet Switches

Switch-related Tables

The table templates already put group-oriented regular expressionentries in the switch table. Use lsdef for a sample node to see if theswitch and switchport attributes are correct. If not, use chdef ortabedit to change the values.

If you configured your switches to use SNMP V3, then you need todefine several attributes in the switches table. Assuming all of yourswitches use the same values, you can set these attributes at the grouplevel:

tabch switch=switch switches.snmpversion=3 switches.username=xcat switches.password=passw0rd switches.auth=sha

Option 3: Manually Discover Nodes

Prerequisite: The dynamic dhcp range has been configured before your power on the nodes.

If you have a few nodes which were not discovered by SequentialDiscovery or Switch Discovery, you could find them in discoverydatatable. The undiscovered nodes are identified as 'undef' method indiscoverydata table.

Display the undefined nodes with nodediscoverls command:

nodediscoverls -t undef
 UUID                                    NODE                METHOD         MTM       SERIAL
 61E5F2D7-0D59-11E2-A7BC-3440B5BEDBB1    undef               undef          786310X   1052EF1
 FC5F8852-CB97-11E1-8D59-E41F13EEB1BA    undef               undef          7914B2A   06DVAC9
 96656F17-6482-E011-9954-5CF3FC317F68    undef               undef          7377D2C   99A2007

If you want to manually define the 'undefined' node to a specificfree node, using the 'nodediscoverdef' command ( available 2.8.2 orhigher).

e.g. You have a free node n10 and you want to define theundefined node which uuid is '61E5F2D7-0D59-11E2-A7BC-3440B5BEDBB1' tothe n10, run following command:

nodediscoverdef -u 61E5F2D7-0D59-11E2-A7BC-3440B5BEDBB1 -n n10

After the manually defining, the 'node name' and 'discovery method'attributes of undefined node will be changed. You could display thechange by nodediscoverls command:

# nodediscoverls
 UUID                                    NODE                METHOD         MTM       SERIAL 61E5F2D7-0D59-11E2-A7BC-3440B5BEDBB1    n10                 manual         786310X   1052EF1 FC5F8852-CB97-11E1-8D59-E41F13EEB1BA    undef               undef          7914B2A   06DVAC9
 96656F17-6482-E011-9954-5CF3FC317F68    undef               undef          7377D2C   99A2007

And you can run 'lsdef n10' to see the 'mac address' and 'mtm' has been updated to the node definition. If the next task like bmcsetup has been set in the chain table, the task chain will be continued to run after the running of nodediscoverdef command.

Run the discovery

If you want to update node firmware when you discover the nodes, follow the steps in xCAT iDataPlex Advanced Setup#Updating Node Firmware before continuing.

If you want to automatically deploy the nodes after they are discovered, follow the steps in xCAT iDataPlex Advanced Setup#Automatically Deploying Nodes After Discovery before continuing. (But if you are new to xCAT we don't recommend this.)

To initiate any of the 3 discover methods, walk over to systems and hit the power buttons. For the sequential discovery method power the nodes on in the orderthat you want them to be given the node names. Wait a short time (about 30 seconds) between each node to ensure they will contact xcatd in thecorrect order. For the switch and manual discovery processes, you canpower on all of the nodes at the same time.

On the MN watch nodes being discovered by:

tail -f /var/log/messages

Look for the dhcp requests, the xCAT discovery requests, and the " has been discovered" messages.

A quick summary of what is happening during the discovery process is:

  • the nodes request a DHCP IP address and PXE boot instructions
  • the DHCP server on the MN responds with a dynamic IP address and the xCAT genesis boot kernel
  • the genesis boot kernel running on the node sends the MAC and MTMS to xcatd on the MN
  • xcatd asks the switches which port this MAC is on so that itcan correlate this physical node with the proper node entry in thedatabase. (Switch Discovery only)
  • xcatd uses specified node name pool to get the proper node entry. (Sequential Discovery only)
    • stores the node's MTMS in the db
    • puts the MAC/IP pair in the DHCP configuration
    • sends several of the node attributes to the genesis kernel on the node
  • the genesis kernel configures the BMC with the proper IP address, userid, and password, and then just drops into a shell

After a successful discovery process, the following attributes willbe added to the database for each node. (You can verify this by runninglsdef ):

  • mac - the MAC address of the in-band NIC used to manage this node
  • mtm - the hardware type (machine-model)
  • serial - the hardware serial number

If you cannot discover the nodes successfully, see the next section #Manually Discover Nodes.

If at some later time you want to force a re-discover of a node, run:

makedhcp -d

and then reboot the node(s).

Monitoring Node Discovery

When the bmcsetup process completes on each node (about 5-10minutes), xCAT genesis will drop into a shell and wait indefinitely (and change the node's currstate attribute to "shell"). You can monitor the progress of the nodes using:

watch -d 'nodels ipmi chain.currstate|xcoll'

Before all nodes complete, you will see output like:

====================================
n1,n10,n11,n75,n76,n77,n78,n79,n8,n80,n81,n82,n83,n84,n85,n86,n87,n88,n89,n9,n90,n91
====================================
shell

====================================
n31,n32,n33,n34,n35,n36,n37,n38,n39,n4,n40,n41,n42,n43,n44,n45,n46,n47,n48,n49,n5,n50,n51,n52,
 n53,n54,n55,n56,n57,n58,n59,n6,n60,n61,n62,n63,n64,n65,n66,n67,n68,n69,n7,n70,n71,n72,n73,n74
====================================
runcmd=bmcsetup

When all nodes have made it to the shell, xcoll will just show that the whole nodegroup "ipmi" has the output "shell":

====================================
ipmi
====================================
shell

When the nodes are in the xCAT genesis shell, you can ssh or psh to any of the nodes to check anything you want.

Verfiy HW Management Configuration

At this point, the BMCs should all be configured and ready for hardware management. To verify this:

# rpower ipmi stat | xcoll
====================================
ipmi
====================================
on

HW Settings Necessary for Remote Console

To get the remote console working for each node, some uEFI hardwaresettings must have specific values. First check the settings, and ifthey aren't correct, then set them properly. This can be done via theASU utility. The settings are slightly different, depending on thehardware type:

  • For the dx360-m3 and earlier machines create a file called asu-show with contents:
show uEFI.Com1ActiveAfterBoot
show uEFI.SerialPortSharing
show uEFI.SerialPortAccessMode
show uEFI.RemoteConsoleRedirection
And create a file called asu-set with contents:
set uEFI.Com1ActiveAfterBoot Enable
set uEFI.SerialPortSharing Enable
set uEFI.SerialPortAccessMode Dedicated
set uEFI.RemoteConsoleRedirection Enable
  • For dx360-m4 and later machines create a file called asu-show with contents:
show DevicesandIOPorts.Com1ActiveAfterBoot
show DevicesandIOPorts.SerialPortSharing
show DevicesandIOPorts.SerialPortAccessMode
show DevicesandIOPorts.RemoteConsole
And create a file called asu-set with contents:
set DevicesandIOPorts.Com1ActiveAfterBoot Enable
set DevicesandIOPorts.SerialPortSharing Enable
set DevicesandIOPorts.SerialPortAccessMode Dedicated
set DevicesandIOPorts.RemoteConsole Enable

Then for both types of machines, use the pasu tool to check these settings:

pasu -b asu-show ipmi | xcoll    # Or you can check just one node and assume the rest are the same

If the settings are not correct, then set them:

pasu -b asu-set ipmi | xcoll

For alternate ways to set the ASU settings, see xCAT iDataPlex Advanced Setup#Using ASU to Update CMOS, uEFI, or BIOS Settings on the Nodes.

Now the remote console should work. Verify it on one node by running:

rcons

To verify that you can see the genesis shell prompt (after hittingenter). To exit rcons type: ctrl-shift-E (all together), then "c", the".".

You are now ready to choose an operating system and deployment method for the nodes....

Deploying Nodes

  • In you want to install your nodes as stateful (diskful) nodes, follow the next section #Installing Stateful Nodes.
  • If you want to define one or more stateless (diskless) OS images and boot the nodes with those, see section #Deploying Stateless Nodes. This method has the advantage of managing the images in a central place, and having only one image per node type.
  • If you want to have nfs-root statelite nodes, see xCAT Linux Statelite. This has the same advantage of managing the images from a centralplace. It has the added benefit of using less memory on the node whileallowing larger images. But it has the drawback of making the nodesdependent on the management node or service nodes (i.e. if themanagement/service node goes down, the compute nodes booted from it godown too).
  • If you have a very large cluster (more than 500 nodes), at this point you should follow Setting Up a Linux Hierarchical Cluster to install and configure your service nodes. After that you can return here to install or diskless boot your compute nodes.

Installing Stateful Nodes

There are two options to install your nodes as stateful (diskful) nodes:

  1. use ISOs or DVDs, follow the section #Option 1: Installing Stateful Nodes Using ISOs or DVDs
  2. or clone new nodes from a pre-installed/pre-configured node, follow the section #Option 2: Installing Stateful Nodes Using Sysclone

Option 1: Installing Stateful Nodes Using ISOs or DVDs

This section describes the process for setting up xCAT to install nodes; that is how to install an OS on the disk of each node.

Create the Distro Repository on the MN

The copycds command copies the contents of the linux distro media to/install// so that it will be available to install nodes with or create diskless images.

  • Obtain the Redhat or SLES ISOs or DVDs.
  • If using an ISO, copy it to (or NFS mount it on) the management node, and then run:
copycds /RHEL6.2-*-Server-x86_64-DVD1.iso
  • If using a DVD, put it in the DVD drive of the management node and run:
copycds /dev/dvd       # or whatever the device name of your dvd drive is

Tip: if this is the same distro version as your management node,create a .repo file in /etc/yum.repos.d with content similar to:

[local-rhels6.2-x86_64]
name=xCAT local rhels 6.2
baseurl=file:/install/rhels6.2/x86_64
enabled=1
gpgcheck=0

This way, if you need some additional RPMs on your MN at a later, you can simply install them using yum. Or if you are installing othersoftware on your MN that requires some additional RPMs from the disto,they will automatically be found and installed.

Select or Create an osimage Definition

The copycds command also automatically creates several osimagedefintions in the database that can be used for node deployment. To see them:

lsdef -t osimage          # see the list of osimages
lsdef -t osimage           # see the attributes of a particular osimage

From the list above, select the osimage for your distro,architecture, provisioning method (in this case install), and profile(compute, service, etc.). Although it is optional, we recommend youmake a copy of the osimage, changing its name to a simpler name. Forexample:

lsdef -t osimage -z rhels6.2-x86_64-install-compute | sed 's/^[^ ]\+:/mycomputeimage:/' | mkdef -z

This displays the osimage "rhels6.2-x86_64-install-compute" in aformat that can be used as input to mkdef, but on the way there it usessed to modify the name of the object to "mycomputeimage".

Initially, this osimage object points to templates, pkglists,etc. that are shipped by default with xCAT. And some attributes, forexample otherpkglist and synclists, won't have any value at all becausexCAT doesn't ship a default file for that. You can now change/fill inany osimage attributes that you want. A general convention is that if you are modifying one of the default files that an osimage attribute points to, copy it into/install/custom and have your osimage point to it there. (If you modify the copy under /opt/xcat directly, it will be over-written the nexttime you upgrade xCAT.)

But for now, we will use the default values in the osimagedefinition and continue on. (If you really want to see examples ofmodifying/creating the pkglist, template, otherpkgs pkglist, and syncfile list, see the section #Deploying Stateless Nodes. Most of the examples there can be used for stateful nodes too.)

Install a new Kernel on the nodes

Using a postinstall script ( you could also use the updatenode method):

mkdir /install/postscripts/data
cp  /install/postscripts/data

Create the postscript updatekernel:

vi /install/postscripts/updatekernel

Add the following lines to the file

#!/bin/bash
rpm -Uivh data/kernel-*rpm

Change the permission on the file

chmod 755 /install/postscripts/updatekernel

Add the script to the postscripts table and run the install:

chdef -p -t group -o compute postscripts=updatekernel
rnetboot compute

Update the Distro at a Later Time

After the initial install of the distro onto nodes, if you want toupdate the distro on the nodes (either with a few updates or a new SP)without reinstalling the nodes:

  • create the new repo using copycds:
copycds /RHEL6.3-*-Server-x86_64-DVD1.iso
Or, for just a few updated rpms, you can copy the updated rpmsfrom the distributor into a directory under /install and run createrepoin that directory.
  • add the new repo to the pkgdir attribute of the osimage:
chdef -t osimage rhels6.2-x86_64-install-compute -p pkgdir=/install/rhels6.3/x86_64
Note: the above command will add a 2nd repo to the pkgdirattribute. This is only supported for xCAT 2.8.2 and above. Forearlier versions of xCAT, omit the -p flag to replace the existing repodirectory with the new one.
  • run the ospkgs postscript to have yum update all rpms on the nodes
updatenode compute -P ospkgs

Option 2: Installing Stateful Nodes Using Sysclone

This section describes how to install or configure a diskful node (we call it as golden-client), capture an osimage from this golden-client,the osimage can be used to clone other nodes later.

Note: this support is available in xCAT 2.8.2 and above.

Install or Configure the Golden Client

If you want to use the sysclone provisioning method, you need a golden-client. In this way, you can customize and tweak thegolden-client’s configuration according to your needs, verify it’sproper operation, so once the image is captured and deployed, the newnodes will behave in the same way as the golden-client.

To install a golden-client, follow the section #Option 1: Installing Stateful Nodes Using ISOs or DVDs.

To install the systemimager rpms onto the golden-client:

  • Download the xcat-dep tarball which includes systemimager rpms.
Go to xcat-dep and get the latest xCAT dependency tarball. Copy the file to themanagement node and untar it in the appropriate sub-directory of/install/post/otherpkgs. For example:

[RH]:

mkdir -p /install/post/otherpkgs/rhels6.3/x86_64/xcat
cd /install/post/otherpkgs/rhels6.3/x86_64/xcat
tar jxvf xcat-dep-*.tar.bz2
  • Add the sysclone otherpkglist file and otherpkgdir to osimage definition and run the install. For example:

[RH]:

chdef -t osimage -o  otherpkglist=/opt/xcat/share/xcat/install/rh/sysclone.rhels6.x86_64.otherpkgs.pkglist
chdef -t osimage -o  -p otherpkgdir=/install/post/otherpkgs/rhels6.3/x86_64
rpower  reset          # you could also use the updatenode method

[Fedora/CentOS]

  • For otherpkglist: Has same content with RedHat's otherpkglist.
  • For otherpkgdir: Can use the same directory with redhat.

Capture image from the Golden Client

Using imgcapture to capture an osimage from the golden-client.

imgcapture  -t sysclone -o

Tip: when imgcapture is run, it pulls an osimage from thegolden-client, and creates an osimage definition on xcat managementnode. Use lsdef -t osimage to check the osimageattributes.

Begin Installation

The nodeset command tells xCAT what you want to do next with this node, rsetboot tells the node hardware to boot from the network for the next boot, and powering on the node using rpower starts the installation process:

nodeset compute osimage=mycomputeimage
rsetboot compute net
rpower compute boot

Tip: when nodeset is run, it processes the kickstart or autoyasttemplate associated with the osimage, plugging in node-specificattributes, and creates a specific kickstart/autoyast file for each node in /install/autoinst. If you need to customize the template, make acopy of the template file that is pointed to by the osimage.templateattribute and edit that file (or the files it includes).

Monitor installation

It is possible to use the wcons command to watch the installation process for a sampling of the nodes:

wcons n1,n20,n80,n100

or rcons to watch one node

rcons n1

Additionally, nodestat may be used to check the status of a node as it installs:

nodestat n20,n21
n20: installing man-pages - 2.39-10.el5 (0%)
n21: installing prep

Note: the percentage complete reported by nodestat is not necessarily reliable.

You can also watch nodelist.status until it changes to "booted" for each node:

nodels compute nodelist.status | xcoll

Once all of the nodes are installed and booted, you should be ablessh to all of them from the MN (w/o a password), because xCAT shouldhave automatically set up the ssh keys (if the postscripts ransuccessfully):

xdsh compute date

If there are problems, see Debugging xCAT Problems.



Deploying Stateless Nodes

Note: this section describes how to create a stateless image usingthe genimage command to install a list of rpms into the image. As analternative, you can also capture an image from a running node andcreate a stateless image out of it. See Capture Linux Image for details.

Create the Distro Repository on the MN

The copycds command copies the contents of the linux distro media to/install// so that it will be available to install nodes with or create diskless images.

  • Obtain the Redhat or SLES ISOs or DVDs.
  • If using an ISO, copy it to (or NFS mount it on) the management node, and then run:
copycds /RHEL6.2-Server-20080430.0-x86_64-DVD.iso
  • If using a DVD, put it in the DVD drive of the management node and run:
copycds /dev/dvd       # or whatever the device name of your dvd drive is

Tip: if this is the same distro version as your management node,create a .repo file in /etc/yum.repos.d with content similar to:

[local-rhels6.2-x86_64]
name=xCAT local rhels 6.2
baseurl=file:/install/rhels6.2/x86_64
enabled=1
gpgcheck=0

This way, if you need some additional RPMs on your MN at a later, you can simply install them using yum. Or if you are installing othersoftware on your MN that requires some additional RPMs from the disto,they will automatically be found and installed.

Using an osimage Definition

Note: To use an osimage as your provisioning method, you need to be running xCAT 2.6.6 or later.

The provmethod attribute of your nodes should contain the name of the osimage object definition that is being used for those nodes. The osimage object contains paths for pkgs, templates, kernels, etc. If you haven't already, run copycds to copy the distro rpms to /install. Default osimage objects are also defined when copycds is run. To view the osimages:

lsdef -t osimage          # see the list of osimages
lsdef -t osimage           # see the attributes of a particular osimage

Select or Create an osimage Definition

From the list found above, select the osimage for your distro,architecture, provisioning method (install, netboot, statelite), andprofile (compute, service, etc.). Although it is optional, we recommend you make a copy of the osimage, changing its name to a simpler name.For example:

lsdef -t osimage -z rhels6.3-x86_64-netboot-compute | sed 's/^[^ ]\+:/mycomputeimage:/' | mkdef -z

This displays the osimage "rhels6.3-x86_64-netboot-compute" in aformat that can be used as input to mkdef, but on the way there it usessed to modify the name of the object to "mycomputeimage".

Initially, this osimage object points to templates, pkglists,etc. that are shipped by default with xCAT. And some attributes, forexample otherpkglist and synclists, won't have any value at all becausexCAT doesn't ship a default file for that. You can now change/fill inany osimage attributes that you want. A general convention is that if you are modifying one of the default files that an osimage attribute points to, copy it into/install/custom and have your osimage point to it there. (If you modify the copy under /opt/xcat directly, it will be over-written the nexttime you upgrade xCAT.)

Set up pkglists

You likely want to customize the main pkglist for the image. This is the list of rpms or groups that will be installed from the distro.(Other rpms that they depend on will be installed automatically.) Forexample:

mkdir -p /install/custom/netboot/rh
cp -p /opt/xcat/share/xcat/netboot/rh/compute.rhels6.x86_64.pkglist /install/custom/netboot/rh
vi /install/custom/netboot/rh/compute.rhels6.x86_64.pkglist
chdef -t osimage mycomputeimage pkglist=/install/custom/netboot/rh/compute.rhels6.x86_64.pkglist

The goal is to install the fewest number of rpms that still providesthe function and applications that you need, because the resultingramdisk will use real memory in your nodes.

Also, check to see if the default exclude list excludes all files and directories you do not want in the image. The exclude list enables you to trim the image after the rpms are installed into the image, sothat you can make the image as small as possible.

cp /opt/xcat/share/xcat/netboot/rh/compute.exlist /install/custom/netboot/rh
vi /install/custom/netboot/rh/compute.exlist
chdef -t osimage mycomputeimage exlist=/install/custom/netboot/rh/compute.exlist

Make sure nothing is excluded in the exclude list that you need onthe node. For example, if you require perl on your nodes, remove theline "./usr/lib/perl5*".

Installing OS Updates By Setting linuximage.pkgdir(only support for rhels and sles)

The linuximage.pkgdir is the name of the directory where the distropackages are stored. It can be set multiple paths. The multiple pathsmust be separated by ",". The first path is the value of osimage.pkgdirand must be the OS base pkg directory path, such aspkgdir=/install/rhels6.2/x86_64,/install/updates/rhels6.2/x86_64 . Inthe os base pkg path, there is default repository data. In the other pkg path(s), the users should make sure there is repository data. If not,use "createrepo" command to create them.

If you have additional os updates rpms (rpms may be from the oswebsite, or the additional os distro) that you also want installed, make a directory to hold them, create a list of the rpms you want installed, and add that information to the osimage definition:

  • Create a directory to hold the additional rpms:
mkdir -p /install/updates/rhels6.2/x86_64
cd /install/updates/rhels6.2/x86_64
cp /myrpms/* .

If there is no repository data in the directory, you can run "createrepo" to create it:

createrepo .

The createrepo command is in the createrepo rpm, which for RHEL is in the 1st DVD, but for SLES is in the SDK DVD.

NOTE: when the management node is rhels6.x, and theotherpkgs repository data is for rhels5.x, we should run createrepo with "-s md5". Such as:

createrepo -s md5 .
  • Append the additional rpms into the corresponding pkglist. Forexample, in /install/custom/install/rh/compute.rhels6.x86_64.pkglist,append:
...
myrpm1
myrpm2
myrpm3
  • Add both the directory and the file to the osimage definition:
chdef -t osimage mycomputeimage pkgdir=/install/rhels6.2/x86_64,/install/updates/rhels6.2/x86_64  pkglist=/install/custom/install/rh/compute.rhels6.x86_64.pkglist

If you add more rpms at a later time, you must run createrepo again.

Note: After the above setting,

  • For diskfull install, run "nodeset mycomputeimage" to pick up the changes, and then boot up the nodes
  • For diskless, run genimage to install the packages into the image, and then packimage and boot up the nodes.
  • If the nodes are up, run "updatenode ospkgs" to update the packages.
  • These functions are only support for rhels6.x and sles11.x

Installing Additional Packages Using an Otherpkgs Pkglist

If you have additional rpms (rpms not in the distro) that youalso want installed, make a directory to hold them, create a list of the rpms you want installed, and add that information to the osimagedefinition:

  • Create a directory to hold the additional rpms:
mkdir -p /install/post/otherpkgs/rh/x86_64
cd /install/post/otherpkgs/rh/x86_64
cp /myrpms/* .
createrepo .

NOTE: when the management node is rhels6.x, and the otherpkgsrepository data is for rhels5.x, we should run createrepo with "-s md5". Such as:

createrepo -s md5 .
  • Create a file that lists the additional rpms that should beinstalled. For example, in/install/custom/netboot/rh/compute.otherpkgs.pkglist put:
myrpm1
myrpm2
myrpm3
  • Add both the directory and the file to the osimage definition:
chdef -t osimage mycomputeimage otherpkgdir=/install/post/otherpkgs/rh/x86_64 otherpkglist=/install/custom/netboot/rh/compute.otherpkgs.pkglist

If you add more rpms at a later time, you must run createrepo again.The createrepo command is in the createrepo rpm, which for RHEL is inthe 1st DVD, but for SLES is in the SDK DVD.

If you have multiple sets of rpms that you want to keep separate to keep them organized, you can put them in separate sub-directories in the otherpkgdir. If you do this, you need to do the following extrathings, in addition to the steps above:

  • Run createrepo in each sub-directory
  • In your otherpkgs.pkglist, list at least 1 file from eachsub-directory. (During installation, xCAT will define a yum or zypperrepository for each directory you reference in your otherpkgs.pkglist.) For example:
xcat/xcat-core/xCATsn
xcat/xcat-dep/rh6/x86_64/conserver-xcat

There are some examples of otherpkgs.pkglist in/opt/xcat/share/xcat/netboot//service.*.otherpkgs.pkglistthat show the format.

Note: the otherpkgs postbootscript should by default be associated with every node. Use lsdef to check:

lsdef node1 -i postbootscripts

If it is not, you need to add it. For example, add it for all of the nodes in the "compute" group:

chdef -p -t group compute postbootscripts=otherpkgs

Set up a postinstall script (optional)

Postinstall scripts for diskless images are analogous to postscriptsfor diskfull installation. The postinstall script is run by genimagenear the end of its processing. You can use it to do anything to yourimage that you want done every time you generate this kind of image. In the script you can install rpms that need special flags, or tweak theimage in some way. There are some examples shipped in/opt/xcat/share/xcat/netboot/. If you create apostinstall script to be used by genimage, then point to it in yourosimage definition. For example:

chdef -t osimage mycomputeimage postinstall=/install/custom/netboot/rh/compute.postinstall

Set up Files to be synchronized on the nodes

Note: This is only supported for stateless nodes in xCAT 2.7 and above.

Sync lists contain a list of files that should be sync'd from the management node to the image and to the running nodes. This allows you to have 1 copy of config files for a particular type of node and makesure that all those nodes are running with those config files. The sync list should contain a line for each file you want sync'd, specifyingthe path it has on the MN and the path it should be given on the node.For example:

/install/custom/syncfiles/compute/etc/motd -> /etc/motd
/etc/hosts -> /etc/hosts

If you put the above contents in /install/custom/netboot/rh/compute.synclist, then:

chdef -t osimage mycomputeimage synclists=/install/custom/netboot/rh/compute.synclist

For more details, see Sync-ing_Config_Files_to_Nodes.

Configure the nodes to use your osimage

You can configure any noderange to use this osimage. In thisexample, we define that the whole compute group should use the image:

 chdef -t group compute provmethod=mycomputeimage

Now that you have associated an osimage with nodes, if you want tolist a node's attributes, including the osimage attributes all in onecommand:

lsdef node1 --osimage

Generate and pack your image

There are other attributes that can be set in your osimage definition. See the osimage man page for details.

Building an Image for a Different OS or Architecture

If you are building an image for a different OS/architecture than is on the Management node, you need to follow this process: Building a Stateless Image of a Different Architecture or OS. Note: different OS in this case means, for example, RHEL 5 vs. RHEL 6. If the difference is just an update level/service pack (e.g. RHEL 6.0vs. RHEL 6.3), then you can build it on the MN.

Building an Image for the Same OS and Architecture as the MN

If the image you are building is for nodes that are the same OS andarchitecture as the management node (the most common case), then you can follow the instructions here to run genimage on the management node.

Run genimage to generate the image based on the mycomputeimage definition:

genimage mycomputeimage

Before you pack the image, you have the opportunity to change anyfiles in the image that you want to, by cd'ing to the rootimgdir (e.g./install/netboot/rhels6/x86_64/compute/rootimg). Although, instead, werecommend that you make all changes to the image via your postinstallscript, so that it is repeatable.

The genimage command creates /etc/fstab in the image. If youwant to, for example, limit the amount of space that can be used in /tmp and /var/tmp, you can add lines like the following to it (either byediting it by hand or via the postinstall script):

tmpfs   /tmp     tmpfs    defaults,size=50m             0 2
tmpfs   /var/tmp     tmpfs    defaults,size=50m       0 2

But probably an easier way to accomplish this is to create apostscript to be run when the node boots up with the following lines:

logger -t xcat "$0: BEGIN"
mount -o remount,size=50m /tmp/
mount -o remount,size=50m /var/tmp/
logger -t xcat "$0: END"

Assuming you call this postscript settmpsize, you can add this to the list of postscripts that should be run for your compute nodes by:

chdef -t group compute -p postbootscripts=settmpsize

Now pack the image to create the ramdisk:

packimage mycomputeimage


Installing a New Kernel in the Stateless Image

The kerneldir attribute in linuximage table is used toassign one directory to hold the new kernel to be installed into thestateless/statelite image. Its default value is /install/kernels, you need to create the directory named under the kerneldir, and genimage will pick them up from there.

Assuming you have the kernel in RPM format in /tmp, the value of kerneldir is not set (which will take the default value: /install/kernels).

This procedure assumes you are using xCAT 2.6.1 or later. Therpm names are an example and you can substitute your level andarchitecture. The kernel will be installed directly from the rpmpackage.


  • For RHEL:

The kernel RPM package is usually named kernel-.rpm, for example: kernel-2.6.32.10-0.5.x86_64.rpm is the kernel package for 2.6.32.10-0.5.x86_64.


cp /tmp/kernel-2.6.32.10-0.5.x86_64.rpm /install/kernels/
createrepo /install/kernels/


  • For SLES:

Usually, the kernel files for SLES are separated into two parts: kernel--base and kernel, and the naming of kernel RPM packages are different. For example, there's two RPM packages in /tmp:

kernel-ppc64-base-2.6.27.19-5.1.x86_64.rpm
kernel-ppc64-2.6.27.19-5.1.x86_64.rpm

2.6.27.19-5.1.x86_64 is NOT the kernel version. 2.6.27.19-5-x86_64 is the kernel version . Follow this naming convention to determine the kernel version.

After the kernel version is determined for SLES, then:


cp /tmp/kernel-ppc64-base-2.6.27.19-5.1.x86_64.rpm /install/kernels/
cp /tmp/kernel-ppc64-2.6.27.19-5.1.x86_64.rpm /install/kernels/


Run genimage/packimage to update the image with the new kernel: (Use sles as example)


Since the kernel version is different from the rpm packageversion, the -g flag needs to be specified on the genimage command forthe rpm version of kernel packages.

genimage -i eth0 -n ibmveth -o sles11.1 -p compute -k 2.6.27.19-5-x86_64 -g 2.6.27.19-5.1

Installing New Kernel Drivers to Stateless Initrd

The kernel drivers in the stateless initrd are used for the devicesduring the netboot. If you are missing one or more kernel drivers forspecific devices (especially for the network device), the netbootprocess will fail. xCAT offers two approaches to add additional driversto the stateless initrd during the running of genimage.

  • Use the '-n' flag to add new drivers to the stateless initrd
genimage  -n

Generally, the genimage command has a default driver list which willbe added to the initrd. But if you specify the '-n' flag, the defaultdriver list will be replaced with your . Thatmeans you need to include any drivers that you need from the defaultdriver list into your .

The default driver list:

rh-x86:   tg3 bnx2 bnx2x e1000 e1000e igb mlx_en virtio_net be2net
rh-ppc:   e1000 e1000e igb ibmveth ehea
sles-x86: tg3 bnx2 bnx2x e1000 e1000e igb mlx_en be2net
sels-ppc: tg3 e1000 e1000e igb ibmveth ehea be2net

Note: With this approach, xCAT will search for the drivers in therootimage. You need to make sure the drivers have been included in therootimage before generating the initrd. You can install the driversmanually in an existing rootimage (using chroot) and run genimage again, or you can use a postinstall script to install drivers to the rootimage during your initial genimage run.

  • Use the driver rpm package to add new drivers from rpm packages to the stateless initrd

Refer to the doc Using_Linux_Driver_Update_Disk#Driver_RPM_Package.

Boot the nodes

nodeset compute osimage=mycomputeimage

(If you need to update your diskless image sometime later, changeyour osimage attributes and the files they point to accordingly, andthen rerun genimage, packimage, nodeset, and boot the nodes.)

Now boot your nodes...

rsetboot compute net
rpower compute boot

Useful Applications of xCAT commands

This section gives some examples of using key commands and command combinations in useful ways. For any xCAT command, typing 'man ' will give details about using that command. For a list of xCAT commands grouped by category, see xCAT Commands. For all the xCAT man pages, see http://xcat.sourceforge.net/man1/xcat.1.html .

Adding groups to a set of nodes

In this configuration, a handy convenience group would be the lowersystems in the chassis, the ones able to read temperature and fanspeed.In this case, the odd systems would be on the bottom, so to do this with a regular expression:

# nodech '/n.*[13579]$' groups,=bottom

or explicitly

chdef -p n1-n9,n11-n19,n21-n29,n31-n39,n41-n49,n51-n59,n61-n69,n71-79,n81-n89,
n91-n99,n101-n109,n111-119,n121-n129,n131-139,n141-n149,n151-n159,n161-n167 groups="bottom"

Listing attributes

We can list discovered and expanded versions of attributes (Actual vpd should appear instead of *) :

# nodels n97 nodepos.rack nodepos.u vpd.serial vpd.mtm
n97: nodepos.u: A-13
n97: nodepos.rack: 2
n97: vpd.serial: ********
n97: vpd.mtm: *******

You can also list all the attributes:

#lsdef n97
Object name: n97
   arch=x86_64
        .
   groups=bottom,ipmi,idataplex,42perswitch,compute,all
        .
        .
        .
   rack=1
   unit=A1

Verifying consistency and version of firmware

xCAT provides parallel commands and the sinv (inventory) command, to analyze the consistency of the cluster. See Parallel_Commands_and_Inventory

Combining the use of in-band and out-of-band utilities with thexcoll utility, it is possible to quickly analyze the level andconsistency of firmware across the servers:

mgt# rinv n1-n3 mprom|xcoll
====================================
n1,n2,n3
====================================
BMC Firmware: 1.18

The BMC does not have the BIOS version, so to do the same for that, use psh:

mgt# psh n1-n3 dmidecode|grep "BIOS Information" -A4|grep Version|xcoll
====================================
n1,n2,n3
====================================
Version: I1E123A

To update the firmware on your nodes, see xCAT iDataPlex Advanced Setup#Updating Node Firmware.

Verifying or Setting ASU Settings

To do this, see xCAT iDataPlex Advanced Setup#Using ASU to Update CMOS, uEFI, or BIOS Settings on the Nodes.

Managing the IB Network

xCAT has several utilities to help manage and monitor the Mellanox IB network. See Managing the Mellanox Infiniband Network.

Reading and interpreting sensor readings

If the configuration is louder than expected (iDataplex chassisshould nominally have a fairly modest noise impact), find the nodes with elevated fanspeed:

# rvitals bottom fanspeed|sort -k 4|tail -n 3
n3: PSU FAN3: 2160 RPM
n3: PSU FAN4: 2240 RPM
n3: PSU FAN1: 2320 RPM


In this example, the fanspeeds are pretty typical. If fan speeds areelevated, there may be a thermal issue. In a dx340 system, if near10,000 RPM, there is probably either a defective sensor or misprogrammed power supply.


To find the warmest detected temperatures in a configuration:

# rvitals bottom temp|grep Domain|sort -t: -k 3|tail -n 3
n3: Domain B Therm 1: 46 C (115 F)
n7: Domain A Therm 1: 47 C (117 F)
n3: Domain A Therm 1: 49 C (120 F)

Change tail to head in the above examples to seek the slowestfans/lowest temperatures. Currently, an iDataplex chassis without aplanar tray in the top position will report '0 C' for Domain Btemperatures.

For more options, see rvitals manpage: http://xcat.sourceforge.net/man1/rvitals.1.html

11-23 01:17