[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]

[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)

Disk and File System Administration

Last Change: 07/Sep/2006

SGI will likely have their own page on copying file systems on techpubs.sgi.com. However, the information given here, based on personal experience, is intended to reflect those things that typical admins and personal users actually want to know, especially beginners. As I add more information, this page will change and be reorganised. If you have useful tips and/or advice that you think should be added, then please contact me.

Installing an Operating System
Cloning A Root Disk
Dealing With Patch Files
Backing up a System
Miscellaneous Hints and Tips

Installing an Operating System

This section shows how to install IRIX 6.2 on an SGI. Though my own experience is with Indigo, Indy and Indigo2, the information here should be applicable to other systems which can use IRIX 6.2, eg. Crimson, etc. Installing other OS versions (eg. 5.3) involves a similar procedure, although there is a new installation method for installing IRIX 6.5 - I have a separate page that describes how to install IRIX 6.5.

Note: if the CPU in your system is a 'PC' type (ie. it has no secondary cache, such as R4600PC 133MHz), please read the extra note at the end of these instructions about RxxxxPC systems.

If you already have an OS installed, such as IRIX 5.3, I recommend against 'upgrading' in the normal sense, eg. running swmgr on a 5.3 Indy and upgrading using the 6.2 CDs. It is far better to carry out a clean installation from scratch; even the file system might be different (XFS instead of EFS in the case of changing from 5.3 to 6.2, although there is a version of 5.3 with XFS). Upgrading on top of an older OS means one is never really sure that the system is configured the way it should be.

Thus, the methodology for installing an OS which I use is as follows:

Backup up all necessary information which makes the system unique (this may be the entire subsections of the disk, user account areas, custom data directories, etc., or just certain key system files). Make a note of any customisations need after reinstallation, eg. extra directories, changes/additions to /usr/local, etc.
Carry out a clean installation, ie. the disk is erased beforehand if necessary.
Alter the default configuration as desired, eg. root password, host name, IP address, YP services, etc. Restore the information backed up from the original disk.

Alternatively, do the installation on a separate disk and then copy over the relevant information from the old disk afterwards.

When I ran an Indy lab at the main University in Preston (where I live), most of the files I backed up were important ones in /etc, others in /var, some local home-made web pages, etc.

Here is a list of the files I backed up (depending on your configuration, other files may also be important to you, such as /etc/sendmail.cf):

    /.cshrc
    /.tchsrc
    /.Xresources
    /.jotrc
    /etc/TIMEZONE
    /etc/fstab
    /etc/hosts.equiv
    /etc/resolv.conf
    /etc/group
    /etc/hosts
    /etc/passwd
    /etc/sys_id
    /etc/config/timed.options
    /etc/init.d/network.local
    /var/yp/ypdomain

It's worth noting that, in some cases, I was not intending to simply move the above backed-up files into the appropriate places once the new installation was complete. Rather, they would be useful in working out what the required changes to the new setup ought to be. When it actually came to configuring the new system, I used a more efficient method of setting up the individual machines - this is described later as an example of automation.

When reinstalling the server, extra files to be backed up included all necessary DNS/NIS data (/var/yp, /var/named, etc.) and of course user data.

Naturally, I also made a complete backup of the system to DAT (see below for details).

Using the two IRIX 6.2 CDs, I made a clean installation like this (my description assumes the presence of a locally-connected CDROM drive):

Power down the machine, make sure the CDROM is properly connected and the IRIX 6.2 Part 1 of 2 CD inserted into the CDROM drive.
Power on the machine. A message will appear about system maintenance. Press the stated key to select it (usually ESC) and select 'Command Monitor'.
After entering any password asked for, type in the following command which activates the program called 'fx' that allows one to repartition the disk:
```
  boot -f dksc(0,X,8)sashARCS dksc(0,X,7)stand/fx.ARCS --x
```
where X is the SCSI ID of the CDROM drive. Note that it's possible the CDROM is on a different SCSI channel, eg. an external CDROM on Indigo2, in which case use the appropriate SCSI controller number instead of 0. For example, the command for an Indigo2 with an external CDROM on ID 4 would be:
```
  boot -f dksc(1,4,8)sashARCS dksc(1,4,7)stand/fx.ARCS --x
```
Use the hinv command in the Command Monitor to identify the correct SCSI controller number and SCSI ID of your CDROM.
According to the fx man page, the above command sequences apply to systems with the 32bit ARCS PROM, namely R4K Indigo, R4K Indigo2, Indy, R4K Onyx, R4K Challenge and O2. For systems with the 64bit ARCS PROM (ie. Power Challenge, Power Onyx, Power Indigo2, Indigo2 IMPACT 10000 or R8000 Indigo2, Origin, Onyx2, OCTANE, etc.) use this command:
```
  boot -f dksc(0,X,8)sash64 dksc(0,X,7)stand/fx.64 --x
```
Older systems such as R3K Indigo are slightly different. In these cases, the sash file and fx file are named to correspond with the system's CPU IP number, eg. R3000 Indigo is an IP12 system. Thus, the correct command for R3K Indigo is:
```
  boot -f dksc(0,X,8)sashIP12 dksc(0,X,7)stand/fx.IP12 --x
```
Note that sometimes I have seen this command fail with an error about the CDROM not being ready, or R4K Indigo might give an error about wrong architecture. This happens more often on older systems such as Indigo when the CDROM being used is a more modern model. Usually the problem can be solved just by repeating the command again by entering '!!' (without the quotes) - for some reason, just entering the command makes it work ok. Alternatively, when an error occurs like this, I've often found that doing the command sequence in two stages instead of all at once can fix the problem too, ie. first boot up into the sash (I'm using the R3K Indigo example here, assuming CDROM on SCSI ID 4):
```
  boot -f dksc(0,4,8)sashIP12
```
Then, once at the sash prompt, boot into fx:
```
  boot -f dksc(0,4,7)stand/fx.IP12 --x
```
I think R3K Indigo, and probably earlier systems, are more fussy about how the CDROM behaves. I certainly found that reliably installing 5.3 on an R3K Indigo was only possible with very early models of CDROM (1X or 2X Toshiba), though I expect 5.3 could handle later models of CDROM if one applied all relevant 5.3 updates and patches.
The fx program should now have run up. You will be presented with a prompt such as this:
```
  fx: "device-name" = (dksc)
```
Assuming you're using a system that's disk on SCSI ID 1, the default settings will be correct so just press Return in answer to the initial questions (dksc, ctlr, drive and lun). If this isn't the case (use hinv to check), then just enter the correct values when asked, eg. the system disk in an R10000 O2 is always on SCSI ID 2.
From the fx menu, select repartition (r), then root drive option (ro). You'll be asked about file system type - press Return for xfs. You'll then be asked to confirm the request - enter 'yes' (have you remembered to backup the system?). The disk will be reconfigured. Now enter '..' to return to the initial menu, enter 'l' to set a new disk label, enter 'sy' to write the information to the disk, and finally enter '/exit' to quit fx. After a pause, you will once again be presented with the system maintenance menu. By the way, believe me when I say, once you get used to using fx in this way, dealing with disks using fx and mkfs becomes very easy indeed. I can perform this fx procedure now in just a few seconds.
Select 'Install System Software' (usually just by pressing '2'). Confirm the source as CDROM and select 'Continue' (pressing Enter twice is often sufficient). A progress bar will appear, showing the loading of the installation tools.
The system will attempt to create a mini UNIX OS setup, stored inside the swap partition. However, it can't do this if there is not yet any valid file system on the disk. If you see a message about there being no valid file system present, then do the following:
- Answer yes to the question about whether you would like to create a new file system.
- Select xfs as the file system type.
- If the disk size is smaller than 4GB then enter 512 as the block size (otherwise enter 4096).
The system will create a new file system using mkfs. If this is what happens when you do an installation, then there is no need to manually create a new file system from within the inst program since it's already been done, in which case goto step 9.
inst will run automatically. An information file will be shown about IRIX 6.2 - quit out of the script by pressing 'Q' (to stop the text from paging) and then enter '2'. Once at the inst prompt, enter 'sh' to obtain a UNIX shell to begin the process of creating a new file system.
From the shell prompt, enter the following:
```
  umount /root
  mkfs /dev/dsk/dks0d1s0
```
Compared to IRIX 5.3, this operation is executed very quickly under 6.2 - this is because the XFS file system works in a completely different way to the old EFS file system.
Note that if you are using a disk which is smaller than 4GB (eg. 549MB, 1GB, 2GB, etc.) then it is better to have a disk block size of 512 instead of the default 4096. So, for small-size disks, use this command instead of the mkfs sequence shown above:
```
  mkfs -b size=512 /dev/dsk/dks0d1s0
```
Now remount the root disk on /root:
```
  mount /dev/dsk/dks0d1s0 /root
```
and exit the shell (enter 'exit', or press CTRL+D) to return to the inst program.
You will now be back at the inst prompt.
Enter '1' to select the 'From' option. The default is /CDROM/dist so press Return in answer to the question if this is correct. The 6.2 startup script README will appear again. Quit the script and exit without running it by pressing 'Q' and then entering '2'. The product descriptions will be read from the CD, and then the inst prompt will appear again after some extra message concerning product dependencies, file sizes, etc.
This next step is very important; enter the following:
```
  set delay_conflicts on
```
Because 6.2 comes on two CDs, it obviously isn't possible to install everything with just one 'run' of inst (6.5 handles this in a different way), unless you happen to have two CDROMs attatched, or the data has been placed together on a single disk. Thus, one must inform inst to ignore any conflicts it finds while installing the first CD - any conflicts are automatically dealt with when installing the 2nd CD.
Now enter:
```
  install default
```
One could use option 7 ('Step') to manually decide which items to install, but it's much easier to do a default installation and then deal with the specifics later when one can use the GUI tools, perhaps even from a different SGI if the target system is a server. Note that a default installation takes up about twice as much space as a minimum installation. If your disk is small and you want to install as little as possible, then don't enter 'install default' after setting delay_conflicts to on.
Enter 'go' to begin the installation. It will take some time and various messages will be displayed. You might see messages that seem like errors (usually concerning Netscape), but just ignore them.
When the installation of the first CD has finished, eject the CD and insert the 'IRIX 6.2 Part 2 of 2' CD. Select 'From' in the same way (enter '1'), quit out of any script README, enter 'install default' as before (if you did for the first CD) and then enter 'go'.
At this point, it's quite common for a conflict message to be displayed concerning the installation of xlators_3d.doc.web_page (it can't be installed because it relies on part of Netscape that's not present on the 2nd CD. Thus, if this happens, just enter 'conflicts 1a' to resolve the conflict, and then commence the installation with 'go'.
When the installation has finished, enter 'quit' to exit inst. The system will do some processing for a while (see the man page for rqsall). After the rqsall procedure has completed, you will be asked whether you want to restart the system - answer 'yes' (or just enter 'y').

And that's it! Once you've done this procedure once, it becomes very much second nature to do it again. I've done it hundreds of times now.

If you're installing onto a server system such as Challenge S, then after the initial installation has finished, the most effective thing to do is to alter just enough of the system files so that one can login to the server remotely from another SGI (all you need to change is /etc/sys_id and /etc/hosts, then reboot). This allows one to use the GUI tools (Software Manager) to begin the task of installing further software, configuring the DNS, NFS, NIS and so on.

Tip: install any desired patches last, usually the current Required/Recommended patch set CD.

Installing an OS on Multiple Clients

Installing a new OS and system software on multiple client machines (Indys in this example) is very similar, except that some software probably doesn't need to be installed, eg. the Berkeley DNS system. To save alot of time and effort, it is considerably easier to create a new client root disk and then clone the information to other disks. See below for details.

I mentioned earlier that I had an easy way of configuring the individual machines in the Indy lab I ran at Preston Uni. Basically, I used a script file which, given a 'target' system name, would install the necessary files with the appropriate changes in the right places automatically. I can't claim the script was super-efficient (it contained no error-checking) but it did the job and saved me alot of time and effort. Here's how I did it...

The server was already configured and ready with NFS, NIS, DNS, etc. all successfully running. All other machines except the 5.3 admin Indy were shut down, cleaned, and the disks ready to clone (I installed 6.2 on the admin Indy last).

Because I knew that many of the files which make a system what it is will be common to all machines, I made a directory called 'CLONE' which was copied to every machine's root disk during the main disk cloning process. The CLONE directory resided in /var/tmp. It contained the following files:

  .Xresources  .cshrc  .jotrc  .rhosts  go*
  etc/
    bootptab       fstab          hosts.equiv    resolv.conf
    bootptab.msk   group          socks.conf     TIMEZONE
    bootptab.tmp   hosts          passwd         sys_id.msk
  etc/config/timed.options
  etc/init.d/network.local*
  var/yp/ypdomain
  var/netls/nodelock

The main thing which makes a machine an individual entity is /etc/sys_id. The sys_id.msk file contains all the different host names for the 19 machines I had to configure. The script 'go' is given the target machine name as a single parameter. grep uses this name to select the appropriate line from sys_id.msk and the result is redirected into the new /etc/sys_id file. In a similar way, the /etc/bootptab file is created by grepping the appropriate line from bootptab.msk with the target name, combining it with the initial text that /etc/bootptab always has, and dumping the result into a new /etc/bootptab file.

The other actions taken by the script file include copying the various configuration files to their appropriate locations, setting up the necessary K39 and S31 network links for the network.local file (you may or may not be using a static route), chkconfigging on particular flags and erasing certain portions of /var/www (this is because the system had /var/www NFS-mounted; everything in there was removed, except the /var/www/server directory to allow the system to be booted in standalone mode should it ever be necessary).

Here is the 'go' script in detail. It contains many 'echo' statements so that I could see the script's progress when executed, and also for debugging purposes:

#!/bin/sh
echo Target system: $1
echo Copying hidden files...
cd /var/tmp/CLONE
/bin/cp .Xresources .cshrc .jotrc .rhosts /
echo Copying etc files...
cd etc
echo sys_id...
grep $1 sys_id.msk > /etc/sys_id
echo bootptab...
grep $1 bootptab.msk > bootptab.tmp
cat bootptab bootptab.tmp > /etc/bootptab
echo fstab, group, hosts, hosts.equiv, passwd, resolv.conf, TIMEZONE...
/bin/cp fstab group hosts hosts.equiv passwd resolv.conf TIMEZONE /etc
echo timed.options...
cd config
/bin/cp timed.options /etc/config
echo network.local...
cd ../init.d
/bin/cp network.local /etc/init.d
echo K39/S31 network links...
ln -s /etc/init.d/network.local /etc/rc0.d/K39network
ln -s /etc/init.d/network.local /etc/rc2.d/S31network
echo Copying var files...
echo 'nodelock (but not overwritten old nodelock)...'
cd ../../var/netls
/bin/cp nodelock /var/netls/nodelock.new
echo ypdomain...
cd ../yp
/bin/cp ypdomain /var/yp
echo Creating mounting directories...
cd /
mkdir mapleson
mkdir home
echo 'Erasing /var/www stuff (do this only after cloning next disk)...'
cd /var/www
echo Erasing cgi-bin...
/bin/rm -rf cgi-bin
echo Erasing conf...
/bin/rm -rf conf...
echo Erasing htdocs...
/bin/rm -rf htdocs
echo Changing chkconfig flags...
chkconfig directoryserver on
chkconfig network on
chkconfig nfs on
chkconfig verbose on
chkconfig yp on
chkconfig videod on
echo Done.
echo 'You are now ready to reboot, erase /var/tmp/CLONE if no more'
echo 'cloning is required, and reboot again.'

The script was run on each machine only after all disks had been successfully cloned. A typical command sequence, after turning on the system and logging in as root, would look like this:

  cd /var/tmp/CLONE
  ./go AKIRA

where 'AKIRA' is the name of one of the Indys I ran.

Installing IRIX 6.2 on RxxxxPC Systems

There is a problem with the later kernel rollup patches that affects systems which have CPUs with no secondary cache (R4600PC 100MHz, R4600PC 133MHz and R5000PC 150MHz). A system with one of these CPUs which has a kernel rollup patch later than 2777 installed (eg. 3110 or 3156) will experience kernel faults (see my Miscellaneous System Problems page for full details).

Tip: during a 6.2 OS installation, when it comes to installing the 'Required/Recommended' patches CD, install patch 2777 first from a different source (eg. the 'IRIX 6.2 Development Foundation 1.1' CD, or the 'Varsity Update 1 of 1, August 1998' CD) and then install the automatically selected patches from the latest patch set CD except the kernel rollup patch.

SGI will likely release a newer kernel rollup at some point which solves this problem.

Cloning A Root Disk

NOTE: when the root disk in an IRIX system is swapped for another drive, for example after a cloning procedure has been done, sometimes the PROM loses the setting for OSLoadFilename, which prevents the system from booting. One of several strange-sounding errors can be reported when this occurs. Thus, if you use the cloning procedure outlined here, swap the disk but then the system does not boot, then go into Command Monitor and enter 'printenv' to make sure that the OSLoadFilename variable is still set to '/unix'. If it is blank, then enter:

   setenv OSLoadFilename /unix

exit Command Monitor and press 1 to boot the system.

Sometimes, it's necessary to make an exact copy of a disk, perhaps for backup purposes or perhaps during an OS upgrade, eg. a single client machine has a new OS installed and its disk is then cloned to all other client disks before individual changes are made. This is a very good way of ensuring that all client systems have identical software setups and is alot faster than installing products manually on each machine. The procedure itself is easy, so if you're someone who has dozens of systems to configure, then don't panic! The info here should help. Note that SGI's TechPubs site has further information.

There are two main ways to clone a disk: using xfsdump in conjunction with xfsrestore, or by using the tar command. The xfsdump method is listed first as it's faster and has other advantages. Sometimes though, the xfsdump method is not appropriate, in which case tar is used - example scenarios are explained later. xfsdump is just better at handling device files, etc. whereas tar can cause problems if not used properly.

In the description given here, I'm assuming certain things:

The disk to be cloned is a root (system) disk on SCSI ID 1 on controller 0. Cloning option disks is a very similar process - specifics connected with this are given later.
The disk to be copied onto is installed as SCSI ID 2 on controller 0.
The file system type on both disks is XFS (if your source file system is EFS, then use the tar method given later).

Bootup the system and login as root. Obtain a UNIX shell.

Create a mount point:

  mkdir /0

Use fx to repartition the extra disk (don't include my comments):

  cd
  fx -x                             # Run fx
  <Enter>                           # Select dksc
  <Enter>                           # Select controller 0
  2                                 # Select drive 2
  <Enter>                           # Select lun 0
  r                                 # Select repartition option
  ro                                # Select root drive option
  <Enter>                           # Select XFS
  yes                               # Yes, continue with the operation
  ..                                # Return to the main menu
  l                                 # Create a new label
  sy                                # Write out the new label
  /exit                             # Exit fx

Use mkfs to create a new file system:

  mkfs -b size=512 /dev/dsk/dks0d2s0

Note that if the disk is 4GB or larger, then exclude the block size definition, ie. just enter:

  mkfs /dev/dsk/dks0d2s0

Mount the destination disk:

  mount /dev/dsk/dks0d2s0 /0

Confirm the amount of space available with 'df -k'.

Now begin the copy process:

  cd /0
  xfsdump -l 0 -p 5 - / | xfsrestore - .

This specifies a Level 0 dump (all files), progress report every 5 seconds, acting on the root file system. xfsdump sends the data to the standard output (by the use of the '-' character); this is piped to xfsrestore which is getting its data from the standard input (again by the use of the '-' character).

NB: I often find it useful to know how long these copy procedures take, eg. planning whether or not one has enough time to do multiple systems, etc. Thus, I always use the timex command to report how long the copy process lasted. Just put timex as the first command, ie. instead of the above, enter:

  cd /0
  timex xfsdump -l 0 -p 5 - / | xfsrestore - .

It doesn't make any difference to the copy process, but it can be useful to have a appreciation for how long these tasks take.

Tip 1: if you're doing all this in a standard xterm, make it wider so that the progress messages don't get wrapped onto the next line. It's easier to read.

Tip 2: if you're using a multi-CPU system, remember you can use the runon command to force the copy process to run on a particular CPU. It's best to choose a CPU that's closest to the SCSI controller(s) involved in the copy process as this minimises system traffic. This is more relevant to newer systems such as Origin, Onyx2, etc. On older systems like Onyx and Challenge, it's more useful simply as a way to prevent the default CPU 0 being used to do everything, eg. for a 4-CPU deskside one might run the task on CPU 3 thus:

  cd /0
  runon 3 timex xfsdump -l 0 -p 5 - / | xfsrestore - .

Finally, the volume header information from the root disk must be copied onto the target disk, though one could do this while the copying is going on. Enter the following:

  cd /stand
  dvhtool -v get sash sash /dev/rdsk/dks0d1vh
  dvhtool -v get ide ide /dev/rdsk/dks0d1vh
  dvhtool -v creat sash sash /dev/rdsk/dks0d2vh
  dvhtool -v creat ide ide /dev/rdsk/dks0d2vh

There may be a symmon entry in the volume header too, in which case enter these extra commands:

  dvhtool -v get symmon symmon /dev/rdsk/dks0d1vh
  dvhtool -v creat symmon symmon /dev/rdsk/dks0d2vh

Try the 'get symmon' command above; if it gives a not-found error, then there isn't any symmon entry present, so don't bother with the creat command.

Alternatively, one can copy the volume header interactively, which does have the advantage of being able to see exactly what is present in the volume header. Also, some systems will have other entries besides sash, ide and symmon, eg. Octane will often have a file called IP30prom. Thus, the interactive method is what I usually use. Here is what to enter (exclude my comments of course):

  cd /stand
  dvhtool /dev/rdsk/dks0d1vh      # Access the system disk volume header
  vd                              # Switch to a different menu
  l                               # List contents of volume header
  g sash sash                     # Copy volume header entries to disk;
  g ide ide                       # If the 'l' command shows other entries
  g symmon symmon                 # besides these, then copy them too.
  quit                            # Exit from this session...
  quit
  dvhtool /dev/rdsk/dks0d2vh      # Access the destination disk
  vd
  l
  d sash                          # Delete old entries (if any are shown
  d ide                           # to be present by the l command),
  d symmon                        # including any besides, sash, ide and symmon.
  a sash sash                     # Copy new entries to destination volume header...
  a ide ide
  a symmon symmon
  quit
  write                           # Confirm out the changes
  quit

In fact, the amount of typing required for the interactive method is less, so that's another advantage.

Note that 5.3 handles the sash in a slightly different way from 6.2/6.5, so if the disk to be copied is a 5.3 installation, then the volume header copy operation can be compacted to:

  cd /0
  dvhtool -v creat /stand/sash sash /dev/rdsk/dks0d2vh

And that's it! The machine can now be powered down and the cloned disk removed. Don't forget to change the clone disk's SCSI ID to 1, though on many systems that is done automatically via the use of a disk sled.

Most of the time, using xfsdump is the best, fastest and most efficient way to clone a disk. However, sometimes it may not be appropriate, eg. if the file system spans several disks but the destination is just a single disk (xfsdump only dumps a single named file system). In such circumstances, using tar is the main alternative.

However, using tar requires some special measures: all NFS mounts should be unmounted beforehand, as should the /proc file system. Also, any CDROMs and other media should be ejected from their respective devices.

Here is what to enter after the fx/mkfs procedure, making the /0 mount point and mounting the target disk:

  umount /proc
  tar cvBpf - . | (cd /0; tar xBpf -)

This command recursively copies the root disk or file system onto the extra disk. By recursive I mean that it also copies /0 into /0; however, at the time this is done, the only items in /0 are some hidden files (because the copy process hasn't yet alphabetically reached anything else), so not much extraneous data is copied. This is why I use /0 as a mount point: if the extra disk was mounted on /disk2 and there was a directory such as /Data or /Alias containing alot of data, then alot of unnecessary copying would occur, and the copy procedure might even fail due to running out of disk space. The character '0' comes before just about everything else in the ASCII character set, so these problems ar prevented.

Note that one definitely does not want to try and tar over /proc since /proc does not contain 'real' files - the entries in /proc relate to process information, used, for example, by the 'ps' command and 'killall'. The entries appear as very large files even though they're not; they are effectively images of running processes; tar cannot understand this and chokes on them, so one should unmount /proc before beginning the tar procedure.

Anyway, after the tar process has finished, enter the following to remount /proc and remove the unwanted '0' directory that's inside /0:

  /etc/mntproc
  cd /0
  /bin/rm -rf 0

Using tar does have the advantage that one can see the files being copied, which is good feedback on the copying process. However, as the various necessary commands demonstrate, tar is sensitive to issues such as NFS mounts, /proc, mounted removeable media, etc. After the cloning has finished, copy over the volume header information just as for the xfsdump method.

Option Disks

The procedure for copying option disks is similar, expect that the partition number will be 7 instead of 0 (remember to select 'Option Drive' from within fx) and one does not need to worry about any volume header since there isn't one.

How to clone lots of disks the easy way!

The following is useful if one has many disks to clone, eg. every client in the lab I ran is being upgraded from 5.3 to 6.2. This is the example I used at the time using tar; these days I would use xfsdump instead.

The answer? Use a script file! Here is the script I used, stored in an executable file called 'diskcopy' which is placed in the root directory:

  #!/bin/sh
  echo
  echo WARNING: this script assumes the 'fx -x' procedure has
  echo already been performed on the target disk!
  echo
  echo Making file system on /dev/dsk/dks0d2s0...
  mkfs -b size=512 /dev/dsk/dks0d2s0
  echo Mounting /dev/dsk/dks0d2s0 on /0...
  mount /dev/dsk/dks0d2s0 /0
  echo Unmounting /proc...
  umount /proc
  echo Changing to root dir...
  cd /
  echo Copying...
  tar cvBpf - . | (cd /0; tar xBpf -)
  echo
  echo Changing to /0...
  cd /0
  echo Removing unwanted contents of /0/0...
  /bin/rm -rf 0
  echo Recreating /0/0...
  mkdir 0
  echo Changing to root dir...
  cd /
  echo getting the sash from the root disk...
  dvhtool -v get sash /stand/sash /dev/rdsk/dks0d1vh
  echo Changing to /0...
  cd /0
  echo Writing the sash to the target disk...
  dvhtool -v creat /stand/sash sash /dev/rdsk/dks0d2vh
  echo Remount /proc...
  /etc/mntproc
  echo Now power down the system and remove the cloned disk.
  echo Remember to switch the cloned disk's SCSI ID back to 1.
  echo NB: after installing the cloned disk into the next machine
  echo and powering on the system, remember to login as root and
  echo do an immediate reboot before running this script again -
  echo this will install the new unix.install file.
  echo

(I have alot of echo comments in my scripts so I can see what is going on)

So how is the above script used? Here is an example of cloning four disks (W, X, Y and Z), using 2 Indys (A and B):

With the source disk W in system A on SCSI ID 1 (this is the disk to be copied), install the first target disk X on SCSI ID 2 in system A.
Power on the system.
Login as root, create a directory called /0 and copy the above script into a file called 'diskcopy' in the root directory. Make sure the diskcopy file is executable (enter 'chmod u+x diskcopy'). This step is only performed once.
Run 'fx -x' and perform the procedure described earlier to repartition the extra disk as a root disk.
Enter this command:
```
   ./diskcopy
```
And that's it! The script makes the file system on the target disk, mounts the disk, unmounts /proc, copies the data, removes the recursive garbage, copies over the sash, etc. The copying process takes about 15 to 20 minutes for an almost-full 549MB 4500rpm disk.
When the script has finished, power off the machine.
Remove the target disk, change it's SCSI ID back to 1 and install it in system B.
Install disks Y and Z in systems A and B, both on SCSI ID 2.
Power on both systems.
For system A, go back to step 4 to begin the cloning process again. For system B, login as root and reboot the system to begin using the new kernel (the unix.install file), then go back to step 4.

The important point here is that both systems are copying at the same time. The script makes sure no errors are made and saves alot of time and typing. The only step that takes any time is the fx procedure, but even that takes just a few seconds (fx can actually use scripts too, so I will improve the script given here at a later date). Once the two systems have finished their scripts, 4 disks are ready for use. For larger numbers of systems and disks, just repeat the procedure on extra systems with further disks. I hope you can see how this procedure can very quickly copy dozens of disks, ie. 4 disks would clone to 8, then 8 to 16, then 16 to 32, 32 to 64, 64 to 128. One could configure 128 half-gig-disk Indys in just a few hours.

Dealing With Patch Files

What's said here may or may not be what you want to know, but it's based on personal experience and so should be useful.

Firstly, installing patch files can sometimes use a lot of RAM. I once installed patch 2262 on a 6.2 Indy with 64MB; rqsall was swapping out to disk repeatedly, at times grabbing as much as 40MB. So, if you can, and assuming you've more than one patch file to do (not worth the hassle of opening up machines if one only has to install a small number of patches), temporarily increase the memory in the target system. I increased it from 32MB to 64MB and then again to 96MB after seeing what patch 2262 was doing.

Ah yes, patch files, the sysadmin's nightmare. Which to use? What must be installed before a particular patch file can be used? What are the incompatibilities?

The instructions that come with patch subsystems always say that one usually only wants to install patches for problems that one has encountered, but if one is installing a new OS it makes sense to me to install the entire patch set as a preventative measure before the system starts really being used. Besides, SGI themselves recommend installing a complete patch set.

Install an entire patch set you say? Well, not all is bad news. For a start, many patches won't be relevant to your system because of hardware options you don't have, systems you're not using (eg. I2 IMPACT) or software that isn't present (eg. 64bit libs). In my case, I installed around two-dozen patches after initially installing 6.2.

The first time I dealt with the patches for 6.2, I wasn't really bothered with the order in which I installed things and basically attempted to install the whole lot at once. Something went wrong; patches can be finicky things and I had problems - after installing a block of ten or so patches, the Indy would boot up with a network memory error and a tiny core dump (200K) was created. Well, I strive for perfection in my system so this wasn't acceptable. I did everything again from scratch and all was ok.

In fact, since that time, I've had occasion to install complete patch sets many, many times using the install script, and I've had no problems since. I think I was just unlucky the first time round.

Even so, some things do need to be said about patch sets. When one activates a patch CD from swmgr (or inst), I often find that some subsystems are selected for installation which cause conflicts, or patches are selected which just aren't needed at all. These conflicts are almost always due to the absence of software that some part of a particular patch expects to be present. So, if you get conflicts, don't panic; just check through the selected patches and see if there are any subsystems which don't need to be selected (64bit versions of things is the most common one), or just select the appropriate options in the conflicts window.

Sometimes, parts of patches are selected which are actually older versions of installed software. I think things like this occur because the scripting process occasionally selects all of the contents of a patch, rather than just the relevant parts of it. I've also seen patches selected which just aren't needed at all - why this happens I don't know. When I do a complete 6.2 installation on an Indy, ie. including the IDO and Varsity set, I observed these patches being selected by the auto-script (from the 'September 1998 Required/Recommended CD') that shouldn't have been selected:

ISDN/PPP patch (I'm not using either),
Performer patch (again, Performer wasn't present),
xlock security fix patch (older version),
MIPS Pro Compiler patch (older version),

Either way, I do now always run the script that one is prompted with, but afterwards I always check each selected patch before commencing the installation to make sure:

No 'older version' software has been selected,
There aren't any conflicts due to missing software,
No patch has been selected for a software system that isn't being used.

When a conflict occurs because of a missing software element, don't be alarmed. It's far more likely to be the case that the software element is something you don't need, so just deselect that part of the patch.

Some admins may feel that they only want to install particular patches as opposed to a patch set, eg. security patches. This is fine, but do take care to ensure there aren't any conflicts, and that you don't overwrite software with older versions.

Usually, people will have patch CDs from which to obtain patch files, but if you don't then you can grab many patches from SGI's ftp site(s):

  ftp://patches.sgi.com/support/patchset/
  ftp://patches.sgi.com/support/patchset/.allrecpatch/

On several occasions, I've pointed people there who wanted patches and had no other means of obtaining them.

Notes:

Can/Should I install patch 2187 on a Challenge S?

If you happen to be someone who's not sure about installing patch 2187 on a Challenge S or Power Challenge M, it's probably because you've spotted this sentence in the release notes:

: This patch contains bug fixes for Crimson and all Challenge
systems except for Challenge S and Power Challenge M.

What this actually means is that the patch is for all platforms, but the bug fixes only affect certain Challenge systems; the patch as a whole is fine for Challenge S and Power Challenge M.

Why does my disk space go down when I install patch files?

By default, the system ensures that a patch can be removed in the future if required by retaining an installation history of the patch's effects, ie. an image of some kind that allows the system to restore the system's software to the state it was in before a patch was installed.

Some patches affect several different subsystems and so the history images required can be large. After installing all relevant 6.2 patches I found that 33MB of disk space had been used up. Since I had no intention of ever removing the patches (I expect the next big change to be a move to IRIX 6.5), I decided to remove the patches' installation histories. The first time I did this, I used individual commands such as:

  versions removehist patchSG0001537

and I removed the patches in reverse order to their installation (ie. last first). But if one intends to remove all patch histories, then there is a short-cut:

  versions removehist "*"

Backing up a System

[this article was originally written sometime in 1999, when I ran a student lab of SGI Indys]

The following advice applies only to lower-end systems such as Indy and Indigo2. I've not had experience of backing up systems such as Octane, Origin, etc. For the latter type of system, I recommend following SGI's own advice, ie. use the proper supplied and recommended tools and methods. What I dsecribe here are handy shortcuts and hints that will be useful to users of systems like Indy, Indigo, Indigo2, etc.

Simple Backup

Assuming one has a DAT drive attatched and one is logged in as root, the easiest way to backup the system is to use the following sequence of commands and actions (an action represents one or more commands, the exact nature of which only you will know):

cd / umount /proc <unmount any NFS-mounted file systems and option disks> tar cv .

As stated elsewhere, the /proc directory doesn't contain real files. They are 'images' of running processes which are used by various programs, most importantly the 'killall' command when one shuts down the system.

Backup Duration

Absolutely ages if one is unfortunate enough to be using a DDS1 DAT (2-4GB capacity, 150K/sec peak transfer without compression). DDS2 is 4-8GB capacity, and DDS3 is 12-24GB capacity (1.2MB/sec without compression). Thus, the transfer rate of DDS3 is about 10X faster than DDS1, so definitely try and use a DDS3 DAT if you can. If you don't have a DAT and are thinking of buying one, definitely get a DDS3! Trust me, it'll be well worth it in the long term. A DDS3 model which is definitely supported by SGIs is the Sony SDT9000 (make sure you have the latest Tape Driver patch installed, and that the DIP switches on the DAT unit are correctly set for SGIs, ie. switches 1 and 2 turned on, 3 and 4 off).

In case you're thinking that the above isn't important, just listen to this tale of woe: I had to reinstall the OS on the Challenge S I run due to total network failure (as it turned out, it was the hub unit that had gone wrong, but I didn't discover that until later). So, out came the DAT tape from the last required backup and into the our DDS1 DAT drive it went. This is the procedure I used to restore the system:

It isn't possible to 'tar xv' extract a DAT's contents right on top of the current root disk. You'll get about as far as /lib/cpp and then everything goes haywire. Obviously, some aspects of the OS need continuous access to particular files. So, I used a second disk to perform the restoration.
The second disk was installed on SCSI ID2, repartitioned as a root disk, mkfs'd with a file system and mounted on /disk2. I knew the DAT had about 1.6GB of data on it, and the disk was 2GB so everything would definitely fit ok (the normal root disk is 4GB).
I then entered these commands:
```
  cd /disk2
  tar xv
```
And that's it! Afterwards, I powered down the machine, swapped the disks over and used the disk cloning procedure above to clone the disk's contents back onto the main 4GB disk.

So where is the woe in this tale? Oh, only that it took over five hours to get all the files off the DAT tape! Aaaaaagh! 8\

That works out to be an average transfer rate of about 102K/sec (it's nowhere near the peek rate probably because many files are very small and cause overhead with respect to creating inodes, etc.)

I had to stay at work overnight to get it all done. The next day, I moaned to my HoD, saying I desperately needed a DDS3, or else I would go crazy. The HoD said yes, go order one. The fact that I looked like 40 miles of rough road after such a night probably helped. :D

The moral of the story is that it's easy to forget or ignore how long backup procedures take because normally they're done overnight by cron when one isn't around. Why not do a test and see how long it takes to backup your system? The elapsed time will be a good estimate of how long it'll take to restore your system should the need ever arise. If you're in a company or business where downtime equals lost revenue, then the time it would take to restore your system is something you definitely ought to know. Enter this command sequence before leaving work one evening:

cd / umount /proc <the usual unmounting of NFS directories, option disks, etc.> timex tar c .

The last line is the important one. The tar operation is done without displaying any messages; once finished, the use of timex shows how long the tar command took to complete.

Fast Backup

As I type this section, I'm in the middle of upgrading the Challenge S server I run with the latest Varsity set (Aug98). I can't be bothered waiting hours for a DAT to finish (DDS3 not arrived yet), so I'm using a slightly different method...

The user files on my system reside in /home, which is a 4.5GB FastWide external SCSI disk. At present, this disk has 2.1GB of space free. The 4.5GB UltraSCSI system root disk only has 1.7GB used, so I figured what the hell, why not just backup the root disk onto the option disk?

However, one cannot use the normal disk cloning procedure to do this because the /home contents would be copied too (all 2GB of it). Thus, the items to be backed up must be specified precisely instead of using any kind of catch-all wildcard (usually the '.' character in the tar command).

After unmounting all non-/home NFS directories and /proc, this is the command sequence I entered ('yoda' is the name of the server):

  cd /home
  mkdir yodabackup
  cd /
  tar cvBpf - .A* .S* .X* .a* .c* .d* .e* .g* .i* .j* .l* .n* .p* .r* .s*
    .v* .w* .z* CDROM bin debug dev dumpster etc floppy lib lib32 mapleson
    opt proc sbin stand tmp unix usr var | (cd /home/yodabackup; tar xBpf -)

The tar command is one long line - it's split onto three lines here in order to be more readable (for the curious, my own account exists in /mapleson on my office admin Indy). This command archived everything into /home/yodabackup, without archiving any of /home itself at all. And since this was a disk-to-disk transfer, it was much faster than backing up to DAT.

NB: reading this article years later, I wonder why I didn't use xfsdump, but I remember now it's because having a simple tar dump means I can access any required file from the dump directory straight away.

Fast System Configuration Backup

If your system's software isn't particularly customised in any way, ie. it's just the system setup that's important (NIS, NFS, DNS, /etc/hosts, etc.) then a very quick way to 'backup' one's system is to only backup the key files which make the system unique, ie. copy the /etc, /var/netls, /var/flexlm, /var/www and /var/yp directories to somewhere safe, or put them onto DAT (your own system might have other important directories, but these five are the main ones).

If anything goes wrong and a reinstall is required, one would reinstall the OS from the base CDs and then use the contents of the backed-up /etc, /var/yp, etc. directories to change the system setup back to what it should be. Key files in /etc are hosts, sys_id, bootptab and a few others, but this technique grabs them all just to be sure. It's is very quick way of backing up the essential data, but a reinstall would require one to sit through the whole CD installation process (not so bad if you have a fast CDROM and good CPU).

For example, my office admin Indy doesn't have anything special on its root disk (all the really important sysadmin information is on my own external /mapleson-mounted 2GB disk), so this is the method I use to 'backup' the system. In my case, if anything goes wrong, I can clone the disk from one of the other lab machines that has the same basic installation, and then use the key backed-up files to restore the system to its correct state. A complete reinstall would thus take less than an hour, instead of 5 hours from a DDS1 DAT.

Conclusions

If you have a spare disk, or enough room on a disk already in use, consider exploiting it for temporary or fast backups (but not permanent backups).

If your system doesn't have anything special installed, and there's another machine one can clone a similar setup from, consider backing up just the essential files which show how the system was configured (/etc, /var/yp, etc.) However, make sure you've written down the procedures that you'd have to go through in order to make the changes (configuring NIS, etc.) I have a script that does much of the work for me, an example of which is shown above in the section on 'Installing an Operating System'.

Always have a fast CDROM! It'll make a difference if/when you have to install many items from multiple CDs. I've personally bought a Toshiba 32X CDROM for my own Indigo2 at home (model no. CD-XM-6201B). Complete information on this model of Toshiba CDROM is available on their web site.

Miscellaneous Hints and Tips

During a clean OS install, do I have to install older products which I know are going to be replaced/upgraded as I go through the CDs? eg. can I not install Netscape 2 and just install Netscape 3 later?

When installing a new OS like 6.2 which might involve many CDs (six in my case, counting the IDO, Varsity, Inventor, etc.) it's tempting to think one could get around having to install older products like Netscape 2 which one knows will be upgraded by later CDs (eg. Comso) to Netscape 3. Don't try it. Don't even think of it. The reason is that so many different products depend on each other it can become a nightmare trying to figure out what to install and what not to install, constantly going back to earlier CDs to install missing pieces and so on. Just bite the bullet, install initial old versions and upgrade them with later CDs such as Cosmo May 97, etc.

Yowch! swmgr is taking ages to install things, especially the rqsall procedure...

I had this problem. Sometimes the rqsall was taking well over 10 to 15 minutes to complete and on occasion quit out with errors. The solution? More memory! The difference was amazing. The client Indys I run all have 32MB RAM, but I decided to see what would happen if I put the 32MB from an unused Indy into the one I was dealing with. Well, not only did various background processes grab more memory than they had before (use 'gmemusage' to watch all this happening) but swmgr grabbed waaay more memory than was initially the case (well over 45MB sometimes) showing that the slowness I had noticed before was caused by the system swapping to disk. With 64MB instead of 32MB, the rqsall procedure took less than a minute instead of 15 minutes.

I'm lucky of course since I have spare Indys from which to temporarily shunt memory around. Incidentally, installing patch files seems to grab even more RAM on occasion (use gmemusage to watch what happens during the installation of patch 2262). Even with 64MB, rqsall was grabbing over 40MB at one point and kept swapping out - time to up it to 96MB while I get these patches done...

Can I use an external disk as a root disk for an Indy?

Yes! In fact that's what I did in order to provide a temporary system with extra software. Follow all the normal procedures to do this and remember to make the external drive SCSI ID 1. I don't know whether it makes much of a difference, but I also recommend making the external drive the first item in any device chain that might be present (the system I dealt with had a CDROM and DAT attatched too).

I'm short of space and want to NFS-mount things. What kind of directories can I do this to?

I deliberately decided to NFS-mount various products on Indys with 549MB disks to show how much slower some products would run compared to locally-store software (I'm expecting to upgrade the disks to 2GB; one machine has a 2GB disk for comparison). Here is a list of some of the items I found were NFS-mountable with 6.2 (these are in addition to the usual things one mounts such as /usr/share, /var/www and any user-data filesystems):

: /usr/java
/usr/CosmoPlayer
/usr/ProDev
/usr/lib/CosmoWorlds
/usr/lib/CosmoCreate
/usr/CosmoCode
/usr/WorkShop
/usr/SpeedShop
/usr/lib/debug
/usr/CaseVision
/usr/lib32
/usr/lib64
/usr/freeware
/usr/share

Note that /usr/lib32 should not be NFS-mounted with 6.5. Thanks to Pim van Riezen (pim@webcity.nl) for this information.

Of course, plenty of other things could be NFS-mounted as well, but many products have their software elements broadly distributed across several directories and so are harder to server-mount with a single NFS mount point. The above items just happen to capture entire software products in one go, especially subsystems like ProDev.

Annoyingly, one of the largest directories that one might want to NFS mount if one could is not worth attempting, namely /usr/lib/nonshared. The reason is that many software products put their non-shared libraries in /usr/lib instead of /usr/lib/nonshared. Typical! I suppose one could NFS-mount it anyway, but not all non-shared libraries would be 'captured' in this way so watch out - some unshared libraries are huge (eg. ImageVision).

I want to make /usr/share NFS-mounted because it takes up so much space. Is it safe to erase all of /usr/share on each client?

Delete all except /usr/share/lib/terminfo. The reason is that, without this directory, booting the machine in standalone mode will result in xterms that don't know anything about how to handle themselves with regard to on-screen terminal operations (eg. the 'clear' command has no effect). The terminfo directory only takes up just over 1MB so I think it's worth retaining it.

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)

[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]

[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

Disk and File System Administration

Last Change: 07/Sep/2006

Installing an Operating System Cloning A Root Disk Dealing With Patch Files Backing up a System Miscellaneous Hints and Tips

Installing an Operating System

Installing an OS on Multiple Clients

Cloning A Root Disk

Dealing With Patch Files

Backing up a System

[this article was originally written sometime in 1999, when I ran a student lab of SGI Indys]

NB: reading this article years later, I wonder why I didn't use xfsdump, but I remember now it's because having a simple tar dump means I can access any required file from the dump directory straight away.

Miscellaneous Hints and Tips

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

Installing an Operating System
Cloning A Root Disk
Dealing With Patch Files
Backing up a System
Miscellaneous Hints and Tips