unix sysadmin archives
Donation will make us pay more time on the project:

Wednesday, 31 August 2011

Sun Hardware Diagnosis at OBP level

It is so common that servers encounter issues related to hardware and those errors cannot be diagnosed by Operating system level utilities.  To perform preliminary diagnosis and to pin point the hardware trouble, system admins have to rely on OBP (Open Boot PROM) diagnosis options.
at OBP level, system admin have three options to investigate the issue
  1. OBP diagnosis commands
  2. OBDiag Outputs
  3. POST Errors

1.Using OBP Commands

Below are the some of the OBP commands that system admins with advanced skill on hardware can use to investigate the trouble.
Displays the power on banner. The banner includes information such as CPU speed, OBP revision, total system memory, ethernet address and hostid.
Displays the ethernet address
Turns the system led off or on.
Copies the contents of the temporary buffer to NVRAM and discards the contents of the temporary buffer.
Powers the system off or on.
Displays all parameters, settings, and values
dentifies Fiber Channel Arbitrated Loop (FCAL) devices on a system. 1
Identifies devices attached to all SBUS slots. Note - This command works only on systems with SBUS slots.
Identifies devices attached to the onboard SCSI bus. 1
Identifies devices attached to all SCSI busses. 1
set-default parameter
Resets the value of parameter to the default setting.
Resets the value of all parameters to the default settings. Tip - You can also press the Stop and N keys simultaneously during system power-up to reset the values to their defaults.
setenv parameter value
Sets parameter to specified value. Note - Run the reset-all command to save changes in NVRAM.
Displays all the devices recognized by the system.
Displays the physical device path for disk controllers.
Displays the physical device path for frame buffers.
Displays the physical device path for network interfaces
If run after Power On Self Test (POST) is completed, this command displays the findings of POST in a readable format.
Displays devices attached to all SBUS slots. Similar to probe-sbus .
Displays the physical device path for tape controllers.
sifting string
Searches for OBP commands or methods that contain string. For example, the sifting probe command displays probe-scsi, probe-scsi-all, probe-sbus, and so on.
Displays CPU and bus speeds
test device-specifier
Executes the selftest method for device-specifier. For example, the test net command tests the network connection.
Tests all devices that have a built-in test method.
Displays OBP and POST version information.
Tests a clock function.
Monitors the network connection for the primary interface.
Monitors all the network connections.



OBDIAG can be used to diagnosis main logic board as well interface boards ( e.g.  PCI /  SCSI / Ethernet / Serial/ Parallel / Keyboard/mouse / NVRAM / Audio /  Video )
To run OBDIAG simply run
OK> obdiag
You can also set up OBDiag to run automatically when the system is powered on using the following methods:
    1. Set the OBP diagnostics variable:              ok setenv diag-switch  true
    2. Press the Stop and D keys simultaneously while you power on the system
Note: On Ultra Enterprise servers, just turn the key switch to the diagnostics position and power on the system, to start obdiag.



POST is a program that resides in the firmware of each board in a system, and it is used to initialize, configure, and test the system boards. POST output is sent to serial port A  and POST completion status will be indicated by the status LEDs
You can watch POST ouput in real-time by attaching a terminal device to serial port A. If none is available, you can use the OBP command show-post-results to view the results after POST completes.
How To Run POST
  • Attach a terminal device to serial port A.
  • Set the OBP diagnostics variable:ok
ok setenv diag-switch true
  • Set the desired testing level. Two different levels of POST can be run, and you can choose to run all tests or some of the tests. Set the OBP variable diag-level to the desired level of testing (max or min), for example:
ok setenv diag-level max
  • If you wish to boot from disk, set the OBP variable diag-device :
ok setenv diag-device : disk   (  The system default for this variable is net).
  • Set the auto-boot variable
ok setenv auto-boot false
  • Save the changes
ok reset-all
  • Power cycle the system (turn it off, and then back on).
POST runs while the system is powered on, and the output is displayed on the device attached to serial port A. After POST is completed, you can also run the OBP command show-post-results to view the results.
Power LED ( Left position)
Should always be on. If all three LEDs are off, suspect a power problem. If this LED is in any other state than on and steady, it indicates a problem.
Service LED (Middle Position)
This LED should be off in normal operation. If on, a component is in an error state and you should check check individual board LEDs. A lit service LED does not imply there is an OS-related problem.
Cycling LED ( Right Position)
This LED should be flashing — this is the normal state.

Thursday, 11 August 2011

How to collect a snapshot from an M-Series machine

To collect a snapshot from a Mx000 system you will need the following information:

1.  The name (or IP) of a server on the same subnet as the XSCF
    (or service processor) where you can store the snapshot data.
2.  A user name and password for the server you will be storing the data on.
3.  The full path on the server where you want snapshot to store the data.

The syntax for running the snapshot command on the XSCF is as follows:

snapshot -LF -t username@servername:/full_path_to_data_location -k download

OR you may use the "none" option:

snapshot -LF -t username@servername:/full_path_to_data_location -k none

You will be prompted twice while snapshot is running:

1.  Accept this public key (yes/no)?  Y
2.  Enter ssh password for user '/username/' on host /servername/

*** Once you have gathered the snapshot, please rename the output file to
include your SR number, then upload this file into the /cores directory at
the http://supportuploads.sun.com site. ***

Note:  As an alternative option for collecting a snapshot, you may also direct
the output to a USB memory stick with the following command:

XSCF> snapshot -d usb0

In the event you are unable to collect a complete snapshot the output from the following four XSCF commands can be substituted:

fmdump -m
fmdump -V