unix sysadmin archives
Donation will make us pay more time on the project:
          

Wednesday 31 August 2011

Sun Hardware Diagnosis at OBP level

It is so common that servers encounter issues related to hardware and those errors cannot be diagnosed by Operating system level utilities.  To perform preliminary diagnosis and to pin point the hardware trouble, system admins have to rely on OBP (Open Boot PROM) diagnosis options.
at OBP level, system admin have three options to investigate the issue
  1. OBP diagnosis commands
  2. OBDiag Outputs
  3. POST Errors

1.Using OBP Commands

Below are the some of the OBP commands that system admins with advanced skill on hardware can use to investigate the trouble.
banner
Displays the power on banner. The banner includes information such as CPU speed, OBP revision, total system memory, ethernet address and hostid.
.enet-addr
Displays the ethernet address
led-off/led-on
Turns the system led off or on.
nvstore
Copies the contents of the temporary buffer to NVRAM and discards the contents of the temporary buffer.
power-off/power-on
Powers the system off or on.
printenv
Displays all parameters, settings, and values
probe-fcal-all
dentifies Fiber Channel Arbitrated Loop (FCAL) devices on a system. 1
probe-sbus
Identifies devices attached to all SBUS slots. Note - This command works only on systems with SBUS slots.
probe-scsi
Identifies devices attached to the onboard SCSI bus. 1
probe-scsi-all
Identifies devices attached to all SCSI busses. 1
set-default parameter
Resets the value of parameter to the default setting.
set-defaults
Resets the value of all parameters to the default settings. Tip - You can also press the Stop and N keys simultaneously during system power-up to reset the values to their defaults.
setenv parameter value
Sets parameter to specified value. Note - Run the reset-all command to save changes in NVRAM.
show-devs
Displays all the devices recognized by the system.
show-disks
Displays the physical device path for disk controllers.
show-displays
Displays the physical device path for frame buffers.
show-nets
Displays the physical device path for network interfaces
show-post-results
If run after Power On Self Test (POST) is completed, this command displays the findings of POST in a readable format.
show-sbus
Displays devices attached to all SBUS slots. Similar to probe-sbus .
show-tapes
Displays the physical device path for tape controllers.
sifting string
Searches for OBP commands or methods that contain string. For example, the sifting probe command displays probe-scsi, probe-scsi-all, probe-sbus, and so on.
.speed
Displays CPU and bus speeds
test device-specifier
Executes the selftest method for device-specifier. For example, the test net command tests the network connection.
test-all
Tests all devices that have a built-in test method.
.version
Displays OBP and POST version information.
watch-clock
Tests a clock function.
watch-net
Monitors the network connection for the primary interface.
watch-net-all
Monitors all the network connections.

 

2.OBDiag


OBDIAG can be used to diagnosis main logic board as well interface boards ( e.g.  PCI /  SCSI / Ethernet / Serial/ Parallel / Keyboard/mouse / NVRAM / Audio /  Video )
To run OBDIAG simply run
OK> obdiag
You can also set up OBDiag to run automatically when the system is powered on using the following methods:
    1. Set the OBP diagnostics variable:              ok setenv diag-switch  true
    2. Press the Stop and D keys simultaneously while you power on the system
Note: On Ultra Enterprise servers, just turn the key switch to the diagnostics position and power on the system, to start obdiag.

 

3.POST

POST is a program that resides in the firmware of each board in a system, and it is used to initialize, configure, and test the system boards. POST output is sent to serial port A  and POST completion status will be indicated by the status LEDs
You can watch POST ouput in real-time by attaching a terminal device to serial port A. If none is available, you can use the OBP command show-post-results to view the results after POST completes.
How To Run POST
  • Attach a terminal device to serial port A.
  • Set the OBP diagnostics variable:ok
ok setenv diag-switch true
  • Set the desired testing level. Two different levels of POST can be run, and you can choose to run all tests or some of the tests. Set the OBP variable diag-level to the desired level of testing (max or min), for example:
ok setenv diag-level max
  • If you wish to boot from disk, set the OBP variable diag-device :
ok setenv diag-device : disk   (  The system default for this variable is net).
  • Set the auto-boot variable
ok setenv auto-boot false
  • Save the changes
ok reset-all
  • Power cycle the system (turn it off, and then back on).
POST runs while the system is powered on, and the output is displayed on the device attached to serial port A. After POST is completed, you can also run the OBP command show-post-results to view the results.
LED STATUS
Power LED ( Left position)
Should always be on. If all three LEDs are off, suspect a power problem. If this LED is in any other state than on and steady, it indicates a problem.
Service LED (Middle Position)
This LED should be off in normal operation. If on, a component is in an error state and you should check check individual board LEDs. A lit service LED does not imply there is an OS-related problem.
Cycling LED ( Right Position)
This LED should be flashing — this is the normal state.

Thursday 11 August 2011

How to collect a snapshot from an M-Series machine

To collect a snapshot from a Mx000 system you will need the following information:

1.  The name (or IP) of a server on the same subnet as the XSCF
    (or service processor) where you can store the snapshot data.
2.  A user name and password for the server you will be storing the data on.
3.  The full path on the server where you want snapshot to store the data.

The syntax for running the snapshot command on the XSCF is as follows:

snapshot -LF -t username@servername:/full_path_to_data_location -k download

OR you may use the "none" option:

snapshot -LF -t username@servername:/full_path_to_data_location -k none

You will be prompted twice while snapshot is running:

1.  Accept this public key (yes/no)?  Y
2.  Enter ssh password for user '/username/' on host /servername/

*** Once you have gathered the snapshot, please rename the output file to
include your SR number, then upload this file into the /cores directory at
the http://supportuploads.sun.com site. ***

Note:  As an alternative option for collecting a snapshot, you may also direct
the output to a USB memory stick with the following command:

XSCF> snapshot -d usb0

In the event you are unable to collect a complete snapshot the output from the following four XSCF commands can be substituted:

showstatus
showhardconf
fmdump -m
fmdump -V