unix sysadmin archives
Donation will make us pay more time on the project:
          

Thursday 1 December 2011

How to Force a Crash Dump When the Solaris Operating System is Hung

In most cases, a system crash dump of a hung system can be forced. However, this is not guaranteed to work for all system hang conditions. To force a dump, you often need to drop down to the boot PROM monitor (OBP) prompt, also known as the "OK prompt", suspending all current program execution.

There are several ways to drop a Sun system to the OK prompt.
1. On older Sun systems with a serial (PS2 type) Sun keyboard and monitor attached, this suspension is performed via a "Stop-A". The upper left key on a Sun keyboard is labeled "Stop". While holding down this key, press the A key.

2. On systems using ASCII terminals for the console, the terminal's predefined break sequence can be used to get to the boot PROM monitor.

3. Newer Sun systems with USB keyboards may require an alternate sequence.

4. Some Sun systems have a system controller/SSP (Enterprise 10000/15000, Sun Fire X800) or ALOM/RSC (Vx80/Vx90 and most new Netra servers) instead of serial port/keyboard access. These can be used to break a hanging system or domain.


Note: There special procedures for Sun SPARC(R) Enterprise Mx000 (OPL) Servers, T1000/T2000 systems, x86 and x64 systems.


The boot PROM monitor will respond with:

Type 'go' to resume
ok

If you don't see this message, you were probably not successful in stopping the system.

Once at the ok prompt, type 'sync' (without the quotes) and press Enter.

The system will immediately panic. Now the hang condition has been converted into a panic, so an image of memory can be collected for later analysis. The system will attempt to reboot after the dump is complete.

The sync command forces the computer to illegally use location, therefore causing a panic: zero. On later revisions of Solaris 8 and above you will see a panic: sync initiated

Not all hang situations can be interrupted. If Stop-A or Break doesn't work, sometimes a series of the same will do the trick. Some hangs are even more stubborn and can only be interrupted by physically disconnecting the console keyboard or terminal from the system for a minute, and then plugging it back in.

If all these attempts fail, you will have to power down the system, thus sadly losing the contents of memory. With luck, a subsequent hang will be interruptable.


NOTE: On the systems with keyswitches, be sure the key is not in the secure position, as this disables the break interrupt in the zs driver.

No comments:

Post a Comment