Date: April 30, 2003
From: barefaced
HARD Reset and COLD Boot:
From the earliest days of electronics, when power is applied, the electronics must
start in a known state. But, transistors in digital feedback loops could end up in any
state -- what we call logic zero and logic one. This is unacceptable for any state
machine, such as computers that must start every time in a known state. Therefore,
computers include a 'power-on reset' circuit so that all critical states (registers,
memories, flip-flops, etc) power up the same way, every time.
'Power-on reset' signal is created by hardware and connects directly to every digital
storage device (logic state memory) by a direct (hardwired) electrical connection. Even
busses, such as the ISA bus, provided a reset pin to every ISA peripheral. Traditionally,
this connection is called "RESET" both on boards and on Integrated Circuits
(ICs). Furthermore, a protocol is defined in datasheets as to how that RESET signal should
be applied.
More specifically for:
- how long after DC power supply voltages have stabilized
- how long the reset pulse must be applied, and
- what else must happen during or after that reset pulse.
A little backtracking here... Some early digital logic families did not always provide
a master reset pin. When Texas Instruments defined pinout and function standards for their
Transistor-Transistor Logic (TTL) family, every state dependent logic chip included a
'reset' pin to master reset on power up. By using this standard feature, logic designs
would not, intermittently, end up in 'metastable' states. Metastable states could result
in a complete lockup of the electronics that could only be removed by power cycling.
"HARD" reset (should) make metastable states impossible.
Hardware reset signal created what is known as a "HARD" reset. Every state
dependent device connected to that system master reset signal was "HARD" reset
on powerup. "COLD" boot occurs when a computer first powers up. Better computer
designs "HARD" reset everything during a "COLD" boot.
However, this concept was not fully understood by some hardware designers. Even the
original IBM PC suffered from neophyte designers; Some peripherals did not
"HARD" reset during a "COLD" boot. For example, many video cards,
floppy disk controllers, keyboards, and modems had no connection to a system master reset.
These devices generated their own reset internally, and only in response to power cycling
("COLD" boot). The only way to restore operation was "COLD" boot.
CPU Startup on Reset:
Fundamental to a computer design is the first program address executed in response to a
"HARD" reset. For Intel 8080 series CPUs, the first memory address executed was
location 0000h. 80x86 series CPUs first executed memory location 983040, more commonly
known as F000:0000h. IOW, on power up, 'power-on reset' signal sets a computer's
instruction register to a fixed and standard address. In a PC, that is also the first BIOS
instruction.
The BIOS, in turn, locates and initialize all peripherals. For example, memory does not
get reset by powerup. Memory is first tested by BIOS writes and reads to:
- confirm how much memory exists
- see if memory is functional
- preset parity memory bits
Final memory values are unique to each PC's BIOS and are set only when BIOS is
restarted; usually in response to a "HARD" reset. BTW, memory parity
errors may create "SOFT" resets (described later).
Another peripheral initialized by the BIOS is the keyboard. After a keyboard's own
internal computer has initialized, then the BIOS would issue commands to preset that
keyboard. For example, a keyboard must be told whether to start with Num Lock key on or
off. Because "HARD" reset does not initialize a keyboard, a keyboard is preset
by software initialization commands (BIOS and OS).
The same is true of many modems, floppy controllers, and video controllers of ISA
vintage. If those peripherals did not power up to a known state, then sometimes BIOS could
not initialize those peripherals. Again, the only way to clear this intermittent failure
was to "COLD" boot.
Most peripherals, such as those inside motherboard chipsets, are also reset by this
"HARD" reset, as was standard Intel practice when those functions were sold as
individual Intel chips. But the original design of the IBM PC suffered from many designers
without good technical background. Some peripherals in the early IBM PC design did not
respond to a "HARD" reset.
SOFT Reset and WARM Boot:
A second method of resetting a state machine is "SOFT" reset. In this case,
the machine itself would execute a software instruction that might or might not trigger
the master reset line. IOW, a "SOFT" reset could also create a "HARD"
reset. But often, "SOFT" resets only implemented BIOS initialization code, and
avoided some first BIOS functions such as power-on memory test. So, many types of
"SOFT" resets were possible. A "SOFT" reset might
- trigger the master reset line ("HARD" reset), or
- only execute some initialization code in a BIOS to reboot the Operating System (OS), or
- issue an I/O reset command to one peripheral.
Fundamental to "SOFT" resets is that power is never removed. A reset with no
power cycling is called a "WARM" boot. Therefore devices that create their own
internal reset due to power cycling, instead, did not reset. They remained unchanged by
the "SOFT" reset.
Typical examples of "WARM" boots without "HARD" reset are the BIOS
interrupt INT 25 (19h) and the keyboard's three finger salute (Cntrl-Alt-Del). How
"WARM boot executes is computer and BIOS dependent.
Real and Protected Modes:
Another type of "SOFT" reset was used in a kludge design of the original
IBM-AT. The kludge involved moving a CPU from Protected mode back to Real mode. DOS
executes in Real mode, also known as 16 bit mode. But starting with the Intel 80286, the
CPU included another instruction set called Protected or 32 bit mode which is used by OS/2
and Windows NT. Early Intel designers provided an instruction to go from Real (16 bit) to
Protected (32 bit) mode. Before the PC existed, the theory was: there is no reason to go
from superior Protected mode back to Real mode.
A method using "SOFT" resets permitted Protected mode to Real mode transition
without a new CPU design .
In PCs, a "SOFT" reset instruction would cause the computer to reboot by
executing only some of the BIOS program; without doing preliminary initialization and
verification such as memory tests. Because power is not removed ("WARM" boot),
then state dependent devices could remain at a known state. Just before starting the
"SOFT" reset, the CPU would start a timer inside the keyboard's computer. Then
when the "SOFT" reset started, that keyboard timer would issue an interrupt to
instruct the CPU to remain in Real mode; to not boot into Protected mode. Therefore, the
computer had switched from Protected mode back to Real mode using a "WARM" boot
- a kludge.
The problem with this procedure was that it took tens of milliseconds - forever to a
computer. Just another kludge that worked, but made early OS/2 a pathetic, 80286 based,
Operating System.
Reset Failures:
Sometimes complex internal hardware or software gets befuddled. A unique event puts the
peripheral into a non-functional state. This is especially a problem for, what we now
call, hardware modems. For example, if the modem's compression algorithm might fail, the
modem's internal computer would ignore all instructions from the CPU and the ISA bus.
Neither "HARD" nor "SOFT" resets would reset this modem failure. Only
a "COLD" boot clears this failure because the peripheral did not properly
implement reset.
How HARD Resets are Created:
Early hardware would immediately create a "HARD" reset when DC voltage was
first applied, and then hold that reset line active for so many hundred milliseconds. This
permitted DC power to stabilize before the CPU began execution. Also, CPUs required reset
to be held for so many clock cycles, so that the CPU was properly initialized. However,
early circuits, using a timing capacitor and resistor, could easily be fooled by unstable
input power. If AC mains voltage held long enough for the reset circuit to start, dropped
for a short time, and then recovered, the 'power-on reset' circuit could release before
computer voltages had stabilized. Strange things would happen to a computer when power
turned off and on both quickly and too many times. All directly traceable to a bad
'power-on reset' design, and to peripherals that did not respond to "HARD"
resets.
Better designs included a digital timer that maintained 'power-on reset' for so many
milliseconds after DC voltage had remained in spec. Many clone motherboards did not
implement this more comprehensive reset driver. Furthermore, motherboards differed as to
response to a loss of Power Supply's "Power Good" signal. If "Power
Good" is lost (power supply detects bad voltage output), then the computer should
"HARD" reset. But what happens to a computer system, when many "Power
Good" is lost and peripherals are not reset by "HARD" reset? The answer:
Unstable operation or a system crash when the partial power loss is restored.
Also some early Intel chips had an internal problem with the CPU's response to reset.
Intel identified a rare event where some CPUs did not properly reset everytime. BIOS
manufacturers were instructed to perform a second "SOFT" reset to the CPU.
Again, some clone motherboards did not always get it right.
Sometimes a CPU upgrade requires that a "HARD" reset last longer. However,
some motherboard designs are so marginal that 'power-on reset' does not always last long
enough for an upgrade CPU. In other words, sometimes the computer will not always
boot properly the first time due to bad 'power-on reset' circuit design.
Realtime Computing and Reset:
PCs are inferior as real-time computers. All software and hardware can fail. Signal to
noise ratio in digital logic dictates that the computer will fail to transfer a bit once
every (maybe) 10 or 100 years. Therefore real-time computers include a Watchdog Timer
(WT). Computer software must be written so that it issues a heartbeat to the WT within a
certain time period. If WT does not receive that heartbeat, then WT:
- Assumes the computer has crashed, and
- "HARD" resets the computer.
But that means that a "HARD" reset must, literally, reset everything
including peripherals controlled by that computer.
As noted previously, PCs do not properly utilize the "HARD" reset everywhere.
PCs sometimes require a "COLD" boot to properly startup. Watchdog Timer does a
WARM boot to fully reset all hardware - which PCs just do not execute properly for
real-time operation.
Real-time computers must create a "HARD" reset:
- on powerup,
- due to a Watchdog Timer, and
- if any critical power supply voltage goes out of specification.
PCs don't provide this essential hardware.
What happens to a PC controller that crashes as the 30 ton steel part is moving across
a floor? People are killed because reset was not properly implemented.
And you thought reset was a simple function? Now for PCI.
PCI and Resets:
The PCI bus is, literally, a network, independent of the CPU, with central controllers.
The process of configuring a PC's PCI bus is called Plug'n'Play (PnP).
Every PCI card is first set to a known state by the PCI signal called RST#. Every PCI
card must initialize in response to this master reset signal. No operations are to occur
for 2 to the 25th power clock times - or about 0.5 seconds on a 66 MHz PCI bus - after
RST# is released. Unlike older busses, for the PCI bus, "HARD" reset continues
for 33 million clocks after the reset signal. Once reset, every PCI 'target' is
unprogrammed and sits in a "ready to configure" mode.
ISA peripherals were hardwired to known, standard addresses and interrupt channels --
sometimes changed by jumpers or DOS configuration software. PCI peripherals are
reprogrammed by PnP after a "HARD" reset. A "HARD" reset sets a PCI
peripheral (target) into a "ready to configure" mode. Each PCI target is then
read, identified, and configured with an I/O address and an interrupt channel number as
required. These assignments are based on information stored in the target's 64 32 bit
memory locations including a Device ID, Vendor ID number, built in self-test, and
sometimes even special driver software for that target.
When all PCI peripherals are programmed (typically by PnP), then "ready to
configure" mode terminates.
Hot popping (removing a PCI card when power is still applied) is permitted only on
specially designed machines. Power is removed from a PCI card slot, a card is removed or
installed, and then that slot is powered on and reset into a "ready to
configure" mode. This would be an example of "HARD" resetting a PCI slot
using "SOFT" resets.
Another example of "SOFT" resets involves resetting one PCI bus that is
interconnected by a 'PCI to PCI bridge' (a bus master). "SOFT" reset is issued
to that bus's master so that bus master can "HARD" reset targets only on that
bus.
Summary:
A "COLD" boot creates a 'power-on reset' that, in turn, creates a
"HARD" reset. A "WARM" boot may or may not issue a "HARD"
reset. A "SOFT" reset can be created by a "WARM" boot or simply
by normal computer execution. A "HARD" reset can be created by a 'power-on
reset' or by other sources such as a Watchdog Timer or a voltage monitor.
Booting from a power on condition is a "COLD" boot. Booting without power
cycling is a "WARM" boot.