Bring-Up & First Power · #23 of 52

Rails, Clocks & Reset

Verifying the First Layer of Life

The smoke test passed. The current-limited supply ramped up, the board drew a sane idle current, nothing got hot, nothing let go. You are tempted to plug in the debugger and load firmware right now. Resist that. A board that has survived power is not the same as a board that is alive. Underneath the firmware there is a silent layer the CPU depends on without ever asking your permission: the rails that feed it, the clock that paces it, and the reset that tells it when to wake. Get one of those three wrong and your perfectly good code runs on a machine that was never born.

Before you trust a single instruction, prove the rails, the clock, and the reset.

The previous lesson got power onto the board safely. This one verifies that the power actually built a working substrate underneath the processor. There is an order here too, and it is the same shape as before: lowest layer first, highest layer last. You check the rails before the clock, and the clock before reset, because reset is only meaningful once voltage and timing already exist. Three things, in that order, and only then does firmware get a vote.

By the end, you can

Verify each rail by its voltage, its ripple, and its place in the power-up sequence
Explain why some SoCs and FPGAs latch up when I/O rails come up before the core rail
Confirm a crystal oscillator has started by checking both its frequency and its amplitude, and recognize a long-ground-lead ringing artifact
Confirm power-on reset (POR) asserts then releases with timing that lets the rails settle before the CPU runs

Intuition first

Think of bringing a board to life the way an anaesthetist thinks of waking a patient. You do not start by asking the patient to do arithmetic. You check the vital signs first, in a fixed order: is there blood pressure (the rails), is there a steady pulse (the clock), and did the patient actually come out of sedation cleanly rather than half-awake (reset). Only when those three read normal do you ask for the first conscious act. Skip the vitals and a confusing answer to "can you hear me?" might mean the patient is fine, or might mean you are talking to someone who is still under.

Firmware is that first conscious act. If the rails are low, or the clock never started, or reset released too early, the CPU may do something, and that something will look like a firmware bug. You will burn an afternoon single-stepping perfectly correct code on a machine whose heartbeat is missing. The whole point of this lesson is to make the first layer of life trustworthy so that when firmware misbehaves later, you know the fault is in the firmware and not in the body running it.

There are exactly three vitals, and each has a specific way it can lie to you.

Vital one: the rails (voltage, ripple, sequence)

A "rail" is one regulated supply voltage with its own distribution across the board. The robot hand's finger-driver board has three: a 3.3 V logic rail for the MCU and digital glue, a 5 V sensor rail for the analog front end and the magnetic encoders, and a motor rail (often 12 V or 24 V) for the H-bridges that drive the finger tendons. Each one has to pass three separate questions, and "the LED lit" answers none of them.

Voltage. Put the DMM on each rail at its decoupling capacitor, not at the connector, and read it against the same ground every time. A 3.3 V rail that measures 3.05 V is not "close enough." Many parts have a minimum operating voltage and a brown-out threshold a few hundred millivolts below nominal, and a rail sagging under load is the classic cause of a CPU that resets at random under heavy current draw. Write the number down per rail.

Ripple. A rail is never a flat DC line. The regulator switches, the load pulses, and what rides on top of the DC is ripple. You cannot see ripple on a multimeter (it averages it away); you need the oscilloscope, AC-coupled, on a tight scale. A few tens of millivolts of ripple on a 3.3 V logic rail is usually fine. Hundreds of millivolts, or a sharp spike that coincides with a motor commutation event, is a rail that will inject noise into your ADC readings and occasionally trip a brown-out. Ripple is the difference between "the average is right" and "the worst instant is safe."

Sequence. This is the one beginners miss, and it is the one that destroys parts. The rails must come up in a defined order, and they must stay within defined windows relative to each other while they ramp. Order matters because of how the chips are built, not because of preference. We will unpack the failure mode next, but the rule to hold now is: the core (lowest-voltage) rail generally comes up first, the I/O and analog rails follow, and the high-voltage motor rail is enabled last so a tendon never twitches before the brain that commands it has settled.

Why order matters: latch-up

Here is the physics, because "do it in this order or else" is a rule you will actually follow once you can see the or else.

Every CMOS input pin has protection structures, and the layout of the n-wells and p-substrate underneath them forms an unintended pair of parasitic bipolar transistors: a PNP and an NPN cross-wired into the structure of a thyristor (an SCR). In normal operation that parasitic device is dormant. But if an I/O pin is driven to a voltage above its own supply rail (which is exactly what happens when the I/O rail is still at zero while a 5 V sensor line is already live, or when the I/O rail comes up well ahead of the core), current is injected where it should not be, the parasitic transistors turn each other on, and the thyristor latches. Once latched it is a low-resistance path from supply to ground that the chip created inside itself, and it holds until you remove power, often after the part has already cooked.

So "core before I/O" is not superstition. It guarantees that no pin is ever driven beyond a rail that has not yet arrived. The current-limited supply from the previous lesson is your backstop here: if a sequencing error does provoke latch-up on first power, the limit pins the fault current and you watch the rail collapse instead of watching a part die. Sequence correctly and keep the leash on.

Vital two: the clock (frequency and amplitude)

Digital logic does nothing without a clock. On most boards the heartbeat starts at a crystal oscillator: a small slice of quartz that flexes when you put a voltage across it (inverse piezoelectricity) and generates a voltage as it springs back. Wired into an amplifier with feedback, that flex-and-spring becomes a self-sustaining oscillation at the crystal's mechanical resonance, and because quartz is mechanically sharp (a very high Q factor) the frequency is stable to parts per million. The MCU divides and multiplies that reference up into every clock the chip needs.

Startup is not instant. At power-on the amplifier places the crystal in an unstable equilibrium and amplifies its own thermal noise; the resonant frequency wins out and ramps up over a startup time that can be hundreds of microseconds to milliseconds. So confirming the clock means confirming two things, not one.

Frequency. Probe the crystal pin (usually the output side, often labelled XO or XOUT) and confirm the oscillation is at the rated frequency. A crystal marked 40 MHz that is running at 13 MHz is oscillating on the wrong mode or being pulled by a wrong load capacitor, and the MCU's derived clocks will all be wrong.

Amplitude. A crystal that is starting but not building to full swing is a crystal that will fail intermittently, especially cold or under supply noise. You want to see it reach a healthy peak-to-peak swing, not a sickly wobble near the noise floor. A weak swing points at a wrong load capacitance, too little drive, or a contaminated board near the oscillator.

There is a measurement trap here that is so common it deserves its own warning. The crystal node is a high-impedance, low-amplitude analog signal, and a standard 10x scope probe with its long spring-clip ground lead will load the oscillator and ring. The ringing you then see is an artifact of your probe, not the signal on the board. Use a low-capacitance probe or an active probe, and use the shortest possible ground, the little ground spring tip right at the probe barrel, not the six-inch alligator lead. A long ground lead forms a loop inductance that resonates with the probe capacitance; the overshoot and ringing it produces are yours, manufactured by the way you measured. You will see this exact lesson again, in depth, when you study probing.

Vital three: reset (assert, then release, with timing)

The last vital is the one that decides when the CPU is allowed to start, and it is the most quietly dangerous because a board with a botched reset often half-works. A power-on reset (POR) is a small circuit whose job is to hold the processor in a known idle state while power and clock are still stabilizing, and then to let go cleanly once everything is valid. It must do two things in order: first assert (drive reset active and keep the CPU pinned), then release (deassert and let the CPU fetch its first instruction) only after the rails have settled and the oscillator is running.

The classic implementation is an RC network feeding a Schmitt-trigger input. When power arrives, the capacitor charges through the resistor; the reset line is held active until the rising RC voltage crosses the Schmitt threshold, at which point reset releases. The R and C are chosen so the charge time is longer than the worst-case time for the supply to stabilize and the clock to start. The Schmitt trigger matters because it gives a clean, hysteretic edge on release; without it, a slowly rising node near the threshold would chatter the reset line and the CPU might start, stop, and start again.

Confirming reset on the bench means watching the reset pin on the scope, triggered on power-up, and seeing the shape: it goes active when power appears, stays active through the rail ramp and the oscillator startup, and then makes one clean release edge after the rails are valid. If reset releases before the rails settle, the CPU starts on a sagging supply and runs from an undefined state, the kind of fault that masquerades as a random firmware crash for days.

See it: the three vitals as a sequence

No interactive widget here, just the picture you are verifying. Read it bottom to top, because that is the order the board comes alive and the order you check it:

A timing diagram showing the core rail, sensor rail, motor rail, oscillator, and reset release sequence. — Reset release belongs to the right of every dependency. The first instruction should happen only after the rails are valid and the clock has settled. · TooFoo original SVG

The debrief is the whole lesson in one sentence: the CPU's first instruction must land to the right of every other event on that diagram. Rails valid, clock swinging, reset released, in that order, with margin. If your scope shows reset releasing while a rail is still climbing or the crystal is still a wobble, you have found a real fault before firmware ever got the blame.

On the finger-driver board the 5 V sensor rail comes up several milliseconds before the 3.3 V core rail. On a larger SoC this same out-of-order ramp risks what failure?

You probe the crystal pin with a 10x scope probe on its six-inch alligator ground lead and see heavy ringing on the waveform. What is the most likely explanation?

Lab: the vitals pass

With the board on the current-limited supply from last lesson, run the three vitals in order and log a one-line verdict for each. Rails: DMM each rail at its decoupling cap for voltage (3.3V rail → 3.29 V, OK), then scope each rail AC-coupled for ripple (logic rail ripple → 28 mV pp, OK). Sequence: if you have a multi-channel scope, capture two rails together on power-up and confirm core leads I/O and the motor rail comes last; if not, reason it from the enable chain on the schematic. Clock: with a low-capacitance probe and a short ground spring, confirm the crystal reaches rated frequency and full amplitude. Reset: scope the reset pin triggered on power-up and confirm one clean release edge that lands after the rails are valid. You should leave the bench with five logged numbers and a clear statement: this board has rails, a clock, and a clean reset, in the right order. Now firmware gets a vote.

Why the RC reset fights the supply ramp, and what a real crystal startup looks like in math

The standard power-on reset is an RC network into a Schmitt trigger. Call the reset threshold $V_{th}$ and the rail it charges toward $V_{cc}$ . With the supply already stable, the node voltage follows

V_C(t) = V_{cc}\left(1 - e^{-t/RC}\right)

and reset releases when $V_C$ reaches $V_{th}$ , at time $t_{rel} = RC \ln\!\left(\dfrac{V_{cc}}{V_{cc} - V_{th}}\right)$ . You size $RC$ so that $t_{rel}$ exceeds the worst-case sum of supply-settling time and oscillator startup time, giving the rails and clock margin to become valid before the CPU runs.

The trap the Wikipedia article on power-on reset names is the slow supply ramp. The clean formula above assumes $V_{cc}$ snaps to its final value, so the capacitor charges relative to a fixed target. If instead $V_{cc}(t)$ rises slowly, the capacitor charges alongside the rail rather than lagging it, and the node can already sit above $V_{th}$ by the time the Schmitt stage is itself powered enough to act. No reset pulse is delivered to the core. This is precisely why a bare RC is fragile and why production designs use a supervisor IC that monitors the actual rail voltage against a bandgap reference, asserting reset whenever the rail is below a guaranteed-valid level regardless of how fast or slow it ramps.

On the clock side, a crystal models as a high-Q RLC branch (motional $L_1$ , $C_1$ , $R_1$ ) in parallel with a static shunt capacitance $C_0$ , with series and parallel resonances at

\omega_s = \frac{1}{\sqrt{L_1 C_1}}, \qquad \omega_p = \omega_s \sqrt{1 + \frac{C_1}{C_0}} .

Startup is exponential: the loop's small-signal gain must exceed unity, and the amplitude grows roughly as $e^{(t/\tau)}$ until amplifier nonlinearity clamps it, where $\tau$ scales with $Q$ and the negative resistance the circuit presents. A typical quartz $Q$ of $10^4$ to $10^6$ (against perhaps $10^2$ for an LC oscillator) is why the frequency is so stable and why startup, though not instant, settles to a sharp line. That same high-impedance, low-amplitude node is exactly what a long probe ground lead corrupts: the lead's loop inductance $L_{lead}$ resonates with the probe input capacitance $C_{probe}$ near $f \approx 1/(2\pi\sqrt{L_{lead} C_{probe}})$ , manufacturing ringing that is an artifact of the instrument, not of the board. Note one reconciliation with the authoritative framing of this lesson: the Wikipedia crystal article stresses long-term accuracy and aging, while on the bench your job is only the binary question of did it start, at the right frequency and full amplitude; the long-term ppm story matters for product calibration, not for first bring-up.

Grounded in Wikipedia: "Crystal oscillator", "Power-on reset" (CC BY-SA).

Key takeaways

A board that survived power is not yet alive: prove rails, clock, and reset before firmware gets a vote.
Each rail must pass three tests, not one: voltage (DMM), ripple (scope, AC-coupled), and sequence (the right order, with margin).
Out-of-order rails can latch up an SoC or FPGA: an I/O pin driven above a not-yet-present rail fires a parasitic thyristor into a self-made short.
Confirm the crystal by both frequency and amplitude, with a low-capacitance probe and a short ground. A long ground lead's ringing is your artifact, not the signal.
Power-on reset must assert then release, with the release edge landing only after rails settle and the clock runs.

Practice 1 warm-up

You measure the finger-driver board's rails: 3.3 V logic reads 3.28 V with 25 mV of ripple, 5 V sensor reads 4.97 V with 40 mV of ripple, and the motor rail reads its nominal 12 V with 90 mV of ripple. Which single measurement, by itself, is missing from this report that the lesson insists you also verify?

Show worked solution

The sequence. You have voltage and ripple for all three rails, which are two of the three rail tests, but nothing here proves the rails came up in the right order (core 3.3 V first, 5 V sensor next, motor rail last). Two rails captured together on a multi-channel scope at power-up, or a reasoned trace of the enable chain on the schematic, is the missing measurement. Voltage and ripple can all read perfect on a board that still latches up because it sequenced wrong.

Practice 2 core

A crystal is rated 24 MHz. On the scope you see oscillation at 24 MHz, but the peak-to-peak swing sits low, barely above the noise, and the board boots only sometimes, more often when warm. Is the clock "confirmed"? What do you check, in order?

Show worked solution

No, it is not confirmed. Confirming a crystal requires both frequency and amplitude, and here amplitude has failed even though frequency is right. A weak swing that is worse cold and intermittent on boot is a marginal oscillator. Check, in order: (1) your measurement first, low-capacitance/active probe with a short ground spring, so you are not looking at a probe-loaded waveform; (2) the load capacitors against the crystal's specified load capacitance, a wrong value detunes and weakens the loop; (3) the oscillator drive level / gain setting in the MCU's clock config; (4) board contamination or flux residue near the high-impedance crystal node. A clock that starts but does not build to full amplitude is a field failure waiting to happen, so it must be fixed before firmware is trusted.

Practice 3 stretch

A board uses a simple RC-into-Schmitt power-on reset sized for a fast supply ramp. The team adds a large bulk capacitor to the input for noise reasons, which slows the supply ramp from microseconds to several milliseconds. The board now occasionally boots into a garbage state. Explain the mechanism and give a robust fix.

Show worked solution

The RC reset was sized assuming the rail snaps up fast, so the reset capacitor lags the rail and crosses the Schmitt threshold after the supply is valid. By slowing the ramp to milliseconds, the reset capacitor now charges alongside the rising rail. By the time the Schmitt-trigger stage is itself powered enough to act, its input has already passed the threshold, so no reset pulse reaches the core and the CPU boots from an undefined state, intermittently, depending on exact ramp shape. The robust fix is to stop watching a bare RC capacitor and instead watch the actual rail voltage: use a dedicated supervisor (brown-out) IC that holds reset asserted until the rail crosses a bandgap-referenced valid threshold and only then releases, independent of how fast or slow the supply ramps. That guarantees assert-then-release timing for any ramp rate.

The boards that waste the most of an engineer's life are not the ones that stay dark; those send you straight to the rails with a meter. It is the board whose silent first layer is almost right, the rail that sags a tenth of a volt under load, the crystal that starts but never quite swings, the reset that lets go a hair too soon. Check the vitals before you ask for the first conscious act, and when the firmware finally misbehaves, you will know you are debugging the mind and not the body it was born into.

🩺 Rails, Clocks & Reset

By the end, you can

Intuition first

Vital one: the rails (voltage, ripple, sequence)

Why order matters: latch-up

Vital two: the clock (frequency and amplitude)

Vital three: reset (assert, then release, with timing)

See it: the three vitals as a sequence

Lab: the vitals pass

Key takeaways

Rails, Clocks & Reset