Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 9/10)

Observability Belongs on the PC, Not in the Production Binary

Part 7 covered host-first testing. Part 8 added hardware-in-the-loop testing with an IoTest image and a Python harness. Part 9 is about what you do when the system is running and you need to understand behavior without turning the firmware into a logging framework.

The guiding principle is simple:

Keep target observability minimal and deterministic.
Move heavy analysis, visualization, and introspection to the host.

This avoids firmware bloat, keeps timing predictable, and makes debugging better rather than noisier.

The boundary: on-target telemetry vs off-target analysis

Most embedded observability failures come from mixing these concerns:

On-target code tries to format rich logs, allocate strings, and emit verbose traces.
Those logs change timing, overflow buffers, and create new failure modes.
Developers then debug the logging system instead of the firmware.

A better split:

On target: emit small, fixed-format events and counters.
Off target: decode, correlate, visualize, and analyze.

The target should produce data. The host should produce insight.

What “minimal” looks like on the target

Minimal does not mean “no observability.” It means “observability that cannot break determinism.”

A good target-side observability set:

Counters
- loop slip count
- queue overflow counts
- parser error counts
- watchdog resets, brownout events
State snapshots
- current mode/state id
- last fault code
- a small set of key inputs and outputs
Event stream (optional)
- fixed-size event records in a ring buffer
- drained periodically, not emitted from ISRs unless absolutely necessary

Avoid:

dynamic formatting
iostreams
variable-length strings
“log everything” builds shipped as production candidates

Unsolicited advice: if your observability changes the system’s behavior, it is not observability. It is a new subsystem.

Use fixed-size event records

If you need traces, use fixed-size records so storage and bandwidth are predictable.

A typical record:

timestamp or tick count
event id
a small number of integral parameters

Keep it boring. Boring is debuggable.

One tight C++ example:

#include <array>
#include <cstdint>

struct TraceEvent final {
    std::uint32_t ticks{0U};
    std::uint16_t id{0U};
    std::int32_t  a{0};
    std::int32_t  b{0};
};

template <std::size_t N>
class TraceBuffer final {
public:
    void push(const TraceEvent& e) noexcept {
        this->buf_[this->write_] = e;
        this->write_ = (this->write_ + 1U) % N;
        if(this->count_ < N) {
            ++this->count_;
        } else {
            ++this->drop_count_;
        }
    }

    [[nodiscard]] std::uint32_t drop_count() const noexcept { return this->drop_count_; }

private:
    std::array<TraceEvent, N> buf_{};
    std::size_t write_{0U};
    std::size_t count_{0U};
    std::uint32_t drop_count_{0U};
};

Notes:

This is deterministic: fixed memory, fixed record size, explicit drop behavior.
You can flush it on demand via a command or periodically in a non-hot path.

If you are using ETL for target containers, the same pattern applies. The principle is fixed capacity and explicit overflow behavior.

Prefer binary on the wire, decode on the host

Human-readable ASCII is great for IoTest bring-up and a small set of status queries. But for ongoing observability, binary is usually the right default:

predictable size
lower bandwidth
less time spent formatting on the MCU
easier to version and evolve

You can still keep it debuggable by decoding on the host into human-readable form.

A practical pattern:

On target: emit compact records with ids and integers.
On host: map ids to names, apply scaling, and render rich views.

Make “versioning” part of your protocol

Observability that cannot evolve becomes a liability.

Include:

firmware build id
protocol version
record schema version for trace events

This avoids silent mismatches where tooling decodes the wrong format and produces nonsense.

Unsolicited advice: schema mismatch bugs waste days. Version everything.

Host-side tooling: where insight should live

If you keep the target signal clean, host tooling can be as rich as you want:

trace decoding into JSON or CSV
timeline views
state transition diagrams
slip and overflow dashboards
correlation with test scenarios

This is also where you can afford heavier dependencies: parsers, GUI libraries, plotting libraries, data processing pipelines.

If you are building a series around GitLab pipelines, host tooling also becomes a first-class artifact:

collected traces as pipeline artifacts
automatic decoding jobs
visual reports attached to merge requests

CI support: make observability actionable, not just available

Observability data is useful only if it is used.

Good pipeline patterns:

HIL jobs upload trace artifacts.
A decode job turns traces into readable reports.
Thresholds fail the pipeline when they indicate regressions:
- slip count increased beyond a limit
- overflow counters non-zero
- unexpected fault codes
Reports are retained for comparison across releases.

Unsolicited advice: treat overflow counters like failed assertions. If you see them, you are already outside your design envelope.

What not to do

Avoid these common traps:

Shipping verbose logging in production builds “just in case.”
Printing from ISRs.
Allocating memory to build log strings on target.
Using observability that depends on timing-sensitive host reads.
Adding “temporary debug code” that becomes permanent.

If you need deep introspection, build a separate debug or IoTest variant and keep production deterministic.

Minimal checklist

On target: counters, minimal snapshots, fixed-size event records, explicit drop behavior.
On wire: prefer compact binary for traces, decode on host.
Version everything: build id, protocol version, schema version.
On host: rich analysis, visualization, automated report generation.
In CI: store trace artifacts, decode automatically, and enforce regression thresholds.

Part 10 will tie everything together: a GitLab pipeline blueprint and an incremental migration checklist, including how to structure jobs, enforce quality gates, and keep “deterministic firmware discipline” from becoming optional under schedule pressure.

The Complete “Modern C++ Firmware” Series:

Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 1/10)
- The Case for Modern C++ on Tiny, Safety Critical Targets
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 2/10)
- Choosing C++20 Today, C++23 on a Short Leash
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 3/10)
- Deterministic By Construction: The Rules You Do Not Cross
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 4/10)
- Time and Scheduling Without Footguns
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 5/10)
- Concepts for Hardware Platforms, Not Vtables
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 6/10)
- No Allocation in the Loop: Memory Rules That Survive CI
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 7/10)
- Test the Firmware Without the Board: Host First Strategy
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 8/10)
- Python and ASCII Protocols for Hardware in the Loop
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 9/10)
- Observability Belongs on the PC, Not in the Production Binary
Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 10/10)
- GitLab Pipeline Blueprint and a Migration Checklist

Need professional firmware development help? Engage with Polyrhythm

Discover more from John Farrier

Subscribe to get the latest posts sent to your email.

Modern C++ Firmware: Proven Strategies for Tiny, Critical Systems (Part 9/10)

Observability Belongs on the PC, Not in the Production Binary

The boundary: on-target telemetry vs off-target analysis

What “minimal” looks like on the target

Use fixed-size event records

Prefer binary on the wire, decode on the host

Make “versioning” part of your protocol

Host-side tooling: where insight should live

CI support: make observability actionable, not just available

What not to do

Minimal checklist

The Complete “Modern C++ Firmware” Series:

Related

Discover more from John Farrier

Leave a ReplyCancel reply

Observability Belongs on the PC, Not in the Production Binary

The boundary: on-target telemetry vs off-target analysis

What “minimal” looks like on the target

Use fixed-size event records

Prefer binary on the wire, decode on the host

Make “versioning” part of your protocol

Host-side tooling: where insight should live

CI support: make observability actionable, not just available

What not to do

Minimal checklist

The Complete “Modern C++ Firmware” Series:

Share this:

Related

Discover more from John Farrier

Related Posts

Breaking Down C++20 Callable Concepts

Share this:

7 Interesting (and Powerful) Uses for C++ Iterators

Share this:

Unlocking the Power of std::hash in C++ Programming

Share this:

Leave a ReplyCancel reply

Discover more from John Farrier