Observability Viewpoint
Purpose
The Observability Viewpoint focuses on the design of the software tracing facilities, as the main method for obtaining diagnostic information from a running system consisting of Active Objects. This design viewpoint focuses on the software tracing component inside the QP/C Framework, called QS (from "QP Spy").
Design Concerns
One of the greatest strengths of any software tracing system is the data transmission protocol. The protocol must be agreed upon by both the target-resident component and the host-resident component of the whole software tracing system (called QP/Spy).
The QP/Spy protocol is lightweight, but contains the mechanisms for checking both data integrity and continuity. It has many elements of the High Level Data Link Control (HDLC↑) protocol defined by the International Standards Organization (ISO). The QP/Spy protocol has been specifically designed to simplify the data management overhead in the target, yet to allow detection of any data dropouts due to trace buffer overruns. The protocol not only has provisions for detecting gaps in the data and other errors, but it also allows for instantaneous re-synchronization after any buffering or transmission error to minimize loss of useful data.
The QP/Spy protocol transmits each trace record in an HDLC-like frame. The upper part of the figure above shows the serial data stream transmitted from the target, containing frames of different lengths. The bottom part of the figure above shows the details of a single frame:
One of the most essential characteristics of HDLC-type protocols is establishing very easily identifiable frames in the serial data stream. Any receiver of such a protocol can instantaneously synchronize to the frame boundary by simply finding the Flag byte. This is because the special Flag byte can never occur within the content of a frame. To avoid confusing unintentional Flag bytes that can naturally occur in the data stream with an intentionally sent Flag, HDLC uses a technique known as transparency (a.k.a. byte-stuffing or escaping) to make the Flag bytes transparent during transmission. Whenever the transmitter encounters a Flag byte in the data, it inserts a two-byte escape sequence into the output stream. The first byte is the Escape byte, defined as binary 01111101 (hexadecimal 0x7D). The second byte is the original byte XOR-ed with 0x20.
Of course, now the Escape byte itself must also be transparent to avoid interpreting an unintentional Escape byte as the two-byte escape sequence. The procedure of escaping the Escape byte is identical to that of escaping the Flag byte.
The transparency of the Flag and Escape bytes complicates the computation of the Checksum slightly. The transmitter computes the Checksum over the Frame Sequence Number, the Record-Type, and all Data bytes before performing any "byte-stuffing". The receiver must apply the exact reversed procedure of performing the "byte-un-stuffing" before computing the Checksum.
An example may make this clearer. Suppose that the following trace record needs to be inserted into the trace buffer (the transparent bytes are shown in code format):
Record-Type = 0x7D, Record Data = 0x7D 0x08 0x01
Assuming that the current Frame Sequence Number is, say, 0x7E, the Checksum will be computed over the following bytes:
Checksum == (uint8_t)(~(0x7E + 0x7D + 0x7D + 0x08 + 0x01)) == 0x7E
And the actual frame inserted into the QS trace buffer will be as follows:
0x7D 0x5E 0x7D 0x5D 0x7D 0x5D 0x08 0x01 0x7D 0x5E 0x7E
In addition to the HDLC-like framing, the QP/Spy transmission protocol specifies the endianness of the data to be little-endian. All multi-byte data elements, such as 16-, 32-, or 64-bit integers, pointers, and floating point numbers are inserted into the QS trace buffer in the little-endian byte order (least-significant byte first). The QS data inserting macros place the data in the trace buffer in a platform-neutral manner, meaning that the data is inserted into the buffer in the little-endian order regardless of the endianness of the CPU. Also, the data-inserting macros copy the data to the buffer one byte at a time, thus avoiding any potential data misalignment problems. Many embedded CPUs, such as ARM (Cortex-M0), require specific alignment of 16-, 32-, or 64-bit quantities.
The QS Trace Buffers (transmit QS buffer and receive QS-RX buffer) store only complete HDLC frames, which is the pivotal point in the design of the QS target component and has two important consequences:
The target-resident component of the QP/Spy tracing system is called QS. The purpose of QS is to provide facilities for instrumenting the target code so it will produce an interesting real-time trace from code execution. In this sense, it is similar to peppering the code with printf statements. However, the main difference between QS and printf is where the data formatting and sending is done. When you use printfs, the data formatting and sending occur in the time-critical paths through your embedded code. In contrast, the QS target-resident component inserts raw binary data into the QS ring buffer, so all the time-consuming formatting is removed from the Target system and is done after the fact in the Host. Additionally, in QS, data logging and sending to the Host are separated so that the target system can typically perform the transmission outside of the time-critical path, for example, in the idle processing of the target CPU.
The QS target component consists of the QS ring buffer, the QS filters, as well as the instrumentation added to the QP/C Framework and the application, as shown in the figure below. Additionally, the QS target component contains the receive-channel (QS-RX) with its own receive buffer, which can receive data from the QSPY host component.
A nice byproduct of removing the data formatting from the Target is a natural data compression. For example, formatted output of a single byte takes two hexadecimal digits (and 3 decimal digits), so avoiding the formatting gives at least a factor of two in data density. On top of this natural compression, QS uses such techniques as data dictionaries and compressed format information, which in practice result in a compression factor of 4-5 compared to the expanded human-readable format.
The QP/C Framework contains the QS instrumentation for tracing the interesting occurrences within the framework, such as state machine activity (dispatching events, entering/exiting a state, executing transitions, etc.), Active Object activity (allocating events, posting/publishing events, time events, etc.), and more. All this instrumentation reserves 100 predefined QS trace records, which are enumerated in QS_GlbPredef. These QS records have a predefined (hard-coded) structure both in the QS target-resident component and in the QSPY host-based application↑. See also the documentation of the human-readable output↑ generated from the predefined QS records.
In addition to the predefined QS records, you can add your own, flexible, application-specific trace records, which are not known in advance to the QSPY host-resident component. You can think of the application-specific records as an equivalent to printf(), but with much less overhead. The following code snippet shows an example of an application-specific QS record from your embedded code:
As you can see from the example above, an application-specific trace record always begins with QS_BEGIN_ID(), followed by several application-specific data elements, followed by QS_END().
Application-Specific Record Representation
The biggest challenge in supporting flexible "application-specific" trace records is to provide the data type information with the data itself, so that QSPY "knows" how to parse such records and move on to the next data element within the record. The figure below shows the encoding of the application-specific trace record from the previous listing.
The application-specific trace record, like all QS records, starts with the Sequence Number and the Record-Type. Every application-specific trace record also contains the timestamp immediately following the Record Type. The number of bytes used by the timestamp is configurable by the macro QS_TIME_SIZE. After the timestamp, you see the data elements, such as a byte (QS_U8()) and a string (QS_STR()). Each of these data elements starts with a fmt (format) byte, which actually contains both the data-type information (in the lower nibble) and the format width for displaying that element (in the upper nibble). For example, the data element QS_U8(1, n) will cause the value 'n' to be encoded as uint8_t with the format width of 1 decimal digit.
As shown in the listing above, typically, the application-specific records are enclosed with the QS_BEGIN_ID() / QS_END() pair of macros. This pair of macros disables interrupts at the beginning and enables them again at the end of each record. Occasionally, you might want to generate trace data from within already-established critical sections or ISRs. In such rare occasions, you would use the macros QS_BEGIN_NOCRIT() / QS_END_NOCRIT() to avoid nesting of critical sections.
The record-begin macro QS_BEGIN_ID() takes two arguments. The first argument (e.g., PHILO_STAT) is the enumerated Record-Type, which is used in the global filter and is part of each record header.
The second argument (e.g., AO_Philo[n]->prio in the example above) is used for the local filter, which allows you to log only specific objects selectively. The code snippet shows an example of an application-specific trace record, including use of the second parameter of the QS_BEGIN_ID() macro.
Application-Specific Record Examples
The following examples show the QS application-specific trace records as C code on the left, and the output generated by the QSPY host application from these records on the right. The examples assume that the QS dictionaries have been produced for the Record-Types and function/object pointers used.
| Trace Record | QSPY output |
|---|---|
| 1018004718 PHILO_STAT 1 thinking NOTE: produced only when AO_Philo[n] is enabled in the QS Local Filter | |
1055004424 IO_CALL IO_Read -129 0 | |
0207024814 DATA_RX l_uart2 10 17 84 BB 40 FD 15 00 00 99 0B 00 00 90 0D 00 20 | |
| 0991501750 FP_DATA 3.141500e+003 -2.7182818280e+005 NOTE: produced only when QS-ID QS_AP_ID + 1 is enabled in the QS Local Filter |
Application-Specific Data Elements
The following table summarizes the supported data elements that can be used inside the Application-Specific trace records:
| Data Element | Example | Comments |
|---|---|---|
| QS_U8() | QS_U8(0, n); | Outputs a uint8_t integer with format "u" |
| QS_I8() | QS_I8(3, m); | Outputs a int8_t integer with format "%3d" |
| QS_U16() | QS_U16(5, n); | Outputs a uint16_t integer with format "%5u" |
| QS_I16() | QS_I16(0, m); | Outputs a int16_t integer with format "d" |
| QS_U32() | QS_U32(QS_HEX_FMT, n); | Outputs a uint32_t integer with format "%8X" |
| QS_I32() | QS_I32(0, m); | Outputs a int32_t integer with format "d" |
| QS_U64() | QS_U32(0, n); | Outputs a uint32_t integer with format "%2"PRIi64 |
| QS_F32() | QS_F32(0, 3.1415F); | Outputs a 32-bit float with format "%7.0e" (zero digits after the comma) |
| QS_F64() | QS_F64(4, sqrt(2.0)); | Outputs a 64-bit double with format "%12.4e" (4 digits after the comma) |
| QS_STR() | QS_STR("Hello") | Outputs a zero-terminated string with format "s" |
| QS_MEM() | QS_MEM(&my_struct, 16) | Outputs 16 bytes of memory starting from &my_struct. The bytes are output using the hex format "%02X" |
| QS_OBJ() | QS_OBJ(&my_obj); | Outputs an object pointer. If an object dictionary for the object exists, QSPY will display the symbolic name of the object |
| QS_FUN() | QS_OBJ(&foo); | Outputs a function pointer. If a function dictionary for the function exists, QSPY will display the symbolic name of the function |
| QS_SIG() | QS_SIG(TIMEOUT_SIG, (void*)0); | Outputs a signal. If signal dictionary for the signal exists, QSPY will display the symbolic name of the signal |
Obviously, QS cannot completely eliminate the overhead of software tracing. But with the fine-granularity filters available in QS, you can make this impact as small as necessary. For greatest flexibility, QS uses two complementary levels of filters: Global Filter and Local Filter described below. The combination of such two complementary filtering criteria results in very selective tracing capabilities.
QS Global Filter
The Global Filter is based on trace Record-Types associated with each QS record (see ::QSpyRecords). This filter allows you to disable or enable each individual Record-Type or a whole group of QS records. For example, you might enable or disable QS_QEP_STATE_ENTRY (entry to a state), QS_QEP_STATE_EXIT (exit from a state), QS_QEP_INIT_TRAN (state transition), QS_QF_ACTIVE_POST (event posting), QS_QF_PUBLISH (event publishing), and all other pre-defined and application-specific event types. This level works globally for all state machines, active objects, and time event objects in the entire system.
QS provides a simple interface, QS_GLB_FILTER(), for setting and clearing individual Record-Types as well as groups of Record-Types in the Target code. The following table summarizes the Record-Types and groups of Record-Types that you can use as arguments to QS_GLB_FILTER().
Here are some examples of setting and clearing the QS Global Filter with QS_GLB_FILTER():
QS Local Filter
The Local Filter is based on QS-IDs associated with various objects in the Target memory. The QS-IDs are small integer numbers, such as the unique priorities assigned to QP Active Objects, but there are more such QS-IDs which you can assign to various objects. Then, you can set up the QS Local Filter to trace only a specific groups of such QS-IDs.
The main use case for QS Local Filter is an application where certain active objects are very "noisy", and would overwhelm your trace. The QS Local Filter allows you to silence the "noisy" active objects and let the others through.
Please note that the QS Global Filter will not do the trick, because you don't want to suppress all QS records of a given Record-Type. Instead, you want to suppress only specific objects.
QS provides a simple interface, QS_LOC_FILTER(), for setting and clearing individual QS-IDs as well as groups of QS-IDs in the Target code. The following table summarizes the QS-IDs and groups of QS_IDs that you can use as arguments to QS_LOC_FILTER().
| QS-ID /Group | Range | Example | Comments |
|---|---|---|---|
| 0 | 0 | always enabled | |
| QS_AO_IDS | 1..64 | QS_LOC_FILTER(QS_AO_IDS); QS_LOC_FILTER(-QS_AO_IDS); QS_LOC_FILTER(6); QS_LOC_FILTER(-6); QS_LOC_FILTER(AO_Table->prio) | Active Object priorities |
| QS_EP_IDS | 65..80 | QS_LOC_FILTER(QS_EP_ID + 1U); | enable Event-Pool #1 |
| QS_EQ_IDS | 81..96 | QS_LOC_FILTER(QS_EQ_ID + 1U); | enable Event-Queue #1 |
| QS_AP_IDS | 97..127 | QS_LOC_FILTER(QS_AP_ID + 1U); | enable Application Specific QS_ID |
Here are some examples of setting and clearing QS Local Filter with QS_LOC_FILTER():
Most QS trace records produced by QS are time-stamped with a high-resolution counter (the resolution depends on the availability of a hardware timer-counter in the Target, but typically provides sub-microsecond granularity). QS provides an efficient API for obtaining platform-specific timestamp information. Given the right timer-counter resource in your Target system, you can provide QS with as precise timestamp information as required. The size of the timestamp is configurable to be 1, 2, or 4 bytes (see QS_TIME_SIZE).
QS maintains a set of Current Objects to which it applies commands received through the QS-RX channel. For example, the event-post operation is applied to the current Active Object, while the peek/poke/fill operations are applied to the current Application Object. QS maintains the following Current Objects:
By the time you compile and load your application image to the Target, the symbolic names of various objects, function names, and event signal names are stripped from the code. Therefore, if you want to have the symbolic information available to the QSPY host-resident component, you need to supply it somehow to the software tracing system.
The QS Target-resident component provides special dictionary trace records designed expressly for providing the symbolic information about the target code in the trace itself. These "dictionary records" are very much like the symbolic information embedded in the object files for the traditional single-step debugger. QS can supply five types of dictionary trace records:
The dictionary trace records are typically generated during the system initialization and this is the only time they are sent to the QSPY host component. It is your responsibility to code them in (by means of the QS_???_DICTIONARY() macros). The following code snippet provides some examples of generating QS dictionaries:
The QS target component contains the receive-channel (QS-RX), which can receive data from the QSPY host application. The QS-RX channel provides the following services: