Uses of Bit-fields
Bit-fields are typically used to implement communication protocols
and to create device drivers for actuators or sensors. Frequently the
actuators or sensors will use A/D and D/A converters to associate
specific bit groupings with analog voltages on device connector pins.In safety critical systems the bit patterns of the bit-fields must be correctly arranged or communication with the actuator or sensor will be incorrect. Since many safety critical systems require command of actuators, combined with resulting sensor readings to provide a closed loop control system, the ability to correctly define the bit layout for each device interface is highly safety critical.
JSF C++ Coding Standard
AV Rule 154 (MISRA Rules 111 and 112, Revised)
Bit-fields shall have explicitly unsigned integral or enumeration types only.
Rationale: Whether a plain
(neither explicitly signed nor unsigned) char, short, int or long
bit-field is signed or unsigned is implementation-defined.[10] Thus,
explicitly declaring a bit-filed unsigned prevents unexpected sign
extension or overflow.
Note: MISRA Rule
112 no longer applies since it discusses a two-bit minimum-length
requirement for bit-fields of signed types.
AV Rule 155
Bit-fields will not be used to pack data into a word for the sole purpose of saving space.
Note: Bit-packing should be reserved for use in interfacing to
hardware or conformance to communication protocols.
Warning: Certain aspects of
bit-field manipulation are implementation-defined.
AV Rule 156 (MISRA Rule 113)
All the members of a structure (or class) shall be named and shall only be accessed via their names.
Rationale: Reading/writing to
unnamed locations in memory is error prone.
Exception: An unnamed bit-field of width zero may be used
to specify alignment of the next bit-field at an allocation boundary.
[10], 9.6(2)C++ Bit-fields Explained
The following explanation of C++ bit-fields is taken from http://en.cppreference.com/w/cpp/language/bit_fieldBit field
Declares
a class data member with explicit size, in bits. Adjacent bit field
members may be packed to share and straddle the individual bytes.
A
bit field declaration is a class
data member declaration which
uses the following declarator:
identifier(optional)attr(optional)
: size |
(1) | ||
attr(C++11)
|
-
|
optional sequence of any
number of attributes
|
identifier
|
-
|
the name of the bit field
that is being declared. The name is optional: nameless bitfields
introduce the specified number of bits of padding
|
size
|
-
|
an integral constant
expression with a value greater or equal to zero. When greater
than zero, this is the number of bits that this bit field will
occupy. The value zero is only allowed for nameless bitfields
and has special meaning: it specifies that the next bit field in
the class definition will begin at an allocation unit's
boundary.
|
Explanation
The
number of bits in a bit field sets the limit to the range of values
it can hold:
#include <iostream>
struct S {
// three-bit unsigned field,
// allowed values are 0...7
};
int main()
{
S s
s.b
std::cout << s.b << '\n'; // output: 0
}
Multiple
adjacent bit fields are usually packed together (although this
behavior is implementation-defined):
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 2 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
// 3 bits: unused
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
The
special unnamed bit field of size zero can be forced to break up
padding. It specifies that the next bit field begins at the
beginning of its allocation unit:
#include <iostream>
struct S {
// will usually occupy 2 bytes:
// 3 bits: value of b1
// 5 bits: unused
// 6 bits: value of b2
// 2 bits: value of b3
};
int main()
{
std::cout << sizeof(S) << '\n'; // usually prints 2
}
If
the specified size of the bit field is greater than the size of its
type, the value is limited by the type: a std::uint8_t
b
:
1000;
would
still hold values between 0 and 255. the extra bits become unused
padding.
Because
bit fields do not necessarily begin at the beginning of a byte,
address of a bit field cannot be taken. Pointers and non-const
references to bit fields are not possible. When initializing
a const reference from
a bit field, a temporary is created (its type is the type of the
bit field), copy initialized with the value of the bit field, and
the reference is bound to that temporary.
The
type of a bit field can only be integral or enumeration type.
A
bit field cannot be a static
data member.Notes
The
following properties of bit fields are implementation-defined
- Everything about the actual allocation details of bit fields within the class object
- For example, on some platforms, bit fields don't straddle bytes, on others they do
- Also, on some platforms, bit fields are packed left-to-right, on others right-to-left
|
(until C++14)
|
References
- C++11 standard (ISO/IEC 14882:2011):
- 9.6 Bit-fields [class.bit]
- C++98 standard (ISO/IEC 14882:1998):
- 9.6 Bit-fields [class.bit]
Ada Alternative
Ada allows the programmer to define bit layouts within a record
with much more confidence than can be done by the C++ programmer. The
Ada 2012 Reference Manual describes the use of record representation
clauses.13.5.1 Record Representation Clauses
A record_representation_clause specifies
the storage representation of records and record extensions, that is,
the order, position, and size of components (including discriminants,
if any).
Syntax
record_representation_clause ::=
for first_subtype_local_name use
record [mod_clause]
{component_clause}
end record;
for first_subtype_local_name use
record [mod_clause]
{component_clause}
end record;
Name Resolution Rules
Legality Rules
The first_subtype_local_name of
a record_representation_clause shall
denote a specific record or record extension subtype.
If
the component_local_name is
a direct_name,
the local_name shall
denote a component of the type. For a record extension, the component
shall not be inherited, and shall not be a discriminant that
corresponds to a discriminant of the parent type. If
the component_local_name has
an attribute_designator,
the direct_name of
the local_name shall
denote either the declaration of the type or a component of the type,
and theattribute_designator shall
denote an implementation-defined implicit component of the type.
The position, first_bit,
and last_bit shall
be static expressions. The value of position and first_bit shall
be nonnegative. The value of last_bit shall
be no less than first_bit –
1.
If
the nondefault bit ordering applies to the type, then either:
the
value of first_bit shall
be zero and the value of last_bit +
1 shall be a multiple of System.Storage_Unit.
At
most one component_clause is
allowed for each component of the type, including for each
discriminant (component_clauses
may be given for some, all, or none of the components). Storage
places within acomponent_list shall
not overlap, unless they are for components in distinct variants
of the same variant_part.
A
name that denotes a component of a type is not allowed within
a record_representation_clause for
the type, except as the component_local_name of
a component_clause.
Static Semantics
If
the default bit ordering applies to the type,
the position, first_bit,
and last_bit of
each component_clause directly
specify the position and size of the corresponding component.
If
the nondefault bit ordering applies to the type, then the layout is
determined as follows:
the component_clauses
for which the value of last_bit is
greater than or equal to the size of the largest machine scalar
directly specify the position and size of the corresponding
component;
for
other component_clauses,
all of the components having the same value of position are
considered to be part of a single machine scalar, located at
that position;
this machine scalar has a size which is the smallest machine scalar
size larger than the largest last_bit for
all component_clauses
at that position;
the first_bit and last_bit of
each component_clause are
then interpreted as bit offsets in this machine scalar.
A record_representation_clause for
a record extension does not override the layout of the parent part;
if the layout was specified for the parent type, it is inherited by
the record extension.
Implementation Permissions
An
implementation may generate implementation-defined components (for
example, one containing the offset of another component). An
implementation may generate names that denote such
implementation-defined components; such names shall be
implementation-defined attribute_references.
An implementation may allow such implementation-defined names to be
used in record_representation_clauses.
An implementation can restrict such component_clauses
in any manner it sees fit.
If
a record_representation_clause is
given for an untagged derived type, the storage place attributes for
all of the components of the derived type may differ from those of
the corresponding components of the parent type, even for components
whose storage place is not specified explicitly in
the record_representation_clause.
Implementation Advice
An
implementation should support machine scalars that correspond to all
of the integer, floating point, and address formats supported by the
machine.
An
implementation should support storage places that can be extracted
with a load, mask, shift sequence of machine code, and set with a
load, shift, mask, store sequence, given the available machine
instructions and run-time model.
A
storage place should be supported if its size is equal to the Size of
the component subtype, and it starts and ends on a boundary that
obeys the Alignment of the component subtype.
For
a component with a subtype whose Size is less than the word size, any
storage place that does not cross an aligned word boundary should be
supported.
An
implementation may reserve a storage place for the tag field of a
tagged type, and disallow other components from overlapping that
place.
An
implementation need not support a component_clause for
a component of an extension part if the storage place is not after
the storage places of all components of the parent type, whether or
not those storage places had been specified.
NOTES
14 If
no component_clause is
given for a component, then the choice of the storage place for the
component is left to the implementation. If component_clauses
are given for all components,
the record_representation_clause completely
specifies the representation of the type and will be obeyed exactly
by the implementation.
Examples
Example
of specifying the layout of a record type:
Word : constant := 4; -- storage element is byte, 4 bytes per word
type State is (A,M,W,P);
type Mode is (Fix, Dec, Exp, Signif);
type Mode is (Fix, Dec, Exp, Signif);
type Byte_Mask is array (0..7) of Boolean;
type State_Mask is array (State) of Boolean;
type Mode_Mask is array (Mode) of Boolean;
type State_Mask is array (State) of Boolean;
type Mode_Mask is array (Mode) of Boolean;
type Program_Status_Word is
record
System_Mask : Byte_Mask;
Protection_Key : Integer range 0 .. 3;
Machine_State : State_Mask;
Interrupt_Cause : Interruption_Code;
Ilc : Integer range 0 .. 3;
Cc : Integer range 0 .. 3;
Program_Mask : Mode_Mask;
Inst_Address : Address;
end record;
record
System_Mask : Byte_Mask;
Protection_Key : Integer range 0 .. 3;
Machine_State : State_Mask;
Interrupt_Cause : Interruption_Code;
Ilc : Integer range 0 .. 3;
Cc : Integer range 0 .. 3;
Program_Mask : Mode_Mask;
Inst_Address : Address;
end record;
for Program_Status_Word use
record
System_Mask at 0*Word range 0 .. 7;
Protection_Key at 0*Word range 10 .. 11; -- bits 8,9 unused
Machine_State at 0*Word range 12 .. 15;
Interrupt_Cause at 0*Word range 16 .. 31;
Ilc at 1*Word range 0 .. 1; -- second word
Cc at 1*Word range 2 .. 3;
Program_Mask at 1*Word range 4 .. 7;
Inst_Address at 1*Word range 8 .. 31;
end record;
record
System_Mask at 0*Word range 0 .. 7;
Protection_Key at 0*Word range 10 .. 11; -- bits 8,9 unused
Machine_State at 0*Word range 12 .. 15;
Interrupt_Cause at 0*Word range 16 .. 31;
Ilc at 1*Word range 0 .. 1; -- second word
Cc at 1*Word range 2 .. 3;
Program_Mask at 1*Word range 4 .. 7;
Inst_Address at 1*Word range 8 .. 31;
end record;
for Program_Status_Word'Size use 8*System.Storage_Unit;
for Program_Status_Word'Alignment use 8;
for Program_Status_Word'Alignment use 8;
NOTES
15 Note
on the example: The record_representation_clause defines
the record layout. The Size clause guarantees that (at least) eight
storage elements are used for objects of the type. The Alignment
clause guarantees that aliased, imported, or exported objects of the
type will have addresses divisible by eight.
Notice that Ada allows not only the number of bits for each field, it also allows the programmer to specify the value range for each field, the bit location relative to the start of each word of the structure, and even leave some unused bits. C++ requires the use of unnamed bit-fields (with a size greater than 0) to mark unused bits. The C++ compiler has no way to know which bits the programmer wants to skip without specifying an unnamed field. Strangely, an unnamed field with a size of 0 indicates that the next field should start at an allocation unit's boundary, which is highly dependent on the hardware architecture. For instance, on an 8-bit processor the allocation boundary will be every 8 bits, while on a 32-bit processor the allocation boundary might be every 8 bits, 16 bits, or 32 bits, depending upon the processor architecture.
As a software safety engineer I find the implementation-defined aspects of C++ bit-fields to be very unsettling. A project may verify that its C++ bit-fields work as intended on a particular architecture, but when a technology refresh is performed in several years, there is no assurance that the C++ program will continue to work properly on the new hardware. The failure of the C++ program to run on the new hardware will be received as a complete surprise by most users of the upgraded system and by their management. The surprise may be accompanied by hazardous events because of incorrect control of a safety critical system. People may be injured or die, the environment may be seriously damaged, and very expensive systems may be damaged or destroyed. The root cause of the hazards will not be poor functional requirements, poor design, or programming mistakes. The root cause of the hazards will be exposure of implementation-defined behaviors in a programming language. Who will the lawyers sue over the consequences of those hazards?