Friday, March 21, 2014

Ada vs C++ Bit-fields

Uses of Bit-fields
Bit-fields are typically used to implement communication protocols and to create device drivers for actuators or sensors. Frequently the actuators or sensors will use A/D and D/A converters to associate specific bit groupings with analog voltages on device connector pins.
In safety critical systems the bit patterns of the bit-fields must be correctly arranged or communication with the actuator or sensor will be incorrect. Since many safety critical systems require command of actuators, combined with resulting sensor readings to provide a closed loop control system, the ability to correctly define the bit layout for each device interface is highly safety critical.

JSF C++ Coding Standard

AV Rule 154 (MISRA Rules 111 and 112, Revised)

Bit-fields shall have explicitly unsigned integral or enumeration types only.



Rationale: Whether a plain (neither explicitly signed nor unsigned) char, short, int or long bit-field is signed or unsigned is implementation-defined.[10] Thus, explicitly declaring a bit-filed unsigned prevents unexpected sign extension or overflow.
Note: MISRA Rule 112 no longer applies since it discusses a two-bit minimum-length requirement for bit-fields of signed types.

AV Rule 155

Bit-fields will not be used to pack data into a word for the sole purpose of saving space.
Note: Bit-packing should be reserved for use in interfacing to hardware or conformance to communication protocols.
Warning: Certain aspects of bit-field manipulation are implementation-defined.


Rationale: Bit-packing adds additional complexity to the source code. Moreover, bit-packing may not save any space at all since the reduction in data size achieved through packing is often offset by the increase in the number of instructions required to pack/unpack the data.

AV Rule 156  (MISRA Rule 113)

All the members of a structure (or class) shall be named and shall only be accessed via their names.

Rationale: Reading/writing to unnamed locations in memory is error prone.
Exception: An unnamed bit-field of width zero may be used to specify alignment of the next bit-field at an allocation boundary. [10], 9.6(2)



C++ Bit-fields Explained

The following explanation of C++ bit-fields is taken from http://en.cppreference.com/w/cpp/language/bit_field

Bit field

Declares a class data member with explicit size, in bits. Adjacent bit field members may be packed to share and straddle the individual bytes.
A bit field declaration is a class data member declaration which uses the following declarator:
identifier(optional)attr(optional) : size
(1)
The type of the bit field is introduced by the decl-specifier-seq of the declaration syntax.
attr(C++11)
-
optional sequence of any number of attributes
identifier
-
the name of the bit field that is being declared. The name is optional: nameless bitfields introduce the specified number of bits of padding
size
-
an integral constant expression with a value greater or equal to zero. When greater than zero, this is the number of bits that this bit field will occupy. The value zero is only allowed for nameless bitfields and has special meaning: it specifies that the next bit field in the class definition will begin at an allocation unit's boundary.

Explanation

The number of bits in a bit field sets the limit to the range of values it can hold:
#include <iostream>
struct S {
 // three-bit unsigned field,
 // allowed values are 0...7
 
};
int main()
{
S s 
s.b
    std::cout << s.b << '\n'; // output: 0
}
Multiple adjacent bit fields are usually packed together (although this behavior is implementation-defined):
#include <iostream>
struct S {
    // will usually occupy 2 bytes:
    // 3 bits: value of b1
    // 2 bits: unused
    // 6 bits: value of b2
    // 2 bits: value of b3
    // 3 bits: unused
 
};
int main()
{
    std::cout << sizeof(S) << '\n'; // usually prints 2
}
The special unnamed bit field of size zero can be forced to break up padding. It specifies that the next bit field begins at the beginning of its allocation unit:
#include <iostream>
struct S {
    // will usually occupy 2 bytes:
    // 3 bits: value of b1
    // 5 bits: unused
    // 6 bits: value of b2
    // 2 bits: value of b3
 
 
 
 
};
int main()
{
    std::cout << sizeof(S) << '\n'; // usually prints 2
}
If the specified size of the bit field is greater than the size of its type, the value is limited by the type: a std::uint8_t b : 1000; would still hold values between 0 and 255. the extra bits become unused padding.
Because bit fields do not necessarily begin at the beginning of a byte, address of a bit field cannot be taken. Pointers and non-const references to bit fields are not possible. When initializing a const reference from a bit field, a temporary is created (its type is the type of the bit field), copy initialized with the value of the bit field, and the reference is bound to that temporary.
The type of a bit field can only be integral or enumeration type.
A bit field cannot be a static data member.

Notes

The following properties of bit fields are implementation-defined
  • Everything about the actual allocation details of bit fields within the class object
  • For example, on some platforms, bit fields don't straddle bytes, on others they do
  • Also, on some platforms, bit fields are packed left-to-right, on others right-to-left
  • Whether char, short, int, long, and long long bit fields that aren't explicitly signed or unsigned are signed or unsigned.
  • For example, int b:3; may have the range of values 0..7 or -4..3.
(until C++14)

References

  • C++11 standard (ISO/IEC 14882:2011):
  • 9.6 Bit-fields [class.bit]
  • C++98 standard (ISO/IEC 14882:1998):
  • 9.6 Bit-fields [class.bit]


Ada Alternative

Ada allows the programmer to define bit layouts within a record with much more confidence than can be done by the C++ programmer. The Ada 2012 Reference Manual describes the use of record representation clauses.

13.5.1 Record Representation Clauses

1
record_representation_clause specifies the storage representation of records and record extensions, that is, the order, position, and size of components (including discriminants, if any).

Syntax

2
record_representation_clause ::= 
    
for first_subtype_local_name use
      
record [mod_clause]
        {
component_clause}
      
end record;
3
component_clause ::= 
    
component_local_name at position range first_bit .. last_bit;
4
position ::= static_expression
5
first_bit ::= static_simple_expression
6
last_bit ::= static_simple_expression

Name Resolution Rules

7
Each positionfirst_bit, and last_bit is expected to be of any integer type. 

Legality Rules

8/2
The first_subtype_local_name of a record_representation_clause shall denote a specific record or record extension subtype. 
9
If the component_local_name is a direct_name, the local_name shall denote a component of the type. For a record extension, the component shall not be inherited, and shall not be a discriminant that corresponds to a discriminant of the parent type. If the component_local_name has an attribute_designator, the direct_name of the local_name shall denote either the declaration of the type or a component of the type, and theattribute_designator shall denote an implementation-defined implicit component of the type.
10
The positionfirst_bit, and last_bit shall be static expressions. The value of position and first_bit shall be nonnegative. The value of last_bit shall be no less than first_bit – 1. 
10.1/2
   If the nondefault bit ordering applies to the type, then either: 
10.2/2
the value of last_bit shall be less than the size of the largest machine scalar; or
10.3/2
the value of first_bit shall be zero and the value of last_bit + 1 shall be a multiple of System.Storage_Unit. 
11
At most one component_clause is allowed for each component of the type, including for each discriminant (component_clauses may be given for some, all, or none of the components). Storage places within acomponent_list shall not overlap, unless they are for components in distinct variants of the same variant_part.
12
A name that denotes a component of a type is not allowed within a record_representation_clause for the type, except as the component_local_name of a component_clause.

Static Semantics

13/2
 record_representation_clause (without the mod_clause) specifies the layout.
13.1/2
   If the default bit ordering applies to the type, the positionfirst_bit, and last_bit of each component_clause directly specify the position and size of the corresponding component.
13.2/3
   If the nondefault bit ordering applies to the type, then the layout is determined as follows:
13.3/2
the component_clauses for which the value of last_bit is greater than or equal to the size of the largest machine scalar directly specify the position and size of the corresponding component;
13.4/2
for other component_clauses, all of the components having the same value of position are considered to be part of a single machine scalar, located at that position; this machine scalar has a size which is the smallest machine scalar size larger than the largest last_bit for all component_clauses at that position; the first_bit and last_bit of each component_clause are then interpreted as bit offsets in this machine scalar. 
14
record_representation_clause for a record extension does not override the layout of the parent part; if the layout was specified for the parent type, it is inherited by the record extension. 

Implementation Permissions

15
An implementation may generate implementation-defined components (for example, one containing the offset of another component). An implementation may generate names that denote such implementation-defined components; such names shall be implementation-defined attribute_references. An implementation may allow such implementation-defined names to be used in record_representation_clauses. An implementation can restrict such component_clauses in any manner it sees fit. 
16
If a record_representation_clause is given for an untagged derived type, the storage place attributes for all of the components of the derived type may differ from those of the corresponding components of the parent type, even for components whose storage place is not specified explicitly in the record_representation_clause.

Implementation Advice

17
The recommended level of support for record_representation_clauses is: 
17.1/2
An implementation should support machine scalars that correspond to all of the integer, floating point, and address formats supported by the machine.
18
An implementation should support storage places that can be extracted with a load, mask, shift sequence of machine code, and set with a load, shift, mask, store sequence, given the available machine instructions and run-time model.
19
A storage place should be supported if its size is equal to the Size of the component subtype, and it starts and ends on a boundary that obeys the Alignment of the component subtype.
20/2
For a component with a subtype whose Size is less than the word size, any storage place that does not cross an aligned word boundary should be supported.
21
An implementation may reserve a storage place for the tag field of a tagged type, and disallow other components from overlapping that place. 
22
An implementation need not support a component_clause for a component of an extension part if the storage place is not after the storage places of all components of the parent type, whether or not those storage places had been specified. 
NOTES
23
14  If no component_clause is given for a component, then the choice of the storage place for the component is left to the implementation. If component_clauses are given for all components, the record_representation_clause completely specifies the representation of the type and will be obeyed exactly by the implementation. 

Examples

24
Example of specifying the layout of a record type: 
25
Word : constant := 4;  --  storage element is byte, 4 bytes per word
26
type State         is (A,M,W,P);
type Mode          is (Fix, Dec, Exp, Signif);
27
type Byte_Mask     is array (0..7)  of Boolean;
type State_Mask    is array (State) of Boolean;
type Mode_Mask     is array (Mode)  of Boolean;
28
type Program_Status_Word is
  record
      System_Mask        : Byte_Mask;
      Protection_Key     : Integer range 0 .. 3;
      Machine_State      : State_Mask;
      Interrupt_Cause    : Interruption_Code;
      Ilc                : Integer range 0 .. 3;
      Cc                 : Integer range 0 .. 3;
      Program_Mask       : Mode_Mask;
      Inst_Address       : Address;
end record;
29
for Program_Status_Word use
  record
      System_Mask      at 0*Word range 0  .. 7;
      Protection_Key   at 0*Word range 10 .. 11; -- bits 8,9 unused
      Machine_State    at 0*Word range 12 .. 15;
      Interrupt_Cause  at 0*Word range 16 .. 31;
      Ilc              at 1*Word range 0  .. 1;  -- second word
      Cc               at 1*Word range 2  .. 3;
      Program_Mask     at 1*Word range 4  .. 7;
      Inst_Address     at 1*Word range 8  .. 31;
  end record;
30
for Program_Status_Word'Size use 8*System.Storage_Unit;
for Program_Status_Word'Alignment use 8;
NOTES
31
15  Note on the example: The record_representation_clause defines the record layout. The Size clause guarantees that (at least) eight storage elements are used for objects of the type. The Alignment clause guarantees that aliased, imported, or exported objects of the type will have addresses divisible by eight.

Notice that Ada allows not only the number of bits for each field, it also allows the programmer to specify the value range for each field, the bit location relative to the start of each word of the structure, and even leave some unused bits. C++ requires the use of unnamed bit-fields (with a size greater than 0) to mark unused bits. The C++ compiler has no way to know which bits the programmer wants to skip without specifying an unnamed field. Strangely, an unnamed field with a size of 0 indicates that the next field should start at an allocation unit's boundary, which is highly dependent on the hardware architecture. For instance, on an 8-bit processor the allocation boundary will be every 8 bits, while on a 32-bit processor the allocation boundary might be every 8 bits, 16 bits, or 32 bits, depending upon the processor architecture.

As a software safety engineer I find the implementation-defined aspects of C++ bit-fields to be very unsettling. A project may verify that its C++ bit-fields work as intended on a particular architecture, but when a technology refresh is performed in several years, there is no assurance that the C++ program will continue to work properly on the new hardware. The failure of the C++ program to run on the new hardware will be received as a complete surprise by most users of the upgraded system and by their management. The surprise may be accompanied by hazardous events because of incorrect control of a safety critical system. People may be injured or die, the environment may be seriously damaged, and very expensive systems may be damaged or destroyed. The root cause of the hazards will not be poor functional requirements, poor design, or programming mistakes. The root cause of the hazards will be exposure of implementation-defined behaviors in a programming language. Who will the lawyers sue over the consequences of those hazards?

Thursday, March 20, 2014

Ada vs. the JSF C++ Coding Standard

In this article I compare safety critical C++ coding rules for arrays with built in Ada array capabilities.

Use of Arrays

C and C++ array indexing always begins at 0 because that indicates a 0 offset from the address of the start of the array. The array name is actually a pointer the the start of the array. The index value is used to calculate the memory address offset from the beginning of the array. It is implied that array index values should always be non-negative. Neither the C nor C++ languages provide rules for compilers to check the validity of array indices. Any offset from the beginning of the array is allowed, including negative offsets, which will access memory locations with lower memory addresses than the beginning of the array.

AV Rule 96

Arrays shall not be treated polymorphically. See Meyers [7], item 3.
Rationale: Array indexing in C/C++ is implemented as pointer arithmetic. Hence, a[i] is equivalent to a+i*SIZEOF(array element). Since derived classes are often larger than base classes, polymorphism and pointer arithmetic are not compatible techniques.

AV Rule 97

Arrays shall not be used in interfaces. Instead, the Array class should be used.
Rationale: Arrays degenerate to pointers when passed as parameters. This “array decay” problem has long been known to be a source of errors.

Ada Alternative

Ada array indexing is not implemented as pointer arithmetic. The index type and index range of an Ada array is always available wherever the array is visible. Ada array index type may be specified to be any discrete type, which includes signed integers, modular integers, and enumerated types.

Ada arrays are first class types, not merely pointers to an address in memory. When an Ada array is passed as a parameter to a subprogram the array does not decay to a pointer as in C or C++. The abstract behavior of Ada arrays is similar to the abstract behavior of the C++ Array class, although there are some very different implementation details.

Circular Buffer Example

------------------------------------------------------------------------
-- Demonstration of a circular array buffer in Ada                    --
------------------------------------------------------------------------
with Ada.Text_IO; use Ada.Text_IO;

procedure Circular_Buffer is
   type Buffer_Index is mod 10;
   type Buffer_Type is array(Buffer_Index) of Integer;

   procedure Alter_Buffer(Item : out Buffer_Type) is
      Index : Buffer_Index := Buffer_Index'First;
   begin
      for I in 1..15 loop
         Item(Index) := I;
         Index := Index + 1;
      end loop;
   end Alter_Buffer;

   Buffer : Buffer_Type := (1,2,3,4,5,6,7,8,9,10);

begin
   Put_Line("Initial Buffer Values:");
   for Element of Buffer loop
      Put_Line(Integer'Image(Element));
   end loop;

   Alter_Buffer(Buffer);

   Put_Line("Final Buffer Values:");
   for Element of Buffer loop
      Put_Line(Integer'Image(Element));
   end loop;
end Circular_Buffer;

Program Output:

Initial Buffer Values:
1
2
3
4
5
6
7
8
9
10
Final Buffer Values:
11
12
13
14
15
6
7
8
9
10

Comparison of three algorithms for summing an array of integers

This article shares a comparison of the time taken to sum an array of integers. This article tests three algorithms for summing the array. •...