Wednesday, November 19, 2014

Graph Representation Using an Adjacency List
Summary
The graph abstract data type is very useful for identifying relationships between entities. Graphs can be used to represent navigation on a map, with the entities being the way-points and the connections between the way-points being the named routes, with additional attributes of distance and travel time. Graphs can also be used to represent states and state transitions for finite state diagrams.

Both maps and finite state diagrams benefit from a directed graph, where relationships are directional. There are also undirected graphs, which can be used to model such things as molecular structures, where the atoms are the entities and the connections between them are the chemical bonds.

The following example shows an implementation of an undirected graph with a fixed number of entities. A fixed length graph is an efficient representation when the number of entities you are modeling will not grow, such as when you are modeling the structure of a protein. This example defines a generic data type for a graph. The generic data type allows the programmer to specialize the graph upon instantiation by specifying the type of the entity and the number of entities in the graph.

As is my custom, I am using the Ada programming language for my example. The generic abstract data type is defined in an Ada package, and a simple procedure is written to illustrate how the package is used.

Ada Package Specification
The Ada package specification defines the public interfaces to the abstract data type and the private internal data representations used in the abstract data type.
------------------------------------------------------------------
-- Generic Undirected Graph Package --
------------------------------------------------------------------
generic
   type Element is private;
   Max_Elements : Positive;
package Fixed_Length_Graph is
   subtype Element_Id is Positive range 1..Max_Elements;
   type Adjacency_List is array(Element_Id) of Boolean;
   type Undirected_Graph is tagged private;
   Procedure Add_Element(to : in out Undirected_Graph;
                              Item : in Element;
                              Id   : in Element_Id);
   Procedure Add_Adjacencies(to : in out Undirected_Graph;
                                   Item : in Element_Id;
                                   List : in Adjacency_List);
   function Get_Element(From : in Undirected_Graph;
                               Id   : in Element_Id) return Element;
   function List_Adjacencies(From : in Undirected_Graph;
                             Id   : in Element_Id) return Adjacency_List;
   No_Element_Error : Exception;
private
   type Adjacency_Matrix is array(Element_Id) of Adjacency_List;
   type Element_Record is record
      Filled : Boolean := False;
      The_Element : Element;
   end record;
   type Element_List is array(Element_Id) of Element_Record;
   type Undirected_Graph is tagged record
      Adjacencies : Adjacency_Matrix := (Others => (Others => False));
      Elements    : Element_List;
   end record;
end Fixed_Length_Graph;

This package provides two public data types: Adjacency_List, which is fully exposed to the public, and Undirected_Graph, which is declared to be private and is therefore an opaque type. The completion of the type Undirected_Graph is seen in the private section of the package specification, along with the private types Adjacency_Matrix, Element_Record, and Element_List, which are completely private, having no exposure to the public.
You may notice that Adjacency_List is an array of boolean values and Adjacency_Matrix is an array of Adjacency_List elements.
Ada Package Body
The Ada package body contains the implementation of all the functions and procedures declared in the Ada package specification. The contents of the Ada package body are completely hidden from any the calling subprograms. One consequence of this is that the calling subprogram never needs to be recompiled if only the package body is modified. The calling subprogram depends upon the interface specified in the package specification, not upon the implementation of that interface.
package body Fixed_Length_Graph is



-----------------
-- Add_Element --
-----------------



procedure Add_Element(to : in out Undirected_Graph;
                    Item : in Element;
                      Id : in Element_Id)is
begin
   To.Elements(Id).The_Element := Item;
   To.Elements(Id).Filled := True;
end Add_Element;



---------------------
-- Add_Adjacencies --
---------------------



procedure Add_Adjacencies(to : in out Undirected_Graph;
                        Item : in Element_Id;
                        List : in Adjacency_List) is
begin
   for I in List'Range loop
      To.Adjacencies(Item)(I) := List(I);
      To.Adjacencies(I)(Item) := List(I);
   end loop;
end Add_Adjacencies;



-----------------
-- Get_Element --
-----------------



function Get_Element(From : in Undirected_Graph;
                       Id : in Element_Id) return Element is
begin
   if From.Elements(Id).Filled then
      return From.Elements(ID).The_Element;
   else
      raise No_Element_Error;
   end if;
end Get_Element;



----------------------
-- List_Adjacencies --
----------------------



function List_Adjacencies(From : in Undirected_Graph;
                            Id : in Element_Id) return Adjacency_List is
begin
   return From.Adjacencies(Id);
end List_Adjacencies;



end Fixed_Length_Graph;
Remember that the Adjacency_Matrix is defined as an array of arrays, not as a two-dimensional array. This was done to simplify the ability to return the adjacencies associated with a specific entity or node.

Testing the Undirected Graph Package
In this test I have decided to create a graph of bounded string values. The graph I am representing is shown in the diagram below.



Test Source Code
------------------------------------------------------------------
-- Second test of Fixed_Length_Graph package --
------------------------------------------------------------------
with Ada.Text_Io; use Ada.Text_IO;
with Ada.Strings.Bounded;
with Fixed_Length_Graph;

procedure Graph_Test_02 is
   package B_Strings is new Ada.Strings.Bounded.Generic_Bounded_Length(25);
   use B_Strings;
   package Str_Graph is new Fixed_Length_Graph(B_Strings.Bounded_String, 7);
   use Str_Graph;
   The_Graph: Undirected_Graph;
   Adj_List : Adjacency_List;

   Subtype Index is Positive range 1..7;
   type Str_Array is array(Index) of B_Strings.Bounded_String;
   Names : Str_Array;


   Adj_Array : Array(Index) of Adjacency_List;

begin
   Names(1) := To_Bounded_String("One");
   Names(2) := To_Bounded_String("Two");
   Names(3) := To_bounded_String("Three");
   Names(4) := To_Bounded_String("Four");
   Names(5) := To_Bounded_String("Five");
   Names(6) := To_Bounded_String("Six");
   Names(7) := To_Bounded_String("Seven");
   for I in Names'Range loop
      The_Graph.Add_Element(Names(I), I);
   end loop;
   Adj_Array(1) := (2 => True, 3 => True, Others => False);
   Adj_Array(2) := (1 => True, 3 => True, 4 => True, Others => False);
   Adj_Array(3) := (1 => True, 2 => True, Others => False);
   Adj_Array(4) := (2 => True, 5 => True, 6 => True, Others => False);
   Adj_Array(5) := (4 => True, 7 => True, Others => False);
   Adj_Array(6) := (4 => True, 7 => True, Others => False);
   Adj_Array(7) := (5 => True, 6 => True, Others => False);

   for I in Adj_Array'Range loop
      The_Graph.Add_Adjacencies(I, Adj_Array(I));
   end loop;

   for I in Element_Id'Range loop
      Put(To_String(The_Graph.Get_Element(I)) & "-> ");
      Adj_List := The_Graph.List_Adjacencies(I);
      for J in Adj_List'Range loop
         if Adj_List(J) = True then
            Put(To_String(The_Graph.Get_Element(J)) & " ");
         end if;
      end loop;
      New_Line;
   end loop;

end Graph_Test_02;

The output of this program is:



One-> Two Three
Two-> One Three Four
Three-> One Two
Four-> Two Five Six
Five-> Four Seven
Six-> Four Seven

Seven-> Five Six

Friday, March 21, 2014

Ada vs C++ Bit-fields

Uses of Bit-fields
Bit-fields are typically used to implement communication protocols and to create device drivers for actuators or sensors. Frequently the actuators or sensors will use A/D and D/A converters to associate specific bit groupings with analog voltages on device connector pins.
In safety critical systems the bit patterns of the bit-fields must be correctly arranged or communication with the actuator or sensor will be incorrect. Since many safety critical systems require command of actuators, combined with resulting sensor readings to provide a closed loop control system, the ability to correctly define the bit layout for each device interface is highly safety critical.

JSF C++ Coding Standard

AV Rule 154 (MISRA Rules 111 and 112, Revised)

Bit-fields shall have explicitly unsigned integral or enumeration types only.



Rationale: Whether a plain (neither explicitly signed nor unsigned) char, short, int or long bit-field is signed or unsigned is implementation-defined.[10] Thus, explicitly declaring a bit-filed unsigned prevents unexpected sign extension or overflow.
Note: MISRA Rule 112 no longer applies since it discusses a two-bit minimum-length requirement for bit-fields of signed types.

AV Rule 155

Bit-fields will not be used to pack data into a word for the sole purpose of saving space.
Note: Bit-packing should be reserved for use in interfacing to hardware or conformance to communication protocols.
Warning: Certain aspects of bit-field manipulation are implementation-defined.


Rationale: Bit-packing adds additional complexity to the source code. Moreover, bit-packing may not save any space at all since the reduction in data size achieved through packing is often offset by the increase in the number of instructions required to pack/unpack the data.

AV Rule 156  (MISRA Rule 113)

All the members of a structure (or class) shall be named and shall only be accessed via their names.

Rationale: Reading/writing to unnamed locations in memory is error prone.
Exception: An unnamed bit-field of width zero may be used to specify alignment of the next bit-field at an allocation boundary. [10], 9.6(2)



C++ Bit-fields Explained

The following explanation of C++ bit-fields is taken from http://en.cppreference.com/w/cpp/language/bit_field

Bit field

Declares a class data member with explicit size, in bits. Adjacent bit field members may be packed to share and straddle the individual bytes.
A bit field declaration is a class data member declaration which uses the following declarator:
identifier(optional)attr(optional) : size
(1)
The type of the bit field is introduced by the decl-specifier-seq of the declaration syntax.
attr(C++11)
-
optional sequence of any number of attributes
identifier
-
the name of the bit field that is being declared. The name is optional: nameless bitfields introduce the specified number of bits of padding
size
-
an integral constant expression with a value greater or equal to zero. When greater than zero, this is the number of bits that this bit field will occupy. The value zero is only allowed for nameless bitfields and has special meaning: it specifies that the next bit field in the class definition will begin at an allocation unit's boundary.

Explanation

The number of bits in a bit field sets the limit to the range of values it can hold:
#include <iostream>
struct S {
 // three-bit unsigned field,
 // allowed values are 0...7
 
};
int main()
{
S s 
s.b
    std::cout << s.b << '\n'; // output: 0
}
Multiple adjacent bit fields are usually packed together (although this behavior is implementation-defined):
#include <iostream>
struct S {
    // will usually occupy 2 bytes:
    // 3 bits: value of b1
    // 2 bits: unused
    // 6 bits: value of b2
    // 2 bits: value of b3
    // 3 bits: unused
 
};
int main()
{
    std::cout << sizeof(S) << '\n'; // usually prints 2
}
The special unnamed bit field of size zero can be forced to break up padding. It specifies that the next bit field begins at the beginning of its allocation unit:
#include <iostream>
struct S {
    // will usually occupy 2 bytes:
    // 3 bits: value of b1
    // 5 bits: unused
    // 6 bits: value of b2
    // 2 bits: value of b3
 
 
 
 
};
int main()
{
    std::cout << sizeof(S) << '\n'; // usually prints 2
}
If the specified size of the bit field is greater than the size of its type, the value is limited by the type: a std::uint8_t b : 1000; would still hold values between 0 and 255. the extra bits become unused padding.
Because bit fields do not necessarily begin at the beginning of a byte, address of a bit field cannot be taken. Pointers and non-const references to bit fields are not possible. When initializing a const reference from a bit field, a temporary is created (its type is the type of the bit field), copy initialized with the value of the bit field, and the reference is bound to that temporary.
The type of a bit field can only be integral or enumeration type.
A bit field cannot be a static data member.

Notes

The following properties of bit fields are implementation-defined
  • Everything about the actual allocation details of bit fields within the class object
  • For example, on some platforms, bit fields don't straddle bytes, on others they do
  • Also, on some platforms, bit fields are packed left-to-right, on others right-to-left
  • Whether char, short, int, long, and long long bit fields that aren't explicitly signed or unsigned are signed or unsigned.
  • For example, int b:3; may have the range of values 0..7 or -4..3.
(until C++14)

References

  • C++11 standard (ISO/IEC 14882:2011):
  • 9.6 Bit-fields [class.bit]
  • C++98 standard (ISO/IEC 14882:1998):
  • 9.6 Bit-fields [class.bit]


Ada Alternative

Ada allows the programmer to define bit layouts within a record with much more confidence than can be done by the C++ programmer. The Ada 2012 Reference Manual describes the use of record representation clauses.

13.5.1 Record Representation Clauses

1
record_representation_clause specifies the storage representation of records and record extensions, that is, the order, position, and size of components (including discriminants, if any).

Syntax

2
record_representation_clause ::= 
    
for first_subtype_local_name use
      
record [mod_clause]
        {
component_clause}
      
end record;
3
component_clause ::= 
    
component_local_name at position range first_bit .. last_bit;
4
position ::= static_expression
5
first_bit ::= static_simple_expression
6
last_bit ::= static_simple_expression

Name Resolution Rules

7
Each positionfirst_bit, and last_bit is expected to be of any integer type. 

Legality Rules

8/2
The first_subtype_local_name of a record_representation_clause shall denote a specific record or record extension subtype. 
9
If the component_local_name is a direct_name, the local_name shall denote a component of the type. For a record extension, the component shall not be inherited, and shall not be a discriminant that corresponds to a discriminant of the parent type. If the component_local_name has an attribute_designator, the direct_name of the local_name shall denote either the declaration of the type or a component of the type, and theattribute_designator shall denote an implementation-defined implicit component of the type.
10
The positionfirst_bit, and last_bit shall be static expressions. The value of position and first_bit shall be nonnegative. The value of last_bit shall be no less than first_bit – 1. 
10.1/2
   If the nondefault bit ordering applies to the type, then either: 
10.2/2
the value of last_bit shall be less than the size of the largest machine scalar; or
10.3/2
the value of first_bit shall be zero and the value of last_bit + 1 shall be a multiple of System.Storage_Unit. 
11
At most one component_clause is allowed for each component of the type, including for each discriminant (component_clauses may be given for some, all, or none of the components). Storage places within acomponent_list shall not overlap, unless they are for components in distinct variants of the same variant_part.
12
A name that denotes a component of a type is not allowed within a record_representation_clause for the type, except as the component_local_name of a component_clause.

Static Semantics

13/2
 record_representation_clause (without the mod_clause) specifies the layout.
13.1/2
   If the default bit ordering applies to the type, the positionfirst_bit, and last_bit of each component_clause directly specify the position and size of the corresponding component.
13.2/3
   If the nondefault bit ordering applies to the type, then the layout is determined as follows:
13.3/2
the component_clauses for which the value of last_bit is greater than or equal to the size of the largest machine scalar directly specify the position and size of the corresponding component;
13.4/2
for other component_clauses, all of the components having the same value of position are considered to be part of a single machine scalar, located at that position; this machine scalar has a size which is the smallest machine scalar size larger than the largest last_bit for all component_clauses at that position; the first_bit and last_bit of each component_clause are then interpreted as bit offsets in this machine scalar. 
14
record_representation_clause for a record extension does not override the layout of the parent part; if the layout was specified for the parent type, it is inherited by the record extension. 

Implementation Permissions

15
An implementation may generate implementation-defined components (for example, one containing the offset of another component). An implementation may generate names that denote such implementation-defined components; such names shall be implementation-defined attribute_references. An implementation may allow such implementation-defined names to be used in record_representation_clauses. An implementation can restrict such component_clauses in any manner it sees fit. 
16
If a record_representation_clause is given for an untagged derived type, the storage place attributes for all of the components of the derived type may differ from those of the corresponding components of the parent type, even for components whose storage place is not specified explicitly in the record_representation_clause.

Implementation Advice

17
The recommended level of support for record_representation_clauses is: 
17.1/2
An implementation should support machine scalars that correspond to all of the integer, floating point, and address formats supported by the machine.
18
An implementation should support storage places that can be extracted with a load, mask, shift sequence of machine code, and set with a load, shift, mask, store sequence, given the available machine instructions and run-time model.
19
A storage place should be supported if its size is equal to the Size of the component subtype, and it starts and ends on a boundary that obeys the Alignment of the component subtype.
20/2
For a component with a subtype whose Size is less than the word size, any storage place that does not cross an aligned word boundary should be supported.
21
An implementation may reserve a storage place for the tag field of a tagged type, and disallow other components from overlapping that place. 
22
An implementation need not support a component_clause for a component of an extension part if the storage place is not after the storage places of all components of the parent type, whether or not those storage places had been specified. 
NOTES
23
14  If no component_clause is given for a component, then the choice of the storage place for the component is left to the implementation. If component_clauses are given for all components, the record_representation_clause completely specifies the representation of the type and will be obeyed exactly by the implementation. 

Examples

24
Example of specifying the layout of a record type: 
25
Word : constant := 4;  --  storage element is byte, 4 bytes per word
26
type State         is (A,M,W,P);
type Mode          is (Fix, Dec, Exp, Signif);
27
type Byte_Mask     is array (0..7)  of Boolean;
type State_Mask    is array (State) of Boolean;
type Mode_Mask     is array (Mode)  of Boolean;
28
type Program_Status_Word is
  record
      System_Mask        : Byte_Mask;
      Protection_Key     : Integer range 0 .. 3;
      Machine_State      : State_Mask;
      Interrupt_Cause    : Interruption_Code;
      Ilc                : Integer range 0 .. 3;
      Cc                 : Integer range 0 .. 3;
      Program_Mask       : Mode_Mask;
      Inst_Address       : Address;
end record;
29
for Program_Status_Word use
  record
      System_Mask      at 0*Word range 0  .. 7;
      Protection_Key   at 0*Word range 10 .. 11; -- bits 8,9 unused
      Machine_State    at 0*Word range 12 .. 15;
      Interrupt_Cause  at 0*Word range 16 .. 31;
      Ilc              at 1*Word range 0  .. 1;  -- second word
      Cc               at 1*Word range 2  .. 3;
      Program_Mask     at 1*Word range 4  .. 7;
      Inst_Address     at 1*Word range 8  .. 31;
  end record;
30
for Program_Status_Word'Size use 8*System.Storage_Unit;
for Program_Status_Word'Alignment use 8;
NOTES
31
15  Note on the example: The record_representation_clause defines the record layout. The Size clause guarantees that (at least) eight storage elements are used for objects of the type. The Alignment clause guarantees that aliased, imported, or exported objects of the type will have addresses divisible by eight.

Notice that Ada allows not only the number of bits for each field, it also allows the programmer to specify the value range for each field, the bit location relative to the start of each word of the structure, and even leave some unused bits. C++ requires the use of unnamed bit-fields (with a size greater than 0) to mark unused bits. The C++ compiler has no way to know which bits the programmer wants to skip without specifying an unnamed field. Strangely, an unnamed field with a size of 0 indicates that the next field should start at an allocation unit's boundary, which is highly dependent on the hardware architecture. For instance, on an 8-bit processor the allocation boundary will be every 8 bits, while on a 32-bit processor the allocation boundary might be every 8 bits, 16 bits, or 32 bits, depending upon the processor architecture.

As a software safety engineer I find the implementation-defined aspects of C++ bit-fields to be very unsettling. A project may verify that its C++ bit-fields work as intended on a particular architecture, but when a technology refresh is performed in several years, there is no assurance that the C++ program will continue to work properly on the new hardware. The failure of the C++ program to run on the new hardware will be received as a complete surprise by most users of the upgraded system and by their management. The surprise may be accompanied by hazardous events because of incorrect control of a safety critical system. People may be injured or die, the environment may be seriously damaged, and very expensive systems may be damaged or destroyed. The root cause of the hazards will not be poor functional requirements, poor design, or programming mistakes. The root cause of the hazards will be exposure of implementation-defined behaviors in a programming language. Who will the lawyers sue over the consequences of those hazards?

Comparison of three algorithms for summing an array of integers

This article shares a comparison of the time taken to sum an array of integers. This article tests three algorithms for summing the array. •...