Saturday, April 22, 2023

Comparison of Bit Array Implementations using C and Ada

 

I found the following programming example in Bit Array in C - Sanfoundry

#include <stdio.h>

#define SIZE (58) /* amount of bits */

#define ARRAY_SIZE(x) (x/8+(!!(x%8)))

char get_bit(char *array, int index);

void toggle_bit(char *array, int index);


void toggle_bit(char *array, int index) {

    array[index / 8] ^= 1 << (index % 8);

}

char get_bit(char *array, int index) {

    return 1 & (array[index / 8] >> (index % 8));

}

int main(void) {

    /* initialize empty array with the right size */

    char x[ARRAY_SIZE(SIZE)] = { 0 };

    int i;

    for (i = 0; i < SIZE; i += 2)

        toggle_bit(x, i);

    toggle_bit(x, 56);

    for (i = 0; i < SIZE; i++)

        printf("%d: %d\n", i, get_bit(x, i));

    return 0;

}

 

The program creates a bit array containing 58 elements. The program works as expected and manages to illustrate very arcane features of the C programming language.

I challenge anybody not familiar with C bit shifting to completely understand how the two void functions named get_bit and toggle_bit actually work.

void toggle_bit(char *array, int index) {

    array[index / 8] ^= 1 << (index % 8);

}

char get_bit(char *array, int index) {

    return 1 & (array[index / 8] >> (index % 8));

}

These functions are examples of the “simplicity” of C programming. While the syntax is compact the semantics of these functions are not. Describing what these two functions do would take several paragraphs of text.

 

As an avid Ada programmer I decided to implement a bit array in Ada and perform the same behaviors on the bit array.

The Ada program is slightly longer, sacrificing compactness for clarity.

-- Ada program to implement a bit array

with Ada.Text_IO; use Ada.Text_IO;

procedure Main is

   type bit is range 0 .. 1;

   type index is range 0 .. 57;

   type bit_array is array (index) of bit with

      Component_Size => 1;

   x   : bit_array := (others => 0);

   Idx : index     := 0;

begin

   for I in x'Range loop

      if I mod 2 = 0 then

         toggle_bit (x, I);

      end if;

   end loop;

   toggle_bit (x, 56);


   for I in x'Range loop

      Put_Line (I'Image & ":" & bit'Image (x (I)));

   end loop;

   Put_Line("Size of bit array is" & Integer'Image(X'Size) & " bits.");

   Put_Line("Length of array is" & Integer'Image(X'Length));

end Main;

 

The Ada programming language measures the size of data types in units of bits while the C program must convert bytes to bits in determining its size.

Compare the corresponding Ada and C code sections:

C:

#define SIZE (58) /* amount of bits */

#define ARRAY_SIZE(x) (x/8+(!!(x%8)))

. . .

char x[ARRAY_SIZE(SIZE)] = { 0 };

 

Ada:

   type bit is range 0 .. 1;

   type index is range 0 .. 57;

   type bit_array is array (index) of bit with

      Component_Size => 1;

. . .

x   : bit_array := (others => 0);

 

The Ada code defines an integral type named bit with 0 and 1 as the only valid values. The Ada code then defines an index range to be used in the array type. Knowing the index range prevents buffer overflows in other parts of the program.

The array type bit_array is declared. C provides no means to define array types, only array instances.

In the C example the array named x is defined as an array of type char, but we are not interested in values of type char. We are interested in the bit values within each char element. The size of the char array must be declared to be the number of char elements needed to store 58 bits. Each char element is 8 bits, therefore 8 char elements are needed to store 58 bits.

The corresponding Ada source code is somewhat more readable. A bit_array is an array of 58 elements indexed by the values 0 through 57. The size of each array component is 1, which in Ada terms is 1 bit.

The variable x in the C example is declared to be an array of 8 char data elements. All elements (and all 64 bits) are initialized to 0.

The variable x in the Ada example is declared to be an instance of bit_array, which is declared to be an array of 58 bits. Each element is initialized to 0. Only the 58 bits are initialized to 0.

The C function named toggle_bit is difficult to understand.

void toggle_bit(char *array, int index) {

    array[index / 8] ^= 1 << (index % 8);

}

 

The purpose of this function is to identify the bit specified by the index parameter and toggle the bit. If the bit is 1 then set it to zero. If the bit is 0 then set it to 1.

The Ada procedure named toggle_bit performs the same behavior with a somewhat more verbose, but also clearer syntax.

   procedure toggle_bit (arr : in out bit_array; Idx : index) is

   begin

      if arr (Idx) = 1 then

         arr (Idx) := 0;

      else

         arr (Idx) := 1;

      end if;

   end toggle_bit;

 

Note that the bits in the bit array are indexed with the same syntax used to index any other Ada array. No special syntax is needed. The compiler writes all the low level bit shifting for the programmer, eliminating the need for an explicit get_bit procedure as found in the C example. The Ada version clearly states toggle logic. If the current value of the bit indexed by Idx is 1 then assign 0 to that bit, otherwise assign 1 to that bit.

Also note the obscurity of the C syntax for passing an array to a function. The array is not actually passed to the function. The name of the array is passed as a pointer to the first element of the array, thus the parameter used to “pass” the array is a pointer to char rather than an array. A pointer to char may be a pointer to the first element of an array of char or it may be a pointer to a char which is not a member of an array of char. C requires the programmer to know whether or not the parameter points to an array. The C syntax also does not specify whether or not the actual parameter passed to this function may be modified by the function.

The Ada procedure specifies the parameter arr is an instance of bit_array. Furthermore, the passing mode “in out” specifies that the value of the array is used and modified by the procedure, specifying that the actual parameter passed to this procedure may be modified by the procedure.

Outputs:

The output of the C program is:

0: 1

1: 0

2: 1

3: 0

4: 1

5: 0

6: 1

7: 0

8: 1

9: 0

10: 1

11: 0

12: 1

13: 0

14: 1

15: 0

16: 1

17: 0

18: 1

19: 0

20: 1

21: 0

22: 1

23: 0

24: 1

25: 0

26: 1

27: 0

28: 1

29: 0

30: 1

31: 0

32: 1

33: 0

34: 1

35: 0

36: 1

37: 0

38: 1

39: 0

40: 1

41: 0

42: 1

43: 0

44: 1

45: 0

46: 1

47: 0

48: 1

49: 0

50: 1

51: 0

52: 1

53: 0

54: 1

55: 0

56: 0

57: 0

 

The output of the Ada program is:

 0: 1

 1: 0

 2: 1

 3: 0

 4: 1

 5: 0

 6: 1

 7: 0

 8: 1

 9: 0

 10: 1

 11: 0

 12: 1

 13: 0

 14: 1

 15: 0

 16: 1

 17: 0

 18: 1

 19: 0

 20: 1

 21: 0

 22: 1

 23: 0

 24: 1

 25: 0

 26: 1

 27: 0

 28: 1

 29: 0

 30: 1

 31: 0

 32: 1

 33: 0

 34: 1

 35: 0

 36: 1

 37: 0

 38: 1

 39: 0

 40: 1

 41: 0

 42: 1

 43: 0

 44: 1

 45: 0

 46: 1

 47: 0

 48: 1

 49: 0

 50: 1

 51: 0

 52: 1

 53: 0

 54: 1

 55: 0

 56: 0

 57: 0

Size of bit array is 64 bits.

Length of array is 58

 

Tuesday, April 4, 2023

Poor Quality C Programming Examples

 

C is hard enough without low quality programming examples

I recently read through some of the C programming examples from Sanfoundry. Some of the examples are well constructed while others are very poorly constructed.

One of the bad examples I read is found under the heading of Simple C Programs. The example purports to show how to count the number of vowels and consonants in an input sentence.

The source code for the example follows.

/*

 * C program to read a sentence and count the total number of vowels

 * and consonants in the sentence.

 */

#include <stdio.h>

 

void main()

{

    char sentence[80];

    int i, vowels = 0, consonants = 0, special = 0;

 

    printf("Enter a sentence \n");

    gets(sentence);

    for (i = 0; sentence[i] != '\0'; i++)

    {

        if ((sentence[i] == 'a' || sentence[i] == 'e' || sentence[i] ==

        'i' || sentence[i] == 'o' || sentence[i] == 'u') ||

        (sentence[i] == 'A' || sentence[i] == 'E' || sentence[i] ==

        'I' || sentence[i] == 'O' || sentence[i] == 'U'))

        {

            vowels = vowels + 1;

        }

        else

        {

            consonants = consonants + 1;

        }

        if (sentence[i] =='\t' ||sentence[i] =='\0' || sentence[i] ==' ')

        {

            special = special + 1;

        }

    }

    consonants = consonants - special;

    printf("No. of vowels in %s = %d\n", sentence, vowels);

    printf("No. of consonants in %s = %d\n", sentence, consonants);

}

 

This program does not actually identify a sentence. Instead it merely reads a single input line from stdin. The first conditional carefully identifies the values of the vowels and then assumes anything not a vowel is a consonant. A second conditional makes a weak attempt to find “special” characters that are not letters, but this conditional only looks for tabs, spaces and end of string null characters. It completely ignores punctuation and numeric digits. Sentences often contain punctuation and numeric digits, which are neither vowels nor consonants. The error shows in an inflated count of consonants when an input string contains punctuation and/or numeric digits.

The example program is poorly designed and does not fulfill the requirements expressed on the page C Program to Count the Number of Vowels and Consonants in a Sentence - Sanfoundry.

I wrote an Ada program to achieve the same stated goals. This program uses approximately the same number of lines of source code as the C program while avoiding the problem of miscounting consonants.

-- Ada program to read a line of text and count the number of vowels

-- and consonants in the text

 

with Ada.Text_IO; use Ada.Text_IO;

 

procedure Main is

   subtype Letter is Character with

        Static_Predicate => Letter in 'a' .. 'z' | 'A' .. 'Z';

   subtype Vowel is Character with

     Static_Predicate => Vowel in 'a' | 'e' | 'i' | 'o' | 'u' |

       'A' | 'E' | 'I' | 'O' | 'U';

 

   Line       : String (1 .. 80);

   vowels     : Natural := 0;

   consonants : Natural := 0;

   Length     : Natural;

begin

   Put_Line ("Enter a sentence:");

   Get_Line (Item => Line, Last => Length);

   for I in 1 .. Length loop

      if Line (I) in Letter then

         if Line (I) in Vowel then

            vowels := vowels + 1;

         else

            consonants := consonants + 1;

         end if;

      end if;

   end loop;

   Put_Line

     ("Number of vowels in " & Line (1 .. Length) & " is" & vowels'Image);

   Put_Line

     ("Number of consonants in " & Line (1 .. Length) & " is" &

      consonants'Image);

end Main;

 

The Ada program explicitly defines a subtype of Character containing only English letters. It also defines a subtype of Character defining only the vowels defined in the C program.

The first conditional in the Ada program filters out all characters that are not letters. Within that conditional block the current character is tested for being a vowel. If it is not a vowel then it can only be a consonant. Both the vowel count and the consonant count are correct because the program selected only letters and filtered out all non-letter characters.

C language philosophy

The C language expects the programmer to perform all error checking while only providing very primitive means of specifying the correct data conditions. Historically C has concentrated its design on execution speed at the expense of programmer effort.

Ada language philosophy

The Ada language provides syntax tools to define subtypes of a data type based upon either a range of values or set of values. The Ada example above defines two subtypes of the predefined type Character. One subtype defines the set of all lower case and upper case letters. The other subtype defines the set of lower case and upper case letters identified as vowels.

Each of these subtypes is expressed in a very compact manner in a single expression.

These subtypes express exactly what is a letter and what is a vowel.

Conclusion

The C program never specifies what is a letter. The result is a latent error in program logic which most beginning C students would be unlikely to identify. Eliminating this error would require either a laborious list of letters in C or a list of the numeric ranges representing lower case letters and upper case letters. The first option greatly obscures the meaning of the filtering by its complexity. The second option greatly obscures the meaning of the filtering by using the numeric representation of the letters in a compound conditional expression such as

If ((sentence[i] >= 65 && sentence[i] <= 90) || (sentence[i] >= 97 && sentence[i] <= 122))

The Ada equivalent is accomplished by defining the set of values constituting a letter

subtype Letter is Character with

        Static_Predicate => Letter in 'a' .. 'z' | 'A' .. 'Z';

 

Followed by a simple conditional

if Line (I) in Letter then

The Ada “in” operator returns True if Line (I) is a member of the set of values defined for Letter and False if Line (I) is not a member of the set of values define for Letter.

The set of letters is defined as all the characters in the range starting at ‘a’ and ending at ‘z’ and all the characters in the range starting at ‘A’ and ending at ‘Z’. While this set defines the same values as the C conditional example above, it does so in a very clear manner understandable by programmers of all levels of experience, without resorting to a table of values mapping characters to numeric representations.

The Ada program clearly specifies all the values needed to count vowels and consonants, thereby eliminating the latent defect present in the C program.

Comparison of three algorithms for summing an array of integers

This article shares a comparison of the time taken to sum an array of integers. This article tests three algorithms for summing the array. •...