Parallel summation of a large array without shared data locks
The traditional producer/consumer pattern employs a shared
buffer between the producer and the consumer.
Many producer/consumer problems are simply sequential problems with the
overhead of multiple tasks and a shared buffer.
Parallel operations, on the other hand, are more naturally
concurrent without the locking overhead of a shared buffer. Instead
non-overlapping data elements of a collection such as an array are assigned to
two or more tasks, and identical tasks process their subsets of the collection
without need of locking the collection.
If the parallel task
is to sum all the elements in the array then task 1 in the diagram above will
sum the elements in the first half of the array while task 2 sums the elements
in the second half of the array. Task 1 and task 2 then simply report their
subtotals to the parent task which adds the two values and returns the final
total.
The following source code is an Ada package for parallel
addition along with a procedure to test the package.
package Parallel_Addition is
type Data_Array is array(Integer range
<>) of Integer;
type Data_Access is access all Data_Array;
function Sum(Item : in not null Data_Access)
return Integer;
end Parallel_Addition;
The package specification above defines an array type that
can be used by the Sum function. The Sum function declares a parameter of the
type Data_Accesss so that the function can handle arrays created either on the
stack or on the heap.
package body Parallel_Addition
is
---------
-- Sum --
---------
function Sum (Item : in not null
Data_Access) return Integer is
task type Adder is
entry Set (Min : Integer; Max :
Integer);
entry Report (Value : out Integer);
end Adder;
task body Adder is
Total : Integer := 0;
First : Integer;
Last
: Integer;
begin
accept Set (Min : Integer; Max : Integer)
do
First := Min;
Last := Max;
end Set;
for I in First .. Last loop
Total := Total + Item (I);
end loop;
accept Report (Value : out Integer) do
Value := Total;
end Report;
end Adder;
A1
: Adder;
A2
: Adder;
R1
: Integer;
R2
: Integer;
Mid : constant Integer := (Item'Length /
2) + Item'First;
begin
A1.Set (Min => Item'First, Max =>
Mid);
A2.Set (Min => Mid + 1, Max =>
Item'Last);
A1.Report (R1);
A2.Report (R2);
return R1 + R2;
end Sum;
end Parallel_Addition;
The package body for Parallel_Addition simply implements the
Sum function. The Sum function defines a task type named Adder. That task type
has two entries. The Set entry receives the minimum and maximum index values to
be processed. The Report entry passes the locally calculated subtotal back to
the Sum function. Each instance of Adder sums the values in the index range
from Min to Max in the array passed as the Sum formal parameter Item, then
passes the local sum back through the Report entry.
Two instances of Adder are created as well as two variables
to contain results, one result for each Adder task. The variable Mid calculates
the middle index value of the array Item.
Adder tasks A1 and A2 suspend at their Set entry until their
Set entry is called. The then concurrently process the array slices indicated
by their Min and Max values. They then suspend until their Report entry is
called.
The Sum function simply calls the two Set entries and then
calls the two Report entries. Finally Sum returns the sum of R1 and R2.
The test procedure for the Parallel_Addition package is:
with Parallel_Addition; use
Parallel_Addition;
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Calendar; use Ada.Calendar;
procedure
Parallel_Addition_Test is
The_Data : Data_Access := new Data_Array (1
.. Integer'Last);
Start
: Time;
Stop
: Time;
The_Sum
: Integer;
begin
The_Data.all := (others => 1);
Start
:= Clock;
The_Sum
:= Sum (The_Data);
Stop
:= Clock;
Put_Line ("The sum is: " &
Integer'Image (The_Sum));
Put_Line
("Addition elapsed time is "
&
Duration'Image (Stop - Start) &
" seconds.");
Put_Line
("Time per addition operation is
" &
Float'Image(Float(Stop - Start) /
Float(The_Data'Length)) &
" seconds.");
end Parallel_Addition_Test;
The variable The_Data is an instance of Data_Access which
accesses an array containing Integer’Last data elements. The variables Start
and Stop are used to capture the time required to calculate the sum of all
values in the array.
All the values of the array accessed by the variable
The_Data are initialized to 1 to ensure that the resulting sum does not exhibit
integer overflow. The variables Start and Stop record the time just before
summing the data and just after summing the data. The difference in the two
time values is the approximate elapsed time to calculate the sum. The average
time per addition operation is simply the elapsed time divided by the number of
data elements processed.
An output of this program, run on a Windows 10 computer, is:
The sum is: 2147483647
Addition elapsed time is 5.661118000 seconds.
Time per addition operation
is 2.63616E-09 seconds.
The sum is also the number of array elements processed. This
large array was used to produce a statistically significant timing sample.
Comments
Post a Comment