DOES IT MAKE SENSE TO SOLVEDIOPHANTINE LINEAR EQUATION?

We consider the problem solution of diophantine linear equation - Diophantine problem: Given an array of positive integers A[1], ..., A[N] and an positive integer C find a subset of A[1], ..., A[N] that sums to C. This problem is NP-complete. An special case for even C, where C = (A[1] + A[2] + ... + A[N]) div 2, is the partition problem. I present very simple exponential-time algorithm for finding their solutions.

# ... it can't be solved by WORLD'S FASTEST COMPUTER

Following citations have been selected from various sources):

The only known solution is brute force: test all possible subsets until the sum is found or you run out of subsets. For example for N = 300 the number of subsets is 2300, which is greater than the estimated total number of electrons, protons, and neutrons, in the entire universe. Even a computer counting at a speed of over a million or billion each second, would not reach the result until long after our sun had burned out. When you have partition items from an office, i.e. for N (approximately) equals 100, into 2 disjoint subsets so that the sum of their values in each of the two subsets is equal it doesn't make sense to solve by WORLD'S FASTEST COMPUTER.

# Algorithm DIOPHANT

I introduce a very simple algorithm for finding of a solution of this problem.

The DIOPHANT Algorithm

Sort the elements of an aray into ascending order.

Compute Ls[1] = A[1], Ls[2] = Ls[1] + A[2], ..., Ls[J] = Ls[J - 1] + A[J], ... for J = 3, ..., N.

If A[1] <= C & C <= Ls[N] then
The array Ls is scanned from the left (for indexes K = 1, 2, ...) to find an element Ls[K] >= C.

If Ls[K] = C then we found a solution and it includes as well elements A[1], ..., A[K]; algorithm ends.

The array A is scanned from the right (for indexes J = N, N - 1, ...) to find an element A[R] <= C.

If A[R] = C then we found a solution and it includes as well an element A[R]; algorithm ends.

Now, for L = K, ..., R while A[1] <= C - A[L] we will suppose that A[L] is a part of the solution and we will search a subset of A[1], ..., A[L - 1] that sums to C - A[L] (current sum), i.e. recursive call of this ghostwhite-coloured part of the description of algorithms for:

N = L - 1 and C = C - A[L]

In this article I present a non-recursive version of this algorithm in the Rexx language:

DIOPHANT
call SORT 1, N
Ls.1 = A.1
do I = 2 to N; Im1 = I - 1; Ls.I = A.I + Ls.Im1; end
if A.1 <= C & C <= Ls.N then do
S = 1; Stack.1 = N C
do while S > 0
parse var Stack.S R C V; S = S - 1
do K = 1 while Ls.K < C; end
if Ls.K = C then call EXIST V, K, 1
do while A.R > C; R = R - 1; end
if A.R = C then call EXIST V, R, R
do L = K to R while A.1 <= C - A.L
D = V A.L; S = S + 1; Stack.S = (L - 1) (C - A.L) D
end
end
end
say "Solution not exist"

An solution is stored into the V variable as an string of numbers separated by blanks. Into the stack are stored by the push operation

S = S + 1; Stack.S = (L - 1) (C - A.L) D

following data for the solution of a subproblem: size of an array, a current value of C, a current part of the solution. When a solution exists the EXIST procedure is called. My simple example of this procedure writes a solution on a screen and ends the computation:

EXIST
EXIST: procedure expose A.
parse arg V, B, E
do J = B to E by -1; V = V A.J; end
say "Solution:" V
exit

When we replace the exit instruction with S = S - 1; return instructions then the DIOPHANT will find all solutions.

# Worst Case - Complexity O(2N)

DIOPHANT is an exponential-time algorithm. The column entitled P(N) includes number of push operations for given N.

Exponential complexity, the partition problem
A[1] = 2, A[J] = 100 + A[J - 1], J = 2, ..., N
 N Exist P(N) time (sec) P(N) / P(N - 1) 2 1 0 3 1 0 1.00 4 + 3 0 5 3 0 6 5 0 1.67 7 8 0 1.60 8 + 7 0 9 27 0 10 44 0 1.63 11 77 0.05 1.75 12 + 14 0 13 265 0.11 14 429 0.16 1.60 15 772 0.28 1.80
 N Exist P(N) time (sec) P(N) / P(N - 1) 16 + 25 0 17 2875 0.94 18 4742 1.60 1.65 19 8761 2.97 1.85 20 + 38 0.06 21 33672 11.26 22 56392 19.66 1.67 23 105581 37.57 1.87 24 + 54 0 25 414524 143.74 26 703409 250.96 1.70 27 1329633 466.22 1.89 28 + 75 0.06 29 5296942 1794.41

 Notes on the partition problem

We can spare about half of the number of push operations when we will find only one subset S, A[N] is contained in S. I.e. the Diofantine problem of size N - 1: Given an array of positive integers A[1], ..., A[N - 1] and an positive integer C - A[N] find a subset of A[1], ..., A[N - 1] that sums to C - A[N], where C = (A[1] + ... + A[N]) div 2.

In this article I will solve the partition problem, as the "usual" Diophantine problem, also if (A[1] + ... + A[N]) div 2 is an odd number.

# Average? Case - Examples of Polynomial Complexity

What is average case for the Diophantine problem? That is the Question!

# Test #1

For given N = 10, 20, 30, 40, 50, 100, 200, 300, 400, 500 I generated an array A by the following fragment of program. Note that in the Rexx language the call of the RANDOM(Min, Max) built-in function returns a pseudorandom number from the interval <Min, Max>, where Max <= 100000.

Generating elements of A array in test #1
Sum = 0
do J = 1 to N
A.J = RANDOM(1, 100000); Sum = Sum + A.J
end

For the partition problem I computed the C number by integer divide (operator % in Rexx)

C for the partition problem
C = Sum % 2

For the general Diophantine problem I computed C by expression

C for the Diophantine problem
C = FORMAT(Sum * RANDI(),,0)

where the FORMAT built-in function returns integer - rounded values Sum * RANDI(). The RANDI function returns random numbers between 0 and 1. I used "the minimal standard generator" from Park S.K., Miler K.W.: Random Number Generators: Good ones are hard to find. CACM 1988, Vol. 31, No. 10, p. 1192-1201. Seed is a global variable used to hold the current value in the integer sequence. This generator must be initialized by assigning Seed a value between 1 and 2147483646. Random numbers uniformly distributed between 0 and 1 then can be generated as required via repeated calls to RANDI. Note: // is operator of operation Remainder - divide and return only the remainder.

RANDI
RANDI: procedure expose Seed
numeric digits 14;
A = 16807; M = 2147483647; Seed = (A * Seed) // M
return Seed / M

I repeated 100 times (for given N) the generation of A, C and execution of the DIOPHANT algorithm for finding an solution of the Diophantine problem and then 100 times for finding an solution of the partition problem. The following table sumarizes the results.

Partition problem
C = Sum % 2
N ExistAverage P(N)Average time (sec)
10 0 53.3 0.0205
20 86 7314.93 2.6880
30 100 3493.63 1.3191
40 100 2478.58 0.9303
50 100 2421.02 0.9429
100 100 2035.57 0.5999
200 100 5149.09 1.1845
300 100 10868.98 2.6189
400 100 18804.96 5.5589
500 100 29206.94 12.7907
Diophantine problem
C = FORMAT(Sum * RANDI(), , 0)
N ExistAverage P(N)Average time (sec)
10 0 29.42 0.0109
20 53 4673.77 1.5646
30 86 3480.79 1.1818
40 97 3022.63 1.0483
50 94 1956.00 0.6834
100 99 4050.09 0.8992
200 100 5149.09 1.1845
300 100 7814.071.9347
400 100 14301.56 4.3563
500 100 21542.35 62.5650

The following example shows one solution of the partition problem:

A solution of the partition problem for N = 100
2501 push operations, 0.83 sec
430 1989 2414 3616 4786 4851 5081 6774 7651 9716
1136213110 13913 16301 17058 18473 20594 20640 20886 25846
26730 2849530352 30646 32243 32762 34159 35037 36114 36359
39237 40442 4126442796 43803 44301 44400 44777 4572847467
49921 51136 51454 5167953961 54581 55479 57299 5753758066
59220 59308 59901 60066 6118161579 62559 62892 6390664905
65776 67353 67776 70076 70171 7131671796 75373 7755580520
81273 81681 81804 81815 83093 83490 8451384750 8493185057
88009 88180 88419 88569 89708 90075 90418 909929137592527
92877 93400 94319 94734 97172 98091 98170 98173 9833799363

 5552260 = 2776130 + 2776130

# Test #2

In the second test I generated the elements of A array by the following fragment (the first instruction of the main program was: numeric digits 16):

Generating elements of A array in the test #2
Sum = 0
do J = 1 to N
A.J = FORMAT(RANDI() * 21474836.47,,0)
Sum = Sum + A.J
end

I solved the partition problem for maximum value of item more than 21 millions. I initialized Seed by assigning the value 214748364 and then I executed the DIOPHANT algorithm 20 times for N = 100, 200, 300, 400, 500. Then I initialized Seed by assigning the value 19685089 and I continued for N = 30, 40, 50, 20, 10:

The partition problem #2
NExistAverage P(N) Maximum P(N)
10 52 98
20 22 534 40 733
30 20× 1 293 339 5 023 110
4020×777 078 4 592 315
50 20× 681 585 2 797 498
100 20×117 536 435 236
200 20×88 077 454 263
300 20× 81 816 263 396
400 20× 83 037 344 113
500 20×64 972 151 288

# Conclusion

When an array A will contain really values of items (about 500 values of items from office or garage or hangar or dock) then the problem can be solved almost always only by the Rexx interpreter and an ordinary PC in a reasonable amount of time..