Basic notions of
algorithmics
Piotr Chrząstowski-Wachtel
Uniwersytet Warszawski
Algorithmics
The most important part of computer
science
Describes
how to solve the algorithmic problems,
how to choose appropriate data structures
how to analyse program behavior
Results in the most spectacular
improvements of computer
performance!
What is algorithmics about?
Any planning – in particular writing
computer programs
We must remember that computers
must have a rigid and precise
interface. They are quite stupid
machines in general, and will not
deduce anything as humans use to
do.
The roots of algorithmics
The first known people
who made a scientific
approach to algorithmic
solutions were the
Ancient Greeks
The first algorithmic
problems thoroughly
investigated were the
geometric constructions
Plato
Geometric constructions
(Plato's approach)
The aim is to determine certain
objects on a plane (points, lines and
circles), satisfying some given
properties.
Example problem:
Given a circle o(O,r) and a point A outside
the circle, draw a tangent line crossing A.
Is such solution legal?
We position the compass' needle in A,
and rotate the ruler around it until the
first point of a circle shows up.
We draw a line connecting A with this
point.
Plato forbid such operations
... like sliding or rotating a ruler, and
many others, like drawing a parabola,
spiral, using the 3rd dimension and
many others.
What is allowed then?
Domain of geometric
constructions
We focus on 3 kinds of objects: points,
lines and circles
On these objects only 5 operations are
legal.
Legal geometric operations
For 2 points draw a line crossing them,
For 2 points draw a circle, having a center in
one of them and crossing the other one,
For 2 lines determine their intersection point
For a line and a circle, determine their
common points
For 2 circles, determine their common points
... and nothing more!
Notation for legal operations
Let's invent the names for the legal operations
l := line(X,Y) – line containing points X and Y
o:= circle(O,Y) – circle with center O and radius OY
X := l k – point od intersection of lines l and k
(X,Y) := l o – points of intersection of line l and circle o
(X,Y) := o1 o2 – point od intersection of circles o1 and
o2
All these operations are partial – they are detertmined only
if arguments are sensible.
The tangent problem
solution
We present here the solution of the problem to draw
a tangent line to a given circle o(O,Y) and point A
:
l := line(O,A) – line l connects circle's center with the point A
o1:= circle(O,A) – circle with center O and radius OA
o2:= circle(A,O) – circle with center A and radius AO
(P,Q) := o1 o2 – points of intersection of o1 and o2
k := line(P,Q) – k is the perpendicular bisector of segment
OA
X := l k – center of OA
o3 := circle (X,O) – circle with the center X and radius XO
(R,S) := o o3 – intersection points of o and o3
s := line(R,A) – line s is one of the 2 tangent lines
Compact version of the
solution
We can present it shortly:
s := line((o
1
(circle((line(O,Y)
line(circle(A,O)) circle(O,A)),O))),A)
by
we mean here the first of two intersection
points
... this is more or less how the functional
programming looks like
Non-solvable problems
Ancient Greeks could not solve the following
three problems algorithmically within the given
framework:
determine the side of a square of equal area
to a given circle (squaring a circle)
divide an angle to 3 equal parts (angle
trisection)
determine the side of a cube which has the
volume twice as big as the given one (cube
doubling)
Unsolvability of some
geome-trical construction
problems
In 19th century it was rigourously proven that
none of the presented 3 problems can be solved!
Maybe it is so, because we have too few and too
primitive operations?
But if we add some more operations will new
unsolvable problems arise?
Unsolvable problems
Much later (20th century) Alan
Turing showed that there exist
algorithmic problems, which are
generally unsolvable by any
computer, no matter how
powerful it is!
Post correspondence
problem
Example:
x
1
=abb y
1
=a
x
2
=b y
2
=abb
x
3
=a y
3
=bb
Does there exist a sequence of indices i1,i2,…,in,
such that x
i1
…x
in
=y
i1
…y
in
?
Post correspondence problem is
unsolvable
! However,
for many cases the answer can be given (like for the
above example). There does not exist a general
algorithm, which for any input data x
1
,...,x
n
i y
1
,...,y
n
would determine, if one can equalize the words by
the same sequence of indices.
Emil Post
Where does the word
"algorithm” come from?
It must be an old word, because in
many languages sound similar.
Al Khwarizmi
Abu Ja'far Muhammad
ibn Musa Al-Khwarizmi
(ok. 780 – ok. 850)
Hisab al-jabr w'al-
muqabala
Described interesting
methods to solve
equations by means of
algorithmic methods!
Introduced the decimal
system
An example from Al
Khwarizmi book
We solve the equation
x
2
+10x=39
What are we allowed to
do?
Use only natural numbers
(segment length)
add and subtract segments
multiply (making rectangles)
taking square roots (determining
the side length of a square with
the given area)
S=64
hence x=3
An example from Euclid
Determine the Greatest Common
Divisor GCD(m,n)
What are we allowed to do? The same as
previously, in particular:
use only natural numbers
add and subtract them
Euklides
GCD(m,n)
There is no formula for GCD(m,n), which
would work for any m and n.
Euclid proved that if we subtract the
smaller number from the greater one,
their GCD will remain the same.
GCD(m,n) for 0
m
n,
n>0
The first idea:
The following pseudo-code will do the job:
Input(m,n);
While m>0 do
begin
if m>n then Exchange(m,n);
n:=n-m
end;
Output(n)
Euclid
If m=0 then GCD(m,n)=n
If m>0 then GCD(m,n)=GCD(n –
m,m)
Do we see the problem?
How long would it take to perform the
code for a computer with 1THz clock for
n=10
30
and m=1?
if n=10
9
and m=1, then it would take ca
1/40 s.
But 10
21
/40s = more than 790 billion
years! Few dozens as much as our
Universe exists!
In RSA cryptography the numbers used
are of order 10
200.
Should we give up?
Let's look more closely, how our algoritms
works.
Subtracting the smaller number m from the bigger n
ends, when the result is smaller than m.
What remains is nothing but the remainder from the
divisioin n by m.
This is the moment, when the number roles are
exchanged: m becomes the bigger number, and n –
smaller.
So all the subtraction does nothing more than just
determining the remainder!
Can't we determine the
remainder faster?
Definitely we can – for instance by performing
the written division.
So we can modify our algorithm:
Euclid 2
If m=0 then GCD(m,n)=n
If m>0 then GCD(m,n)=GCD(n mod m,
m),
•where mod stands for the remainder
Code of Euclid 2
We can program the algorithm using the
following pseuo-code:
Input(m,n);
While m>0 do
begin
r := n mod m;
n := m;
m := r
end;
Output(n)
Have we won anything?
This time the worst data will not be and the
greatest number in the given range. The
remaider would then be 0, and the algorithm
would terminate after one loop turn.
Let's try to determine data, for which Euclid2
will perform the greatest number of loops.
What are the worst data for 30-digit numbers?
Pessimistic complexity
... is the number of operations that algorithm
will perform for the most evil data.
We should determine the range of data we are
looking for the most pessimistic one.
Let's calm down. What are the worst data for
Euclid2 in the range ..99?
Pessimistic complexity of
Euclid2
Let's ask the question from the other side.
Instead of asking what the worst data are, let's
ask how big the data should be at least, in
order to enforce the given number of loop turn.
Let's start with the small values. What are the
smallest data which will enforce 0 loop turns?
Clearly (1,0). For 1 loop turn (1,1). For 2 loop
turns we need at least the numbers (2,3),
because the divisor should be greater than ,
and the denominator greater than the divisor.
..Complexity of Euclid2
The next values are:
3 loop turns: (3,5), because in order to get "the
cheapest" 2 turns – the pair (2,3) – the divisor
should be 3, and the smallest divident giving for
that divisor the remainder 2 is 5.
4 loop turns: (5,8); divisor 5 and divident 5+3=8
5 loop turns: (8,3); because 3=8+5
...and so on
In each of the cases, the quotient is
..Complexity of Euclid2
The record is (55,89) – 9 loop turns:
(55,89)
(34,55)
(2,34)
(3,2)
( 8,3)
( 5, 8)
( 3, 5)
( 2, 3)
( , 2)
( 0, )
Looking at the right column we see that:
•one cannot make 2 loop turns with numbers less
than 3,
•one cannot make 3 loop turns with numbers less
than 5
•,..
•one cannot make 9 loop turns with numbers less
than 89
and so on...
Fibonacci numbers
So the worst numbers in each row are such that the
smaller of them is equal to the greater one from below,
and the greater one is the sum of the two below numbers.
(
55,89)
(34,55)
(2,34)
(3,2)
( 8,3)
( 5, 8)
( 3, 5)
( 2, 3)
( , 2)
( 0, )
The numbers in one column form
the so called Fibonacci sequence
F
0
= 0, F
1
= 1,
F
n
= F
n-1
+ F
n-2
for
n>1
0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 ...
Leonardo Fibonacci
Fibonacci numbers
Here are the numbers of loop turns for given pairs of 2
Fibonacci numbers:
(
F
10
,F
11
) 9 turns
(F
9
,F
10
) 8 turns
(F
8
,F
9
) 7 turns
(F
7
,F
8
) 6 turns
(F
6
,F
7
) 5 turns
(F
5
,F
6
) 4 turns
(F
4
,F
5
) 3 turns
(F
3
,F
4
) 2 turns
(F
2
,F
3
) turn
(F
0
,F
1
) 0 turns
From (F
n-1
,F
n
) for n>2
we obtain n-2 loop turns.
Fibonacci numbers
Among the 30-digit numbers the worst for
Euclid2 are the 2 biggest Fibonacci numbers
within this range.
The index of the greater of them minus 2
equals to the biggest number of loop turns
within this range.
The key question is how fast they grow?
If fast, then good! Because the number index
will be small.
Fibonacci numbers
Fibonacci himself was unable to determine a
formula for "his" numbers. But such formula
exists!
The formula was discovered by Leonard Euler
in the 8th century, and later re-discovererd
by Jacques
Binet in the 9th
century.
Euler-Binet formula
Leonard Euler
Jacques Binet
Conslusions from Euler-Binet
formula:
Let's introduce the notation
These numbers are equal to ca. 1.618... and
-0.618... . The Euler-Binet formula takes the form
Now getting rid of the second component, which
soon becomes close to 0, we get a simplified
formula:
Fibonacci numbers grow
exponentially fast!
So from some moment each Fibonacci
number is greater more than 60% than
the previous one.
Since we want to find the greatest
Fibonacci number less that 10
30
, we wish
to solve the inequality
n
/
5 < 10
30
, and after taking logarithm
with base we get n-2 < 30log
10 =
148,33...
Finally n150.
We have won!
It turns out that even for such enormous
data as 30-digit numbers, the number of
loop turns will not exceed 150.
There is however one difficulty:
programming the mod operation is more
challenging that simple subtraction.
... but who claims that programming is
easy?
Important aspects
Algorithmic domain
Before we start designing algorithms, we
should be aware of the exact set of
objects and operations we wish to build
our algorithm of.
For geometrical constructions they were
points, lines, and circles together with 5
operations we admitted.
Algorithmic domains for Euclid
algorithms
In both Euclid algorithms we used a
different set of operations:
Euclid1: (N, , –, Exchange )
Euclid2: (N, >
0
, mod )
Algorithmic domain
The basic notion of an algorithmic
domain is the set together with
specified operations and relations – a
relational system
where A is the carrier, and the sets
are the sets of operations and relations
on the carrier
Structural programming
When we program in a structured way,
we create hierarchical domains, letting
the higher-level domains use the
operations of lowel-level domains.
On none of the levels we want to be
engaged with unnecessary details.
Such programming style is
characteristic to the languages of
newer generations (object-orientation,
logic and functional programming).
Computational complexity
This is the second important notion of
algorithmics.
We consider the number of
operations, which we call a
computational cost.
We can apply it to
a concrete algorithm
an algorithmic problem.
What do we count?
It would be hard to focus on each
operation.
Instead we choose the most frequent
operation and the most expensive one,
and concentrate on counting the number
of executions of such operation for some
data.
The number of all other operations is
limited by a linear function. The size of
each program is finite!
Pessimistic and average
complexity
Most commonly we consider 2 types
of data:
worst data (pessimistic complexity)
typical data (average complexity)
Space complexity
Sometimes we need additional
memory (like in the Eratosthenes
sieve). The amount of such memory is
called space complexity. Again we
distinguish:
pessimistic space complexity
average space complexity
Complexity of a problem
Complexity of a problem is the
complexity of the best algorithms in a
given class solving this problem. Here
we can distinguish:
pessimistic problem complexity
average problem complexity
Computational complexity
So we have 3 categories of
complexity:
time – memory
pessimistic – average
algorithm – problem
Data size
We determine the data size by the
number of bits required to represent
the input data.
for arrays – their length
for numbers – the number of digits
for graphs – the sum of the number of
nodes and edges
Conclusion
Algorithmics is the essence of computer
science
It is important, what algorithm we
choose. Even the fastest computers can
be useless if we choose a bad algorithm.
It is always a good idea to consider a
structural approach and to determine
the cost of the solution.