[MUSIC] The last part of the discussion
of floating point numbers[SOUND] .
The last part of our discussion of
floating numbers, is how we deal with
them in the C language.
C offers two levels of precision for
floating point numbers, as we've already
seen in the IEEE floating point
representation.
Bot a 32 bit representation for floats as
they're referred to and a 64 bit
representation for doubles.
the, the fault rounding mode is round to
even, to avoid that bias in rounding and
always in one direction.
And there's an, a another header file
that we can include that has some
important constants called math.h
That has constants, for example for
infinity and not a number that we can use
in our programs.
One thing to keep reiterating and you
need to remember is, never to use
equality comparison for floating point
numbers.
There's just too many slight differences
that could occur in rounding or in how an
expression is evaluated associatively or
distributively.
And we can often get unexpected results
for our equality comparisons.
The best thing to do with floating point
numbers is to avoid equality comparisons.
And always do a subtraction of the two
values.
And then a test that there, those two
that, that difference is small.
Okay.
Another thing we should talk about is
casting in C.
unlike casting between signed and
unsigned integers, in this case, we do
change the bit representation.
So, for example, when we want to go from
an int to a float and cast, an integer
value into a floating point value, we
actually have to normalize that integer
value.
Right?
Get its exponent, figure out its mantissa
and then represent that in the floating
point notation.
So, that means that integer may in fact
get rounded however, overflow is not
possible.
Floating point numbers can represent much
larger values than we can get to with our
integer representation, okay.
When we go from int to double, we can
actually get an exact conversion as long
as the int is less than 53 bits.
Because now in the double notation the
fractional part is 52 bits long plus that
one extra bit that one point that sits in
front of the mantissa.
So, we get 53 bit word size and our
integers if they're 32 bits can fit
completely in that.
so there's going to be exact conversion.
if we have a 64-bit integer, we might
have some rounding again.
And of course if we go from float to
double, we also get an exact an exact
casting because a float is 32-bits a
double is 64, and it has a larger
fraction, a larger exponent field.
So, it can definitely handle any number
that is in the float representation.
In doing conversions of doubles or floats
to integers we have a couple of issues to
think about.
One is that the, the fractional part of
the floating point number may be
truncated.
Because as we adjust it to take into
account the exponent, we may shift it in
such a way to lose a few bits.
by convention we're going to always round
these values towards zero as we do the
conversion.
Another issue is when the double or float
is bigger or smaller than we can actually
represent in our integer notation.
In that case we'll use the convention to
set the value to Tmin, the two's
compliment minimum value.
And we'll probably also do that for
things like, not a number or infinities
and infinities, we might set to Tmax and
Tmin, for example.
Okay, so to summarize our floating point
representations, here I've shown five
different possibilities.
So, the zero in floating point is the
sting of all zeros and we do that for
convenience because now if we ever test
for zero, all we have to do is the same
test we did for intervals.
We just look for an all zero bit pattern,
and we know its a zero.
Then we talked about normalized values
where the exponent is anywhere from one
to two to the k minus two where k is the
number if bits of the exponent.
And the significand is 1 point m.
Where m is the mantisa, what's
represented in that blue portion of the
of the number.
we also mentioned that we reserved the
exponent of all ones to represent
positive and negative infinity.
Okay.
And we're actually going to put a further
condition on that, that it's going to be
all ones and all zeros in the fractional
part.
So, all ones in the exponent, all zeros
in the fractional part, and of course the
sine can be positive or negative.
For not a number, we actually have many
possibilities.
the exponent is still all ones, but now
the fractional part is non-zero.
That gives us many, many values possible
for not a number.
And in fact, these are used to signify
different conditions under which the not
a number arose.
And finally we have denormalized values
where the exponent is zero, but we treat
the signifcand a little differently.
And you'll notice that in this case we'll
put a zero in front of it, rather than
our typical one for normalized values.
And this is used to represent values to
more densely represent the values near
zero, okay?
we're not going to talk about
denormalized values here but they are
treated in more detail in the recommended
text by Bryant and O'Halloran, if you
want to learn more about that.
finally, we always have to remember that
all these representations suffer from the
problem that there's a fixed number of
bits, and that means we can get overflow
or underflow.
in floating point we also have to
consider the fact that even simple
fractions like 0.2, do not have an exact
representation.
In fact it, it, it's a repeating
representation that we have to truncate
at some point and round okay.
So, we can lose precision unlike every
operation gets a slightly wrong result
that is rounded from the exact result.
And these can pile up and that's why we
do that round to even, to make sure it
doesn't go in one direction all the time.
Okay.
the other thing we need to remember is
that we might get different results as we
apply associativity and distributivity.
Those operations, those laws do not apply
in floating point numbers because of
these inexact results to every operation.
And lastly, yet again I want to remind
you never test floating point values for
equality.
Okay, that can get you in a lot of
trouble because of these rounding
effects.
Alright, that concludes our discussion of
number representations.
Wyszukiwarka
Podobne podstrony:
07 Optional Floating point Operations06 Optional IEEE Floating point StandardAlpha Floating PointSH Floating PointHPPA Floating PointMSP430 Floating PointV850 Floating PointIn Control Omega PointFabryka dźwięków syntetycznych 2010 08 29 In The Mix vol 1 dominDim Mak How Chi Is Used In Dim Mak Pressure Point Defence2009 08 Sync or Swim All in One Solution2006 08 the Sequel Stored Procedures, Triggers, and Views in Mysql 508 The Only Living Boy In New YorkOptions in Fiberwięcej podobnych podstron