Computer Arithmetic

steve_bank · Dec 26, 2018

All math reduces to binary operations. Decimal numbers are for input and output.

Binary addition is easy to implement in hardware logic.
0+0 = 0 0+1 = 1 1+0 = 1 1+1 = 0 carry 1

The truth table yields an excusive OR, one or the other but not both. It takes a few more logic gates to generate the carry bit and add in the carry bit from a previous bit in the adder.

11 0 carry one
01 1
10 1
00 0

While all adition, subtraction, multiplication, and division can be done by software alone, it is slow. It matters for apps like digital image processing that require a lot of repetitive addition and multiplication in the algorithms.

steve_bank · Dec 26, 2018

Signed numbers for addition and subtraction get problematic in hardware. A way aroundnthat is to transform subtraction to addition using commentary numbers.

In base 10 with a max count of 9 the complement of 1 is 8.

In binary arithmetic you can look at an integer like a counter that counts up and down. at a count of 0 an increment is +1. A decrement from 0 results in 9 not 1.
9
8
7
6
5
4
3
2
1
0
9 0
8 1
7 2
6 3
5 4
4 5
3 6
2 7
1 8
0 9

Subtracting 6-4.
Take the complement of 4 and add to 6. Add 1 to the result and ignore the carry.

6-4 = 2
6 + 5 + 1 = 12.
Ignore the carry and the answer is 2.

1-1
1 + 8 + 1 = 10
Ignore the carry and the answer is zero.

subtract 4-7 = -3. This ends up a little different, the counting occurs in reverse.

Add the complement of larger number but a 1 is not added.

4 - 7 = complement[4 + 2] = 6. The complement of 6 is 3.

steve_bank · Dec 26, 2018

Same principle in binary. Direct binary subtraction can be done, but it gets complicated. Using complements is much easier.

111 7
110 6
101 5
100 4
011 3
010 2
001 1
000
111 complement 000
110 001
101
100
011
010
001
000

In binary the complement of a number is taken by inverting the bits. Decrementing from 0 the count goes to 111 or 7. This reflects a real binary counter in hardware.

he complement of 010 is 101 . In a 3 bit system with a max count of 7 the complement of 010(2) is 101(5)

5-2 = 3 101 - 011
Take the complement of 011 and add to 101 + 1.
101 - 011 -> 101 + 100 + 1 = 1010 = 3. Ignore the carry bit.

Inverting the bits and adding 1 is called 2s complement

3-5 = -2 complement 0f 101(5) is 010(2)
011-101 -> 011 + 010 -> 101 complement 101 is 010 decimal 2.

This is 1s complement.

The carry bit indicates positive or negative. All done easily in logic.

steve_bank · Dec 26, 2018

Dealing with real numbers.

There are two broad categories when deal with calculations and real numbers. Fixed point and floating point.

In general apps a number is stored as a fractional, < 1, part and an integer exponent.

Consider an 8 bit system. Analogous to an analog to digital converter 1 is the max number and it is binary weighted by 1/2^n. Each bit's weight is 1/2 of the previous bit.

For 8 bits
b7 0.5
b6 0.25
b5 0.125
b4 0.0625
b3 0.03125
b2 0.015625
b1 0.0078125
b0 0. .00390625

b0 represents the resolution.

In floating point the there is a high dynamic range between the smallest and biggest number. Fixed point processing in some apps is faster and cheaper. The decimal point is fixed and the app has to stay within the limited dynamic range. Calculations do not require shifting to line up the decimal points.

Floating point numbers are stored as an exponent to track the decimal point, a fractional number represting the number, and a sign bit in the exponent integer.

Multiplication, addition, and subtractions can be performed on the binary weighted fractional part.

01000000 + 00100000 = 125 + .0625 = 00110000 = .1875.

8 bits is common in many applications. For general math pap's 32 bits usually has enough resolution that for practical calculations the results appear to be continuous real numbers.

steve_bank · Dec 26, 2018

Binary multiple plication is straightforward. Same as base 10 multiplication, fory binary it is a simple binary shift and add.

/ 100 4
/ x 11
--------
/ 100
/ 100
---------
/ 1100 12

For weighted multiplication.

(0.5 * 0.0975) = (10000000 * 00011000) = 0000110000000000

An 8x8 binary multiply results in 16 bits. To restore the decimal point shift 8 bits to the right.
00001100 = 0.03125 + 0.015625 = .04875
0.5 * 0.0975 = .04875

Computer Arithmetic

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...

steve_bank

Diabetic retinopathy and poor eyesight. Typos ...