C++: Integers

An integer type variable is a variable that can only hold whole numbers (eg. -2, -1, 0, 1, 2). C++ actually has four different integer variables available for use:char, short, int, and long. The only difference between these different integer types is that they have varying sizes — the larger integers can hold bigger numbers. You can use the sizeof operator to determine how large each type is on your machine.

In the following tutorials, we will typically assume:

a char is 1 byte
a short is 2 bytes
an int is either 2 or 4 bytes
a long is 4 bytes

Declaring some integers:

char chChar;

short int nShort; // "short int" is technically correct

short nShort2; // "short" is preferred shorthand

int nInteger;

long int nLong; // "long int" is technically correct

long nLong2; // "long" is preferred shorthand

While short int and long int are technically correct, we prefer to use the shorthand versions short and long instead. Adding the prefix int makes the type harder to distinguish from variables of type int. This can lead to mistakes (such as overflow) if the short or long modifier is inadvertently missed.

Because the size of char, short, int, and long can vary depending on the compiler and/or computer architecture, it can be instructive to refer to integers by their size rather than name. We often refer to integers by the number of bits or bytes a variable of that type is allocated.

As you learned in the last section, a variable with n bits can store 2^n different values. We call the set of values that a data type can hold it’s range. Integers can have two different ranges, depending on whether they are signed or unsigned.

Signed and unsigned variables

A signed integer is a variable that can hold both negative and positive numbers. To declare a variable as signed, you can use thesigned keyword:

signed char chChar;

signed short nShort;

signed int nInt;

signed long nLong;

A 1-byte signed variable has a range of -128 to 127. Any value between -128 and 127 (inclusive) can be put in a 1-byte signed variable safely.

Sometimes, we know in advance that we are not going to need negative numbers. This is common when using a variable to store the quantity or size of something (such as your height — it doesn’t make sense to have a negative height!). An unsigned integer is one that can only hold positive values. To declare a variable as unsigned, use the unsigned keyword:

unsigned char chChar;

unsigned short nShort;

unsigned int nInt;

unsigned long nLong;

A 1-byte unsigned variable has a range of 0 to 255.

Note that declaring a variable as unsigned means that it can not store negative numbers, but it can store positive numbers that are twice as large!

So what happens if we do not declare a variable as signed or unsigned? All integer variables except char are signed by default. Char can be either signed or unsigned by default (but is usually signed).

short nShort; // signed by default

int nInt; // signed by default

long nLong; // signed by default

char chChar; // can be signed or unsigned by default, but probably signed.

New programmers sometimes get signed and unsigned mixed up. The following is a simple way to remember the difference: in order to differentiate positive and negative numbers, we typically use a negative sign. If a sign is not provided, we assume a number is positive. Consequently, an integer with a sign (a signed integer) can tell the difference between positive and negative. An integer without a sign (an unsigned integer) assumes all values are positive.

Now that you understand the difference between signed and unsigned, let’s take a look at the ranges for different sized signed and unsigned variables:

Size/Type	Range
1 byte signed	-128 to 127
1 byte unsigned	0 to 255
2 byte signed	-32,768 to 32,767
2 byte unsigned	0 to 65,535
4 byte signed	-2,147,483,648 to 2,147,483,647
4 byte unsigned	0 to 4,294,967,296
8 byte signed	-9,223,372,036,854,775,807 to 9,223,372,036,854,775,807
8 byte unsigned	0 to 18,446,744,073,709,551,615

For the math inclined, an n-bit signed variable has a range of -(2^(n-1)) to (2^(n-1))-1. An n-bit unsigned variable has a range of 0 to (2^n)-1. For the non-math inclined… use the table. :)

What happens if we try to put a number outside of the data type’s range into our variable? We get…

Overflow

In binary, we count from 0 to 15 like this: 0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1101, 1111. As you can see, the larger numbers require more bits to represent. Because our variables have a fixed number of bits, this puts a limit on the largest number they can hold.

Consider a hypothetical variable that can only hold 4 bits. Any of the binary numbers enumerated above would fit comfortably inside this variable because none of them are larger than 4 bits. But what happens if we try to assign a value that takes 5 bits to our variable? We get overflow: our variable will only store 4 of the 5 bits, and the excess bits are lost.

Overflow occurs when bits are lost because a variable does not have enough memory to store them.

We can see this in action with the following program:

#include <iostream>

int main()

{

    using namespace std;

    unsigned short x = 65535; // largest 2-byte unsigned value possible

    cout << "x was: " << x << endl;

    x = x + 1; // We desire 65536, but we get overflow!

    cout << "x is now: " << x << endl;

}

What do you think the result of this program will be?

x was: 65535
x is now: 0

What happened? Informally, we overflowed the variable by trying to put a number that was too big into it, and the result is that our value “wrapped around” back to the beginning of the range. For non-integer data types, overflowed variables do not always wrap around the range, so do not rely on this happening!

The following paragraph explains exactly why we ended up getting a value of 0 after overflow. It is optional reading. If all this binary stuff is confusing, you can skip it.

The number 65,535 is represented by the bit pattern 1111 1111 1111 1111 in binary. 65,535 is the largest number an unsigned 2 byte (16-bit) integer can hold, as it uses all 16 bits. When we add 1 to the value, the new value should be 65,536. However, the bit pattern of 65,536 is represented in binary as 1 0000 0000 0000 0000, which is 17 bits! Consequently, the highest bit (which is the 1) is lost, and the low 16 bits are all that is left. The bit pattern 0000 0000 0000 0000 corresponds to the number 0, which is our result!

Similarly, we can overflow the bottom end of our range as well.

#include <iostream>

int main()

{

    using namespace std;

    unsigned short x = 0; // smallest 2-byte unsigned value possible

    cout << "x was: " << x << endl;

    x = x - 1; // We expect -1, we get overflow!

    cout << "x is now: " << x << endl;

}

x was: 0
x is now: 65535

In the case of a signed integer, the result is identical.

#include <iostream>

int main()

{

    using namespace std;

    signed short x = 32767; // largest 2-byte signed value possible

    cout << "x was: " << x << endl;

    x = x + 1; // We desire 32768, but we get overflow!

    cout << "x is now: " << x << endl;

}

x was: 32767
x is now: -32768

Overflow results in information being lost, which is almost never desirable. If there is ANY doubt that a variable might need to store a value that falls outside it’s range, use a larger variable!

Integer division

Integer division can also cause issues because dividing 2 integers can produce a fractional result, and integers can not store fractions. Consider the statement int x = 5 / 3;. Under normal mathematical rules, x would be assigned the value of 5/3, which is 1.6666. However, in integer division, the fraction is dropped, so x is assigned the value of 1. Integer division always drops the fraction — it does not round.

C++

Integers

Share

0 comments:

Search This Blog

Welcome

Tags

Popular Posts