Bit Manipulation

Site Navigation

Main
   Site News

   Our Software

   Legal Information

   Credits

Calculators
   Information

   C Programming

     Introduction to C

     Keyboard Input

     Graphics Intro

     Slider Puzzle 1

     Functions

     Pointers

     Dynamic Memory

     Slider Puzzle 2

     Structures

     Bit Manipulation

       Part I

       Part II

       Part III

     Advanced Pointers

     File I/O

     Graduate Review

   Assembly

   Downloads

Miscellaneous
   Links

   Cool Graphs

   Feedback Form

C Programming Lessons

TIGCC Programming Lessons

The following is part of a series of lessons designed to help teach people to program in C for the TI-89, 92+, and V200 calculators using the TIGCC development environment.

If you wish, you can download the program source code, project files, and binaries here.

Lesson 8: Bit Manipulation

Step 1 - An Introduction to Bits in C

As you may or may not know, the smallest data type in C is the character, or char for short. Characters can represent values from 0 to 255, if we use unsigned characters. To represent these values, they use a series of bits. 8 bits make up a byte, and a character is one byte. We know it takes 8 bits to make a byte because we know that computers work in a binary number system, and can only count in powers of 2, and if we take 8 numeric positions, we get 2 to the 8th power (base 2 raised to the 8 number positions) and get 256, but since we start counting at 0, that gives us 0-255. This is all fine and good you may say, but how does that help me in C programming? Well, it all comes down to efficiency concerns, as usual. We want to use the least amount of space possible to do the tasks we need to do, but the smallest data type we have available uses 8 bits to store it's values, even if we using values that are much smaller than that. Imagine we needed to store a yes or no value. We could represent that with the values 1 for yes, and 0 for no. That requires only a single bit, but if we had to use a character, the smallest data type, we are wasting 7 bits that we can't use. Imagine we had 8 such values we needed to store. Now we are wasting 56 bits. Is this an efficient design? Of course not. So, what is the solution? Bit manipulation, of course.

C has built-in methods for manipulating data a single bit at a time. So, if we have 8 true/false or yes/no value pairs, which we often do need. For example, in our last lesson, when we talked about unions, we talked about the Symbol Entry structure, which stored a flag register to keep track of various items related to a symbol (file). They had 16 different flags, all with yes or no values, and they used a single short integer (2 bytes) to store all 16 flags. This means they only had to use 2 bytes to store 16 different properties about their files. Let's take a look at how they did that again real quick:
struct { 
	unsigned short int busy:1,local:1,flag1_5:1,flag1_4:1, 
			   collapsed:1,twin:1,archived:1,in_view:1; 
	unsigned short int folder:1,overwritten:1,checked:1,hidden:1, 
			   locked:1,statvar:1,graph_ref_1:1,graph_ref_0:1; 
} bits;
You can see they have unsigned short integers declared, but they have this odd :1 appended to all the variable names. So, what does this semicolon one mean? This is a bit-field. It's C's way of saying, we only want to use a single bit to represent this value. However, it does not have to be a single bit. We can use any number from 1 to 16 (remember that a short integer is 2 bytes, or 16 bits). However, there is little point in using values of 8 or 16, since we can use characters for 8 bits, and short integers for 16 bits. In this way, we can use a single bit to represent so called 'flag' values, variables that only have 2 possible values, TRUE or FALSE, YES or NO, ON or OFF, and other such similar things.

You might be thinking that there are 2 unsigned shorts here, so why is it 16 bits and not 32. The reason is because the compiler will always pack bit-fields into the smallest possible space. You could actually have put each of those declarations on a single line and it would still take only 16 bits. Note that you can only declare bit-fields inside structures and unions.

Remember that we don't have to use only a single bit, we can use multiple bits for higher values. So, if we wanted to represent small values, like the values for cards, which have face values no higher than 13 (13 faces in a poker deck per suit), and 2 suits, we could represent it all with a single character. For example:
struct Card {
	unsigned char face:4, suit:2;
};
Here, we use 4 bits to represent the card face. This means we can store any values from 0 to (2 raised to the 4th power) - 1 (2 to the 4th being 16, that puts our range at 0-15). And our suit, which takes two bits, because our suit has 4 possible values, and 2 bits can represent values from 0 - ((2 ^ 2) - 1), or 3, so 0-3. To make it less confusing, we could define an enumeration to handle our constants with more meaningful names, like this:
enum CardFaces {ACE,TWO,THREE,FOUR,FIVE,SIX,SEVEN,EIGHT,NINE,TEN,JACK,QUEEN,KING};
enum CardSuits {SPADES,DIAMONDS,CLUBS,HEARTS};
So, with these definitions, we can use meaningful names with our card structures. To check a card in the deck, we can use it's name rather than its value.

Now, if you remember our last definition, which looked more like this:
struct Card { 
	int face, suit;
};
You can see that even using integers (which would take up 2 bytes per value), we are still using 4 bytes to store every card in our deck. And considering a deck usually consists of at least 52 cards, we are using 52 * 4 bytes or 208 bytes of memory. Contrast this with our new definition, and we only use 52 bytes for an entire deck. Imagine a game of double-deck solitaire, if you've played before (it's a variation on solitaire where two decks are used). This would mean our new definition would only use 104 bytes, but our old definition would use 416 bytes. On a platform with such limited memory, this is a significant savings.

It is important to note however that though you save memory here in the storage space used, it is more difficult for the processor to work with single bits than with bytes. This means more code is generated to work with bit-fields than with standard variables. You will want to try both to see if you actually end up saving memory at all. A better use of bit-fields than memory saving concerns are flagsets like the one in the AMS symbol pointer struct.

We have seen how we can divide our variables up to use a certain number of bits, but how many bits does it actually use, if we don't use all of them? Remember in our card example, we used 4 bits for the face, and 2 bits for the suit. This adds up to only 6 bits, but an unsigned character uses 8 bits. Does this mean we have to use 8 bits, or do we only need 6? Well, the answer is we have to use 8. This is because there is no way to reserve less than 8 bits of memory at a time. This is a limitation of the processor. So it's best if we use bit fields in sets of 8 or 16, but remember that even though we are wasting 2 bits, that is much better than wasting 4 bytes (32 bits) on the 2 integer card definition we used last time.

Step 2 - The Bit Card Deck

Start TIGCC and create a new project. Create a new C Source File named bitcard. Modify the file so that it looks like this:

bitcard.c

#include <tigcclib.h>

#define DECK_SIZE   52

typedef struct {
    unsigned char face:4, suit:2;
} Card;

void initDeck(Card *deck) {
    short int loop;

    // initialize the deck of cards
    for (loop = 0; loop < DECK_SIZE; loop++) {
        deck[loop].face = loop % 13;
        deck[loop].suit = loop / 13;
    }
}

void shuffle(Card *deck) {
    short int loop, randNum;
    Card temp;

    // shuffle the deck
    for (loop = 0; loop < DECK_SIZE; loop++) {
        randNum = rand() % DECK_SIZE;

        // swap the card at the current position with
        // the one at the randomly selected position
        temp = deck[loop];
        deck[loop] = deck[randNum];
        deck[randNum] = temp;
    }

}

void printDeck(Card *deck, const char *faces[], const char *suits[]) {
    int loop;

    // clear the screen
    clrscr();

    // print the deck one card at a time
    for (loop = 0; loop < 52; loop++) {
        // print the face and suit of the card
        printf("%s of %s\n",faces[deck[loop].face],suits[deck[loop].suit]);

        // pause for each 8 cards displayed
        if ((loop % 8) == 0 && loop != 0) {
            printf("Press any key...\n");
            ngetchx();
        }
    }

    // pause to show the last few cards
    ngetchx();
}

void _main(void) {
    const char *faces[] = {"Ace","Deuce","Three","Four","Five","Six","Seven",
                "Eight","Nine","Ten","Jack","Queen","King"};
    const char *suits[] = {"Spades","Diamonds","Clubs","Hearts"};
    Card *deck = NULL;

    // allocate memory for the deck of cards
    if ((deck = calloc(DECK_SIZE,sizeof(Card))) == NULL) {
        DlgMessage("Fatal Error","Out of Memory",BT_OK,BT_NONE);
        return;
    }

    // seed the random number generator
    randomize();

    // initialize the deck of cards
    initDeck(deck);

    // shuffle the deck
    shuffle(deck);

    // print the deck
    printDeck(deck,faces,suits);

    // free the memory used by the deck of cards
    free(deck);
}

Step 2a - Compile and Run the Program

Save the project and build it. Send it to TiEmu. It will look something like this:

TI-89 AMS 2.05 bitcard.89z TI-92+ AMS 2.05 bitcard.9xz

Step 2b - Program Analysis

This program is nearly identical to the card program from lesson 7, but there are some important changes.
typedef struct {
	unsigned char face:4, suit:2;
} Card;
We use the new definition of the Card structure so we can save space.
const char *faces[] = {"Ace","Deuce","Three","Four","Five","Six","Seven",
			"Eight","Nine","Ten","Jack","Queen","King"};
const char *suits[] = {"Spades","Diamonds","Clubs","Hearts"};
The first thing of note inside the _main() function is the use of a constant character pointer array. We haven't talked about the const modifier before, so I want to do so now. The const keyword just means constant. The compiler will warn you if you try to alter the value of a constant variable, however, there is no checking to actually make sure you cannot alter a 'constant' variable. It's just a built-in feature to help you make sure you don't accidentally modify constant values, and string literals like these should definitely not change.

The rest of the program is exactly like the program from lesson 7, so I won't bore you by going over the whole program again. Just remember the important difference was our use of the bit structure to declare the card. Although the program takes up roughly the same amount of memory as the card program from lesson 7, since most of the size of the program is embedded in the string literal constants of the card name arrays and the code itself, the program will require only a fourth as much memory to allocate space for the card array. Assuming this is the only thing we need to use from the dynamic memory, we could run this card game even if the calculator had only a couple hundred bytes of free memory available. This is of huge importance on a TI-89, where memory is so very limited.

Step 3 - Bit Masks and the Bitwise Operations

Another very important concept in C is the idea of bit masks in bitwise operations. Let's imagine we wanted to represent complex maps in games. They can use be represented by a map of binary numbers, representing that a pixel should either be drawn, or not be drawn. We can also use the numbers to mean if a sprite should be drawn, like a block on a wall. So, let's image we had a maze like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

To draw this map, or even to check a position on the map, we can use simple techniques of bit masks and bitwise operations. To give further background to the topic, we will need to talk about exactly what a bit mask is. A bit mask is simply a way of removing parts of data we don't want to look at. It is called a bit mask because it is composed of bits, and we are masking out (or putting a mask over) the data we don't want to see. And in this case, the data we don't want to see is made up of bits that we don't need.

So, imagine we were trying to see if our character was trying to move to a position on the map that is occupied by a barrier (assume the dark areas are places that our character cannot walk on, and the light areas are places our character can travel safely). Well, let's imagine our map is a grid of (x,y) locations. So, let's imagine we were trying to see if the character could safely walk onto position G7. If we see that each of our rows on the map are made up of a set of columns, then we can mask out every position except the 7th one in the G row. So, our mask would become 0x02000. Since we can't use binary in C, we have to use hex masks, but they work the same way, we just need to translate our binary to hex. So, in this case, we have our 20 columns, and our mask should be 0b0000 0010 0000 0000 0000. This would mean, the only position in the row we care about the the 7th position. Convert that to hex and we get 0x02000. Now, the question here becomes, why do we want to use all zeros except for the place we want to check. Well, this has to do with another concept in bitwise operations, known as bitwise AND. Bitwise AND is one of the logical operations we can do to test certain properties about bits. In Bitwise AND, we have two values. Bitwise AND takes those two values and returns a new value. If two bits in the same position in both values are true, then bitwise AND returns true in the result value. But if either bit is 0, then we return false. But with so many bits to test, how can we only test the bit we are looking for? Well, that's where our bit mask comes in. Remember that our bit mask sets all our bits to zero except the one we want to test. Well, in this case, we can do a bitwise AND on our variable and the bit mask, and we will be able to see if just our bit is turned on.

Remember earlier when we were talking about true and false values in C. We said that usually, 1 is used for true, and 0 for false. Well, that's not the whole idea. In C, any value other than 0 is considered true, and 0 is the only false value. So, when you do a bitwise AND with a bitmask, we will end up with a result that is either zero or non-zero. This is how we know if the bit was turned on or not.

So, imagine we have the variable position. The position variable represents a horizontal row on our map. Now, our bit mask is still set to mask out all but the seventh position on our map. This is because we are still trying to check if the position G7 is a position our character can move to, or not. So, having our position variable set to be the G row of our map, then we can perform the bitwise AND operation on position and our bit mask, which we will call the variable mask. Since this is starting to get complicated in the abstract, let's look at an example of code:
unsigned long int mask = 0x02000;
unsigned long int position = 0x82001;

if (position & mask) {
	DlgMessage("Error","Unable to move to selected location",BT_OK,BT_NONE);
}
Our bit mask, mask, is just as it was above. Our position, which needs to be defined in hex, is the G row of our maze above. It looks better if we convert it to binary, but remember that we can only use hex in C, but let's go ahead an take a look at it in binary so you can see how it looks: 0b10000010000000000001. As you can see, the positions on our map show 3 squares filled in on the G row, and our map is 20 squares wide. 8 in binary is 1000, 2 is 0010, 0 is 0000, and 1 is 0001. String them all together and we get our map row.

In C, the bitwise AND operation is represented by the single ampersand &. Remember that this is very different from the double ampersand && meaning logical truth (if this condition is true AND this condition is true). Remember that the single ampersand & returns the bitwise AND of the two values, and truth is the result of any non-zero value. Remember that our bit mask masks out all the values we don't care about. So, even if there was another 1 in the position that didn't appear in the mask (and it normally would have such values), AND would still return true so long as the bit mask 1's all correspond to position 1's.

To make this a little more clear, let's take a look at the bitwise AND function, so that we may better understand it. It is a very important concept, if you plan to write games, or do matrix operations.

If we have two values, and I will put them in binary so it's easier to see, then the AND looks something like this:
    1000 0010 0000 0000 0001
AND 0000 0010 0000 0000 0000
----------------------------
    0000 0010 0000 0000 0000

Continue the Lesson in Part II

C Programming Lessons

TIGCC Programming Lessons

Lesson 8: Bit Manipulation

Copyright © 1998-2007 Techno-Plaza All Rights Reserved Unless Otherwise Noted

Copyright © 1998-2007 Techno-Plaza
All Rights Reserved Unless Otherwise Noted