Pointers and Arrays, Part II

Site Navigation

Main
   Site News

   Our Software

   Legal Information

   Credits

Calculators
   Information

   C Programming

     Introduction to C

     Keyboard Input

     Graphics Intro

     Slider Puzzle 1

     Functions

     Pointers

       Part I

       Part II

     Dynamic Memory

     Slider Puzzle 2

     Structures

     Bit Manipulation

     Advanced Pointers

     File I/O

     Graduate Review

   Assembly

   Downloads

Miscellaneous
   Links

   Cool Graphs

   Feedback Form

C Programming Lessons

TIGCC Programming Lessons

Lesson 5 - Pointers and Arrays

Step 5 - More Uses for Pointers

Well, back in step 4, we learned about the basic uses of pointers, and got an introduction to the dereferencing operator. In this step, we will learn some more uses for pointers.

We mentioned that we could use pointers for values we can't return from a function. But when would this happen? Well, let's imagine we want to manipulate a string. But the only way we have to store strings is an array of characters. And since we cannot return more than one value from a function, we must use pointers to manipulate such a string.

Let's examine a very useful function called sprintf(). sprintf() is similar to printf, only instead of printing it's data to the screen, it "prints" it to a string. In fact, printf() actually calls sprintf() and then calls a print string routine. "string-print-formatted", but who keeps track of these things anyway. sprintf() needs to be able to manipulate a string's data, so it can alter all the special character sequences and format the string accordingly. But to do that, it would need some way of accessing the string. Enter the pointer. Just so we're clear, the syntax of the sprintf() function is sprintf(string pointer, "format string", format arguments);
char string[30];
sprintf(string, "This is a %d character string", 30);
Remember that an array is always treated like a pointer. Knowing that, our pointer for the sprintf() function is the string[] array. We didn't explicitly say it, but the arguments we use for the sprintf() and printf() functions do not have to be variables, just anything that has a value. So, the literal number 30 is perfectly valid. So, after we do this, the string array will contain the string "This is a 30 character string". Pretty simple.

Well, you might be asking yourself, how did the sprintf() function accomplish this task? Well, that's easy. The sprintf() function was given 3 pointers: the string array pointer, the string format literal pointer, and the argument list.

Since all of these pointers are arrays of sort, let's list them like that:
string: 000000000000000000000000000000
literal: 'T','h','i','s',' ',... (you get the idea)
arguments: 30
Remember that when we begin, all the pointers are pointing to their first item. For the string pointer, that's the very first character. For the string literal, that's the 'T'.

So, when we come to the sprintf() function, it has to do several things, but the task is very simple, providing the data is all put in correctly. sprintf() will not correct for mishaps in your code syntax. If you forget to put a variable in the argument list, you will get very strange errors. Anyway, let's take a look at the process:

Get the current character from the literal pointer. If it's not a '%', append it to the string pointer, like this: *string = *literal; If it is a '%', skip to step 4. If we found the terminating character of the literal pointer, skip to step 7.

Increment the string pointer and the literal pointer. string++; literal++;

Repeat at step 1.

Since character was a '%', increment the literal pointer and check the character there. literal++; If the new character is a '%', then append a '%' to the string pointer as in step 1. Goto Step 2. Otherwise, continue with step 5.

Check the format specifier. 'd' = integer. 's' = string. 'u' = unsigned integer. 'f' = float. I'm not going to go through all the process, so we'll just do the integer format. So, goto step 6.

For integers, grab a single integer from the argument pointer. int temp; temp = (int)*argument; argument += sizeof(int); Convert the integer to a character string (complex, so just pretend there's a magic function that does this for us automatically), then append it to the string. for (loop = 0; loop < tempIntegerStringPointerSize; loop++) { *string = *tempIntegerString; string++; tempIntegerString++; } Repeat from step 1.

We're finished.

I know that seems complicated, and believe me I oversimplified it, but the important part is that the pointer is what allowed us to change the value of the string array without passing a copy of the array, and without returning an entire array. Both complicated processes. All we had to do was pass a pointer value. The pointer is the most powerful (and therefore most difficult to master) concept in C. Give it time, and you will understand them perfectly.

Well, let's take a look at a program that will do something like a sprintf function.

Step 5a - Implementing Pointer Theory: A more concrete look at the sprintf() function

Start TIGCC and create a new project. Create a new C Source File and call it sprintf. Edit it to look like this.

sprintf.c

#include <tigcclib.h>

// a VERY simplified version of a sprintf() function
int sprintf2(char *str, const char *literal, const char *args) {
    int length = 0;

    // loop until we find the terminator
    while (*literal != 0) {
        if (*literal == '%') {
            // we have a format specifier
            // increment the literal string position
            literal++;

            if (*literal == 'd') {
                // we have an integer
                *str = 'i';
                *++str = 'n';
                *++str = 't';

                // it's too hard to convert
                // so we'll just pretend by skipping
                // the integer bytes
                args+=4;

                length+=3;
            } else if (*literal == 's') {
                // we have a string
                while (*args != 0) {
                    // loop till we find the string terminator
                    *str++ = *args++;
                    length++;
                }

                // make sure we aren't past the end
                str--;
                args++;
            } else if (*literal == '%') {
                *str = '%';

                length++;
            } else {
                *str = 'm';
                *++str = 'i';
                *++str = 's';
                *++str = 'c';

                // assume all other types
                // take 1 byte
                args++;

                length+=4;
            }

            str++;
            literal++;
        } else {
            // regular character
            *str = *literal;
            str++;
            literal++;
            length++;
        }
    }

    // terminate the string
    *str = 0;

    return length;
}

// Main Function
void _main(void) {
    int length;
    char string[80];

    // clear the screen
    clrscr();

    // format the string
    length = sprintf2(string,"Test %s. %s!!","String\x00QWERTY\x00");

    // print the string
    printf_xy(0,0,string);

    // format a new string
    length = sprintf2(string,"%d %f %% %m'isc'","0030FM");

    // print the new string
    printf_xy(0,10,string);

    // wait for user to press a key to exit the program
    ngetchx();
}

Step 5b - Compile and Run the Program

Save the program and compile it. Send the program to TiEmu and run it. It will look like this:

TI-89 AMS 2.05 sprintf.89z TI-92+ AMS 2.05 sprintf.9xz

Step 5c - Program Analysis

Okay, it's time to analyze the program.
int length;
char string[80];
	
// clear the screen
clrscr();
The start of the program is fairly straight forward. We declare two variables, a length variable which will hold the length of our string, and a string array which will hold our string. We declare the string to be 80 characters in length (max). We are telling the computer to reserve 80 characters for us and to give us a pointer to the first character. This will become even more clear when we cover dynamic memory allocation, but one step at a time.

I have no idea why I used printf_xy or clrscr here since DrawStr clearly would have worked fine.
// format the string
length = sprintf2(string,"Test %s. %s!!","String\x00QWERTY\x00");
Here is where we format our string using our version of sprintf(). Keep in mind this version of sprintf() is VERY scaled down, but it does do a couple things the real sprintf() does. The length of the new string will be returned by the function, which although rarely used, the real sprintf() does do.

Note the format of the string. We don't have the luxury of infinite arguments to our function, because that would mean introducing a ton of new concepts I don't want to worry about here. Since unlimited arguments are rarely used, we probably won't cover how to do this at all. So, instead of having arguments passed one after another, we used a single string literal. This is less than ideal, but will suit our purposes.

So, our format string says, replace all my %s's with the strings I provide, and leave all the other characters in tact. This brings up a concept in computers which is very important. The issue of string termination. All programs need a way to tell the computer where the string ends. There are two ways of doing this, specifying the length of the string in characters, and ending the string with some kind of terminator character. Most computers including your PC and the TI-89/92+/V200 use the terminator, and the terminator is always NULL (0). So, at the end of every string, we must put a terminator character. For all string literals (the "some string" strings), the compiler will automatically add the string terminator for you without you needing to worry about it, but when we are specifying a string as one of the arguments, we must add it ourselves. This brings up one of the drawbacks of our argument method. A real sprintf() function would pass a series of pointers to all its arguments, but we don't have that luxury here. This means it's hard to specify strings. We can terminate them by using the \x00 character, which is a hex escape sequence. This means, instead of putting the character 0, which is what would happen if we just put a zero in there, we should put the number 0 as our terminator. This brings us to the biggest problem in our argument specifier, what kinds of things can we put after strings? Well, our biggest problem is that every number after the \x hex escape sequence is interpreted to be part of that sequence. So we have to use something outside the hex number range. Our QWERTY string works because Q is not part of the hex number base (0-9, A-F), so it's interpreted as being the first character of the next character sequence. I know that seems complicated, but it's really not.

Well, we've covered the format of our new function, and how we used it in this instance, so let's take a look at the sprintf2() function we created.
// a VERY simplified version of a sprintf() function
int sprintf2(char *str, const char *literal, const char *args) {
	int length = 0;
The beginning is fairly simple. It's a function of "three" arguments (since we have other arguments embedded inside the *args pointer). We declare one variable, the length counter which we will return at the end of the function. The const char simply means these strings won't change. It lets the compiler help us find errors. When we have a const string that we try to change, the compiler will warn us.
// loop until we find the terminator
while (*literal != 0) {
Here is where we begin the main loop. Since the format string is a string literal, it does have a terminating character, so we are waiting to find it. This brings up one of the many cool things about character pointers. Remember that an array of characters is just that, one character after another after another. So, there is no real way to interpret all the characters at once, even though we think of them as a string. This is to our advantage though. Remember the * dereferencing operator? Well, since we have a character pointed to by the character pointer, we can dereference the pointer to get the character we are currently pointing to. When we start, this will be pointing at the first character in the format string. When we are finished, this will be pointing to the null terminator, which is what we want. So, we will loop until we find this character.
if (*literal == '%') {
	// we have a format specifier
	// increment the literal string position
	literal++;
Now that we know we have more string to interpret, we have to check the characters. Well, the first thing to check is if we have a % sign, which means we have a format specifier. If we do, then we have to start pointing at the next character in the format string. To do this, we use pointer arithmetic. It's simple. Pointer arithmetic is just like regular arithmetic, but we advance in increments of the pointer type. For a character pointer, we skip 1 character, which in C is 1 byte. For int pointers, we skip 2 bytes, because integers on the TI-89/92+/V200 are 2 bytes (they can be 4, but this is not the default). Long integers are of course 4 bytes. Let's look at a more concrete example.

Characters take 1 byte, so if we increment a character pointer by "1", we will advance the pointer one byte. However, integers take two bytes, so when we increment an integer pointer by "1", we advance it 2 bytes. Pointers take 4 bytes (on 32-bit machines like the 68000), so if we increment a pointer to an array of pointers (that may not make sense right now, but there are such things), we would advance the pointer by 4 bytes. So, to be simple, pointer arithmetic just means adding x bytes * the size of the pointer type any time we add x to a pointer. So, if *ptr is a char, and it points to the address 1000, then ptr++ would mean it now points to 1001. If *ptr is an integer, and it points to 1010, then ptr++ would make it point to 1012. If *ptr is a long integer (4 bytes) and it points to 1996, then ptr++ would mean it now points to 2000. See, pointer arithmetic is easy!

Well, since literal is a character pointer, and characters take one byte, we just advanced the literal pointer by 1 character (1 byte), so it now points to the character right after the % sign. This is just where we want to be. So, let's continue the analysis.
if (*literal == 'd') {
	// we have an integer
	*str = 'i';
	*++str = 'n';
	*++str = 't';
				
	// it's too hard to convert so we'll just pretend by skipping the integer bytes
	args+=4;
	length+=3;
Well, now that we know we have a format specifier, it's time to determine which one it is. So, we will start by checking for integer specifiers, which are signified by the letter 'd' (which stands for decimal if you wondered how they got 'd' out of integer).

Here is where we start appending things to the string we have. Well, it's a bit difficult converting an integer to a string, so we aren't going to worry about that. Instead, we will put the word 'int' in the string every time we find an integer specifier. We can assign the first character by using the dereferencing operator and the equality operator.

Now we start getting more complicated. The next two characters must increment the pointer before they add their character, or we will accidentally overwrite the first character. So, to increment the pointer first, we use the ++ pre-increment operator. We can couple this with the dereferencing operator to put them all in a single statement. Now, if we put the ++ before the array, it will increment the pointer before it assigns the value. So, we assign the 'n' to the position in the character array (string) one after the current position we are in. Simple, no? Well, just in case it's not, let's break it down into something easier.
++str;
*str = 'n';
In this, we can see things a bit more clearly. First, we increment the character pointer. Then we can safely assign the new character to the correct position in the array. But it's more concise to put it all in one statement. Note that in this example, we could have used pre or post increment, since it doesn't matter if we increment it before we do nothing or after we do nothing, but it does matter when we combine them because we must increment the pointer before we assign the next character.

It would probably have been nicer to enclose the ++ operator within parentheses to distinguish it from the * dereference operator. *(++ptr) is a little more readable than *++ptr even though they are evaluated exactly the same.

We are assuming integers are two bytes, but you can't specify that as characters, so 4 characters (2 characters per byte) works out to be a single integer, but it's hard to convert, so we just skip that space in the argument pointer. And, since we added 3 characters, we must increment the length counter by 3.
} else if (*literal == 's') {
	// we have a string
	while (*args != 0) {
		// loop till we find the string terminator
		*str++ = *args++;
		length++;
	}
				
	// make sure we aren't past the end
	str--;
	args++;
Okay, now that we "finished" our integer part, such that we are going to do, let's go on to the string part. We check for the 's' character, which is the cue for a string to be appended to our new string.

Now we have to make another loop, but this time, we are looking for the string terminator inside the argument pointer. This is just like the literal terminator we are looking for, so it the same thing we did above.

Now we have to do some string concatenation. To do that, we want to append one character from the args to one character of the string, and then increment both pointers. Well, remember what we did for the integer append? We use pre-increment dereferenced assignment (okay, I just made that phrase up, but it fits). Anyway, that's similar to what we want to do here, but instead, we want to increment the pointer after we copy the character. So, we put the ++ increment operator at the end of the string pointer. This means, copy the character from each character array, then increment both pointers after the copy is done. That's just what we wanted to happen. Now all we need to do is increment the length counter by 1 for the character we just copied, and our loop is finished.

Once the loop has ended (we found the terminator), we find ourselves at a problem part. Because we almost always need to increment the string pointer, we will do that at the end, but we are already where we want to be. So, when that pointer increment comes at the end of the function, it will throw our pointer off. So we need to subtract one from our string pointer so when we hit the pointer increment at the end, it will again be pointing at the correct place.

Now, the last thing of concern is to increment the argument pointer so it's not pointing to the terminator. If we had another string, it would never get appended because we would already be pointing at a terminating character, so make sure this doesn't happen.
} else if (*literal == '%') {
	*str = '%';

	length++;
The next section is fairly intuitive. If we have a %%, we want to put the literal character '%' in the string. So, we just append the character to the string and increment the length.
} else {
	*str = 'm';
	*++str = 'i';
	*++str = 's';
	*++str = 'c';
				
	// assume all other types
	// take 1 byte
	args++;
				
	length+=4;
}
Well, I told you this version of sprintf() was limited. If we experience any other kind of format specifier, we do not try to interpret it. We append "misc" to the string, and assume it needed one byte from our argument pointer. This is just like the integer specifier, so I won't waste time explaining it again.
str++;
literal++;
Now that we have exited the if-then-else block, we come to our final task. Since all of these things added at least one character to the string, it's more efficient to put the string pointer increment here, rather than in every segment of the if-then-else block. But don't forget we also need to increment the literal pointer so we can find the next character to interpret. So, take care of that, and we loop again.
} else {
	// regular character
	*str = *literal;
	str++;
	literal++;
	length++;
}
Now, what if we didn't get a format specifier? Well, that's easy. We just append the character from the literal to the character in the string. You can see here how we can separate the increments from the assignment if we choose. It's a preference thing, but it's probably better to put it all in one statement. Anyway, since we are only appending a single character, the length pointer is only incremented by one, and we loop.
// terminate the string
*str = 0;
	
return length;
Okay, now that we have taken care of the loop, we need to finish the string. Since every string needs a terminator, we need to add that as the last character. Then, we simply return the length counter so the user knows how long the string is, if the user has a need for such knowledge. Well, that takes care of our sprintf2() function, so let's go back to the beginning.
// print the string
printf_xy(0,0,string);
Now that we have a formatted string, we need to print the string. We can use DrawStr() or printf_xy() to accomplish this. DrawStr() would have been just as good. I have no idea why I used printf_xy here.
// format a new string
length = sprintf2(string,"%d %f %% %m'isc'","0030FM");
Okay, now we are going to reformat the string. The real reason for doing this is because it's hard to combine string values and other values inside the argument pointer, due to our limitations. But it's good to see that we don't need a blank string to format, we can reformat any string.

Well, the format specifiers are all there. We used two miscellaneous formatters, one as a play on words and the other for the floating point variables, but we don't interpret either of them. Then we also put an integer on the string. When we hit the integer part of the sprintf2() function, it will skip the 4 characters (2 bytes in a sense as we need two characters to represent the hex byte). Each of the miscellaneous formatters will take one byte from the argument string, which is why it is 6 bytes long. The other formatter we used was the literal %, just to show you how it works.

Since we used this function above, and walked through it, this shouldn't be too much more difficult to understand than the last one.
// print the new string
printf_xy(0,10,string);
	
// wait for user to press a key to exit the program
ngetchx();
Well, the end of our program. Just print out our new string, and wait for user input to exit the program.

Step 6 - Conclusion

Well, we have reached the end of this lesson, and now its time for me to sum everything up into a neat paragraph or two. Well, that's not going to happen. Pointers are the most difficult concept in C, especially for beginners, so this is probably the most complicated lesson. If you are lucky, you understood it all and are ready to go try to use pointers yourself.

I suppose I should say something resembling a conclusion, so here it is. Pointers and arrays, which are very similar concepts, are probably the most useful and powerful concepts in C. Arrays are used everywhere, and pointers are just as important. We have only scratched the surface of these concepts, as you will see soon. You should now be moving beyond the range of a beginning programmer. You've learned all of the basic concepts of C, and now it is time to move on to more advanced topics. Although you should be able to program most any simple or reasonably advanced program (with the help of the TIGCC library documentation of course), you haven't learned all the ways to make your programs good. There is always more than one way to do something in C, and there are often better ways to do some things.

I hope you have learned something here, because it's a most valuable thing to learn. I also hope you are ready to move to more advanced topics in C programming, because we are beyond the scope of basic things. It's time to tie all these concepts together to build real programs. I showed you how just from the first three lessons you could build a slider puzzle game. With the concepts you have now, you could build a Nibbles clone, or any simple game in C. However, C programming is a never-ending topic. This is not a bad thing, but it does require patience.

Anyway, have fun using pointers.

Lesson 5: Pointers and Arrays
Questions or Comments? Feel free to contact us.

C Programming Lessons

TIGCC Programming Lessons

Lesson 5 - Pointers and Arrays

Copyright © 1998-2007 Techno-Plaza All Rights Reserved Unless Otherwise Noted

Copyright © 1998-2007 Techno-Plaza
All Rights Reserved Unless Otherwise Noted