Well, back in step 4, we learned about the basic uses of pointers, and
got an introduction to the dereferencing operator. In this step, we will
learn some more uses for pointers.
We mentioned that we could use pointers for values we can't return from
a function. But when would this happen? Well, let's imagine we want to
manipulate a string. But the only way we have to store strings is an
array of characters. And since we cannot return more than one value from
a function, we must use pointers to manipulate such a string.
Let's examine a very useful function called sprintf(). sprintf() is
similar to printf, only instead of printing it's data to the screen, it
"prints" it to a string. In fact, printf() actually calls sprintf()
and then calls a print string routine. "string-print-formatted", but
who keeps track of these things anyway. sprintf() needs to be able to
manipulate a string's data, so it can alter all the special character
sequences and format the string accordingly. But to do that, it would
need some way of accessing the string. Enter the pointer. Just so we're
clear, the syntax of the sprintf() function is sprintf(string pointer,
"format string", format arguments);
char string[30];
sprintf(string, "This is a %d character string", 30);
Remember that an array is always treated like a pointer. Knowing that,
our pointer for the sprintf() function is the string[] array. We didn't
explicitly say it, but the arguments we use for the sprintf() and
printf() functions do not have to be variables, just anything that has a
value. So, the literal number 30 is perfectly valid. So, after we do
this, the string array will contain the string "This is a 30 character
string". Pretty simple.
Well, you might be asking yourself, how did the sprintf() function
accomplish this task? Well, that's easy. The sprintf() function was
given 3 pointers: the string array pointer, the string format literal
pointer, and the argument list.
Since all of these pointers are arrays of sort, let's list them like
that:
string: 000000000000000000000000000000
literal: 'T','h','i','s',' ',... (you get the idea)
arguments: 30
Remember that when we begin, all the pointers are pointing to their
first item. For the string pointer, that's the very first character. For
the string literal, that's the 'T'.
So, when we come to the sprintf() function, it has to do several things,
but the task is very simple, providing the data is all put in correctly.
sprintf() will not correct for mishaps in your code syntax. If you
forget to put a variable in the argument list, you will get very strange
errors. Anyway, let's take a look at the process:
-
Get the current character from the literal pointer. If it's not a
'%', append it to the string pointer, like this: *string = *literal;
If it is a '%', skip to step 4. If we found the terminating
character of the literal pointer, skip to step 7.
-
Increment the string pointer and the literal pointer. string++;
literal++;
-
Repeat at step 1.
-
Since character was a '%', increment the literal pointer and check
the character there. literal++; If the new character is a '%', then
append a '%' to the string pointer as in step 1. Goto Step 2.
Otherwise, continue with step 5.
-
Check the format specifier. 'd' = integer. 's' = string. 'u' =
unsigned integer. 'f' = float. I'm not going to go through all the
process, so we'll just do the integer format. So, goto step 6.
-
For integers, grab a single integer from the argument pointer. int
temp; temp = (int)*argument; argument += sizeof(int); Convert the
integer to a character string (complex, so just pretend there's a
magic function that does this for us automatically), then append it
to the string. for (loop = 0; loop <
tempIntegerStringPointerSize; loop++) { *string =
*tempIntegerString; string++; tempIntegerString++; } Repeat from
step 1.
-
We're finished.
I know that seems complicated, and believe me I oversimplified it, but
the important part is that the pointer is what allowed us to change the
value of the string array without passing a copy of the array, and
without returning an entire array. Both complicated processes. All we
had to do was pass a pointer value. The pointer is the most powerful
(and therefore most difficult to master) concept in C. Give it time, and
you will understand them perfectly.
Well, let's take a look at a program that will do something like a
sprintf function.
Okay, it's time to analyze the program.
int length;
char string[80];
// clear the screen
clrscr();
The start of the program is fairly straight forward. We declare two
variables, a length variable which will hold the length of our string,
and a string array which will hold our string. We declare the string to
be 80 characters in length (max). We are telling the computer to reserve
80 characters for us and to give us a pointer to the first character.
This will become even more clear when we cover dynamic memory
allocation, but one step at a time.
I have no idea why I used printf_xy or clrscr here since DrawStr clearly
would have worked fine.
// format the string
length = sprintf2(string,"Test %s. %s!!","String\x00QWERTY\x00");
Here is where we format our string using our version of sprintf().
Keep in mind this version of sprintf() is VERY scaled down, but it does
do a couple things the real sprintf() does. The length of the new string
will be returned by the function, which although rarely used, the real
sprintf() does do.
Note the format of the string. We don't have the luxury of infinite
arguments to our function, because that would mean introducing a ton of
new concepts I don't want to worry about here. Since unlimited arguments
are rarely used, we probably won't cover how to do this at all. So,
instead of having arguments passed one after another, we used a single
string literal. This is less than ideal, but will suit our purposes.
So, our format string says, replace all my %s's with the strings I
provide, and leave all the other characters in tact. This brings up a
concept in computers which is very important. The issue of string
termination. All programs need a way to tell the computer where the
string ends. There are two ways of doing this, specifying the length of
the string in characters, and ending the string with some kind of
terminator character. Most computers including your PC and the
TI-89/92+/V200 use the terminator, and the terminator is always NULL
(0). So, at the end of every string, we must put a terminator character.
For all string literals (the "some string" strings), the compiler will
automatically add the string terminator for you without you needing to
worry about it, but when we are specifying a string as one of the
arguments, we must add it ourselves. This brings up one of the drawbacks
of our argument method. A real sprintf() function would pass a series of
pointers to all its arguments, but we don't have that luxury here. This
means it's hard to specify strings. We can terminate them by using the
\x00 character, which is a hex escape sequence. This means, instead of
putting the character 0, which is what would happen if we just put a
zero in there, we should put the number 0 as our terminator. This brings
us to the biggest problem in our argument specifier, what kinds of
things can we put after strings? Well, our biggest problem is that every
number after the \x hex escape sequence is interpreted to be part of
that sequence. So we have to use something outside the hex number range.
Our QWERTY string works because Q is not part of the hex number base
(0-9, A-F), so it's interpreted as being the first character of the next
character sequence. I know that seems complicated, but it's really not.
Well, we've covered the format of our new function, and how we used it
in this instance, so let's take a look at the sprintf2() function we
created.
// a VERY simplified version of a sprintf() function
int sprintf2(char *str, const char *literal, const char *args) {
int length = 0;
The beginning is fairly simple. It's a function of "three" arguments
(since we have other arguments embedded inside the *args pointer). We
declare one variable, the length counter which we will return at the end
of the function. The const char simply means these strings won't change.
It lets the compiler help us find errors. When we have a const string
that we try to change, the compiler will warn us.
// loop until we find the terminator
while (*literal != 0) {
Here is where we begin the main loop. Since the format string is a
string literal, it does have a terminating character, so we are waiting
to find it. This brings up one of the many cool things about character
pointers. Remember that an array of characters is just that, one
character after another after another. So, there is no real way to
interpret all the characters at once, even though we think of them as a
string. This is to our advantage though. Remember the * dereferencing
operator? Well, since we have a character pointed to by the character
pointer, we can dereference the pointer to get the character we are
currently pointing to. When we start, this will be pointing at the first
character in the format string. When we are finished, this will be
pointing to the null terminator, which is what we want. So, we will loop
until we find this character.
if (*literal == '%') {
// we have a format specifier
// increment the literal string position
literal++;
Now that we know we have more string to interpret, we have to check the
characters. Well, the first thing to check is if we have a % sign, which
means we have a format specifier. If we do, then we have to start
pointing at the next character in the format string. To do this, we use
pointer arithmetic. It's simple. Pointer arithmetic is just like regular
arithmetic, but we advance in increments of the pointer type. For a
character pointer, we skip 1 character, which in C is 1 byte. For int
pointers, we skip 2 bytes, because integers on the TI-89/92+/V200 are
2 bytes (they can be 4, but this is not the default). Long integers are
of course 4 bytes. Let's look at a more concrete example.
Characters take 1 byte, so if we increment a character pointer by
"1", we will advance the pointer one byte. However, integers take
two bytes, so when we increment an integer pointer by "1", we advance it
2 bytes. Pointers take 4 bytes (on 32-bit machines like the 68000), so
if we increment a pointer to an array of pointers (that may not make
sense right now, but there are such things), we would advance the
pointer by 4 bytes. So, to be simple, pointer arithmetic just means
adding x bytes * the size of the pointer type any time we add x to a
pointer. So, if *ptr is a char, and it points to the address 1000, then
ptr++ would mean it now points to 1001. If *ptr is an integer, and it
points to 1010, then ptr++ would make it point to 1012. If *ptr is a
long integer (4 bytes) and it points to 1996, then ptr++ would mean it
now points to 2000. See, pointer arithmetic is easy!
Well, since literal is a character pointer, and characters take one
byte, we just advanced the literal pointer by 1 character (1 byte), so
it now points to the character right after the % sign. This is just
where we want to be. So, let's continue the analysis.
if (*literal == 'd') {
// we have an integer
*str = 'i';
*++str = 'n';
*++str = 't';
// it's too hard to convert so we'll just pretend by skipping the integer bytes
args+=4;
length+=3;
Well, now that we know we have a format specifier, it's time to
determine which one it is. So, we will start by checking for integer
specifiers, which are signified by the letter 'd' (which stands for
decimal if you wondered how they got 'd' out of integer).
Here is where we start appending things to the string we have. Well,
it's a bit difficult converting an integer to a string, so we aren't
going to worry about that. Instead, we will put the word 'int' in the
string every time we find an integer specifier. We can assign the first
character by using the dereferencing operator and the equality operator.
Now we start getting more complicated. The next two characters must
increment the pointer before they add their character, or we will
accidentally overwrite the first character. So, to increment the pointer
first, we use the ++ pre-increment operator. We can couple this with the
dereferencing operator to put them all in a single statement. Now, if we
put the ++ before the array, it will increment the pointer
before it assigns the value. So, we assign the 'n' to the
position in the character array (string) one after the current position
we are in. Simple, no? Well, just in case it's not, let's break it down
into something easier.
++str;
*str = 'n';
In this, we can see things a bit more clearly. First, we increment the
character pointer. Then we can safely assign the new character to the
correct position in the array. But it's more concise to put it all in
one statement. Note that in this example, we could have used pre or post
increment, since it doesn't matter if we increment it before we do
nothing or after we do nothing, but it does matter when we
combine them because we must increment the pointer before we
assign the next character.
It would probably have been nicer to enclose the ++ operator within
parentheses to distinguish it from the * dereference operator. *(++ptr)
is a little more readable than *++ptr even though they are evaluated
exactly the same.
We are assuming integers are two bytes, but you can't specify that as
characters, so 4 characters (2 characters per byte) works out to be a
single integer, but it's hard to convert, so we just skip that space in
the argument pointer. And, since we added 3 characters, we must
increment the length counter by 3.
} else if (*literal == 's') {
// we have a string
while (*args != 0) {
// loop till we find the string terminator
*str++ = *args++;
length++;
}
// make sure we aren't past the end
str--;
args++;
Okay, now that we "finished" our integer part, such that we are going to
do, let's go on to the string part. We check for the 's' character,
which is the cue for a string to be appended to our new string.
Now we have to make another loop, but this time, we are looking for the
string terminator inside the argument pointer. This is just like the
literal terminator we are looking for, so it the same thing we did
above.
Now we have to do some string concatenation. To do that, we want to
append one character from the args to one character of the string, and
then increment both pointers. Well, remember what we did for the integer
append? We use pre-increment dereferenced assignment (okay, I just made
that phrase up, but it fits). Anyway, that's similar to what we want to
do here, but instead, we want to increment the pointer after we
copy the character. So, we put the ++ increment operator at the end of
the string pointer. This means, copy the character from each character
array, then increment both pointers after the copy is done. That's just
what we wanted to happen. Now all we need to do is increment the length
counter by 1 for the character we just copied, and our loop is finished.
Once the loop has ended (we found the terminator), we find ourselves at
a problem part. Because we almost always need to increment the string
pointer, we will do that at the end, but we are already where we want to
be. So, when that pointer increment comes at the end of the function, it
will throw our pointer off. So we need to subtract one from our string
pointer so when we hit the pointer increment at the end, it will again
be pointing at the correct place.
Now, the last thing of concern is to increment the argument pointer so
it's not pointing to the terminator. If we had another string, it would
never get appended because we would already be pointing at a terminating
character, so make sure this doesn't happen.
} else if (*literal == '%') {
*str = '%';
length++;
The next section is fairly intuitive. If we have a %%, we want to put
the literal character '%' in the string. So, we just append the
character to the string and increment the length.
} else {
*str = 'm';
*++str = 'i';
*++str = 's';
*++str = 'c';
// assume all other types
// take 1 byte
args++;
length+=4;
}
Well, I told you this version of sprintf() was limited. If we experience
any other kind of format specifier, we do not try to interpret it. We
append "misc" to the string, and assume it needed one byte from our
argument pointer. This is just like the integer specifier, so I won't
waste time explaining it again.
str++;
literal++;
Now that we have exited the if-then-else block, we come to our final
task. Since all of these things added at least one character to the
string, it's more efficient to put the string pointer increment here,
rather than in every segment of the if-then-else block. But don't forget
we also need to increment the literal pointer so we can find the next
character to interpret. So, take care of that, and we loop again.
} else {
// regular character
*str = *literal;
str++;
literal++;
length++;
}
Now, what if we didn't get a format specifier? Well, that's easy. We
just append the character from the literal to the character in the
string. You can see here how we can separate the increments from the
assignment if we choose. It's a preference thing, but it's probably
better to put it all in one statement. Anyway, since we are only
appending a single character, the length pointer is only incremented by
one, and we loop.
// terminate the string
*str = 0;
return length;
Okay, now that we have taken care of the loop, we need to finish the
string. Since every string needs a terminator, we need to add that as
the last character. Then, we simply return the length counter so the
user knows how long the string is, if the user has a need for such
knowledge. Well, that takes care of our sprintf2() function, so let's go
back to the beginning.
// print the string
printf_xy(0,0,string);
Now that we have a formatted string, we need to print the string. We can
use DrawStr() or printf_xy() to accomplish this. DrawStr() would have
been just as good. I have no idea why I used printf_xy here.
// format a new string
length = sprintf2(string,"%d %f %% %m'isc'","0030FM");
Okay, now we are going to reformat the string. The real reason for doing
this is because it's hard to combine string values and other values
inside the argument pointer, due to our limitations. But it's good to
see that we don't need a blank string to format, we can reformat any
string.
Well, the format specifiers are all there. We used two miscellaneous
formatters, one as a play on words and the other for the floating point
variables, but we don't interpret either of them. Then we also put an
integer on the string. When we hit the integer part of the sprintf2()
function, it will skip the 4 characters (2 bytes in a sense as we need
two characters to represent the hex byte). Each of the miscellaneous
formatters will take one byte from the argument string, which is why it
is 6 bytes long. The other formatter we used was the literal %, just to
show you how it works.
Since we used this function above, and walked through it, this shouldn't
be too much more difficult to understand than the last one.
// print the new string
printf_xy(0,10,string);
// wait for user to press a key to exit the program
ngetchx();
Well, the end of our program. Just print out our new string, and wait
for user input to exit the program.