Strings and Characters

Strings in C are essentially arrays of characters (there is no string type as there is in C++). The ASCII character encoding scheme uses a single byte to represent each character, and strings consist of an array of single-byte characters terminated by the null character (a byte with the value 0), which marks the end of the string. The char data type is used to represent these single byte characters, and strings are represented in C as an array of type char. The following character array variable can be used to hold a string up to twenty characters in length:

char myString[21]

The string held in the character array myString may have up to twenty-one characters including the terminating null character ('\0'). The two character arrays shown below, for example, both have space for twenty-one characters, but the strings they represent have different lengths.

Character arrays

Character arrays


Character arrays

The null character (written as a backslash followed by a zero) indicates to the compiler that it has reached the end of the string. The character elements that follow the null character in the character array are ignored. A character array, like any other variable, can be initialised with a series of array elements, as shown in the following program statement:

char myString[21] = {'H', 'e', 'l', 'l', 'o', '\0'};

Strings that consist of character arrays can also be initialised using string literals, however, as shown below:

char myString[21] = "Hello";

Strings of characters enclosed between double quotation marks (") are known as string literals. String literals have a null character appended automatically after the last character. A string variable intended to hold a string literal of up to n characters should therefore be declared as having n+1 elements, to allow for the final null terminating character (note that a string literal may not be assigned to a character array using the assignment operator once it has been declared, although each character variable in the string may be individually assigned a value using its array subscript - a number in the range 0 to n).

To hold the string "Hello", we would require an array that could hold the 5 characters in the word "Hello", plus the terminating null character. The variable could be declared and initialised as follows:

char my_string[6];
strcpy( my_string, "Hello" );

Standard C string handling functions like the strcpy() function used in the above example can be used to manipulate strings consisting or arrays of type char. Declarations for most of the commonly used functions for working with strings can be found in the string.h header file.


Finding the length of a string

You can determine the length of a string using the strlen() function. The value returned is the number of characters in the string not including the terminating NULL character. When applied to a character array with space for 30 characters, that contains the string "Hello World", the value returned would be 11 (as opposed to 12 or 30). You could use strlen(), for example, to check the length of a postcode string, as shown in the following example:

char post_code[10];

gets (post_code);

if ((strlen(post_code) < 7) || (strlen(post_code) > 9))
  printf( "Post code is incorrect length\n" );
else
  printf( "Post code is correct length" );

You could also check see whether or not data has actually been entered at all by checking to see if the length of the string is greater than 0. Alternatively, you could check whether the first character in the array is NULL. Both methods are illustrated in the following example:

char str[10];

gets (str);

if ( strlen(str) == 0 )
  printf( "Nothing entered\n" );
else
  printf( "Something entered" );

/* or */

if ( str[0] == '\0' )
  printf( "Nothing entered\n" );
else
  printf( "Something entered" );


Comparing two strings

You can compare two strings using the strcmp() function. The strcmp() function takes two arguments - the strings to be compared, separated by a comma. The comparison is alphanumerical, with letters having a greater value than numbers. For example, A comes after 9, but 0 comes before 9, and Z comes after A.

The strcmp() function compares every character of two strings passed to it as arguments, starting from the left. If every character is the same, the two strings are equal and the function returns 0. If the first string contains a character having a smaller value than the character occupying the same position in the second string, it is said to be less than the second string, and returns a negative integer. If the first string contains a character having a larger value than the character occupying the same position in the second string, it is said to be greater than the second string, and returns a positive integer.

The process is similar to comparing names in a telephone directory (e.g. "Smith" comes before "Smythe"). Comparison of the third character would result in strcmp() returning a non-zero result (depending, of course, on the order in which the arguments were passed to the function). The function is typically used to check whether or not two strings are the same, or to sort a number of strings into alphanumeric order. Note that strcmp() is case-sensitive.

The following code checks a hard-coded password:

char pword[21];

printf("Enter password: ");
gets(pword);

if (strcmp(pword, "letmein") == 0)
  printf("Password correct - continue");
else
  printf("Password incorrect. Go away, intruder!");

The strncmp() function compares the first n characters of two strings to see whether they match, where n is an integer value passed to the strncmp() function. The following short program demonstrates the use of both the strcmp() function and the strncmp() function.

// Example program 1
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  char* animal[] = {"Mole", "Mongoose", "Moose", "Mole", "Monkey"};
  int i, x;

  printf("In this list:\n\n");
  x = 0;
  for(i=0; i<5; i++)
  {
    printf("%s\n", animal[i]);
    if(strcmp(animal[i], "Mole") == 0) x++;
  }
  printf("\nthe word \"Mole\" appears ");
  printf("%d times. \n\n", x);
  x = 0;
  for(i=0; i<5; i++)
  {
    if(strncmp(animal[i], "Mon", 3) == 0) x++;
  }
  printf("\nThere are %d", x);
  printf(" words in the list beginning with \"Mon\".");
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 1 is shown below.

The output from example program 1

The output from example program 1


Joining two strings together

You often need to join strings together and output the result. There are two primary ways of joining strings together:

The following example asks the user to input their first and second names, combines these two string variables in a third string variable, and outputs the result to the screen:

char first_name[21], surname[21], full_name[41];

printf( "First name: " );
gets( first_name );
printf( "Second name: " );
gets( surname );
strcpy( full_name, first_name );
strcat( full_name, " " );
strcat( full_name, surname );
printf( "Full name is: %s", full_name );

Note that, in order to create the new string, we used the strcpy() function to copy the contents of first_name into full_name, and then used the strcat() function to join a space, and then surname, onto the end of full_name. The following example uses the sprintf() function:

char first_name[21], surname[21], full_name[41];

printf( "First name: " );
gets( first_name );
printf( "Second name: " );
gets( surname );
sprintf( full_name, "%s %s", first_name, surname );
printf( "Full name is: %s", full_name );

The sprintf() function works like printf(), except that the result is output to a string variable rather than to the screen. The use of sprintf() in the example specifies that the result should consist of a string, followed by a space, followed by a string, where the first string is first_name and the second string is surname.


Converting strings and numbers

What would happen if you tried to compile the following program?

#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  int result;
  char num1[21], num2[21];
  strcpy( num1, "2" );
  strcpy( num2, "3" );
  result = num1 + num2;
  printf( "Result of %s+%s=%d", num1, num2, result );
  getch();
}

The line result = num1 + num2; will produce a compiler error message (in Dev-C++, the compiler error will be "[Error] invalid operands to binary + (have 'char *' and 'char *')"). This is because you are trying to add two string variables together. Unfortunately, although the strings may contain digits, they are not numeric variables and therefore cannot be added together. We need to convert the string to a number, and then do the arithmetic on the results.

To convert a string to an integer value, you can use the atoi() function, which takes a string as its argument and returns the integer value represented by the string. The first character that atoi() encounters within the string that it does not understand will terminate the conversion. If it is unable to extract an integer value from the string, it returns a result of zero (0). The atoi() function is declared in the stdlib.h library header file, so you should add the necessary #include directive. The amended program is shown below.

// Example program 2
#include <stdio.h>
#include <conio.h>
#include <string.h>
#include <stdlib.h>

void main()
{
  int result;
  char num1[21], num2[21];
  strcpy( num1, "2" );
  strcpy( num2, "3" );
  result = atoi(num1) + atoi(num2);
  printf( "Result of %s+%s=%d", num1, num2, result );
  getch();
}

The output from example program 2 is shown below.

The output from example program 2

The output from example program 2

A similar function, atof(), converts a string to a floating point number. It works in the same way, but the result is assigned to a floating point variable rather than an integer variable. To convert an integer or floating point number back to a string, we can again use sprintf(). The following examples illustrates this:

/* convert an integer value to a string */
int i;
char result[21];

i = 10;
sprintf( result, "%d", i );

/* convert a floating point value to a string */
float f;
char result[21];

f = 3.5;
sprintf( result, "%3.1f", f );


Searching a string for a specific character

The following short program demonstrates the use of both the strchr() function and the strrchr() functions, which return pointers to the first occurrence of a character in a string and the last occurrence of a character in a string respectively.

// Example program 3
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  char str[40] = "\"The boy stood on the burning deck.\"";
  char *charPtr;
  int n;

  printf("There are two occurrences of the letter \"b\" ");
  printf("in the following sentence: \n\n");
  printf("%s\n\n", str);
  charPtr = strchr(str, 'b');
  n = charPtr - str;
  printf("The first is at position %d.\n\n", n);
  charPtr = strrchr(str, 'b');
  n = charPtr - str;
  printf("The second is at position %d.", n);
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 3 is shown below:

The output from example program 3

The output from example program 3


Assigning values to strings

The following short program demonstrates the use of both the strcpy() function and the strncpy() function, which are used to copy the contents of a string, or part of a string, into another string.

// Example program 4
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  char str01[10] = "January";
  char str02[10] = "February";
  char str03[10] = "March";
  char longMon01[10], longMon02[10], longMon03[10];
  char shortMon01[4] = "", shortMon02[4] = "", shortMon03[4] = "";

  strcpy(longMon01, str01);
  strcpy(longMon02, str02);
  strcpy(longMon03, str03);
  strncpy(shortMon01, str01, 3);
  strncpy(shortMon02, str02, 3);
  strncpy(shortMon03, str03, 3);
  printf("The first month of the year is %s\n", longMon01);
  printf("(this is often shortened to '%s').\n\n", shortMon01);
  printf("The second month of the year is %s\n", longMon02);
  printf("(this is often shortened to '%s').\n\n", shortMon02);
  printf("The third month of the year is %s\n", longMon03);
  printf("(this is often shortened to '%s').\n\n", shortMon03);
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 4 is shown below:

The output from example program 4

The output from example program 4


Finding a substring

The following short program demonstrates the use of the strstr() function, which searches for a specified substring within a string (note that the code must be entered by the user as upper case characters to be recognised by the program as a valid code).

// Example program 5
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  char strAirports[40] = "LHR, LGW, MAN, STN, BHX, GLA, EDI, LTN, BFS, BRS";
  char airportCode[4] = "";
  char *charPtr;
  int n;

  printf("Please enter a UK airport code: ");
  scanf("%s", airportCode);
  charPtr = strstr(strAirports, airportCode);
  if (charPtr == NULL)
  {
    printf("\n\nThat code is not one of the top 10 UK airports.");
  }
  else
  {
    n = charPtr - strAirports;
    n = n/4 + 1;
    printf("\n\nThat code is the number %d UK airport.", n);
  }
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 5 is shown below:

The output from example program 5

The output from example program 5


Character handling functions

The character handling functions described in the table below are declared in the ctype.h header file. With the exception of the toupper() and tolower() functions, all of the listed functions return zero (false) or non-zero (true) values.



Character Functions in <ctype.h>
FunctionDeclarationDescription
isalnum()int isalnum(int c)Returns a non-zero value if c is alphanumeric
isalpha()int isalpha(int c)Returns a non-zero value if c is alphabetic
iscntrl()int iscntrl(int c)Returns a non-zero value if c is a control character
isdigit()int isdigit(int c)Returns a non-zero value if c is a digit (0-9)
isgraph()int isgraph(int c)Returns a non-zero value if c is a graphic character
islower()int islower(int c)Returns a non-zero value if c is a lower case character (a-z)
isprint()int isprint(intc)Returns a non-zero value if c is a printable character
ispunct()int ispunct(int c)Returns a non-zero value if c is a punctuation character
isspace()int isspace(int c)Returns a non-zero value if c is a white space characters or
one of the escape sequences: '\f', '\n', '\r', '\t', or '\v'
isupper()int isupper(int c)Returns a non-zero value if c is an upper-case character (A-Z)
isxdigit()int isxdigit(int c)Returns a non-zero value if c is a hexadecimal character
tolower()int tolower(int c)Returns the lower case version of c
toupper()int toupper(int c)Returns the upper case version of c


The short program below illustrates the use of the character functions.

// Example program 6
#include <ctype.h>
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  int charCode;
  char* yes_no[] = {"Yes", "No"};

  printf("\n\nEnter a character code (0-127): ");
  scanf("%d", &charCode);
  printf("\n\n");
  printf("isalnum : %s\n", yes_no[!(isalnum(charCode)>0)]);
  printf("isalpha : %s\n", yes_no[!(isalpha(charCode)>0)]);
  printf("iscntrl : %s\n", yes_no[!(iscntrl(charCode)>0)]);
  printf("isdigit : %s\n", yes_no[!(isdigit(charCode)>0)]);
  printf("isgraph : %s\n", yes_no[!(isgraph(charCode)>0)]);
  printf("islower : %s\n", yes_no[!(islower(charCode)>0)]);
  printf("isprint : %s\n", yes_no[!(isprint(charCode)>0)]);
  printf("ispunct : %s\n", yes_no[!(ispunct(charCode)>0)]);
  printf("isspace : %s\n", yes_no[!(isspace(charCode)>0)]);
  printf("isupper : %s\n", yes_no[!(isupper(charCode)>0)]);
  printf("isxdigit : %s\n\n", yes_no[!(isxdigit(charCode)>0)]);
  if (!isprint(charCode))
  {
    printf("The character cannot be printed.");
  }
  else if (isspace(charCode))
  {
    printf("The character is a space.");
  }
  else
  {
    printf("The character is \"%c\"\n\n", charCode);
  }
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 6 is shown below:

The output from example program 6

The output from example program 6

We used the strstr() function (above) to find an occurrence of a three character airport code. Unfortunately, the program fails to find the code if the user enters any or all of the code as lower case characters. The toupper() and tolower() functions can be used to convert characters to their upper or lower case representations (note that the functions only convert a character if it is not already of the required case). The following program is a revised version of the substring program we used to find an airport code. This version uses the toupper() function in a short loop construct to convert the user's input to upper case if necessary.

// Example program 7
#include <stdio.h>
#include <conio.h>
#include <string.h>

void main()
{
  char strAirports[49] = "LHR, LGW, MAN, STN, BHX, GLA, EDI, LTN, BFS, BRS";
  char airportCode[4] = "";
  char *charPtr;
  int i, n;

  printf("Please enter a UK airport code: ");
  scanf("%s", airportCode);
  for(i=0; i<3; i++)
  {
    airportCode[i] = toupper(airportCode[i]);
  }
  charPtr = strstr(strAirports, airportCode);
  if (charPtr == NULL)
  {
    printf("\n\nThat code is not one of the top 10 UK airports.");
  }
  else
  {
    n = charPtr - strAirports;
    n = n/5 + 1;
    printf("\n\nThat code is the number %d UK airport.", n);
  }
  printf("\n\nPress ENTER to continue.");
  getch();
}

The output from example program 7 is shown below:

The output from example program 7

The output from example program 7