×
Author:
Website:
Page title:
URL:
Published:
Last revised:
Accessed:

Working with Strings

Overview

A PHP string is a sequence of zero or more printable characters (letters, numbers, punctuation marks etc.) that is typically enclosed within either single of double quotes. For example, we could write "Hello World!" or 'Hello World!', and both of these strings will be handled in the same way by PHP. There are circumstances, however, under which using double or single quotes will produce very different results, but we'll get to that later.

Never combine double and single quotes for the same string. Writing "Hello World!', for example, will result in an error message such as "Parse error: syntax error, unexpected end of file . . .". Keep in mind also that if a string contains a block of quoted text, either the quotes used for the quoted text should be of a different kind from the quotes used for the string containing that text, or they should be escaped. For example:

$str = "As the man walked past me, he said 'Good morning!'.";

Or alternatively:

$str = 'As the man walked past me, he said "Good morning!".';

Or (using escape sequences):

$str = "As the man walked past me, he said \"Good morning!\".";

If a string contains an apostrophe and you are using single quotes to enclose it, you will need to escape the apostrophe. For example:

$str = 'Well I\'ll go to the foot of our stairs!';

Another thing you should be aware of is that PHP natively only supports a 256-character set in which each character is represented by a single 8-bit byte. This is essentially the default character set for the HTML 4.0 standard which emerged towards the end of the 1990s, based on the ISO/IEC 8859 series of standards for 8-bit character encodings. Today, the character encoding scheme almost universally adopted for the web is UTF-8, which uses variable-length encoding.

In the UTF-8 standard, a character can be represented using from one up to four bytes, depending on the character. The first 128 characters in the UTF-8 encoding scheme, for example, are identical to the 128 characters of the ASCII encoding scheme, and require only one byte each to represent them. The letters of the Greek alphabet, on the other hand, each require two bytes to represent them.

We said earlier that a PHP string is a sequence of zero or more printable characters, but it would be more accurate to say that a PHP string is a sequence of zero or more bytes. This can lead to some interesting results when we try to use some of PHP's string handling functions with strings that contain UTF-8 characters that require more than one byte. Consider the following example:

<?php
$str = "Καλημέρα";
echo strlen($str);
// 16
?>

You can clearly see that the string variable $str contains the Greek word Καλημέρα, which means "Good morning" and has eight characters. However, the PHP strlen() function, which returns the length of any string variable passed to it as an argument, tells us that the string has a length of sixteen. The reason is that the strlen() function counts bytes, not characters, and each letter of the Greek alphabet is represented using two bytes in the UTF-8 character-encoding scheme.

In this article, we will be looking at a number of PHP's string handling functions. Keep in mind, however, that some of these functions are byte-oriented. They will work perfectly well with UTF-8 strings, but don't always produce the expected results. Throughout the rest of this article, we will attempt to point out potential problems that could arise as a result, and what we can do to try and avoid such problems.

The maximum permitted length of a string variable (in bytes) will depend on the system hardware and operating system. On a 32-bit system, for example, the maximum string length is two Gigabytes (2,147,483,647 bytes). According to several sources, there is no official limit to the length of a string on a 64-bit system.

From a practical viewpoint, you are unlikely to ever encounter strings of such gargantuan proportions. A complete digital copy of Leo Tolstoy's "War and Peace", for example, would contain approximately 3.5 million bytes. We've actually worked out that in order to enter a 2GB string on a keyboard - without a break - would take the average person approximately twenty-one years!

PHP provides numerous built-in functions for manipulating strings. Some of those functions will be featured in this article. A far more complete guide to PHP's standard string handling functions can be found on the PHP Group's website here. A guide to string handling functions for strings with multi-byte encoding can also can be found on the PHP Group's website here.

Basic syntax

Much of the time we will be dealing with string literals. A string literal is a sequence of characters that will be stored in memory exactly as they are typed (or otherwise generated). The greeting "Hello World!" which features in so many beginner programming texts, for example, is a string literal.

We can surround a string literal with either double or single quotes and the result will be the same, whether we are saving the string to a storage medium of some kind or sending it to the screen. Or a printer. There are a few things to watch out for in terms of certain characters as we have seen already, such as when using an apostrophe in a singly quoted string, but generally speaking string literals are simply chunks of text.

We mentioned earlier that, although enclosing strings in either double or single quotes will normally make no difference, there are several things we need to be aware of. The first thing you should be aware of is that double-quoted strings will expand both escaped characters and variable names, whereas singly-quoted strings won't. In the case of a singly-quoted string, the exceptions are the apostrophe and the backslash character. For example:

<?php
echo 'Here\'s a backslash: \\.<br>';
echo 'And here\'s a double quotation mark: \".';
// Here's a backslash: \.
// And here's a double quotation mark: \".
?>

As you can see, the first statement in our script produces the expected result, but when we attempt to escape the double quotation mark using an escape sequence (\"), PHP outputs the reverse solidus (backslash) as typed. The same thing will happen with other escaped characters, such as the dollar sign (\$). This is also the case for variable names. This is what happens if we try to use an embedded variable with single quotes:

<?php
$str = "Wednesday";
echo 'Today is $str.';
// Today is $str.
?>

If you need to embed escaped characters and variable names in a string, the easiest way to do so is to use double quotes to enclose the string. Let's try that again, but this time we'll use double quotes:

<?php
$str = "Wednesday";
echo "Today is $str.";
// Today is Wednesday.
?>

We can also embed elements of an array variable within a string. In the following example, we embed an element from an indexed array within a string:

<?php
$softDrinks = ["cola", "orange Juice", "lemonade"];
echo "I would like a $softDrinks[2] please.";
// I would like a lemonade please.
?>

We can do the same thing with associative array elements, as shown in the following example (note that the key of the array element must be used without quotes):

<?php
$rainbow = [
"R1"=>"red",
"R2"=>"orange",
"R3"=>"yellow",
"R4"=>"green",
"R5"=>"blue",
"R6"=>"indigo",
"R7"=>"violet"
];
echo "My favourite colour is $rainbow[R5].";
// My favourite colour is blue.
?>

If a string is enclosed in double quotes, the following escape sequences will be correctly interpreted:

SequenceOutput
\nlinefeed
\rcarriage return
\thorizontal tab
\vvertical tab
\eescape
\fform feed
\\backslash
\$dollar sign
\"double-quote

Certain regular expressions will also be correctly interpreted (we will be discussing the topic of regular expressions in another article). As with singly-quoted strings, if we attempt to escape any other characters, the backslash will be printed as well.

The process of embedding escaped characters and variables into a string is formally referred to as interpolation. The basic syntax used to achieve interpolation is to use doubly-quoted strings, as we have described here. There is also an advanced syntax that involves the use of curly braces. This syntax allows us to do everything we can do with doubly-quoted strings, and a couple of things we can't do with doubly-quoted strings. We'll discuss the advanced syntax in more detail later in this article.

We'll also be discussing two additional syntactical forms for delimiting strings, namely heredoc, which literally means "here document", and nowdoc, which presumably means "now document". Both the heredoc syntax and the nowdoc syntax allow us to output a string spanning multiple lines exactly as it appears in the script.

The difference between heredoc and nowdoc is that is that heredoc allows us to interpolate strings in the same way we can with doubly-quoted strings, whereas nowdoc behaves like a singly-quoted string in this respect, and will not expand escaped characters or embedded variable names.

Strings as arrays

A string in PHP is represented internally as an array of bytes, together with an integer value that indicates the length of the string in bytes. This implementation means that we can accessed individual characters within a string by specifying the zero-based offset of the character we wish to access using square brackets notation, in much the same way we access an array element in an indexed array. For example:

<?php
$greeting = "Hello World!";
echo "The first character in \"$greeting\" is \"$greeting[0]\".";
// The first character in "Hello World!" is "H".
?>

There are however a few things to consider here. First of all, attempting to read an out-of-range offset will result in an error, as it would with any array. Attempting to write to an out of range offset, on the other hand, will pad the string with spaces. For example:

<?php
$str = "This is a string";
var_dump($str);
// string(16) "This is a string"
echo "<br>";
$offset = strlen($str) + 5;
$str[$offset] = "!";
var_dump($str);
// string(22) "This is a string !"
?>

As you can see from the var_dump() output, writing an exclamation mark to the offset specified by $offset, which is well beyond the end of the $str variable in its original form, has increased the length of the string from sixteen to twenty-two bytes by padding the string with spaces. We can only see the first of these spaces in the browser output because HTML ignores all but the first white-space character. Looking at the underlying source code gives us a better picture of what has happened:


The string has been padded with spaces

The string has been padded with spaces


Note that attempting to write more than one character to a string offset will result in only the first character being written. The remaining characters will simply be discarded. Negative offsets are allowed, and specify the offset from the end of the string.

Perhaps the most important thing to bear in mind if you are contemplating accessing individual characters within a string in this manner is that the results may be unexpected unless the script is written using a single-byte encoding scheme such as ISO-8859-1. Even then, if Zend Multibyte is enabled (which it probably will be), there is no guarantee that every character in a string will be represented by a single byte.

Enabling Zend Multibyte essentially means that strings containing one or more multi-byte characters can be used in a script, and will be automatically encoded internally using UTF-8. This means that you don't need to worry about explicitly declaring a multi-byte encoding.

What you do need to be aware of is that some PHP string-handling functions do not work correctly with strings that contain multi-byte characters, as we saw above with the strlen() function. If in doubt, avoid using these functions and use the functions provided by the Mbstring extension, which are specifically designed to handle multi-byte strings.

Note that prior to PHP 7.4, it was possible to access individual characters in a string using either square brackets ([]), as we have in our examples, or curly braces ({}). You may encounter examples of curly braces being used in this way in older scripts or PHP tutorials. As of version 7.4, however, using the curly brace syntax was deprecated, and it was removed altogether in PHP 8.0, so avoid using it.

Advanced syntax

We have already mentioned that variables can be embedded within string literals when using double quotes. The variable will be evaluated, and its value inserted into the string in place of the variable name. There are some circumstances, however, when this will not work as expected. Consider the following example:

<?php
$guns = 21;
echo "The most common form of gun salute is the $guns-gun salute."
// The most common form of gun salute is the 21-gun salute
?>

This example works as expected. The variable $guns is evaluated and replaced in the echo statement by the string literal "21". Now consider this example:

<?php
$str = "Bon";
echo "Place names in the UK beginning with \"Bon\":<br><br>";
echo "$strawe<br>";
echo "$strby<br>";
echo "$strcath<br>";
echo "$strnybridge<br>";
echo "$strchurch<br>";
?>

This code doesn't work as expected at all. In fact, it generates a warning for each line in which we attempt to use the variable $str as a prefix. This is because we have effectively changed the variable name by adding additional characters to the end of it. For example, $str becomes $strawe, which will be flagged as an undefined variable. We have two options to rectify the problem. On way is to use the string concatenation operator. For example:

echo $str . "awe<br>";

Alternatively, we can use the advanced syntax, which involves enclosing the variable within curly braces:

echo "{$str}awe<br>";

The advanced syntax is sometimes the better option, especially when we are embedding multiple variables within a string. Here is a revised version of the previous example:

<?php
$str = "Bon";
echo "Place names in the UK beginning with \"Bon\":<br><br>";
echo "{$str}awe<br>";
echo "{$str}by<br>";
echo "{$str}cath<br>";
echo "{$str}<br>";
echo "{$str}church<br>";

// Place names in the UK beginning with "Bon":
//
// Bonawe
// Bonby
// Boncath
// Bon
// Bonchurch
?>

This time, the script works as expected. The advanced syntax also solves another problem in that, when using associative array elements within double-quoted strings, it is necessary to omit the quotes around the array key. The following doesn't work, even though we have used single quotes around the key instead of double quotes:

<?php
$alphabet = [
"a"=>"A",
"b"=>"B",
"c"=>"c"
];
echo "The first letter of the alphabet is $alphabet['a']";
// Parse error: syntax error, unexpected string content . . .
?>

In order to fix this, we have to lose the quotes surrounding the key altogether:

<?php
$alphabet = [
"a"=>"A",
"b"=>"B",
"c"=>"c"
];
echo "The first letter of the alphabet is $alphabet[a]";
// The first letter of the alphabet is A
?>

This works perfectly, but it means that we cannot use a defined constant as a key in an associative array because it will not be evaluated as a constant when used in a double-quoted string. For example:

<?php
define("ALPHA", 0);
$alphaGR = ["Alpha", "Beta", "Gamma"];
echo "The first letter of the Greek alphabet is $alphaGR[ALPHA].";
// Warning: Undefined array key "ALPHA" . . .
?>

One solution is to use concatenation or output the array value using a separate echo statement. A perhaps more elegant solution is to use the advanced syntax, as shown here:

<?php
define("ALPHA", 0);
$alphaGR = ["Alpha", "Beta", "Gamma"];
echo "The first letter of the Greek alphabet is {$alphaGR[ALPHA]}.";
// The first letter of the Greek alphabet is Alpha.
?>

Heredoc

As well as the basic and advanced syntax for strings there are two additional syntactic forms known as heredoc and nowdoc. We're going to look first at heredoc. The heredoc syntax uses the <<< operator immediately followed by an identifier. The output string starts on the next line and can span multiple lines. It can contain embedded variables, characters that would otherwise need to be escaped, and single and double quotes in any order. The same identifier that was used to open the string must be used to close it. An example should clarify how the heredoc syntax works:

<?php
$str = <<<STRING
The boy stood on the burning deck<br>
Whence all but he had fled;<br>
The flame that lit the battle's wreck<br>
Shone round him o'er the dead.<br>
STRING;
echo $str;

// The boy stood on the burning deck
// Whence all but he had fled;
// The flame that lit the battle's wreck
// Shone round him o'er the dead.
?>

Note that if the output is intended for a web browser, the HTML break element (<br>) is still required at the end of each line of text. In fact, one of the applications for which the heredoc syntax is ideal is in creating HTML content. Consider the following example:

<?php
$topTen1975 = [
"\"Sister Golden Hair\" - America",
"\"Island Girl\" - Elton John",
"\"Love Will Keep us Together\" - Captain & Tennille",
"\"When Will I Be Loved\" - Linda Ronstadt",
"\"Fallin' in Love\" - Hamilton, Joe Frank & Reynolds",
"\"Bad Blood\" - Neil Sedaka",
"\"Philadelphia Freedom\" - Elton John Band",
"\"Sky High\" - Jigsaw",
"\"Jackie Blue\" - The Ozark Mountain Daredevils",
"\"Get Down Tonight\" - KC and The Sunshine Band"
];

$hitList = "";
$year = 1975;

foreach($topTen1975 as $hit){
$hitList .= "<li>$hit</li>";
}

$str = <<<HTML
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Top 10 Hits of $year</title>
</head>
<body>
<h1>Top 10 Hits of $year</h1>
<ul>
$hitList
</ul>
</body>
</html>
HTML;
echo $str;
?>

We can literally write an entire web page using the heredoc syntax and code it in exactly the same way we would do if we were using an HTML editor. The variable $str holds the code for the web page exactly as typed. And, because heredoc will expand variable names, we can use it to create HTML templates.

In the above example we have an array variable - $topTen1975 - that holds the titles of the top 10 chart hits of 1975, together with the artists who recorded them. We also have a variable called $year that is assigned the value 1975. The contents of the $topTen1975 are formatted as HTML list elements and added to the string variable $hitList using a foreach loop.

Within the heredoc, the $year and $hitList string variables act as placeholders for the year and the songs that were hits in that year. We could create similar array variables to to hold lists of top ten hits from other years, and use the same heredoc HTML template to output a list of top ten hits from any of those years. Here is the output from our example:


We can use heredoc to create HTML templates

We can use heredoc to create HTML templates


The heredoc syntax is equivalent to using double quotes except that it works for multiple lines of text and will preserve line breaks and indentation exactly as entered. We don't have to worry about escaping double or single quotes either. There are however a few rules that must be followed:

  • The heredoc string starts with the <<< operator, followed immediately by an identifier.
  • The identifier can be any string you like, but can only consist of alpha-numeric characters and underscores, and must begin with an underscore or a letter.
  • The heredoc string ends with the same identifier, immediately following the end of the heredoc text, on its own line, and must be followed by a semicolon.
  • The opening heredoc identifier cannot be followed by any whitespace characters.
  • The closing identifier cannot be preceded by any whitespace characters.

Nowdoc

Like heredoc, the nowdoc syntax also uses the <<< operator immediately followed by an identifier. The difference here is that the identifier is enclosed within single quotes.As for heredoc, the output string starts on the next line and can span multiple lines. Embedded variables and escaped characters will not be evaluated, and will appear in the output exactly as typed. The same identifier that was used to open the string must be used to close it, but without the single quotes. An example should clarify how the nowdoc syntax works:

<?php
$str = <<<'STRING'
The boy stood on the burning deck<br>
Whence all but he had fled;<br>
The flame that lit the battle's wreck<br>
Shone round him o'er the dead.<br>
STRING;
echo $str;

// The boy stood on the burning deck
// Whence all but he had fled;
// The flame that lit the battle's wreck
// Shone round him o'er the dead.
?>

As you can see, the output is identical to that of our first heredoc example. In fact, the only difference in the script is the use of single quotes around the opening nowdoc identifier. The nowdoc syntax is equivalent to using single quotes, except that it works for multiple lines of text and will preserve line breaks and indentation exactly as entered. It can't be used for generating template documents in the same way as heredoc because it doesn't evaluate embedded variables, but other than that it behaves in a similar fashion to heredoc.

Nowdoc is useful if you want to output large chunks of text spanning multiple lines, exactly as written. Because it doesn't evaluate variable names or escape sequences, you can use it to output string literals that should not be evaluated, such as program code. For example:

<?php
$program = <<<'CODE'
// A first C Program.
#include <stdio.h>

void main()
{
char str[1];
printf("Greetings, Maestro!");
printf("\n\nPress ENTER to continue...");
gets(str);
}
CODE;

$progFile = fopen("cprog.c", "w") or die("Unable to open file!");
fwrite($progFile, $program);
fclose($progFile);
?>

In this script, we use nowdoc to create a string literal that contains the code for a simple C program and assign it to the variable $program. We then open a file called cprog.c, write the contents of the $program variable to the file, and close the file. If we subsequently open the file in a text editor we should see the following:


We can use nowdoc to create multi-line string literals

We can use nowdoc to create multi-line string literals


Apart from the single quotes around the opening identifier, the nowdoc syntax is identical to the heredoc syntax in, and follows the same rules concerning naming conventions and the non-use of white space before or after the identifier. As with heredoc, we don't have to worry about escaping double or single quotes, and special characters like "$" and "\" are treated the same as any other character.

Updating strings

The value of a primitive PHP variable such as a string can be changed throughout the execution of a script. In the simplest case, we just replace one string value with another using the assignment operator. For example:

<?php
$greeting = "Hello World!";
echo "$greeting<br><br>";
echo "Or, for my friends in Greece:<br><br>";
$greeting = "Γεια σου Κόσμο!";
echo $greeting;

// Hello World!
//
// Or, for my friends in Greece:
//
// Γεια σου Κόσμο!
?>

Simply replacing the text of a string variable is a good way to re-use the same variable as circumstances change without having to replace whole sections of a script. A string variable can be used, for example, to hold different versions of a heading, a menu item, or an entire block of text, depending on the language chosen by the user. Other common use cases include storing text-based user input from online forms, displaying custom messages in a browser window, and storing current date and time information in a human-readable format.

String variables are often updated multiple times within a loop construct. In the article "Working with Arrays" in this section, we saw examples of the same string variable being updated repeatedly during the execution of a foreach loop. If you've read that article, the following example should be familiar to you:

<?php
$trees = ["Oak", "Ash", "Beech", "Elm", "Sycamore", "Cedar", "Larch", "Maple"];
$index = 0;
echo "Trees:<br>";
foreach($trees as $tree) {
$index += 1;
echo "<br>$index. $tree";
}
// Trees:
//
// 1. Oak
// 2. Ash
// 3. Beech
// 4. Elm
// 5. Sycamore
// 6. Cedar
// 7. Larch
// 8. Maple
?>

In this example, we create an array variable called $trees to that hold the names of a number of species of tree. A string variable called $tree is declared in the foreach loop's header which is updated automatically on each iteration of the loop with the current array value, enabling us to display the names of all of the tree species stored in the array.

String concatenation

Often, instead of simply replacing the value of a string variable, we want to add a string literal to an existing string, or join two or more string variables together, to create a new string variable. The process of joining two or more strings in this manner is known as concatenation. In order to concatenate two strings, we use one of two string operators - the concatenation operator (.) or the concatenation assignment operator (.+).

We typically use the string concatenation operator to create a new string variable by joining two or more existing string variables and/or string literals. The string concatenation assignment operator is used to add one or more string variables or string literals to an existing string. The following example demonstrates the use of both operators:

<?php
$person = [
"firstName"=>"Bunsen",
"lastName"=>"Honeydew",
"address"=>"123 Sesame Street",
"town"=>"Muppetville"
];
$mailing = $person["firstName"] . " " . $person["lastName"] . "<br>";
$mailing .= $person["address"] . "<br>" . $person["town"];
echo $mailing;

// Bunsen Honeydew
// 123 Sesame Street
// Muppetville
?>

In this example we have an array $person that holds the name and address of a person (OK, in this case it's a muppet . . . ). We then declare a string variable called $mailing to hold a mailing list item. The value assigned to $mailing is a concatenation of $person["firstName"] (a string variable), the space character (a string literal), $person["lastName"] (another string variable), and "<br>" (another string literal).

In the next step, we concatenate the variable $person["address"], the string literal "<br>", and the variable $person["town"] to create the mailing address using the concatenation operator, and add the concatenation to $mailing using the concatenation assignment operator - all in one line of code.

We can of course break this kind of operation down into several lines of code for the sake of readability, depending on how many string variables and/or string literals we need to concatenate. For example, here is the same script again, this time spread over several lines:

<?php
$person = [
"firstName"=>"Bunsen",
"lastName"=>"Honeydew",
"address"=>"123 Sesame Street",
"town"=>"Muppetville"
];
$mailing = $person["firstName"];
$mailing .= " ";
$mailing .= $person["lastName"];
$mailing .= "<br>";
$mailing .= $person["address"];
$mailing .= "<br>";
$mailing .= $person["town"];
echo $mailing;

// Bunsen Honeydew
// 123 Sesame Street
// Muppetville
?>

This script produces exactly the same output as the previous script, but it might be easier to read (which is questionable), it's also more code to write and probably not particularly efficient. On the other hand, we could achieve all of our concatenation in one line, as in the following script:

<?php
$person = [
"firstName"=>"Bunsen",
"lastName"=>"Honeydew",
"address"=>"123 Sesame Street",
"town"=>"Muppetville"
];
$mailing = $person["firstName"] . " " . $person["lastName"] . "<br>" . $person["address"] . "<br>" . $person["town"];
echo $mailing;

// Bunsen Honeydew
// 123 Sesame Street
// Muppetville
?>

This script produces exactly the same output as the previous two examples, with less code but at the cost of readability. We have one very long line of code that handles all of the concatenation, but looks cluttered and unwieldy. Ultimately, how you handle multiple concatenations of this nature is a case of using your judgement as a programmer in terms of balancing coding efficiency with writing code that is readable and more easily maintained.

Converting variables to strings

Sometimes we want to convert a variable that is not a string into a string in order to display that variable in a specific format. Typical use cases for this type of conversion are when we want to display currency values, or numeric values with leading zeros.

We don't usually need to worry about explicitly converting numeric values to string values when displaying them in their existing format on screen, because a program statement that is expected to produce text output will automatically convert variables to strings when required to do so. For example:

<?php
echo "The value of pi is " . M_PI . "<br><br>";
$pi = 3.1416;
echo "The value of pi to four decimal places is $pi";
// The value of pi is 3.1415926535898
//
// The value of pi to four decimal places is 3.1416
?>

In this example, the built-in PHP mathematical constant M_PI is converted automatically to the string value "3.1415926535898". Note that we need to concatenate the constant M_PI with the string literals on either side of it in the first echo statement, using the string concatenation operator, to prevent it from being interpreted as a string literal. We don't have that problem with the second echo statement, because the variable $pi will automatically be evaluated and displayed as a string.

If we need to explicitly convert a variable to a string, there are several methods we can use. The first method we will look at is type casting. In order to cast a variable to a string, we simply precede the variable name with (string), as demonstrated by the following example:

<?php
$euler = 2.718281828459045;
var_dump($euler);
echo "<br>";
$strEuler = (string)$euler;
var_dump($strEuler);

// float(2.718281828459045)
// string(14) "2.718281828459"
?>

We can achieve the same results with numeric variables using the PHP strval() function. For example:

<?php
$euler = 2.718281828459045;
var_dump($euler);
echo "<br>";
$strEuler = strval($euler);
var_dump($strEuler);

// float(2.718281828459045)
// string(14) "2.718281828459"
?>

In the above examples, we have used type casting or the strval() function to create new string variables from existing (non-string) variables without changing the type of the original variable. If we actually want to change the type of the original variable we can assign the result of the conversion back to that variable, like this:

<?php
$euler = 2.718281828459045;
var_dump($euler);
echo "<br>";
$euler = strval($euler);
var_dump($euler);

// float(2.718281828459045)
// string(14) "2.718281828459"
?>

We can also use the PHP function settype() to change the type of a variable directly. The function takes two arguments - the name of the variable to be converted, and the type we want to convert it to. Since we want to convert a non-string variable to a string variable, the second argument is going to be "string". The function returns a Boolean value indicating the success (TRUE) or failure (FALSE) of the conversion process. For example:

<?php
$euler = 2.718281828459045;
var_dump($euler);
echo "<br>";
$result = settype($euler, "string");
var_dump($result);
echo "<br>";
var_dump($euler);

// float(2.718281828459045)
// bool(true)
// string(14) "2.718281828459"  
?>

As we can see, the settype() function has successfully converted the variable $euler, which represents Euler's number, from a floating-point value to a string. One point to note here is that the precision of the string representation appears to have been reduced from fifteen significant digits to only thirteen significant digits.

In fact, a PHP floating-point variable has a maximum precision of fourteen decimal digits, so even if you specify a greater number of digits, the value will be stored internally as the closest approximation of that number possible with fourteen decimal digits, and PHP will round the results of subsequent operations carried out with that value to reflect the maximum fourteen-digit precision permitted.

Although the first two string-conversion methods shown above (type casting and the strval() function) are generally interchangeable and produce the same results, some sources claim that using (string) is faster than using strval(), although the difference, if any, is probably negligible. Note that none of the methods described above can be used to convert an array to a string. For example:

<?php
$trees = [
"Oak",
"Ash",
"Beech",
"Birch",
"Elm",
"Cedar"
];
$strTrees = (string)$trees;
var_dump($strTrees);

// Warning: Array to string conversion . . .
// string(5) "Array"
?>

We see two things happening here. First, PHP issues a warning that we are attempting to convert an array variable to a string. Secondly, although the var_dump() function reveals that the variable $strTrees is indeed a string variable, the value it stores is simply "Array". In fact, whenever we try to convert an array variable to a string using (string), strval() or settype(), the resulting string value will always be "Array".

If we do need to convert an array variable to a string - for example to produce a more human-readable representation of the array's contents, or for the purposes of debugging, we can use PHP's implode() function to create a string that contains all of the elements of the array, separated by an appropriate string separator such as a comma followed by a space.

The implode() function takes two arguments - an optional string separator, and the name of an array variable. The return value is a string containing all of the elements in the array, separated by the string separator (if specified). If you have read the article "Working with Arrays" in this section, you might remember this example, in which an array of strings is converted to a single string variable:

<?php
$trees = ["Oak", "Ash", "Beech", "Elm", "Birch"];
$strTrees = implode(", ", $trees);
var_dump($strTrees);
// string(27) "Oak, Ash, Beech, Elm, Birch"
?>

The array elements can be strings, numeric values or Boolean values (or a mixture of these datatypes). Non-string array elements will be automatically converted to their string representation. Here is another example taken from the article "Working with arrays" in which an array of integer values is converted to a string variable:

<?php
$primesTo50 = [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47];
$strPrimesTo50 = implode(",", $primesTo50);
var_dump($strPrimesTo50);
// string(40) "2,3,5,7,11,13,17,19,23,29,31,37,41,43,47"
?>

One use case for the implode() function is in the creation of comma-separated value (CSV) files - plain text files containing a list of comma-separated values. Such files are typically used to store tabular data, or to transfer data between spreadsheet and database applications.

String length and word count

It is sometimes the case that we need to retrieve the length of a string, or the number of words a string contains. There are many reasons why we might want to find the length of a string. The input fields in online forms, for example, often impose a limit on the number of characters a user can enter, and requirements for user passwords often specify a minimum and maximum number of characters. Validating user input will often involve finding out how many characters a user has entered.

Word count is also important in certain situations. Journalists writing an article or students writing an essay, for example, will often have very specific minimum and maximum requirements in terms of the number of words they are expected to submit. Word count is used in areas like text analysis, where it is a standard metric for analysing the complexity of a text, or for measuring its overall readability. It can also be used as a metric for calculating the cost of translation services, or commercial content creation.

The standard PHP function for calculating the length of a string is strlen(). This function takes the name of a string variable as its argument, and returns the length of the string in bytes as an integer value. For example:

<?php
$str = "The boy stood on the burning deck";
$len = strlen($str);
echo "\"$str\" has $len characters.";
// "The boy stood on the burning deck" has 33 characters.
?>

In this case, the strlen() function correctly returns the number of characters in the string variable $str. We should however insert a timely reminder here that, as we mentioned earlier in this article, the number of bytes in a UTF-8 string can often significantly exceed the number of characters, because UTF-8 is a variable length encoding scheme. In the above example, all of the characters belong to one or more 8-bit encoding schemes like ASCII or ISO-8859-1.

If you want to know the number of bytes in a string, strlen() will do the job. However, it is more often the case that we want to know the number of characters, in which case it would be safer to use the mb_strlen() function which returns the correct number of characters in a string regardless of the number of bytes used to represent each character.

The mb_strlen() function takes two arguments - the name of a string variable and an optional encoding scheme argument. If the encoding scheme is not specified, or given as NULL, PHP's internal encoding scheme is used (this will normally be UTF-8). The return value is the number of characters in the string passed to mb_strlen(). Consider the following example:

<?php
$str = "Καλημέρα";
$len = strlen($str);
echo "The strlen() function returns a string length of $len for \"$str\".<br><br>";
$len = mb_strlen($str);
echo "The mb_strlen() function returns a string length of $len for \"$str\".<br>";
// The strlen() function returns a string length of 16 for "Καλημέρα".
//
// The mb_strlen() function returns a string length of 8 for "Καλημέρα".
?>

As you can see, the mb_strlen() function correctly returns the number of characters in the word "Καλημέρα" (the Greek word for "Good morning"). It is rarely necessary to use the optional encoding argument with mb_strlen() because in most cases PHP will use UTF-8 by default.

To count the number of words in a string, we use the str_word_count() function. This function also takes the name of a string variable as its first argument, and has two additional arguments, both optional. The first optional argument is an integer that specifies what kind of value will be returned by the function, as briefly described below:

0 - returns the number of words found
1 - returns an indexed array containing the words found in the string
2 - returns an associative array of key-value pairs in which:
      key is the numeric position of the word in the string
      value is the actual word itself

The second optional argument is a list of additional characters that will also be counted as a "word" for the purposes of the word count. This might be useful when carrying out a word count on a scientific or mathematical treatise, for example, where certain characters and symbols can have great significance.

In most cases, we will probably only be interested in getting a basic word count for some body of text, in which case we can omit the optional parameters. Here is an example of how we might use the str_word_count() function:

<?php
$str = <<<STRING
The boy stood on the burning deck<br>
Whence all but he had fled;<br>
The flame that lit the battle's wreck<br>
Shone round him o'er the dead.<br>
STRING;
$wcount = str_word_count($str);
echo $str;
echo "(word count: $wcount)";

// The boy stood on the burning deck
// Whence all but he had fled;
// The flame that lit the battle's wreck
// Shone round him o'er the dead.
// (word count: 30)
?>

You might have noticed that the str_word_count() function has returned a word count of 30, whereas there are only 26 words in the actual text. The apparent anomaly is because we have formatted the string for output in a browser window using HTML markup elements - namely a line break element (<br>) at the end of each line. These markup elements are also counted as words, making the word count incorrect (unless we actually want to count the HTML elements as well, of course). We could instead do something like this:

<?php
$str = <<<STRING
The boy stood on the burning deck
Whence all but he had fled;
The flame that lit the battle's wreck
Shone round him o'er the dead.
STRING;
$wcount = str_word_count($str);
echo "<pre>$str</pre>";
echo "(word count: $wcount)";

// The boy stood on the burning deck
// Whence all but he had fled;
// The flame that lit the battle's wreck
// Shone round him o'er the dead.
//
// (word count: 26)
?>

Here, we have used the HTML <pre> ... </pre> element (HTML's preformatted text element) to output the text exactly as it appears within the heredoc statement. This time, the str_word_count() function returns the correct number of words.

Searching for substrings

A substring is a portion of a string consisting of a contiguous sequence of characters within the string. For example, "the burning deck" is a substring of "The boy stood on the burning deck". There are a number of ways in which we can search for and manipulate substrings, and many reasons for doing so. Search engines like Google, for example will look for substrings in an online document that match a search term entered by a user.

The number of times a substring is found within a text can also be an important factor in determining how relevant the text is in relation to a specific topic, or whether a particular word or phrase has been overused. AI writing assistants like Grammarly and ProWritingAid, for example, will search for repetition in a text and recommend possible changes, such as removing or replacing some instances of the repeated word or phrase.

We can also search for substrings within a database in order to retrieve only those records in which a particular substring is found, or to determine how many records match a particular criterion. Most modern business applications, provide a search and replace function that allows the user to rapidly find and replace instances of a particular word or phrase in a document.

The first substring-related function we're going to look at here is the strpos() function. This function searches for the first instance of a specific substring within a text. It takes three arguments, the first of which is the string to be searched - either a string literal or the name of a string variable. The second argument is the substring being searched for - also either a string literal or the name of a string variable.

The third (optional) argument is an integer offset. If specified, and if the number is positive, the search starts this number of characters from the beginning of the string. For a negative offset, the search starts this number of characters from the end of the string. If a match is found, the strpos() function returns the position of the first character in the substring. If no match is found, strpos() returns FALSE. For example:

<?php
$str = "The boy stood on the burning deck";
$pos = strpos($str, "burn");
if($pos === FALSE) {
echo "The substring was not found.";
}
else {
echo "The substring was found at position $pos.";
}
// The substring was found at position 21.
?>

Note that functions such as strpos() essentially see a string as a zero-indexed array of characters. If strpos() finds an instance of a substring within a string, the value it returns is the array index within the string of the first character of the substring. In our example, strpos() finds the substring "burn" in the string variable $str and returns a value of 21. The first character in the substring is therefore the 22nd character in the string.

One drawback to using strpos() to search for a substring arises when we want to perform a case-insensitive search. The strpos() function is case sensitive, so the word "burn" is not seen as being the same as "BURN" or "Burn". To search for a substring regardless of case, we can use the stripos() function, which is case insensitive. For example:

<?php
$str = "The boy stood on the burning deck";
$pos = stripos($str, "BURN");
if($pos === FALSE) {
echo "The substring was not found.";
}
else {
echo "The substring was found at position $pos.";
}
// The substring was found at position 21.
?>

Another drawback to using strpos() is that it is not binary-safe. When used with strings containing multi-byte characters, the return value for a substring found in a string will not necessarily be an accurate indication of the substring's starting position within the string. For example:

<?php
$str = "Καλημέρα";
$pos = strpos($str, "ημέ");
echo $pos;
// 6
?>

In this example, we search for the substring "ημέ" in the Greek word "Καλημέρα", which means "Good morning". This would imply that the first character in the substring is the seventh character in the string, when that is obviously not the case. If we want to work with substrings in a multi-byte character encoding such as UTF-8, we should use binary-safe functions. In this case, for example, we could use mb_strpos(), as follows:

<?php
$str = "Καλημέρα";
$pos = mb_strpos($str, "ημέ");
echo $pos;
// 3
?>

If we just want to know whether or not a substring exists within a string, we can use the (binary-safe) str_contains() function. This function takes two arguments. The first argument is the string to be searched - either a string literal or the name of a string variable - and the second argument is the substring being searched for - also either a string literal or the name of a string variable. The return value is TRUE if the substring is found and FALSE otherwise. For example:

<?php
$str = "To be, or not to be, that is the question:";
if(str_contains($str, "quest")) {
echo "The substring was found.";
}
else {
echo "The substring was not found.";
}
// The substring was found.
?>

Unfortunately, the str_contains() function is case-sensitive, and there is no case-insensitive equivalent function at the time of writing. If we want to perform a case-insensitive search for a substring, we'll have to convert the arguments to all upper-case or all lower-case. For example:

<?php
$str = "To be, or not to be, that is the question:";
$sub = "QUEST";
if(str_contains(strtolower($str), strtolower($sub))) {
echo "The substring was found.";
}
else {
echo "The substring was not found.";
}
// The substring was found.
?>

Sometimes we want to know, not just the position of a substring, or whether a substring exists within a string, but whether that substring occupies a special position within the string, such as at the very beginning of the string or at the very end. PHP gives us two functions that allow us to do just this - str_starts_with(), and str_ends_with().

The names of these functions are fairly self-explanatory. Both take two arguments, The first argument is the string to be searched, and the second is the substring being searched for. Both arguments can be either a string literal or the name of a string variable. The return value in each case is TRUE if the substring is found at the specified position, and FALSE otherwise.

The difference between these two functions is where the substring is expected to be found. For str_starts_with(), the substring must be found, in its entirety, at the beginning of the string to be searched. For example:

<?php
$lincoln = <<<LINCOLN
Four score and seven years ago our fathers brought forth on this continent,
a new nation, conceived in Liberty, and dedicated to the proposition that
all men are created equal.

Now we are engaged in a great civil war, testing whether that nation,
or any nation so conceived and so dedicated, can long endure.
LINCOLN;
$searchTerm = "Four score";
if(str_starts_with($lincoln, $searchTerm)) {
echo "The string starts with \"$searchTerm\".";
}
else {
echo "The substring was not found.";
}
// The string starts with "Four score".
?>

For str_ends_with(), the substring must be found in its entirety at the end of the string to be searched. For example:

<?php
$lincoln = <<<LINCOLN
Four score and seven years ago our fathers brought forth on this continent,
a new nation, conceived in Liberty, and dedicated to the proposition that
all men are created equal.

Now we are engaged in a great civil war, testing whether that nation,
or any nation so conceived and so dedicated, can long endure.
LINCOLN;
$searchTerm = "can long endure.";
if(str_ends_with($lincoln, $searchTerm)) {
echo "The string ends with \"$searchTerm\".";
}
else {
echo "The substring was not found.";
}
// The string ends with "can long endure.".
?>

Both str_starts_with() and str_starts_with() are binary safe, but they are also both case-sensitive. If we want to perform a case-insensitive search using either of these functions, we will need to convert the arguments to all upper-case or all lower-case (we leave it to the reader to experiment further with these functions).

Sometimes it is useful to be able to find out just how many times a particular word or phrase occurs within a text. PHP provides a function that allows us to do just that, namely substr_count(). This function takes four arguments The first argument is the string to be searched - either the name of a string variable or a string literal. The second argument is the substring we are looking for - again, either the name of a string variable or a string literal.

The third (optional) argument is an integer offset. If specified, and if the number is positive, the count starts this number of characters from the beginning of the string. For a negative offset, the count starts this number of characters from the end of the string. If omitted, the offset defaults to 0.

The fourth (also optional) argument is an integer length argument that specifies maximum length, following the specified offset, over which to continue the count. A negative value for length is counted from the end of string to be searched. If the offset plus the length would take the count past the end of the string, a warning message will be generated. If omitted, this argument defaults to NULL. Here is an example of how we might use substr_count():

<?php
$dreams = <<<DREAMS
Dreams - Langston Hughes

Hold fast to dreams
For if dreams die
Life is a broken-winged bird
That cannot fly.
Hold fast to dreams
For when dreams go
Life is a barren field
Frozen with snow.
DREAMS;
$searchTerm = "dreams";
$found = substr_count(strtolower($dreams), strtolower($searchTerm));
if($found > 0) {
echo "The search term was found $found times.";
}
else {
echo "The search term was not found.";
}
// The search term was found 5 times.
?>

The substr_count() function appears to be binary safe (although this is not explicitly stated in the official PHP documentation), but like the other substring-related functions we have seen, it is case-sensitive. If we wish to carry out a case-insensitive substring count, we need to convert the two mandatory arguments to all upper-case or all lower-case, as we have in the above example.

Extracting a substring

Being able to extracting a substring from a string is useful when we are only interested in a specific part of the string. We might want to extract the username from an email address, or the domain name from a URL, for example. PHP provides several functions that allow us to extract a substring from a string. The first one we will look at is the substr() function.

The substr() function takes three arguments, the first of which is the name of the string variable to be searched. The second argument is offset which specifies the position at which the substring we wish to extract starts, counting from zero. If offset is positive, counting starts from the beginning of the string. If offset is negative, counting starts from the end of the string. If the string is less than offset characters long, an empty string is returned.

The third (optional) argument is length. If length is specified and is positive, the string returned will contain a maximum of length characters, starting from the offset position. If length is negative, then length characters are omitted from the end of the string.

If offset coincides with or overlaps the start of this truncation, an empty string is returned. If length is specified as 0, an empty string is returned. If length is omitted (or NULL), the substring returned will be the rest of the string, starting from offset. Consider the following script:

<?php
$email = "webmaster@technologyuk.net";
$at = strpos($email, "@");
$username = substr($email, 0, $at);
$domain = substr($email, ++$at);
echo "For email address \"$email\":<br><br>";
echo "The username is \"$username\".<br>";
echo "The domain name is \"$domain\".";

// For email address "webmaster@technologyuk.net":
//
// The username is "webmaster".
// The domain name is "technologyuk.net".
?>

In this example, we extract the username and the domain name from an email address. We call the strpos() function to determine the position of the "@" symbol in $email and assign that value to $at. We invoke the substr() function once to get the username, setting the offset argument to 0 and the length argument to the value of $at.

We invoke the substr() function a second time to get the domain name, this time using the value of $at incremented by one to get our offset. The length argument can be omitted this time because we just want the remainder of the string after the offset.

Replacing text in a string

We saw earlier in this article how to replace the entire contents of a string variable using the assignment operator. Sometimes, though, we only want to replace part of a string with some different text, basically replacing one substring with another. One common use case for text replacement occurs in content management systems, where placeholders are used in templates and later replaced with meaningful data in order to generate dynamic content.

In fact, a search and replace function is a common feature of most business applications, including word processing software, text and code editors, spreadsheets, accountancy software, and e-commerce platforms. It enables the user to quickly find and correct errors, or replace outdated information

The function provided by PHP for finding and replacing substrings in a text is str_replace(). This function accepts four parameters. The first parameter is the search term, i.e. the word or phrase being searched for. The second parameter is the replacement term, i.e. the word or phrase that should replace the search term. Both of these arguments can be either string literals or the name of a string variable.

The third argument is the name of the string variable that is the subject of the search and replace operation. All occurrences of the search term found in the string will be replaced by the replacement text. The value returned by str_replace() is a new string that reflects the changes made by the search and replace operation. Unless we assign the return value of the str_replace() function to it, the original string variable will not be changed.

The fourth (optional) argument to the str_replace() function, if included, is the name of a variable that will be assigned an integer value, depending on how many instances of the search term have been replaced. This is useful if you want to know how many replacements were made. The following example demonstrates how we might use str_replace():

<?php
$aftAddr = <<<ADDR
An "afternoon address" refers to the
appropriate greeting to use in the
afternoon, such as "Good afternoon"
or "Good afternoon, [Name]".
It can also refer to the physical
address where an event or activity
taking place in the afternoon is
located.
ADDR;

$eveAddr = str_replace("afternoon", "evening", $aftAddr, $count);

echo "Revised text:<br>";
echo "<pre>$eveAddr</pre>";
echo "Number of replacements made: $count.";

// Revised text:
//
// An "evening address" refers to the
// appropriate greeting to use in the
// evening, such as "Good evening"
// or "Good evening, [Name]".
// It can also refer to the physical
// address where an event or activity
// taking place in the evening is
// located.
//
// Number of replacements made: 5.
?>

We can also use str_replace() with arrays as the first and second arguments. This enables us to replace multiple substrings in a text simultaneously. Each element in the array passed to str_replace() as the search term is replaced by the corresponding element in the array holding the replacement terms (obviously both arrays must have the same number of elements). Here is an example:

<?php
$movies012025 = [
"Mufasa: The Lion King", "Sonic", "Mona",
"Nosferatu", "Wicked"
];

$movies022025 = [
"Captain America: Brave New World", "Dog Man",
"Heart Eyes", "Paddington in Peru",
"Mufasa: The Lion King"
];

$movieMonth = <<<MOV
Top five movies in January 2025:

1. Mufasa: The Lion King
2. Sonic
3. Mona
4. Nosferatu
5. Wicked
MOV;

echo "<pre>$movieMonth</pre>";

$movieMonth = str_replace("January 2025", "February 2025", $movieMonth);
$movieMonth = str_replace($movies012025, $movies022025, $movieMonth);

echo "<pre>$movieMonth</pre>";

// Top five movies in January 2025:
//
// 1. Mufasa: The Lion King
// 2. Sonic
// 3. Mona
// 4. Nosferatu
// 5. Wicked
//
// Top five movies in February 2025:
//
// 1. Captain America: Brave New World
// 2. Dog Man
// 3. Heart Eyes
// 4. Paddington in Peru
// 5. Mufasa: The Lion King
?>

We could also replace multiple search terms with the same value by using an array as the first argument and a string literal or the name of a string variable as the second argument. Note that, as with many of the other string related functions we have looked at, str_replace() is case sensitive. To carry out a case-insensitive search and replace, use str_ireplace().

Reversing a string

It may be useful in certain situations to be able to reverse a string so that the last character is first and the first character is last. We have to confess that in researching use cases for reversing a string, we found very few real-world examples of where being able to reverse a string would be required, although it does seem to come up frequently at job interviews - probably to test the interviewee's problem-solving skills.

One suggested use case we came across was for sorting domain names or email addresses into groups based on the domain name suffix, i.e. .com, .net, or whatever. The basic idea is that we reverse all the domain names or email addresses and then extract the entries for each top-level domain into its own group - probably using a separate array for each top-level domain.

Fortunately, we don't need to write an algorithm to reverse a string because PHP provides a function that will do this for us, namely strrev(). This function takes a single argument, which is the string to be reversed - either a string literal or the name of a string variable. The return value is a new string in which all of the characters of the original string are in the reverse order. Let's look at an example of how we might use this:

<?php
$email = "webmaster@technologyuk.net";
$emailRev = strrev($email);
$pos = strpos($emailRev, ".");
$topLevelDomain = strrev(substr($emailRev, 0, $pos));
echo "The top level domain is \"$topLevelDomain\".";

// The top level domain is "net".
?>

Here, we have used three of PHP's built-in string functions to extract the name of the top-level domain in an email address. First, we declare a variable called $email and assign it a valid email address string. We then make our first use of the strrev() function to reverse the email address string and assign the result to the variable $emailRev.

The next step is to find the position of the first instance of a period in $emailRev using the strpos() function and assign the result to $pos. We get the (reversed) top level domain name from $emailRev using the substr() function, passing values of 0 and $pos as the offset and length arguments respectively. It now only remains to reverse the result of that operation using strrev() once more, and assign the return value to the $topLevelDomain variable.

Changing case

We have already seen one use case for changing the case of a string - namely to facilitate the case-insensitive use of some of PHP's case-sensitive string-handling functions for which there is no case-insensitive alternative. Sometimes, however, we simply want to change some text to all uppercase or all lowercase for aesthetic reasons, or to ensure consistency in the way text data is displayed.

Domain names, email addresses and URLs tend to use lowercase characters, which means, among other things, that you don't need to worry about case-sensitive usernames or domain names when writing emails or entering a URL into a browser address bar (although in most cases, the email client or browser doesn't care if you inadvertently use uppercase characters). Abbreviations, country codes and product identification codes tend to be written using uppercase characters.

Word processing software usually has auto correct features that will capitalise the first letter of a sentence if the user inadvertently forgets to do so, and most online banking facilities will automatically capitalise user input for International Bank Account Numbers (IBANs) and Bank Identifier Codes (BICs). Airline booking management systems also convert flight numbers and booking codes input by a user to upper-case characters automatically.

We are already familiar with two of the PHP functions for changing the case of a string or substring, namely strtoupper() and strtolower(). Both of these functions accept a single argument, namely the string to be converted to uppercase or lowercase. The argument can be either a string literal or the name of a string variable. Both functions return a new string which will consist of all uppercase characters for strtoupper(), or all lowercase characters for strtolower(). Here is an example:

<?php
$email = "WEBMASTER@TECHNOLOGYUK.NET";
echo "$email<br><br>";
$email = strtolower($email);
echo $email;

// WEBMASTER@TECHNOLOGYUK.NET
//
// webmaster@technologyuk.net
?>

Sometimes, we don't want to change the case of an entire string. For example, we might just want to capitalise the first character in a string, perhaps to ensure that a sentence always starts with a capital letter, for example. PHP provides two functions that enable us to change the case of the first character in a string - lcfirst() and ucfirst().

Like the strtolower() and strtolower() functions, both of these functions take a string as their only argument and return the modified version of that string. In both cases the argument passed to the function can be either a string literal or the name of a string variable. Here's an example of how the ucfirst() function might be used to capitalise the first letter of a sentence:

<?php
$verse = "the boy stood on the burning deck";
echo "$verse<br><br>";
$verse = ucfirst($verse);
echo $verse;

// the boy stood on the burning deck
//
// The boy stood on the burning deck
?>

Trimming white space

Trimming white space from the beginning or end of a string is a common requirement, especially when dealing with user input from online forms and other sources. Users frequently add white space characters, either inadvertently or - on occasion - deliberately, to the beginning or end of their text input, and it is necessary to remove these white-space characters before processing the input.

PHP provides three functions for removing white space from a string. The first we are going to look at is the trim() function, which removes whitespace characters from both the beginning and the end of a string. This function takes two arguments. The first is the name of the string variable to be stripped of its leading and trailing whitespace characters.

The second (optional) argument is a string containing a sequence of whitespace characters that should be stripped from the beginning and end of the string. By default, this string is " \n\r\t\v\x00", and represents the following whitespace characters:

" " - the ASCII space character (0x20)
"\n" - the ASCII linefeed character (0x0A)
"\r" - the ASCII carriage return character (0x0D)
"\t" - the ASCII tab character (0x09)
"\v" - the ASCII vertical tab character (0x0B)
"\0" - the ASCII NUL-byte character (0x00)

We can also specify any number of other (non-whitespace) characters to remove using the second argument if required. The following example removes the whitespace characters from both ends of a string:

<?php
$usrInput = " 123.45 ";
var_dump($usrInput);
$cleanInput = trim($usrInput);
echo "<br><br>";
var_dump($cleanInput);

// string(10) " 123.45 "
//
// string(6) "123.45"
?>

If for some reason you only want to trim the whitespace from one side of a string, PHP provides the ltrim() and rtrim() functions, which trim the whitespace characters from the left-hand side and right-hand side of a string respectively. These functions take exactly the same arguments as trim(), but only remove whitespace (or other characters) from one side of the string.

This could be useful if we want to trim a non-whitespace character from the beginning or end of a string, in which case we will utilise the second argument to ltrim() or rtrim() to specify the character (or characters) to remove. For example:

<?php
$price = "€123.45";
var_dump($price);
$cleanPrice = ltrim($price, "€");
echo "<br><br>";
var_dump($cleanPrice);

// string(9) "€123.45"
//
// string(6) "123.45"
?>

In the above example, we use the ltrim() function to remove the Euro sign () from the string variable $price and assign the result to the variable $cleanPrice. This will enable us to use the value of $cleanPrice in arithmetic operations elsewhere in our code.

Converting strings to arrays

We can turn a string that consists of a number of substrings separated by a common delimiter, such as a comma or a space, into an array using PHP's explode() function. This function takes three arguments. The first argument is a string that specifies the separator to be used expressed as either a string literal or the name of a string variable. The second argument is the string from which the array elements will be derived.

The third (optional) argument, if used, is limit - an integer value that sets the maximum number of array elements to be created. If this argument is a positive integer, it specifies the maximum number of array elements to be created regardless of the number of substrings with the last element containing all of the substrings not already allocated to the array, together with the substring delimiters that separate them. For example:

<?php
$numStr = "One, Two, Three, Four, Five";
$numArr = explode(", ", $numStr, 3);
var_dump($numArr);

// array(3) { [0]=> string(3) "One" [1]=> string(3) "Two" [2]=> string(17) "Three, Four, Five" }
?>

By default, the limit argument is set to the PHP constant PHP_INT_MAX, which usually evaluates to either 2147483647 (on 32-bit systems) or 9223372036854775807 (on 64-bit systems). If limit is set to a negative value, all substrings except the last limit substrings are added to the array. For example:

<?php
$days = "Mon, Tue, Wed, Thu, Fri, Sat, Sun";
$weekdays = explode(", ", $days, -2);
var_dump($weekdays);

// array(5) { [0]=> string(3) "Mon" [1]=> string(3) "Tue" [2]=> string(3) "Wed" [3]=> string(3) "Thu" [4]=> string(3) "Fri" }
?>

The explode() function is useful for extracting values from lists. The contents of a comma-separated value (CSV) file, for example, can be read into a string variable and then assigned to an array.

The process of breaking a string down into an ordered list of substrings is sometimes referred to as splitting. In the examples above, we split strings based on a common delimiter using the explode() function, but PHP provides another function for splitting strings called str_split(). This function accepts two arguments, the first of which is the string to be split - either a string literal or the name of a string variable.

The second (optional) argument is length, an integer value that specifies the maximum length of each substring. If omitted, length defaults to 1. The return value is an array of substrings, each of which contains at most length characters. For example:

<?php
$org = "RSPCA";
$abbr = str_split($org);
var_dump($abbr);

// array(5) { [0]=> string(1) "R" [1]=> string(1) "S" [2]=> string(1) "P" [3]=> string(1) "C" [4]=> string(1) "A" }
?>

As you can see, unless we specify a value for length other than 1, the string will be split into individual characters. This could perhaps be useful if we want to manipulate a string character by character, but let's think about other use cases.

By specifying a length value of 3, we could break large undelimited integer values down into groupings of three digits in order to make them easier to read. Take for example, the PHP constant value argument is set to PHP_INT_MAX, which usually evaluates to 9223372036854775807 on a 64-bit system.

Such large numbers are very hard for a human being to read when expressed without delimiters, but suppose we try the following:

<?php
$largeInt = PHP_INT_MAX;
echo "This system's largest integer: $largeInt<br><br>";
$numArr = str_split($largeInt, 3);
$largeIntStr = implode(" ", $numArr);
echo "Here is the readable version: $largeIntStr";

// This system's largest integer: 9223372036854775807
//
// Here is the readable version: 922 337 203 685 477 580 7
?>

In this example, we use the implode() function to convert the array variable $numArr to a string (we haven't actually discussed this function yet, but will do so shortly). This works, and the number is certainly somewhat more readable, having been broken down into chunks with a maximum size of three digits each, separated by a space (we could also use a comma, or some other delimiter, depending on the convention being followed).

There is however one tiny problem, which is that the last group of digits in the sequence only contains one digit. Normally, we would expect to see groupings of three digits, starting from the right-hand side, with the left-most group being the only one to (potentially) contain less than three digits. We can get around this problem by using the strrev() function which we described earlier - which just goes to show that it is useful after all! Here is the revised script:

<?php
$largeInt = PHP_INT_MAX;
echo "This system's largest integer: $largeInt<br><br>";
$revInt = strrev($largeInt);
$numArr = str_split($revInt, 3);
$largeIntStr = strrev(implode(" ", $numArr));

echo "Here is the readable version: $largeIntStr";

// This system's largest integer: 9223372036854775807
//
// Here is the readable version: 9 223 372 036 854 775 807
?>

We use the strrev() function twice in this example. The first time, we use it to reverse the order of the digits in the large integer value and assign the result to the variable $revInt, which we then pass as the first argument to str_split(). We then use strrev() again to reverse the string returned by the implode() function, which gives us our space-delimited string, now with the digits correctly grouped and in the correct order.

Creating strings from arrays

There may be times when we want to convert the contents of an array to a string that contains all of the elements of the array, separated by an appropriate string separator such as a comma followed by a space. There are various reasons why we might want to do this. We might, for example, want a more human-readable representation of the array's contents, or we might require a string representation of the array for the purposes of debugging.

PHP provides the implode() function, which is virtually the opposite of the explode() function, for the purpose of converting arrays to strings. The implode() function takes two arguments - an optional string separator and the name of an array variable. The return value is a string containing all of the elements in the array, separated by with the string separator (if specified).

The array elements can be strings, numeric values or Boolean values (or a mixture of these datatypes). Non-string array elements will be automatically converted to their string representation. In the following example, an array of strings is converted to a single string variable:

<?php
$trees = ["Oak", "Ash", "Beech", "Elm", "Birch"];
$strTrees = implode(", ", $trees);
var_dump($strTrees);
// string(27) "Oak, Ash, Beech, Elm, Birch"
?>

And in this example, of an array of integer values is converted to a string variable:

<?php
$primesTo50 = [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47];
$strPrimesTo50 = implode(",", $primesTo50);
var_dump($strPrimesTo50);
// string(40) "2,3,5,7,11,13,17,19,23,29,31,37,41,43,47"
?>

One example of where the implode() function can be useful is in the creation of comma-separated value (CSV) files, which are plain text files containing a list of comma-separated values. These files are typically used to store tabular data, and to transfer data between spreadsheet and database applications.

Repeating strings

A repeating string is one in which the same sequence of characters, words or phrases is repeated multiple times. We create a repeating string by taking an existing string and concatenating it with itself some specified number of times.

The function provided by PHP to facilitate the creation of a repeating string is str_repeat(). This function takes two arguments. The first argument is the string to be repeated, and can be either a string literal or the name of a string variable. The second argument is an integer value that specifies the number of times to repeat the string, and must be greater than or equal to 0.

The return value is a new string consisting of the original string, repeated for the number of times specified by the second argument to str_repeat(). Note that if the second argument to str_repeat() is 0, the value returned will be the empty string.

Although there are relatively few real-world use cases for str_repeat(), it could be used to generate blocks of placeholder "dummy text" for debugging purposes, or as a design tool, like the numerous online "lorem ipsum" generators that produce meaningless blocks of Latin text. These blocks of text, also called blind text or filler text, provide software designers with an impression of how a template will look when text is added to a prototype user interface.

Another possible use case might be to generate graphical displays using repeated arrangements of printable characters, such as the Unicode block elements (U+02580 - U+0259F). The following example uses two of these elements to generate a checkerboard pattern:

<?php
$str = "░░░▓▓▓░░░▓▓▓░░░▓▓▓░░░▓▓▓\n";
$str .= "▓▓▓░░░▓▓▓░░░▓▓▓░░░▓▓▓░░░\n";
$pattern = str_repeat($str, 4);
echo "<pre>$pattern</pre>";
?>

Here is the output from the script:


The script generates a checkerboard pattern

The script generates a checkerboard pattern


Formatting strings

We often want to format a string in order to display it in a user-friendly way. We may wish, for example, to display columns of floating-point or decimal currency values so that the numbers align correctly, with the decimal point in each case occupying the same horizontal position on screen.

Alternately, we might want to represent numbers in a format other than decimal, such as hexadecimal, octal, or binary, or even using scientific notation. The possibilities for formatting strings are too numerous to list here, but we'll look at a few examples.

The go-to function for formatting a string in PHP is the sprintf() function. This function takes multiple arguments. The first argument is a format string. This string is made up of zero or more directives. A directive can be either a string literal or a conversion specification. The arguments that follow the format string basically constitute a list of values to be inserted into the formatted string according to the corresponding conversion specifications embedded within the format string.

If the above sounds complicated, that's because it is quite complicated, although things should become clearer when you have looked at some examples. The sprintf() function's return value is the formatted string that contains the string literal components of the formatted string as they appear in the format string, together with the values in the argument list that follows the format string, duly formatted according to the corresponding conversion specification.

Let's look at a very basic example:

<?php
$country = "The United Kingdom";
$pop = 69.23;
$demographic = sprintf("%s has a population of %.2f million.", $country, $pop);
echo $demographic;

// The United Kingdom has a population of 69.23 million.
?>

The format string passed to the sprintf() function in the above example has two conversion specifications. The first, %s, corresponds to the first argument that follows the format string and specifies that the argument should be treated as a string. The argument actually found at that position is $country, which evaluates to the string "The United Kingdom".

The second conversion specification refers to the second argument following the format string, which is $pop - a floating-point value. This one is not quite so easy to interpret, but once you know what's going on here it is fairly straightforward. The conversion specification is %.2f. The f indicates a floating-point value, but by default floating point values are expressed using six decimal places.

Expressing $pop using six decimal digits after the decimal point is totally unnecessary. It would give us an output for this argument of 69.230000, which is counter-productive in terms of readability. The .2 that precedes the f in the conversion specification specifies precision, and limits the output to two digits after the decimal point.

Let's look at another example. In this example we'll explore the various ways in which the same numeric values can be expressed in PHP using different number bases. Here is the script:

<?php
$str = "Life, the universe and everything";
$num = 42;
$decimal = sprintf("%s = %d (decimal).", $str, $num);
$hexadecimal = sprintf("%s = %x (hexadecimal).", $str, $num);
$octal = sprintf("%s = %o (octal).", $str, $num);
$binary = sprintf("%s = %b (binary).", $str, $num);
echo "$decimal<br>";
echo "$hexadecimal<br>";
echo "$octal<br>";
echo $binary;

// Life, the universe and everything = 42 (decimal).
// Life, the universe and everything = 2a (hexadecimal).
// Life, the universe and everything = 52 (octal).
// Life, the universe and everything = 101010 (binary).
?>

The basic conversion specification starts with the percent sign (%) and ends with the type specifier. Some of the more common type specifiers are briefly described in the table below:



Type Specifiers
SpecifierDescription
%A literal percent character (no argument required).
bArgument is interpreted as an integer and displayed as a binary number.
cArgument is interpreted as an integer and displayed as the ASCII character with that byte code.
dArgument is interpreted as an integer and displayed as a signed decimal number.
eArgument is interpreted as a numeric value and displayed using scientific notation (e.g. 1.5e+3).
EAs for e but uses uppercase E (e.g. 1.5E+3).
fArgument is interpreted as a float and displayed as a locale-aware floating-point number.
FArgument is interpreted as a float and displayed as a non-locale-aware floating-point number.
oArgument is interpreted as an integer and displayed as an octal number.
sArgument is interpreted and displayed as a string.
uArgument is interpreted as an integer and displayed as an unsigned decimal number.
xArgument is interpreted as an integer and displayed as a hexadecimal number using lowercase letters.
XArgument is interpreted as an integer and displayed as a hexadecimal number using uppercase letters.

The only mandatory components of the conversion specification are the opening percent sign and the type specifier. The full syntax of the conversion specification includes a number of optional parameters that can be used between the opening % symbol and the type specifier, as shown here:

%[argnum$][flags][width][.precision]specifier

The argnum$ option, if used, specifies which argument the current conversion specification should use. By default, this would be the argument that occupies the same position in the argument list (after the format string itself) as the conversion specification occupies within the format string.

The flags option allows us to fine-tune the output using one or more of the flags described in the table below:



Conversion Specification Flags
FlagDescription
-Left-justify within the given field width (right justification is the default).
+Prefix positive numbers with a plus sign.
(space)Pad the result with spaces (the default behaviour).
0Left-pad numbers with zeros.
'(char)Pad the result with the character specified by (char).

The width option is an integer value that specifies the minimum number of characters the conversion should produce.

The .precision option is an integer value preceded by a period, the meaning of which depends on the type specifier. As we have seen, when used with a floating-point value (type specifiers e, E, f and F) it specifies how many digits should appear after the decimal point. When used with the string type specifier (s), it sets a limit on the number of characters the string may contain.

Let's look at one more example of how sprintf() might be used to format numbers. This time, we'll work with an associative array that holds the descriptions and prices for various items, and use the sprintf() function to create an appropriately formatted price list. Here is the script:

<?php
$products = [
"Widget"=>15.50,
"Sprocket"=>109.75,
"Rocker"=>18.60,
"Thingy"=>32.65,
"Gadget"=>116.00,
"Doofer"=>6.75
];

echo "<h1>Product Price List</h1>";
echo "<pre style=\"font-size: 1.25em;\">";
foreach($products as $item=>$price) {
$listItem = sprintf("%- 15s€ %6.2f\n", $item, $price);
echo $listItem;
}
echo "</pre>";
?>

In this example, we have extracted the name of each key-value pair from the $products array and displayed it on its own line using a foreach loop. The formatting of each line is handled by the following statement:

$listItem = sprintf("%- 15s€ %6.2f\n", $item, $price);

Let's break down the first conversion specification (%- 15s) which tells us how the first argument following the format string will be displayed. The specifier (s) indicates that the first argument should be interpreted as a string. The minus sign and the space character are flags, signalling that the string should be left-justified, and padded using space characters. The integer value (15) specifies that the text should be at least fifteen characters wide.

Now to the second conversion specification (%6.2f), which tells us how the price of each item will be displayed. The specifier (f) indicates that this is a floating-point number. The number 6 immediately following the percent sign (%) signals that the number should be displayed with a minimum width of six characters, and the .2 that immediately follows it specifies a precision of 2 digits after the decimal point.

Here is the output from our script:


The formatted price list

The formatted price list


A full exploration of the versatility of the sprintf() function is beyond the scope of this article, but we will no doubt encounter this and other PHP string formatting functions in future articles. Meanwhile, you have hopefully gained an insight into the potential uses of sprintf() in creating rich and complex string output formats.