Performance comparison

I have laid out three algorithms to replace placeholder tags in a string. These algorithms have different performance, and as everybody in the PHP scene seems to think that performance matters, let me look into it.

There is no "best" method in every case. Instead, which method to use depends on the template used and the data which is available. The following graph nicely shows this:

In this graph, the number of available data keys is plotted against the time that each method takes. The number of available data keys is the size of the $data array in our example. Of the keys in the data array, approximately 50% was actually used in the template.

In the str_replace method, str_replace() is called once for each data key. The time this method takes is directly dependent on the size of the data array. The preg_match method calls str_replace() once for each data key that is actually used in the template. Because approximately 50% of our keys are used in the template, it calls str_replace() about half the time of the str_replace method. These algorithms are both O(n), where n is the number of items in the data array. This means that they become linearly slower when the array becomes bigger.

The preg_replace_callback method, on the contrary, is not influenced by the size of the data array. It simply searches for our placeholders and only then looks up the value in the data array. Looking up a value in an array takes some time which is not dependant on the size of that array, thus giving an O(1) algorithm: the speed is independant of the size of the data array.

In the next graph, we show what happens if you keep the number of data items equal but vary the number of different placeholders that show up in the template. At the left side of the graph, only a few different placeholders are used in the template. At the right side of the graph, all available placeholders from the data array are used.

A you can see, preg_replace_callback is again not impressed by our variations. It does not care whether the placeholders it has to replace are different, because it does a lookup for each of them.

The preg_match method gets slower when it has to replace more different placeholders. Because preg_match replaces only the placeholders which are actually used in the template, it calls str_replace more if it has to replace more different items.

Str_replace also gets slower when it has to replace more different items. This is an interesting detail of the implementation. After all, if str_replace() searched through the whole array and replaced all occurrences each time, each call to str_replace() would take approximately the same time, thus the speed would not be influenced by the number of different keys. However, str_replace() has a little optimization: before doing anything, it first searches whether the word you're looking for actually occurs in the text. If it doesn't, it returns. If it does, it allocates a bigger memory slot to fit the text with the replacements and does the actual replacements. This causes str_replace() to be slower when it actually has something to replace, giving the curved line in the graph.

Finally, we vary the size of the template and see what happens to our functions:

As you can see, all functions get linearly slower when template size (and this the number of placeholders) increases, which makes sense. However, str_replace() is a lot less impressed by a big template than the preg methods. This is because str_replace() is a much simpler algorithm than matching regular expressions.