One of the improvements we tackled was WordPress Trac ticket #58291. It’s essentially about optimizing the function _wp_filter_build_unique_id, which, given a function will return its unique hash. This function is used by WP_Hook – one of the most frequently used WordPress classes, and this improvement makes add_filter and remove_filter faster.

function sum_to_n_faster( $n ) {
return $n * ( $n + 1 ) / 2;
}

What about any other impacts of this change? This is tricky because this function is copy-pasted into VaultPress and WP-CLI. This is a problem because if we switch core to use spl_object_id (which has a different return value from spl_object_hash), it might mess up the callbacks array when combined with VaultPress/WP-CLI.

Code performance 101

In WordPress Trac ticket #58290, we use $x instanceof Y instead is_object( $x ) && $x instanceof Y to save a few unnecessary calls to is_object. While a small improvement, this code executes in WP_Hook::build_preinitialized_hooks that gets called in load.php, and thus scales up with the number of hooks added.

PHP is an interpreted language built on top of the C programming language. Naturally, code written in C will (most of the time) run faster than code written in PHP. The trade-off is that C is a more complex language than PHP – we must pay attention to pointers, memory management, etc.

The original measurement with cachegrind using Xdebug’s profiler showed 1.18% before our patch:
You can find more information in the Trac ticket, especially about the impact of another proposed improvement that we did, as an emphasis on the importance of cross-collaboration work.

In practice, however, it’s usually trickier, and optimizing things like this is not always straightforward. Though, there are several questions that we can ask ourselves that may assist:

  • Where do we start?
    • What are the most frequently called functions?
    • What are the slowest functions?
    • What is the slowest behavior of the application in general?
  • What are we optimizing, and at which layer?
    • App layer? HTTP layer? Code layer? Caching?
  • How do we measure it?
  • What are the trade-offs?
    • Resources: Do we trade space (how much memory it takes during execution) or time (how much time it takes to execute)? (Space and time is an interesting general philosophical concept)
    • What is the impact of the proposed improvement?
      • How does it affect backward compatibility? How does it affect version compatibility?

Saving processing cycles by optimizing foreach

While this post will mostly be around performance improvements at the code level, we want to emphasize that when we write code, we first want it to be readable, correct, secure, and only after that, performant.

If you are interested more about PHP internals, there’s the PHP Internals Book that contains good information but is still incomplete. In our experience, navigating the php-src codebase is the easiest/best way to learn.
In WordPress Trac ticket #58457, we worked on optimizing the WP_Theme_JSON::append_to_selector method. The gist of this improvement is that we shift a conditional check one level above. So, instead of doing something like:

$ ./run-benchmarks.sh

Executing benchmarks for benchmarks/wp_slash.php
——————–
PHP implementation of wp_slash takes 0.0000169277
C implementation of wp_slash takes 0.0000009537
Improvement of 94.366197%

_wp_filter_build_unique_id uses spl_object_hash to compute the hash. However, as of PHP 7.2, we now have spl_object_id. The idea was to switch from spl_object_hash to spl_object_id as the latter will be faster in that it doesn’t do an additional sprintf call – you can compare both functions here to see that. Saving a single call to sprintf is a tiny improvement, but put on a scale, it will still be beneficial.

Saving processing cycles in WordPress actions and filters

Besides using GitHub to search for potential impact, another way is to look through the whole WordPress plugin repository by manually cloning that huge repo or using something like WP Directory.

Reading a cachegrind output from a basic WordPress installation

$foo = input…
foreach ( $x as $y ) {
if ( $foo ) {
a( $y );
} else {
b( $y );
}
}

Code performance optimization is about modifying an existing code to use fewer resources, keeping the original behavior unchanged.

Improving performance with pre-computed values

PHP_FUNCTION(spl_object_hash) {
// …
return strpprintf(32, “%016zx0000000000000000”, (intptr_t)obj->handle);
}

PHP_FUNCTION(spl_object_id) {
// …
RETURN_LONG((zend_long)obj->handle);
}

(Here’s a small quiz on the way. 🙂 Do you know of PHP’s zval tagged union structure?)

After our patch, the measurement showed 0.22% – an improvement of almost 1% in time!

This is an improvement of at least 537 calls to array_keys per request, scaling up with the number of requests and plugins installed.

One would not have expected a “private” function to be copy-pasted like this, but the following quote summarizes this pretty well: 🙂

In this write-up, we talk about recent performance improvements that we did on WordPress 6.3, sharing both our findings and journey.

Have you done any recent performance improvements? How many users did it impact? How did it impact our systems? How did you measure it?

This approach looked like one of the most promising. Besides wp_slash, we also converted a bunch of other functions, such as _wp_filter_build_unique_id, _wp_array_get, absint, and zeroise. With just these functions, during the Xdebug profiling, the image on the right shows almost half of the time improved during wp-admin visit:

This didn’t get in WordPress 6.3 but is scheduled for WordPress 6.4.

wpboost is one experiment to improve WordPress performance by climbing the programming languages abstraction ladder. We experimented with taking some frequently called WordPress PHP functions and shifting them one abstraction level below – from PHP to C. Practically, this is the lowest level of the ladder, but in theory, there is no lowest level; there is also assembly, binary, etc.

To improve its performance, instead of calling array_keys every time, we pre-compute its value (and maintain it whenever the callbacks array gets changed) so that we don’t have to compute it every time within array_filters. With this improvement, we get the number of calls to array_keys down to 790.

Work on performance improvements is rarely a single-person job – it involves a lot of cross-collaboration work. Getting feedback/input and information from others is beneficial to get a deeper insight into some of the proposed improvements.

Here’s a basic school example that sums all digits from 0 to $n:

In any case, working at the PHP C level is beneficial because it will teach you about the PHP internals, and knowing how PHP works internally will make you a better PHP developer.

For the correctness part, WordPress core already has some tests for hooks, and we rely on those, following the testing instructions.

function sum_to_n( $n ) {
$sum = 0;
for ( $i = 0; $i <= $n; $i++ ) {
$sum += $i;
}
return $sum;
}

As we were digging deeper into the hook class (WordPress Trac ticket #58458), we noticed that the function array_keys (within array_filters) gets called about 1327 times on a basic WordPress installation.

This is a great example that shows how we should be constantly aware of the potential impact of our code, regardless if it is a performance improvement. We need to find a way to balance the risk and the reward.

This task is still a work in progress – especially in determining other impacts and finding ways around them.

Another good target for optimization is the WordPress hooks system, as shown in the cachegrind output below. This naturally led to the investigation of the class-wp-hook.php file.

In any case, digging into both PHP and WordPress core internals can provide insightful knowledge, providing awareness of how things work – whether at the architecture level, the function level, etc.
Hyrum’s Law, Software Engineering at Google

We need to continue experimenting, be aware of how PHP and WordPress work, and how the code we write and the data structures we choose will affect performance in the long run.

With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody.

With this experiment, one of the functions we tested is wp_slash. We ran a quick benchmark test using microtime() to compare a function written in PHP with its counterpart written in C. Note that even though the percentage is huge, the numbers are small. However, this scales with the number of times the function is called, the input, and the number of users it serves.

$foo = input…
if ( $foo ) {
foreach ( $x as $y ) {
a( $y );
}
} else {
foreach ( $x as $y ) {
b( $y );
}
}

For the abstraction part, even though it’s not expected for WP_Hook to be extended, it’s a possibility it will happen 🙂 The trade-off is if folks extend this, they will have to make sure to maintain callbacks_keys themselves.

Props to Matthew Reishus, Romina Suarez, Nikolay Bachiyski, Daniel Bachhuber, and Donna Cavalier for their help and feedback!

Besides this improvement, one of the trade-offs is that we now introduce a maintenance burden, as we have two different codebases. For example, if wp_slash gets changed in WordPress core PHP (highly unlikely but not impossible), we must update wpboost too.

The larger our user base, the more attention we need to pay to performance. Even optimizations in the microseconds will have an impact at scale.

Similar Posts