Have you ever asked yourself what happens when you hit a URL on a WordPress website?
Here’s the very simplified version of the story:
There are thousands of lines of code executed in this process, but in this article I’ll concentrate on what happens in points 2 and 3 in the list above.
The “main” WordPress function
The process in which WordPress sets main query arguments according to current URL is the very core of WordPress operation. The class responsible to perform these tasks is named
WP is proof of that, and the method where everything happens is named
That method is quite simple, below you can see its entire code:
These 9 lines of code are powering a quarter of the web.
I already mentioned that this article will focus on how WordPress goes from a URL to a
WP_Query. This happens in one of the methods called inside
From a URL to a query
Let’s add some details to the “request flow” we’ve already seen, focusing on what we’ll discuss:
Please note that to make things simpler I’ve assumed pretty permalinks are enabled.
In short, what’s in between URL and query args are rewrite rules.
What are rewrite rules?
A rewrite rule is something that maps an ugly URL to a pretty URL.
Let’s take step back.
To display information, WordPress needs to know what information you want to show — the way you tell WordPress is URL.
If you visit a URL like
http://example.com/?pagename=sample-page, you are telling WordPress you want to see the page “Sample Page”.
That URL is an ugly URL. The pretty version of it is:
A rewrite rule tells WordPress that a ugly URL should make to a pretty URL. The way rewrite rules do that is via regular expressions.
To better understand this process, we can see the function that you have to use in WordPress to add rewrite rules:
This is example of it in use:
The first argument represents the pretty URL and the second argument is the related ugly URL.
Where do rewrite rules come from?
Even when you don’t add any rules using
add_rewrite_rule, WordPress still resolves pretty permalinks.
In fact, a vanilla WordPress installation comes with a set of rewrite rules already set.
They are the rules for:
Some of them can be modified from the WordPress backend in the Settings > Permalinks page.
Any rewrite rule you register via the rewrite API (of which
add_rewrite_rule is part of) adds to these default rules.
Two traps of rewrite rules
There are two important things that are worth noting:
The consequences of the first point are pretty (and sadly) known. A piece of code that registers post types or taxonomies, or uses the rewrite API, needs to flush the rewrite rules to work.
For a plugin (or theme), flushing may be done calling
flush_rewrite_rules on activation and again on deactivation. This is because the process of flushing is expensive, and can’t be done on every request.
Another way to flush rewrite rules is by visiting the “Settings > Permalinks” page in the WordPress admin, which isn’t possible via code.
This is annoying, but what the second point in the list above implies is probably worse and may not immediately clear. Only “simple” query variables can be used in rewrite rules. By “simple”, I mean scalar variables (strings, numbers, booleans), but not arrays, for example.
A challenging example
Let’s assume you want to implement a URL like:
http://example.com/2016-jan-02-afternoon/ where “afternoon” might either be “morning”, “evening” or “night”.
Let’s also assume that you want this URL to pull a list of posts published in the day present in the URL, but only in the specific part of the day at the end of the URL.
If you have some experience with WordPress development, you know that a query like that is a job for a date query.
The problem is that date query has to be set with an array, and as I said above, that’s not possible directly with a rewrite rule.
How can we implement this feature? Is it even possible? Yes, it is possible, but it requires some work. We need to:
It’s quite a lot of work, and the code below is evident at showing how the limitations of rewrite rules can make your life hard:
Even if you don’t understand what every line does, you can surely understand that this is a lot of code. Surely more code than one may expect for such task.
And still we have to flush rewrite rules.
Imagine a better world
Our example implies a bit of logic per se, but WordPress makes things difficult. To resolve URLs to “data” relevant for the application, other frameworks and CMSes implement what is know as a routing system.
Such a system is based on routes, something that maps URLs to an action or controller, unlike URLs to other URLs like WordPress does.
If we had a routing system in WordPress, it should map URLs to query arguments arrays. Something like:
The fictional code above maps a URL to some query arguments, loading the route in memory. And that’s it, without the need to flush rewrite rules.
…too bad that the
add_frontend_route function does not exist.
Is routing something that’s possible in WordPress?
Yes, it is. To understand the how, we need a deeper understanding of what happens when WordPress parses the request.
One important thing to know about is the filter hook
It’s fired on top of the
WP::parse_request() method, and if the callbacks hooked there return a false value then the request is not parsed at all.
Our visual overview becomes something like:
It means that when WordPress is instructed to skip a request parsing via the
"do parse_request" filter, the main
WP_Query is triggered anyway, and everything continues to work as usual — just no query variable is parsed from the URL.
You may ask: “If no query variable is parsed from the URL, what are the variables that’re used?”
The answer is that the variables used are the one stored in the
WP::$query_vars object property.
That object property is initially set to an empty array, so when the
"do_parse_request" filter returns false, the main query is ran with an empty array as an argument, which results in showing the home page of the website.
If we always return
"do_parse_request" filter with:
…WordPress will always show the home page, no matter what URL of the website we visit.
A more interesting experiment is setting the
WP::$query_vars variable to something arbitrary and doing a
return false on the filter.
To set query vars we could use
global $wp, which is the variable that holds the
WP class instance, but it’s not needed because the
"do_parse_request" filter passes a second argument:
Now, no matter the URL of the website we visit, we will always see an archive of our pages.
The experiments above prove that to set main query arguments to something arbitrary, we just need 2 things:
That’s cool, however we are ignoring the URL, and a real routing system can’t be decoupled from the current URL.
If we want to implement such a thing, we first need to retrieve the current URL, and only after that, we can set some routes to compare to the URL to.
Retrieving the current URL in WordPress
You may know that in PHP code to retrieve current URL we usually look in the
We can surely do that, but in that way we couple our code to a global variable that is hard to reproduce in a command line context, such as unit tests.
Surprisingly, WordPress doesn’t have a function to retrieve the current URL, but we can use the
add_query_arg() function to get it.
That function is normally used to add query string variables to a URL, but if no URL is passed to it then the current URL is used.
Note: I escaped the URL because recently it was discovered that to use
add_query_arg() without passing any URL may be a security risk. The risk is averted by escaping the obtained URL.
Anyway, if we visit a URL such as
http://example.com/foo/bar/, the function above returns
This is quite fine for our purposes, however we also need to take care of the case WordPress is installed in a subfolder.
Let’s assume that WordPress is installed in
http://example.com/blog/ and we visit the URL
http://example.com/blog/sample-page/. What we want to use in our routing system is
Now we need to strip any URL path that is present in the home URL.
Towards a WordPress routing system
Now that we know how to get the current URL, what we need is a way to add some routes and compare them to the URL.
A clever enough way to add routes could be to provide a hook to register our routes. This is a standard way to do things in WordPress, so anyone interacting with our code will not be confused.
The code may look something like this:
The code above, and the comments in it, should make clear what we are doing.
We add a callback to the
"do_parse_request" filter, and inside of it we trigger a filter that lets users add some routes.
After that, if we have some routes, we parse them. Parse means that we compare routes to the current URL to see which one matches.
If a match is found, the matching function (that we are about to write) returns an array of variables that we can store in
WP::$query_vars and return false, just like we did when we did earlier.
We can probably figure out different ways to compare a URL to some routes, but the most flexible way is to use regular expressions.
This is a syntax that is not something we need to invent, and it is very powerful to enforce rules and obtain variables from the mathing. This is also the method that WordPress uses for rewrite rules, so users will be familiar with it; but in contrast to WordPress, we will map URLs to query variables and not URLs to another URL.
Let’s write the code:
The function above uses
preg_match to match the route pattern with the URL path.
In case of a match, the
$matches array from
preg_match is then passed to a callback stored in the route to obtain an array of query arguments.
Finally, the returned arguments are merged with any query variable present in the URL.
The function assumes that
$routes is an array where each item key is the regular expression pattern, and the item value is a callback that receives matches from
preg_match and returns an array of query arguments.
The missing piece is something that allows to add such routes.
Earlier we imagined a better world where WordPress had a
add_frontend_route function to add routes.
Now we got the chance to write that function:
It’s as simple as that.
In the code we wrote to implement the routing system, we were firing the filter
"routing_add_routes" to allow some code to add routes.
The function we just wrote uses that hook to add a route to the routes array, setting the route pattern as the key and the route callback as the value. That’s exactly what the
parse_routes function we wrote is expecting.
Our routing system is overriding the WordPress way to handle URLs. There are cases when this is problematic.
An example is the WordPress dashboard where our system might break things. We can prevent this by not running our system when
is_admin() is true.
However, in WordPress,
is_admin() is true also for AJAX requests, and we probably want to allow our system to run on AJAX requests.
The conditional code to allow the system to run would be something like:
Another thing that may conflict with our system is the WordPress canonical redirect. Considering that WordPress does not recognize URLs matched via our system (because we are not registering them using WordPress core features), it may redirect them using the
That function is hooked in the
"template_redirect" hook, so a solution would be remove that hook when a route is matched.
We are already firing an action,
"routing_matched_vars" when a route matches. We can use that hook to remove
Putting the pieces together
At this point we have all the pieces to create a routing system in WordPress. We can put them together in a plugin to better reuse it in different websites.
The code below is without comments to save space, but all of the lines of code below were already discussed in this article. It’s also available as a Gist.
This is the whole plugin, and it is everything we need to build our routing system.
How to use the routing plugin
After the plugin is active, using it is quite easy.
The only function you need to interact with is
add_frontend_route. You need to provide a first argument, the regular expression pattern that will match URLs, and a second argument that’s a callback that receives the matches array and has to return an array of query vars.
For example, add the following route:
Visiting the URL
example.com/post/latest will show the latest 5 posts, just like it will show latest 5 products if you visit
Do you remember the route we wrote under the “imagine a better world” section earlier?
You can use it, and it will work (just remember to add the namespace). It seems that the better world is here.
Room for improving
We can learn things about URL routing by looking at frameworks and CMSes out there that are already using it.
For example, most routing systems differentiate routes not only based on URL path, but also on HTTP method, because a
$_GET request is usually different than a
$_POST request, even when sent to the same URL.
We can also use hosts to differentiate routes, so that
api.example.com/foo/bar is considered different from
Moreover, our plugin calls
preg_match for every route we add. Regular expression functions are pretty slow and we can improve the performance of our plugin if we use a library like FastRoute to match our routes to the url.
Moreover, we can improve our plugin returning
WP_Error objects or throwing exception when unexpected things happen.
…or maybe just use Cortex
Recently I updated a library of mine, named Cortex.
It implements a routing system using same concepts I exposed in this article, but the actual code is quite different.
In its latest version it uses the FastRoute library and has some additional features: differentiate routes by HTTP method and host, route groups, redirect routes, and more.
In this article we saw how WordPress creates the main query arguments starting from current URL. We discovered that rewrite rules are what WP uses to build query arguments according to the current URL, and saw two annoying issues that affect rewrite rules.
After that, taking inspiration from other software, we imagined a routing system for WordPress that could solve the rewrite rules issues. Finally, step by step, we implemented that system we imagined, giving it the shape of a plugin.
This content was originally published here.