Transferring web hosting raised just one really significant issue. That was due to the new hosting server running Apache 2 with mod_security. This appears to have an unfortunate quirk of being unhappy with most European accented letters in the the URLs – causing an HTTP 406 error.
With the increasing number of attacks on websites, having mod_security enabled is a good thing. However this was giving me a real problem as many URLs are generated from plant names in the database and legitimately have accented characters. A bit of web-searching showed that many other people had hit the same problem.
First thing was to check the URL format – yes, they were already being correctly made using the PHP ‘urlencode’ function. But the web server still didn’t like them!
The most commonly suggested solution was to control the action of mod_security by commands in the .htaccess file for the site. This works for older versions of Apache but not for Apache 2. Instead it would need a configuration change by the server admin – not a good option on shared hosting.
The fix turned out to be quite simple and just involved a little extra PHP code – once the URL string had been passed through urlencode, all ‘%’ signs are changed to an arbitrary 3 character string. Apache then does not see them as significant encoded characters. The reverse change is then done on the target pages for the URLs. The substitution string can be anything that will never occur in the real name string.
In this particular case they were also often images associated with the pages, with the same names. So an additional piece of code was needed to handle those, renaming the files on the server when first accessed.
Not rocket science, but effective! No more 406 errors.