DiggingIntoWordPress

by Chris Coyier & Jeff Starr

6 .htaccess Tricks for Better WP SEO & Security

Posted by on

Here are six htacccess tricks that will help improve the security and SEO quality of your WordPress-powered site. We do this using .htaccess to establish canonical URLs for key peripheral files, such as your robots.txt, favicon.ico, and sitemap.xml files. Canonicalization keeps legitimate bots on track, eliminates malicious behavior, and ensures a better user-experience for everyone. On the menu:

  1. Canonical robots.txt
  2. Canonical Favicons
  3. Canonical Sitemaps
  4. Canonical Category, Tag & Search URLs
  5. Canonical Feeds
  6. Simpler Login URL

Before making changes..

The .htaccess code in this post is designed to work when placed in the web-accessible root .htaccess file of your domain. Before making any changes to this file, make a good backup and keep it on hand just in case. Working with .htaccess is nothing to be afraid of, but it’s critical to not make any mistakes in syntax, spelling, or anything that’s not a comment (#). If you forget a dot working with CSS, your design might look messed up. If you forget a dot working with .htaccess your server will return a 500 – Internal Server Error. If this happens, don’t panic, just upload your backup and everything will be fine.

1. Canonical robots.txt

Help bots and visitors find your robots.txt file no matter what. Given that the robots.txt file should always be located in the root directory, you would think that this wouldn’t be an issue. Unfortunately, bad bots and malicious scripts like to scan for robots.txt files everywhere. Fortunately, this .htaccess snippet eliminates the nonsense by directing any request for “robots.txt” to the actual file in your root directory. If you’re sick of seeing endless requests for nonexistent robots files, this code’s for you:

# CANONICAL ROBOTS.TXT
<IfModule mod_rewrite.c>
 RewriteBase /
 RewriteCond %{REQUEST_URI} !^/robots.txt$ [NC]
 RewriteCond %{REQUEST_URI} robots\.txt [NC]
 RewriteRule .* http://example.com/robots.txt [R=301,L]
</IfModule>

To use this code, replace example.com with your own domain and include in your web-accessible root directory. These directives collectively redirect all requests for any “robots.txt” file, with the exception of the actual, root robots.txt file. If you need to whitelist a similarly named file, just include the following line beneath the first RewriteCond, replacing the path and file name with your own:

RewriteCond %{REQUEST_URI} !/wordpress/robots.txt$ [NC]

Alternate Method: Instead of using Apache’s rewrite module, we can do the same thing with less code using mod_alias:

RedirectMatch 301 ^/(.*)/robots\.txt http://example.com/robots.txt

Either method is effective, but mod_alias is optimal for typical sites with only one robots.txt file. If you have multiple robots.txt files, the mod_rewrite method will enable complete granular control from the root .htaccess file.

2. Canonical Favicons

Avatars, gravatars, and favicons are a big hit with malicious scanners. Evil scripts like to traverse your directory structure with requests for commonly used images such as the ubiquitous favicon.ico. So just as with robots.txt, we can stop the madness and redirect any request for “favicon.ico” to the actual file in your root directory.

# CANONICAL FAVICONS
<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteBase /
 RewriteCond %{REQUEST_URI} !^/favicon.ico$ [NC]
 RewriteCond %{REQUEST_URI} /favicon(s)?\.?(gif|ico|jpe?g?|png)?$ [NC]
 RewriteRule (.*) http://example.com/favicon.ico [R=301,L]
</IfModule>

To use this code, replace example.com with your own domain and include in your web-accessible root directory. These directives collectively redirect all requests for any “favicon” or “favicons” with a .png, .gif, .ico, or .jpg extension. To prevent an infinite request loop, the code includes an exception for the actual, root “favicon.ico” file. If you need to whitelist a similarly named file, just include the following line beneath the first RewriteCond, replacing the path and file name with your own:

RewriteCond %{REQUEST_URI} !/images/favicons.png$ [NC]

3. Canonical Sitemaps

Just as idiotic bots can’t seem to find your robots and favicon files, so too are they clueless when it comes to finding your sitemap, even when declared in the robots.txt file. Nefarious bots will ignore your robots suggestions and hammer your site with ill-requests for nonexistent sitemaps. This malicious behavior chews up system resources and wastes bandwidth. To eliminate the waste while helping stupid bots find your sitemap, slip this into your root .htaccess:

# CANONICAL SITEMAPS
<IfModule mod_alias.c>
 RedirectMatch 301 /sitemap\.xml$ http://example.com/sitemap.xml
 RedirectMatch 301 /sitemap\.xml\.gz$ http://example.com/sitemap.xml.gz
</IfModule>

To use this code, edit each rule with your own domains and file paths. The first rule redirects all requests for your regular, uncompressed sitemap, and the second rule redirects all requests for your compressed (gzipped) sitemap. These rules are independent of each other, so feel free to remove either to suit your needs.

4. Canonical Category, Tag, and Search URLs

Out of the box, WordPress returns a 404 – Not Found for the following URLs:

  • http://your-domain.tld/blog/tag/
  • http://your-domain.tld/blog/search/
  • http://your-domain.tld/blog/category/

These are commonly requested URLs that may be leaking valuable page rank. As explained previously, you can use the following slice of .htaccess to redirect these URLs to your home page (or anywhere else):

# CANONICAL URLs
<IfModule mod_alias.c>
 RedirectMatch 301 ^/tag/$      http://example.com/
 RedirectMatch 301 ^/search/$   http://example.com/
 RedirectMatch 301 ^/category/$ http://example.com/
</IfModule>

To use, place the previous code into your root .htaccess file and replace each example.com with the desired redirect location. Alternately, if WordPress is installed in a subdirectory, use this code instead:

# CANONICAL URLs
<IfModule mod_alias.c>
 RedirectMatch 301 ^/blog/tag/$      http://example.com/
 RedirectMatch 301 ^/blog/search/$   http://example.com/
 RedirectMatch 301 ^/blog/category/$ http://example.com/
</IfModule>

Again, edit the examples with your actual URLs. A good place to redirect any juice or traffic from these three URLs is the home page. Hopefully future versions of WordPress will handle these redirects internally, but until then, this bit of .htaccess is a simple and effective solution.

5. Canonical Feeds

WordPress generates many different feeds for your site. The default feed configuration works great, but it’s worth including in this post a couple of useful .htaccess tricks:

  • Set up canonical feed types (e.g., deliver only RSS2 feeds and redirect other formats)
  • Setup FeedBurner feeds and redirect associated feed requests

Each of these techniques are easily achieved with a little .htaccess. Better grab a beverage..

Set up canonical feeds

WordPress provides feeds in four different formats: Atom, RDF, RSS, and RSS2. This variety was useful in the past to accommodate different apps and devices, but these days I think it’s safe to say that most readers and devices can handle any format you throw at it. So instead of having 4x the feeds, you can consolidate your feeds into a single format:

# CANONICAL FEEDS
<IfModule mod_alias.c>
 RedirectMatch 301 /feed/(atom|rdf|rss|rss2)/?$ http://example.com/feed/
 RedirectMatch 301 /comments/feed/(atom|rdf|rss|rss2)/?$ http://example.com/comments/feed/
</IfModule>

This code will redirect requests for alternate feed formats to your canonical choice for both main-content feed and all-comments feed. To use this code, replace both of the target URLs with your own and place in the web-accessible root directory.

Redirect feeds to FeedBurner

Redirecting feeds to FeedBurner is another useful .htaccess snippet to have in your tool belt. Here are two snippets to do the job – one for main-content and another for all-comments feeds:

# REDIRECT to FEEDBURNER
<IfModule mod_rewrite.c>
 RewriteCond %{REQUEST_URI} ^/feed/ [NC]
 RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
 RewriteRule .* http://feeds.feedburner.com/mainContentFeed [L,R=302]

 RewriteCond %{REQUEST_URI} ^/comments/feed/ [NC]
 RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
 RewriteRule .* http://feeds.feedburner.com/allCommentsFeed [L,R=302]
</IfModule>

The first block redirects all requests for your main-content feed, and the second block handles your all-comments feed. So to use, just replace the mainContentFeed and allCommentsFeed with the URLs of your associated FeedBurner feeds. You may also redirect special category feeds by replicating the pattern with another block of code:

 RewriteCond %{REQUEST_URI} ^/category/wordpress/feed/ [NC]
 RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
 RewriteRule .* http://feeds.feedburner.com/specialCategoryFeed [L,R=302]

As before, just replace the URL of your FeedBurner feed in the RewriteRule and test accordingly.

6. Simpler Login URL

Lastly, a cool .htaccess trick for a better user-login experience. A single line of code placed in your root .htaccess file is all it takes:

RewriteRule ^login$ http://example.com/wp-login.php [NC,L]

Edit the example.com with the domain/path of your WordPress installation. Now instead of typing “wp-login.php”, we just type “login” and whoop, there it is.

You could also use RedirectMatch to enable a login URL switch:

RedirectMatch 301 \@admin http://example.com/wp-admin

The @ prevents the rule from bothering anything else, and RedirectMatch picks up on the switch from anywhere in the site, so regardless of what page you’re on, you just append “@admin” to the URL and you’re there. I use this trick at perishable.biz if you want to see it work. Check the comments of Chris’ original post for good discussion and some more great ideas.

20 Responses

  1. Using only feedburner, will look forward to apply all of these tricks now.
    Thanks a lot for sharing

  2. John P. Bloch March 22, 2011

    Nice little list of hacks! I’m definitely going to use that favicon trick.

    One thing about the feeds, though: your suggestion will disable certain feeds that you might want to keep. For example, every category and tag has its own feed, as do monthly archives, yearly archives, author-based archives, and as of 3.1, custom post types can have their own archive-based feeds too. Adding a karet to the beginning of the match regex fixes this:

    # CANONICAL FEEDS
    <IfModule mod_alias.c>
    RedirectMatch 301 ^/feed/(atom|rdf|rss|rss2)/?$ http://example.com/feed/
    RedirectMatch 301 ^/comments/feed/(atom|rdf|rss|rss2)/?$ http://example.com/comments/feed/
    </IfModule>

    • Absolutely that is another way of doing it. In the post, I recommend redirecting everything and then making exceptions with something like:

      RewriteCond %{REQUEST_URI} ^/category/wordpress/feed/ [NC]
      RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
      RewriteRule .* http://feeds.feedburner.com/specialCategoryFeed [L,R=302]

      Or you can just exclude any feed and not redirect it anywhere:

      RewriteCond %{REQUEST_URI} ^/feed/ [NC]
      RewriteCond %{REQUEST_URI} !^/category/wordpress/feed/ [NC]
      RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
      RewriteRule .* http://feeds.feedburner.com/mainContentFeed [L,R=302]

      Lots of options! :)

  3. Justin Givens March 22, 2011

    Awesome information. I like the FeedBurner section a lot!

  4. Great ! Always good insights here at digwp.com.

    Thanks so much!

  5. TheKoolDots March 22, 2011

    I was concerned about the SEO aspect.
    Wow, even more stuff to learn.

    Thanks,

  6. Robert Jakobson March 23, 2011

    So what should one do if one has multiple xml sitemaps for example for video, images, pages etc.. , should I ust duplicate and copy the RedirectMatch 301 line?

  7. I am little bit confused, Some of these features are now available on wordpress themes. So which is better .htaccess hacks or the same implemented in themes

    • Use the themes if you prefer, otherwise .htaccess is an option. All of the htaccess code is standard stuff, so they’re not really hacks, but if you feel better just using/customizing themes, then that’s the way for you.

      I would be interested in checking out a theme that does everything in this post. Can you provide one?

  8. Quick question….
    I have several Networked WP blogs set up using the blogs.dir approach so how would these be reconfigured to pick up the subdomains of the blogs?
    For example… I have my main domain at http://smallplotgardens.com and then a few subdomains such as http://smallplotgardens.com/raisedgardenbeds/ Would setting the feed URL to default to the main domain ‘break’ the feeds coming from the subdomain blogs?

    • You should be able to use the last bit of code there in the redirect-feed section of the post to prevent unwanted redirection of the subdirectories you mention. Just place your smallplotgardens.com redirect in the root directory and include this rule:

      RewriteCond %{REQUEST_URI} !^/raisedgardenbeds/feed/ [NC]

      Replicate that line for any additional exceptions. You should also be able to redirect your subdirectory feeds to FeedBurner or wherever with something like this:

      RewriteCond %{REQUEST_URI} ^/raisedgardenbeds/feed/ [NC]
      RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
      RewriteRule .* http://feeds.feedburner.com/raisedgardenbeds [L,R=302]

      And then you’d just edit the redirect-to URL in the last line.

  9. I have a question, not relating to this post. I want to know, if I delete some posts, will the images attached to those posts also get deleted automatically from wp-content/uploads folder? Or shall I require to delete them manually? How does WordPress behave in this regard?

    • Good question – you should run a test and find out ;)

      • I did it, the images remained, I had to delete them manually. But there’s a link to ‘unattached images’ in the Media menu. That link shows all images which aren’t attached to any post – I deleted from there. It worked, I didn’t have to login with FTP.

        I think this could be a feature suggestion at WordPress. Post deletions should also delete corresponding attached images. Or else it will be orphaned image and unsuspecting users may not know they’re using extra space on their hosting server.

  10. I copied the canonical sitemap lines and I’m getting the error: “Firefox has detected that the server is redirecting the request for this address in a way that will never complete.”

    I’m guessing it’s redirecting “http://site.com/sitemap.xml” back to the same URL. Is there an Apache setting I need to change to fix the circular reference?

    • Hey Drew, try this instead:

      RedirectMatch 301 ^/(.*)/sitemap\.xml$ http://example.com/sitemap.xml
      RedirectMatch 301 ^/(.*)/sitemap\.xml.gz$ http://example.com/sitemap.xml.gz

      Just edit the example.com to match your own. The logic here is that you only need redirect requests for non-existent sitemaps, which is basically anything other than root.

  11. hi, can i just clarify (sorry if im way behind, just starting to read your book), this part:

    “It was just a simple .htaccess redirect of the wp-config.php file (only for use on NON WordPress sites), but that name was too awesome not to publish.”

    has it been decided whether or not a 301 redirect via .htaccess is ok for wp sites?

    “PHP loads the file (any include for that matter) locally, right from the file system. Apache has nothing to do with that. It’s no problem to use this even if you do run WordPress.” – http://css-tricks.com/snippets/htaccess/shock-teenage-gangsters-with-wp-config-redirect/

    • Sure, it’s perfectly safe to use .htaccess redirects with WordPress (in general), and even for the wp-config.php, it’s fine to protect/redirect using 301, 302, or any other type of redirect.

      I encourage you to see this for yourself on a test site by adding the redirect code to your root htaccess and then checking to see if your site still works. If the config file is blocked, the site won’t work.

      As is mentioned in the thread, WP reads wp-config.php at the PHP level, which happens after Apache does its thing.

Comments are closed. Contact us with any critical information. Thank you!

Code is poetry