Subscribe to THE LATEST

Google: When indexing pages, status code comes first

Google: When indexing pages, status code comes first Blog Feature

Kaitlyn Petro

Director of Operations, 7+ Years Of Digital Marketing Strategy, Project Management, & Process Creation Experience

October 24th, 2019 min read

Google can be tricky; we never know what it’s thinking or what changes it will make. Yet, as digital experts, we’re constantly trying to please the search engine.

Just think, without Google, would search engine optimization even be a thing?

Optimizing our content, websites, and other digital assets for search is nothing new to us. We know Google has algorithms that it uses to push us higher in the rankings and reward us for having optimized content. But what most of us don’t know is that Google actually checks status codes before anything else.

Moz helps us better understand status codes by describing them as “the Internet equivalent of a conversation between your browser and the server. They communicate whether things between the two are A-okay, touch-and-go, or whether something is wrong.”

A recent article on Search Engine Journal features a video of a Webmaster Central Office Hours Hangout session with Google’s John Mueller. In this video, Mueller talks about how Google checks the HTTP status codes of website pages before rendering or indexing content. In fact, Google won’t render anything unless it returns a 200 status code. Mueller says Google doesn’t even see 404 pages, and the same goes for 500 errors and redirects, as well.

What does crawling content mean, and why is it important?

When creating content on your website, usually the ultimate goal is for it to show up in search results. In order to have this happen, Google needs to crawl the page and index it.

Crawling refers to the activity the GoogleBot partakes in when looking for healthy, 200-status pages. Indexing describes what happens once the GoogleBot finds these pages and deems them “okay” to show up in search results.

If any of your pages return a 400 or 500 code, the GoogleBot will not move any further and will not index your content to show up to searchers. In fact, returning these status codes could actually be detrimental to your website’s technical health.

Having a page with a 400 bad request error means that the request to load that page was corrupted or incorrect. The most familiar one we see is the 404. This often happens when we type in a URL incorrectly, or we try to go to a page that is no longer published (and doesn’t have a properly set up redirect in place).

If you see a 500 internal service error, this means something on the website's server has gone wrong, but the specifics of the problem could not be identified.

How to check or audit your page codes

Because of the importance of having fully indexable content, it’s imperative that you know how to check for page errors.

The easiest and quickest way to do this is to use an online tool like SEMrush or Screaming Frog.

Both offer similar site audits that can usually be completed in under an hour (depending on how large your site is). The results of the audit will show all status codes of each of your pages, so you can easily pick out which ones need attention.

Alternatively, Google Search Console can also generate a report with this information (without a full site audit, however), and it’s completely free.

We recommended that an audit like this be done at least two to three times per year.

Fixing error codes

Depending on what the cause of the actual error is, you may need to try a variety of things to rectify it. A 200 status code is the goal, and there is nothing you need to fix on these pages. Having a 300 status code doesn’t require any fixing, either, but it’s good to know that Google will not index this page; it will look to index the page you’ve redirected it to.

For 400 errors, check the status of the page. Was it once published but now stands unpublished? If so, you’ll have to set up a redirect to a live page.

For 500 errors, you may have to do some digging. These types of codes could be due to a permissions error, a PHP timeout, or a coding error in .htaccess. In these instances, your CMS and your web hosting provider will probably have documented help articles or resources you can use to better understand what’s going on and how to come to a resolution.

What does this mean for marketers?

The lesson here is simple: as we pay attention to both on-page and technical SEO elements that affect our site ranking, we need to put a special focus on HTTP status codes.

Conducting site audits regularly and ensuring proper redirects are put into place during page changes should be prioritized.

The most important thing to remember is that if Google recognizes too many error codes on your pages, it can devalue your site. This can decrease the amount of traffic you receive which, in turn, can destroy opportunities for additional business.

Here Are Some Related Articles You May Find Interesting

Want to Contribute Content to impactplus.com? Click Here.