Site Links

Home Page







Just So You Know

I'm not an owner or a partner in any of the companies reviewed on this site. However, some of the links on this site are affiliate links and I do get a commission if you purchase a product using these links.

Here's how to bypass my affiliate links for anybody who would like to.

Jarom Adair
IMFBO.com







Most Popular Tutorials:













Thwarting the Search Engines


Sometimes you want to post information on your site that is private in nature. You want it available for certain people to access it but you don’t want search engines to make it public for the rest of the world to see. How do you control what gets indexed and what doesn’t?

When you don’t want something searched by spiders, there are several ways you can tell them to scram.

Things that will stop search engines

Here are things that will keep your information safe from search engines, whereas there are a few things that won’t (below) that you should be aware of as well.

Password protected pages

Any site where your visitors have to enter a username and a password to access the information will not be available to search engines. Search engines would have to enter a username and password just like any regular visitor.

If you haven’t entered in a username and password and you go to any one of the protected pages on my site, you get booted back to the login screen. That’s how you know your content is protected from search engines.

Robot.txt

You can tell spiders what they’re allowed to search and what they aren’t by creating a “robots.txt” file and placing this file in your root directory (root directory = the main folder where your home page (index file) is located).

Some search engines ignore robot.txt files, but the major search engines will follow them.

Here’s an example robots.txt file (you could copy everything in this box below, call it “robots.txt”, put it in the main folder of your web site, and spiders would behave as described in the translation provided):

user-agent: *
Disallow: /*.pdf
Disallow: /really_bad_poetry/
Disallow: /gradeschool_stories/getting_peed_on_by
_a_5th_grader.html
  • –Translation: For all spiders, don’t index any pdf files, don’t search files in the “really_bad_poetry” folder, and don’t index “getting_peed_on_by_a_5th_grader.html” in the “gradeschool_stories” folder (true story–and not one I care to share with the world)
User-agent: googlebot
Disallow: /
  • –Translation: Only Google’s spider–don’t index anything on this site. Yahoo, MSN, etc… you’re welcome to search this site (I’m not sure why you would do this, but you can).

Meta tags

Here’s some code you can put on any single page on your web site to tell spiders what to do with that particular page (this goes in between the <head></head> code at the beginning of the page):

<Meta name=”robots” content=”noindex, nofollow”>
  • –Translation: “noindex” = don’t search this page, and “nofollow” = don’t follow any of the links on this page.

You can use these meta commands in different combinations as well: “noindex, follow”, “index, nofollow” etc…

Things that won’t stop search engines

There are a couple things that might stop humans from finding certain information, but search engines still seem to find a way.

Simple password pages

It’s easy to create a very simple page that requires that someone give you a password to continue. For example, if someone goes to www.yoursite.com/first_page.html and they enter the password, they go to www.yoursite.com/content.html.

If people can bypass the www.yoursite.com/first_page.html page by typing in www.yoursite.com/content.html and see the content on www.yoursite.com/content.html just fine, then search engines can (and eventually will) do the same thing and list your “password protected” content in their search engines.

There’s probably more than one way to create a simple password page besides using javascript. I don’t know what other programming languages can do this (I’m sure most can), but these simple password pages don’t afford true protection from search engines.

Capture/squeeze pages

It’s common to create a capture page that that requires someone give you their email get access to special information. People give you their email address they want the information.

Like the simple password pages, if people can bypass the www.yoursite.com/first_page.html page by typing in www.yoursite.com/content.html and see the content on www.yoursite.com/content.html just fine, then search engines will eventually fine you “email required” content and make it public.

These capture/squeeze pages that require an email address to continue might stop a normal human from continuing, but don’t afford true protection from search engines.

Not linking to the page

Search engines usually find your web site from other sites that link to you. Once they get to your web site, they simply go from page to page on your site and look at everything you’ve got available.

You might think that if you put up a page and you don’t link to it from you main site, search engines won’t find it. For example, if you put www.yoursite.com/content.html on your site, but none of the pages on your existing site link to it, you’d think that search engines wouldn’t find it.

Just like javascript passwords, if a human could somehow get to the content then so can a search engine.

And search engines will eventually find that content. Either someone will link to that content from their web site, or you’ll send the link www.yoursite.com/content.html out to someone in an email and it will end up on a page somewhere online or in a a conversation between two people on a forum where search engines find it.

How will they find it? It’s impossible to tell. But they will.

What’s next?

Next you need to learn how to see what kind of content search engines have found on your site, and there are ways to raid your competitor’s site for information that they may not know search engines have found.

Join my email list and in upcoming tutorials we’ll discuss how to make sure you’re protected, and even show you how to do a little industrial espionage while you’re at it.

You like?





More good stuff...

Get more rants, cool tools, and awesome internet marketing tutorials.

First Name:
Email:
How did you hear about this site?:






Click here for hundreds more
comments and testimonials




Yours in success,
-Jarom Adair