Content / SEO Tips & Tutorials Information Security

Robots.txt Disallow All and Block Search Engine Spiders

You can literally block any visitor including search engines and secure the data or information you have on your website by the help of .htaccess Deny From All. A similar solution is to have a robots.txt, majorly for search engines.

To disallow all search engine visits and stop the any spider or crawler, create a robots.txt and put the follow text in it:

User-agent: * Disallow: /

That’s a rather strong statement of close-up, as after you have placed the robots.txt file in the document root of your domain, almost all search engine spiders would stop accessing and indexing your entire site, preventing the precious information that you want to keep private from leaking outside.

Usually, you just want a sub folder / directory under the domain to be excluded from the search engine crawling scope, then below is what you need:

User-agent: * Disallow: /data/

Similarly, put the robots.txt at the root directory of the domain, and all play-by-the-rules search engines would never break your privacy by accessing any more.

By Yang Yang

Hello, I'm Yang. I build online businesses that please people. Want to join in and post some useful articles on Shoot me a message.

One reply on “Robots.txt Disallow All and Block Search Engine Spiders”

[…] Another blocking method via robots.txt. Related PostsQuick Tutorial: Robots.txt Disallow All and Block Search Engine Spiders.htaccess: Directory Listing – Enable Web Directory Browsing & Indexing12 Rocking Apache .htaccess Examples – Commands & DirectivesEssential SSH – 19 Linux SSH Commands You Simply Cannot Live WithoutHow to count files (get the number of files) under a directory in Linux? […]

Comments are closed.