You can literally block any visitor including search engines and secure the data or information you have on your website by the help of .htaccess Deny From All. A similar solution is to have a robots.txt, majorly for search engines.
To disallow all search engine visits and stop the any spider or crawler, create a robots.txt and put the follow text in it:
User-agent: * Disallow: /
That’s a rather strong statement of close-up, as after you have placed the robots.txt file in the document root of your domain, almost all search engine spiders would stop accessing and indexing your entire site, preventing the precious information that you want to keep private from leaking outside.
Usually, you just want a sub folder / directory under the domain to be excluded from the search engine crawling scope, then below is what you need:
User-agent: * Disallow: /data/
Similarly, put the robots.txt at the root directory of the domain, and all play-by-the-rules search engines would never break your privacy by accessing http://www.yoursite.com/data/ any more.
1 thought on “Robots.txt Disallow All and Block Search Engine Spiders”
Pingback: .htaccess: Deny From All - Restrict Directory Access
Comments are closed.