Categories
Content / SEO Tips & Tutorials Google Hacks, Cheats & Tips

Use robots.txt Disallow directive to forbid spiders and search engine robots

Just like .htaccess, robots.txt resides at the document root of your domain. It’s a text configuration file containing directives or rules any well behaved web spiders or search engine robots should respect. While you can use .htaccess to forcibly prohibit any visits (including those of human visitors) to a certain part of your site, robots.txt just deals with automated web page spiders such as googlebot.

To forbid any robot spiders to access and index /includes/ and /search/ directories of your site, simply write a robots.txt file and put in the following rules:

User-agent: *
Disallow: /includes/
Disallow: /search/

The asterisk * stands for any robot. By these rules, all robot spiders should not access nor index /includes/ and /search/. This is a good way to protect sensitive data and stop search engines from indexing certain part of your site.

Similarly, you can write rules targeted at a specific search engine:

  1. GoogleBot – Google
  2. Slurp – Yahoo
  3. MSNBot – Bing

(Note that these search engine providers may very probably have more than one bots, the listed bots are just the most common ones at present.)

For example, to prohibit Google from accessing and indexing /ihategoogle and any web documents under it, use rule:

User-agent: GoogleBot
Disallow: /ihategoogle/

There is no Allow directive

Therefore, to allow a spider to access your site, say nothing or:

User-agent: GoogleBot
Disallow: 

To allow a single bot such as GoogleBot to have the only privilege to access your entire site:

User-agent: GoogleBot
Disallow: 

User-agent: *
Disallow: /
Categories
PHP Tips & Tutorials

PHP: Count Words in a String

A quick tip here for counting the number of words in a string. For example, when you need to determine if a submitted string is too short to be considered in inclusion or further processing.

$str = 'How to get the number of words in a string?';
$num = str_word_count($str, 0); // $num = 10

The PHP function str_word_count() is a weirdo in that the 2nd argument is a digit from 0 – 2. While it’s 0, str_word_count() counts and returns the total number of words in the string $str.

If it’s 1, str_word_count() returns an array of all the words identified inside the string $str.

Else if it’s 2, str_word_count() returns what it would return when it’s set to 1 but with all the keys being the numeric position of the corresponding word.

Categories
Domains Internet Tools

Bulk Domain Name Availability Checker Tool to Batch Search Available Domains

dynadot bulk domain checkerSomeone asks me via contact form how to check potentially hundreds of domains for availability at once, as it’d be a huge waste of time to check them one by one. For example, when you need to check if these 50 domain names are still register-able: californiainsurance.com, newyorkinsurance.com, virginiainsurance.com, … and so forth, you need a bulk search tool for available domains.

While an AJAX whois tool gives instant results for all major TLDs, a bulk domain name search tool queries for the availability of all the domains at once in larger volumes that’s preferably no less than 100 in a batch.

Dynadot free bulk domain search is just the tool you need in a situation like this. You can generate a list of most wanted domains, in 100, and input them batch by batch to their domain bulk checker.

Register an account at Dynadot, head to: Domain Names –> Bulk Search and you should be able to check 100 domains at a time.

Categories
SQL / MySQL Tips and Tutorials

Instantly Boost SQL Query Efficiency of REGEXP or RLIKE by 2000%

Naturally, using regular expressions for SELECT queries to check if certain text strings or text patterns are residing somewhere in large chunks of data is the most resource-intensive option and thus your last option. However it’s somehow unavoidable to practice regular expressions in the SQL queries for complicated patterns. For example, word boundaries are a common reason why you want regular expressions in SQL:

SELECT * FROM articles WHERE content REGEXP '[[:<:]]MySQL efficiency[[:>:]]'

This query searches for any entry in table articles that has a ‘MySQL efficiency’ phrase in the content field. In a table as large as 100,000 rows, it’d definitely take more than 0.5 seconds which is a rather outrageous length and would very probably take down the server if large volumes of searches flood in.

So, let’s consider the LIKE clause:

SELECT * FROM articles WHERE content LIKE '%MySQL efficiency%'

Much better, the query time is instantly reduced to 0.01 seconds or so. The problem is % is simply not the right symbol for a word boundary because it matches:

  • theMySQL efficiency-

In this case of ‘MySQL efficiency’, there won’t be much of a problem because the chance of ‘theMySQL efficiency’ to appear is rather slim. However, consider the case of ‘War’, you can’t use LIKE clause for it because it’d also match ‘Edward’ and so forth. You get the idea, using LIKE alone in this manner is incorrect.

The solution

Eventually, we’ve got something good to talk about. This solution can address both of the problems, that is, to combine LIKE and REGEXP / RLIKE together in the query.

SELECT * FROM articles WHERE content LIKE '%MySQL efficiency%' AND content REGEXP '[[:<:]]MySQL efficiency[[:>:]]'

This way, not only the query time is reduced to approximately 0.025 seconds or less because of the LIKE clause, but the phrase can also be actually matched to the real thing thanks to the REGEXP clause.

Probably in that MySQL will first process the LIKE clause and then pass the filtered results to be further processed by the REGEXP clause. Though the REGEXP clause is much more resource-intensive, as there’s a lot less to be processed after being helped by the LIKE clause, the query time is considerably reduced.

Categories
WordPress How To

WordPress blog goes blank after editing and updating the theme files

Really weird but not unexpected at all. After trying to edit and update a theme file functions.php to override a native function of WordPress, get_search_form, inserting these lines at the end of functions.php of my theme:

function get_search_form() {
?>test<?php
}

To see if the native get_search_form() function can be overriden. Turns out it can’t. Not only that but my WordPress blog refuses to load anything other than a blank white page. I have to get on FTP to revert those changes.

And it recovers from the strike.

Categories
SQL / MySQL Tips and Tutorials

mysql command line character set option for importing SQL files encoded in UTF8

For languages other than English, especially those in Asia such as Chinese, each character takes 2 bytes in storage and needs to be encoded in UTF8 or other encoding specifically designed for it.

Normally, the command line mysql database importing command has no problem with English database SQL files that are encoded in ANSI or ISO-8859-x, when the SQL file is encoded in UTF8 and contains foreign language characters, however, the imported data will end up in a mess in the database.

To work around this problem, you have to specify the default character set in the mysql command line:

mysql -h localhost -u user1 -p --default_character_set utf8 somedb < backup.sql

And you should have no problem.

Categories
Business and Marketing Programming Tips & Insights

Being simple as a bless for development cost

Usefulness is the king; and it doesn’t have to be complex. One of the premium rules of project management is to never add a feature without seeing any substantial demand for it. Whatever it is, any additional feature is a burden to the system and a cost of time, both growing exponentially.

While many of them may seem indispensable at first glance, in the long run you will be convinced that they are actually not. In fact, 80% of the features of 80% of the software in the market can be eliminated with few user complaints. The benefits you gain by removing lesser userful features, such as faster loading, better responsiveness and greatly reduced learning curve for users are far more favorable than the features themselves. Not to mention the hardened and accelerated project cycle that’s going to pose less risks to your budget and team because it has been shortened to just one third of the original.

Ideas can be thus much more quickly implemented and tested in public. Small overheads for all projects until you see solid potential of some of them by user acceptance and positive feedback. It is then when you invest more into the selected projects and try to grow them from average simplicity to great simplicity.

Unnecessary sophistication is the number one sign of a novice. Use no more than 3 colors (preferrably black, white and a toner) designing your first web page and no one can tell if you’ve just started learning it or you are an experienced hand. On the contrary, beginners use as many colors as possible across the spectrum before even thinking about the compatibility and actual need for them. They don’t know what’s important that should come first. They simply go with their god damn poor instincts and put everything that randomly comes on their mind into the whole thing.

Being simple is simply to keep the most important stuff and peel off everything else.

The last thing you want do when designing an application or data model, is to blindly make assumptions for users, add new features by dreaming of your own and appear amateur, resulting in a super complex system hard to learn and use as well as features that the users don’t actually need that badly (which you naively believe they do!).

Always start a project by unequivocally hitting the point – why would the users want to use your product? The very first feature is your answer. And that’s it.

Categories
PHP Tips & Tutorials Programming Tips & Insights

A few suggestions of good practices for accelerating PHP development

The slow and steady may win, but the fast and steady dominate. The old saying of faster not always being better may go well in other fields, but not in the IT world of today. Faster is absolutely better. You should by all means try to improve yours or your team’s programming proficiency and accelerate the development speed.

In web programming and website development, I have some suggestions for speeding up your PHP development.

  1. Maintain a library of commonly used features such as user registration module and file uploading module. This is pretty much the essence of quicker development. Create and reuse things.
  2. Write your own framework for a series of similar websites. Many websites share large percentage of user experience flow and functionalities. For instance, the Q&A system of Stack Overflow can be used in a lot of other areas too because of its generic nature, that’s why they have decided to make yet another Q&A site about server administration: Server Fault.
  3. Keep consistency among the database schemes across all applications, for example: 1) always have an ID, 2) always name a table by singular, 3) always name the foreign field the name of the referrenced table, etc. This way, you can much more easily migrate PHP logics across different data models.
  4. Recognize and take out any site-wide text strings such as common regular expressions (against URLs, emails, etc.), site title or administrator email and define them as PHP constants.
  5. If possible, write PHP logics as independent of the application data model as possible. Specifics like table names and field names are all to be excluded from your PHP code. Create an extra file dedicated to database structure description / operations or at least encapsulate them in an initiatiable object or array so migration to another data model is as easy as possible.
  6. If open_basedir directive is left unset, you may consider the convenience of creating a central PHP library for all of your websites that’s located outside of the web accessible directories, providing common modules and interior logic services to all the websites and web applications. This would greatly reduce your development time for future projects but on the other hand, it sort of increases the bonding between different websites, making them harder to customize or transfer away.
  7. Wrap everything up in a function at least if not in a class. Many’s the time you find yourself writing quite a chunk of code executed sequentially in a page, that’s when you should consider wrapping the chunk up as a function in the same place and immediately call it after the function definition. This may seem out of point at first, but as the program grows, it makes it look clean and easy to read and prevents the whole thing from developing into a bowl of noodles.

That’s apparently never all. Care to share some of your tips in developing more quickly in PHP?