PHP: Checking Text Strings against Reserved or Censored Words

by Yang Yang on September 27, 2010

I created a free online web form builder a while back and since it went well in search engine rankings, spammers and phishers found it and started to use it creating forms to collect email account usernames and passwords through phishing attempts. I’ve got to do something before my host closes down my site because of all the complaints and alerts from security department of the universities. They’ve got good reasons. I’m hosting all the phishing forms.

Phishers tend to use URL slugs that include words such as ‘admin’, ‘webmail’ or ‘account’ so that the form seems authoritative at first glance. After they have signed up, they will create forms with fields labeled ‘Password’ or something. So what I’m going to do is to list all such words as reserved words and prohibit the users from doing anything with them.

A function will be needed to examine a subject string against an array of reserved words that will be censored when users use them as input. Listed is a my function:

public static function isStringLegal($subjectString = '', $disallowedWords = array()) {
	$alphabetSubject = preg_replace('|[^a-zA-Z]+|', '', $subjectString);
	foreach ($disallowedWords as $disallowedWord) {
		if (stripos($alphabetSubject, $disallowedWord) !== false) {
			return false;
		}
	}
	return true;
}

The PHP function stripos() returns a numeric value if it finds $disallowedWord in $alphabetSubject, case-insensitive. If it fails to find anything, it returns false.

A sample disallowed words list:

$slugDisallowedWords = array(
	'formkid', 
	'kavoir', 
	'mail', 
	'admin',
	'account',
	'password'
);

The disallowed words list can only contain alphabet letters. If you need a phrase such as ‘no way’, you have to add it in the array as ‘noway’. This is to prevent illegal attempts to add any word or phrase in manners such as ‘a-d-m-i-n’ or ‘Pa_ss Word’. All the non-alphabet letters / characters are first gotten rid of and then the deprived string which contains only alphabet letters are checked against each word in the disallowed words list.

Comments on this entry are closed.

Previous post:

Next post: