To delete all content in any directory, including all sub-directories and files, I’ve been using this:

rm -rf somedir/*

If it is to delete all content of the current directory:

rm -rf *

However, it turns out ‘rm -rf’ doesn’t remove hidden files such as .htaccess (Files with a name starting with a dot are hidden in Linux). To delete all the hidden files as well, I have to run a 2nd command:

rm -rf .??*

{ Comments on this entry are closed }

Linux: How to ‘find’ and search ONLY text files?

by Yang Yang on January 22, 2011

The ‘find’ command in Linux systems searches through a directory and return files that satisfy certain criteria. For instance, to find a file that contains the string ‘needle text’ in the ‘mydocs’ directory:

find mydocs -type f -exec grep -l "needle text" {} \;

The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.

To achieve this, use this version of the above command:

find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1

I asked the question at stackoverflow.com and peoro came up with this solution. It works great.

Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.

Thus far the best way to do this

find -type f -exec grep -Il . {} \;

Or for a particular needle text:

find -type f -exec grep -Il "needle text" {} \;

The -I option to grep tells it to immediately ignore binary files and the . option along with the -l will make it immediately match text files so it goes very fast.

{ Comments on this entry are closed }

One of my old Internet friends Brad made a very nice online slide that introduces to you some of the exciting new features of PHP 5.3. I’m most interested in namespace that would make coding in a large project and code reuse much easier, especially for people who find keeping naming conventions a challenge in the team.

Here’s the original post and slide: http://bradley-holt.com/2010/11/new-features-in-php-53/

All the new stuff and concepts are presented in a practical manner that’s extremely straightforward. Programmers should find them a breeze to crunch.

{ Comments on this entry are closed }

Was doing something with a regular expression and very oddly the connection keeps being reset every time I refresh the web page.

I tried to narrow down the problematic line by removing the code in functional chunks. Finally it comes down to a preg_match() instance with a small bit in the regular expression that’s accidentally and wrongly typed in caught my attention:

(.+)+

Got rid of the second plus sign:

(.+)

And it’s all right.

{ Comments on this entry are closed }

MySQL: Export Table to CSV Text Files for Excel

by Yang Yang on November 18, 2010

MySQL tables can be exported to SQL dump file which is basically a text file of SQL queries that can be used to import the table back into database. To export MySQL tables into other formats such as CSV, phpMyAdmin proves to be very handy if you have changed the execution timeout in seconds to zero (so it never times out) – or it won’t work with large tables. However, there’s another way to do this in native SQL query and it works until the end of a very large table:

SELECT * FROM mytable INTO OUTFILE "c:/mytable.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY "\n";

The INTO OUTFILE command in this MySQL query will store all rows selected from the table mytable into the text file c:/mytable.csv in the form:

"1","Anthony","24"
"2","Rachel","27"

Now you can open and use the CSV file in another application such as Excel and manipulate it any way you like.

If you need another format, just change the query accordingly.

Add column / field headers at the beginning of the CSV file

To add column names at the beginning of the CSV file so each field is identified, use this SQL snippet:

SELECT 'id', 'name', 'age' UNION
SELECT * FROM mytable INTO OUTFILE "c:/mytable.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY "\n";

That is, adding “SELECT ‘id’, ‘title’, ‘slug’ UNION”. And you will have:

"id","name","age"
"1","Anthony","24"
"2","Rachel","27"

The awkward thing is you have to manually add the field names one by one. For now, I know no option other than this that can add the column names before all the records when you are dumping CSV files from MySQL, except with phpMyAdmin.

{ Comments on this entry are closed }

A few SEO tips

by Yang Yang on November 10, 2010

findHaven’t come up with any solidly helpful posts recently so I thought I’d throw in this one. These are some of the things I learned the hard way in the past few years for ranking my websites in Google. Hopefully you would find something new in here.

I rely on SEO to get traffic – in most cases, it yields the best traffic across all possible sources. With good SEO (both on-site content optimization and off-site reputation / link building), it’s hard to not make money. Especially if you are an expert in Internet marketing (niche research, reputation building & management, consumer psychology, landing page tuning, blah blah blah…), it’s even harder to not be rich. Making good money is easy, you just need time.

Don’t use a host that’s POPULAR and CHEAP.

Really popular hosts like hostgator and dreamhost have millions of domains hosted with them. Because they are cheap, spammers like them and Google knows it. I frequently launch new sites and from my experience with dreamhost, after submitting the new site at here, without building any backlinks, it typically takes 1 week or more to get it indexed.

However with hawkhost and wiredtree, it’s totally different situation. Without any initial backlinks, new sites can be indexed in Google 1 day after submission, even when it’s just a blank site with an empty Apache index page. Sometimes I didn’t even have to manually submit the site and it magically and automatically got in Google’s index.

Sites hosted with hawkhost tend to be more stable in Google’s index. However, it’s hard to keep a new site (with merely any content) in Google’s index if it is hosted with dreamhost (and similarly very popular hosts with cheap shared plans). Google would soon get rid of your new site if you don’t keep working on it.

Have a 4 year old website.

Adsense is one of my favorite money makers and my most steady stream of Adsense income comes from a site I built in 2006. I created some nice content (very nice and very original) back then and I just left it there.

I made only $10 a month from the site in the first year and after some very frustrating ups and downs, it’s gradually climbing up. Now, 4 years later, it’s averaging $600 a month. To be honest, I never actually spent much time on it at all. No link building nor frequent content updating no nothing and it’s now making me 600 bucks a month. Not much, but still.

Not only is it receiving large amount of steady traffic, new content are generally very well positioned in front spots in search engine results. The older the site, the more authoritative it can get from search engine’s point of view.

Time is the ultimate distinguisher between builders and spammers. Spammers come and go, hit and run. They are always impatient, looking to make the quick buck with a spammy site. Once they find it’s not profitable, they’ll stop renewing the domain after the 1st year. Google knows this too well.

So most of your sites would not actually start performing in terms of search engine traffic until at least 1 year after domain registration. Yet most people are too obsessed with quick results and never wait that long. They kill their sites just before they can make them decent money.

Be natural.

Google is becoming smarter and smarter. I would never go against them by challenging their intelligence and capabilities to identify spam (or partial spam).

Sites I intentionally optimize in title, description and content keywords as well as off-site link anchor texts never seem to get anywhere substantial. It’s boring and it’s chores. It’s not worth it. I can spend the same time and dollar bill in creating content that’s useful and exciting. Best of all, it’s much more fun that will keep you going!

Duplicate content is a myth.

While being original is absolutely a must in ultra saturated / competitive niches, duplicate content isn’t that big a deal in most niches.

Forget SEO. Start making friends and never stop creating stuff.

I’ve been doing SEO for 4 years and I can finally say, this is the ultimate SEO tip.

{ Comments on this entry are closed }

Argophilia – Eastern Europe Travel Portal

by Yang Yang on November 2, 2010

Phil was one of my best friends on the web and he has been very kind and helpful when I was just starting out. A few months back he invited me to work on a travel project that eventually landed as Argophilia. It’s a shame I was occupied by a lot of chores then and never actually contributed anything substantial to it.

Now you can see it’s becoming something real: http://argophilia.com/

And the news portal: http://www.argophilia.com/news/

They are simply awesome! Both the design and the content are exceptionally great. Hopefully they would take off soon as for now the only travel site targeting the Eastern Europe.

{ Comments on this entry are closed }

I have WAMP server installed on my local computer for PHP and MySQL development. After using it for a while, the MySQL installation folder has run up some seriously huge log files that are taking enormous amount of disk space. We are talking 10s of GB here. The file names are like ibdata1, mysql-bin.000017, etc.. It appears it’s because I frequently have large databases in and out by the ‘mysql’ and ‘mysqldump’ command, building up large stacks of data importing and exporting logs for data rollback and recovery or something.

Simply deleting the log files may result in problems such as being unable to start MySQL again. Tried that and I had to reinstall it to make it work again.

After some searching, I’ve found this query to solve my problem. Just execute it after you have logged in MySQL via command line – of course you will need the root privileges:

PURGE BINARY LOGS BEFORE '2036-12-12 12:12:12';

Something like that would purge and delete all the huge logs to free the disk. Just make the date time to be as sci-fi as possible so that all log files are purged.

{ Comments on this entry are closed }

It can be annoying when MySQL imports your UTF8 database (which contains exotic characters other than those in English) in the default character set of latin1_swedish_ci, jeopardizing the text content. It is also annoying when phpMyAdmin does the same and when you forgot to set the collation to utf8_general_ci for the new database which is very probably going to store utf8 characters.

Let’s fix this once and for all.

Just locate and open the MySQL configuration file my.ini and find the section [mysqld]. Add the following directive:

[mysqld]
character-set-server=utf8
collation-server=utf8_unicode_ci

Save to my.ini and restart MySQL demon. Now MySQL will use utf8 as the default character set when importing databases or creating new databases. The default collation of phpMyAdmin has also changed to utf8_general_ci.

{ Comments on this entry are closed }

Free PHP Business Directory Script

by Yang Yang on September 29, 2010

Simple PHP Script is a website scripts arsenal for Kavoir readers. My plan is to write many simple yet useful website scripts and release them in SPS for you to download and use. Some of them will be commercial and some of them will be free. The first script is a business directory website script in PHP and MySQL that I’m releasing free here.

Homepage layout

free php directory script screenshot
For the back-end administration control panel, use the demo account to log in.

How to download?

If you are interested in obtaining a copy of the script and use it for your own website, please sign up with our email list by the form in the right sidebar or the one immediately after this post.

After you have subscribed to the mailing list, not only can you download the script free, but you will also be notified of more free PHP website script releases in future.

What are the features?

This business directory script has a handy multi-step installation module to help you deploy the script on your server. PHP 5 is required. For now, you have to install it at the root of a dedicated domain or subdomain such as http://dir.example.com or http://www.example.com. You cannot install it in sub-directories such as http://www.example.com/dir.

It also comes with a user registration / contribution system and a full-fledged administration control panel. You can create and edit up to 2 tiers of categories. You and registered users can add business listings with business names, phone numbers, website URLs and postal addresses. You can choose to approve all the listings before they appear publicly or you can enable the users to instantly publish them on your site.

Paid inclusion is not available in this version yet. Nor is a usable template system. However these are already on my list of wanted features for this product.

Those who subscribes to Kavoir email list will receive all future upgrades for free. Just download the package again with the FreeCode that will be sent to your inbox after you have confirmed subscription.

Buy A Word Directory

Back a few years ago I also made this script: http://w3ec.com/ – the PHP link directory script that sells English words as links, $1 per letter. You can purchase a word and make it link back to your website. Feature the word on the homepage for an extra fee.

You can buy the script here: http://w3ec.com/script.php

{ Comments on this entry are closed }