MySQL Performance Tips

by Yang Yang on March 12, 2011

MySQLI’m not an expert but I’ve had my own errors and trials along the way so here I am. These are all the tips I believe you should use when you are optimizing for MySQL database performance. Some stuff may be very handy, but they may not be good for performance.

Design

  1. Normalize first, and de-normalize where appropriate. Table joins are never lags in performance. Use them!
  2. Choose the appropriate collation / charset. UTF16 is omnipotent but it requires 2 times the storage. UTF8 delivers more characters but it’s slower than latin1. Unless it’s necessary, use latin1.
  3. utf8_general_ci is slightly faster than utf8_unicode_ci, thus better in performance.
  4. Use NOT NULL wherever possible.
  5. Use indexes. Create indexes by analyzing the SELECTs you are doing on the table.
  6. Indexing increases performance in selecting, yet decreases performance in inserting. So don’t index everything but only those necessary. And don’t make duplicate indexes.

Querying

  1. Use smallest data types possible.
  2. IN (…) is a total fail. Avoid it wherever possible.
  3. ORDER BY RAND() is a fail too.
  4. When you need to SELECT … to make sure there’s no duplicate unique index values before inserting something, use INSERT … ON DUPLICATE KEY UPDATE … instead.
  5. If your app writes a lot, use InnoDB; if it reads a lot, use MyISAM. Here’s a comparison.
  6. Never use LIMIT xxxx1, xxxx2 with large tables especially when offset xxxx1 is a pretty large number. Instead, give a more strict WHERE … clause with LIMIT xxxx2.
  7. If a feature is deprecated, stop using it.
  8. Avoid wildcards % or _ at the beginning of LIKE queries, e.g. LIKE ‘%bloom’ – this should be avoided. For instance, you should store email addresses reversed (REVERSED()) so it’s faster to search by domain.
  9. If you have a large set of data rows to be inserted at once, create a combined INSERT query string in PHP first – that is, to bulk insert them in batches rather than one by one. LOAD DATA is also a better option than individual INSERT.
  10. Prefer GROUP BY over DISTINCT.
  11. Frequently visit Slow Query Log.
  12. Avoid calculated values in WHERE … , LIMIT … , ORDER BY … , etc.
  13. Use EXPLAIN … to get an idea of how MySQL is doing with your queries thus better optimize it. For example, it would suggest the indexes you should have created.
  14. Don’t SELECT * … on everything, SELECT only what you need.
  15. Be wary of lots of small queries that may be triggered in loops. Instead, combine them and use one large query wherever possible.

Storage Engines

  1. Know the pros and cons of different storage engines and use them accordingly. But try them out in the test environment before making the decision.
  2. Archive old data such as logs in ARCHIVE tables or MERGE tables.
  3. BLACKHOLE engine is very fast for busy things like logs.

my.ini

  1. Increase the default key buffer size: key_buffer = 128M
  2. Increase the default number of tables that can be kept open: table_cache = 128
  3. Increase the default sort buffer size:
    sort_buffer_size = 32M
    myisam_sort_buffer_size = 32M
  4. Disable binary logging that is for data replication by commenting out the line like this:  # log-bin=mysql-bin

What to read from here:

  1. Indexing: http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
  2. “LIMIT” Optimization: http://dev.mysql.com/doc/refman/5.0/en/limit-optimization.html
  3. “ORDER BY” Optimizationhttp://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
  4. Efficient Paging: http://www.slideshare.net/suratbhati/efficient-pagination-using-mysql-6187107
  5. Configuration Tweaks: http://docs.cellblue.nl/2007/03/17/easy-mysql-performance-tweaks/

{ Comments on this entry are closed }

phpMyAdmin doesn’t show the exact size in MB when the size of your database exceeds 1GB. It just shows something like 4.2GB, truncating everything out in the 100MB precision. So is it possible to get the database size in MB in exact numbers using MySQL query?

Yes, it is:

SELECT CONCAT(sum(ROUND(((DATA_LENGTH + INDEX_LENGTH - DATA_FREE) / 1024 / 1024),2))," MB") AS Size FROM INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA like 'YOUR_DATABASE_NAME_HERE%' ;

Just change YOUR_DATABASE_NAME_HERE into the name of your database. And it would prints out something like this:

+------------+
| Size       |
+------------+
| 4906.79 MB |
+------------+
1 row in set (0.05 sec)

You can also get the size of a specific table. Read here: http://www.novell.com/communities/node/8706/check-mysql-database-size-using-sql-query

{ Comments on this entry are closed }

Pure CSS Rounded Corner, No Images

by Yang Yang on March 4, 2011

Though it does look great, I haven’t given it a try until the other day when I was developing smartbabynames.com. Notice the previous and next link immediately below the boy image at here:

http://www.smartbabynames.com/meaning/boy/Langleigh/

They look and feel great.

So here’s what I’ve done to achieve the rounded corner box in pure CSS – no images nor whatever. Just CSS:

.rounded {
	-moz-border-radius: 2px;
	-webkit-border-radius: 2px;
	-khtml-border-radius: 2px;
	border-radius: 2px;
}

Tested across major modern browsers other than IE. Sure it doesn’t work in IE6, but not sure if it works in IE7, IE8, IE9, etc. But let me know if you know.

{ Comments on this entry are closed }

ASCII is the most fundamental character set that has been around since the early days of command line interfaces. There are 128 (0 – 127) most basic ASCII characters such as a-z, A-Z, 0-9, and all the printable punctuations you can type out by a single strike of your keyboard.

As all ASCII characters have an internal decimal value of 0 – 127, which is 0x00 – 0x0F in heximal values, you can find all the non-ASCII characters in my_column by query:

SELECT * FROM my_table WHERE NOT HEX(my_column) REGEXP '^([0-7][0-9A-F])*$';

On the other hand, if you wish to find all records that a certain column (my_column) contains ASCII characters:

SELECT * FROM my_table WHERE HEX(my_column) REGEXP '^([0-7][0-9A-F])*$';

{ Comments on this entry are closed }

www.gmail.com – Tips

by Yang Yang on February 20, 2011

You can login to Gmail.com at here: http://gmail.google.com/

I have been a user of Gmail for almost 6 years. I have some tips on using Gmail.com to share with you that may make your life easier.

You can sign up and get an Gmail account at here: http://www.gmail.com. It’s completely free.

Enable keyboard shortcuts

When you are logged in your gmail account, press Shift + / (Shift key and / key, no plus key) and it’ll open up a very nice semi-transparent dialog that shows a full list of all the keyboard shortcuts available to you in Gmail.

Gmail keyboard shortcuts

Gmail keyboard shortcuts

All the shortcuts are disabled by default. Just click the “Enable” link to enable them all. Now you would have all the nice productivity boosters such as:

  1. c” for quick compose
  2. p” for previous message in conversation, and “n” for next message
  3. *” then “u” to select all unread conversations; “Shift + i” to mark as read
  4. Tab” then “Enter” to immediately send the message
  5. #” to move to trash
  6. r” to reply
  7. f” to forward
  8. etc.

How cool is that!

Make messages from a certain sender to skip inbox

If you use vps web hosting, you may be sent too many emails too frequently about the status (security, load, etc.) of your vps box. If these messages are left in the inbox, it’d be a nightmare because there are just too many of them and it makes you finding a certain message much harder and you would be more likely to miss a legitimate / personal message.

So naturally, we would want those messages to be received and dropped in another custom inbox rather than the main one. For example, all these messages are from root@host.example.com:

  1. Click the top settings cog in the top right corner –>
  2. Mail settings –>
  3. Filters –>
  4. Create a new filter –>
  5. From: root@host.example.com –>
  6. Next Step
  7. Check “Skip the Inbox (Archive it)” and check “Apply the label: New label…”and create a new label (such as “Alerts”) which will be a custom inbox that will be storing all messages coming from root@host.example.com –>
  8. Create Filter (done)

Now all messages coming from root@host.example.com will be automatically stored in the custom inbox “Alerts” rather than the main inbox.

There are a lot more you can do with filters that automatically processes or takes care of messages by certain criteria.

Use search operators to quickly find messages – by sender, recipient, subject, attachment, label, etc.

Just like with Google.com the web search engine, you can use search operators with Gmail.com as well. Some of my most frequently used operators with Gmail are:

  • from:jane@hotmail.com —- Find all messages / communications from jane@hotmail.com
  • to:michael@me.com —- Find all messages you sent to michael@me.com
  • has:attachment subject:samples —- Find all messages that have attachment and have ‘samples’ in the subject
  • filename:office.jpg label:photos is:starred —- Find all starred messages with label ‘photos’ that have the file office.jpg attached

Get to know more operators and examples at here: http://mail.google.com/support/bin/answer.py?answer=7190&hl=en. They will prove to be very handy.

Alias email addresses – you have unlimited sub-emails!

If your email address is ada.monroe@gmail.com, messages delivered to:

ada.monroe+work@gmail.com
ada.monroe+inquiry@gmail.com
ada.monroe+family@gmail.com
...
ada.monroe+******@gmail.com

Are all directed to ada.monroe@gmail.com. With one Gmail account, you have a literally unlimited number of sub-emails.

With filters, this would come out even more powerful because you can set up filters to automatically move, star, forward or apply labels to messages from a certain alias address.

Pre-saved messages / responses to be re-used by a click

If you find yourself typing the same (or almost the same) messages over and over again, you may want to save the message as Canned Responses which is an experimental module that you can turn on for your Gmail account in the Labs (Mail settings -> Labs).

Combining it with filters, you will have a highly automated email system that messages of certain criteria (such as coming from a specific sender or containing an arbitrary phrase) will be automatically replied to with one of your pre-saved responses.

Accessing multiple email accounts (such as from hotmail.com and yahoo.com) within your Gmail account

Other than Gmail itself, you can also set up 3rd party email services so that you can access them all from within Gmail. This would definitely make things simpler and save time if you have quite a few emails from different providers to check everyday. All you need is entering the POP3 server details and giving Gmail the permission to download the messages. Follow this guide for how to set it up, it’s quite easy: http://mail.google.com/support/bin/answer.py?hl=en&answer=21288

Shift-click to instantly select multiple messages

It’s just like what you would do in Windows. If there is a large series of multiple messages that you want selected, just select the first one and then hold Shift key to click the last. So simple and so intuitive.

{ Comments on this entry are closed }

MySQL: How to export a database / table to XML?

by Yang Yang on February 17, 2011

You can export any MySQL database or table into an XML file by the exporting capabilities of phpMyAdmin, the web interface of MySQL. The problem with this approach, however, is that with large tables, you may have to manually export the table more than once by sections into several sequential XML files.

A better approach is by native MySQL command line that will create and store all entries of the table into one XML file:

mysql -u db_user -p db_name --xml -e "SELECT * FROM table_name" > table_name.xml

Wherein db_user is your MySQL server user, db_name is the name of the database and table_name is the name of the table that you would like exported to XML. The resulted XML will be stored in table_name.xml.

Note that this is different from mysqldump in that it’s basically executing a query by the –e switch and output the results in XML (–xml). The results is then directed to the out file rather than displayed on terminal screen.

{ Comments on this entry are closed }

To delete all content in any directory, including all sub-directories and files, I’ve been using this:

rm -rf somedir/*

If it is to delete all content of the current directory:

rm -rf *

However, it turns out ‘rm -rf’ doesn’t remove hidden files such as .htaccess (Files with a name starting with a dot are hidden in Linux). To delete all the hidden files as well, I have to run a 2nd command:

rm -rf .??*

{ Comments on this entry are closed }

Linux: How to ‘find’ and search ONLY text files?

by Yang Yang on January 22, 2011

The ‘find’ command in Linux systems searches through a directory and return files that satisfy certain criteria. For instance, to find a file that contains the string ‘needle text’ in the ‘mydocs’ directory:

find mydocs -type f -exec grep -l "needle text" {} \;

The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.

To achieve this, use this version of the above command:

find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1

I asked the question at stackoverflow.com and peoro came up with this solution. It works great.

Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.

Thus far the best way to do this

find -type f -exec grep -Il . {} \;

Or for a particular needle text:

find -type f -exec grep -Il "needle text" {} \;

The -I option to grep tells it to immediately ignore binary files and the . option along with the -l will make it immediately match text files so it goes very fast.

{ Comments on this entry are closed }

One of my old Internet friends Brad made a very nice online slide that introduces to you some of the exciting new features of PHP 5.3. I’m most interested in namespace that would make coding in a large project and code reuse much easier, especially for people who find keeping naming conventions a challenge in the team.

Here’s the original post and slide: http://bradley-holt.com/2010/11/new-features-in-php-53/

All the new stuff and concepts are presented in a practical manner that’s extremely straightforward. Programmers should find them a breeze to crunch.

{ Comments on this entry are closed }

Was doing something with a regular expression and very oddly the connection keeps being reset every time I refresh the web page.

I tried to narrow down the problematic line by removing the code in functional chunks. Finally it comes down to a preg_match() instance with a small bit in the regular expression that’s accidentally and wrongly typed in caught my attention:

(.+)+

Got rid of the second plus sign:

(.+)

And it’s all right.

{ Comments on this entry are closed }