Just signed up with Mosso Cloud Servers

by Yang Yang on May 24, 2009

Unmanaged hosting provided by Rackspace acquisition Mosso, Cloud Servers can be really a bargain considering their dedicated support and ease of use from the control panel.

The pain lies in the server administration and stuff which I’m still learning my ropes with. But hey, for less than $15 a month, it’s fun!

I’ll throw in a preliminary review after a month or so with them. At the same time, for premium hosting needs, you may want to try mosso Cloud Sites.

{ Comments on this entry are closed }

Back when WordPress was pretty young there’s some loopholes that enable hackers to inject unauthorized and dangerous HTML code into your website pages, thus promoting the distribution of malware that damages the end users computer. I was once there and got penalized by Google for one of my sites. However, they are gentle enough to detect that this might not be my fault but still decided to bring down the overall ranking of all the pages on that site for a while to protect Internet users and notify me.

If you have spotted anything suspicious or sense that your overall site ranking is down, you may want to check it out for sure if your site has been infected with malware or anything else that’s a threat to your site and the visitors.

Just go here: http://www.google.com/safebrowsing/diagnostic?site=example.com

And Google will present you a detailed report of what they have found on your site for the last 90 days.

{ Comments on this entry are closed }

MySQL: Find Duplicate Entry / Rows / Records

by Yang Yang on May 19, 2009

A quick tip – I kind of have the feeling that I’ve covered this before, but whatever – well, just the simple SQL query below will do:

CREATE TABLE temp SELECT id, COUNT(id) FROM `book` GROUP BY isbn, subject HAVING COUNT(id) > 1

Wherein each combination of column isbn and column subject in each of the rows from table book should be unique and you want to find out those duplicate ones. You can simply select without creating a temporary table to store the duplicates, but this would come handy when the book table has like 1 million records.

To do more by not just finding duplicate records but also extracting the unique ones, read this.

{ Comments on this entry are closed }

Today I’m optimizing some MySQL tables with large number of records – 1 million of them, yeah, I know – , it’s simply impossible to deal with such big chunks of data without proper indexing. So there I was, adding a variety of indexes to a few of the columns.

It’s expected it’d take a few hours to complete with such mega size tables, but what’s not expected is that after 2 days, the index adding seems to be still going on as all the tables are not responding to any queries. I can’t even select the first record. It takes *forever*, both from phpmyadmin and shell mysql.

What’s the catch? I’ve no idea. But let’s at least free those poor tables from choking first. So after dropping the index I was trying to add from mysql command line, all the tables returned normal.

Yeah.

DROP INDEX culprit ON bigtable

But the ultimate convict still eludes me. Any ideas? I still need to add those indexes.

{ Comments on this entry are closed }

PHP can execute shell commands, which means it can compress selected files into a archive or zipped package:

exec('tar zcf files.tar.gz file1 file2 file3');

Or make a zip file:

exec('zip files.zip file1 file2 file3');

You can then use PHP to render and supply download of the zip file to your visitor.

{ Comments on this entry are closed }

There are times when you need to store a file (such as one that you sell for profit) outside of the document root of your domain and let the buyers download it via a PHP script so as to hide the real path, web address or URL to that file. Use of this approach enables you to:

  1. Check for permissions first before rendering the file download thus protecting it from being downloaded by unprivileged visitors.
  2. Store the file outside of the web document directory of that domain – a good practice in web security in protecting sensitive and important data.
  3. Count the number of downloads and collect other useful download statistics.

Now the actual tip. Given that you have put the file to be downloaded via the PHP script in place at /home/someuser/products/data.tar.gz, write a PHP file with the following content in it and put it in the web document directory where your site visitors can access:

$path = '/home/someuser/products/data.tar.gz'; // the file made available for download via this PHP file
$mm_type="application/octet-stream"; // modify accordingly to the file type of $path, but in most cases no need to do so

header("Pragma: public");
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: public");
header("Content-Description: File Transfer");
header("Content-Type: " . $mm_type);
header("Content-Length: " .(string)(filesize($path)) );
header('Content-Disposition: attachment; filename="'.basename($path).'"');
header("Content-Transfer-Encoding: binary\n");

readfile($path); // outputs the content of the file

exit();

Now your site visitors can and can only download the protected file via the PHP script.

{ Comments on this entry are closed }

Well, MySQL doesn’t actually have anything called Foreign Key, it’s just my way of saying a alien column from another table.

Let’s say 2 tables are related, 1 is books and the other is subjects. Any book in book table falls into one of the subjects, thus having a subject column referencing the ID of one of the subjects in subject table.

You need to know the number of books from each of the subjects, namely, 243 books in subject Education, 77 books in Science, etc.

Now how do we do this? Simple. Just perform the following SQL query:

SELECT subject.id, COUNT(book.id) FROM subject, book GROUP BY book.subject HAVING book.subject = subject.id

{ Comments on this entry are closed }

As MySQL doesn’t have inherent support for updating more than one rows or records with a single update query as it does for insert query, in a situation which needs us to perform updating to tens of thousands or even millions of records, one update query for each row seems to be too much.

Reducing the number of SQL database queries is the top tip for optimizing SQL applications.

So, is there any other way around with just a few MySQL queries that  equals to millions of single row update queries?

Yes, but you need a temporary table.

Step 1: Create a temporary table

This table should have 2 columns: 1) an ID column that references the original record’s primary key in the original table, 2) the column containing the new value to be updated with.

Creating this table, you can use insert queries as it conveniently inserts more than one rows at a time (with a single query) – actually, as many rows as you need, like 1000 a time.

Step 2: Transfer the new values from the temporary table to the intended table

Use the query below:

UPDATE original_table, temp_table SET original_table.update_column = temp_table.update_column WHERE original_table.id = temp_table.original_id

To transfer all the new values you have in the temporary table to the original table. After this, you have successfully updated the original table with much much less than a million queries, probably just a hundred or so.

UPDATE =============================================

So here’s perhaps a better idea in most of the cases, use the CASE WHEN ELSE END syntax of MySQL to update multiple rows in the options table of your web application:

UPDATE `options` SET `value` = CASE `name`
WHEN 'site_name' THEN 'My Blog'
WHEN 'site_url' THEN 'http://www.example.com'
WHEN 'site_email' THEN 'sam@example.com'
ELSE `value`
END

Make sure you include the ELSE `value` because if you don’t, the other rows not mentioned here will be updated to have empty values.

Update multiple records in a table by `ID` =============================================

Below is an example of updating more than one rows in a table by the id field. The field_name field is updated to the corresponding value: value_1, value_2, or value_3 if the id equals to 1, 2, or 3:

UPDATE `table_name` SET `field_name` = CASE `id`
WHEN '1' THEN 'value_1'
WHEN '2' THEN 'value_2'
WHEN '3' THEN 'value_3'
ELSE `field_name`
END

Yet another way =============================================

There’s also another way of doing this: http://www.kavoir.com/2009/05/mysql-insert-if-doesnt-exist-otherwise-update-the-existing-row.html

Or: INSERT … ON DUPLICATE KEY UPDATE

{ Comments on this entry are closed }

To find / identify a part of an original string and replace it with another string in a SQL query with MySQL, use REPLACE(), the string replace function.

For example, you may want to replace all spaces in one of the varchar or text fields to a single dash:

UPDATE `table` SET `old_field` = REPLACE(`old_field`, ' ', '-')

Or, you create a new column / field out of the old one and:

UPDATE `table` SET `new_field` = REPLACE(`old_field`, ' ', '-')

{ Comments on this entry are closed }

One of the principles of data integrity is to reduce data redundancy as much as possible and wipe out any duplicate database table entries. Another reason for this is that you need to add unique indexing to one or a group of table columns and you need the one column or combination of columns to be unique through out all the rows of the table.

There are basically 2 ways to achieve that:

  1. Identifying duplicate entries, leaving just one of them and deleting the rest.
  2. Identifying only unique table rows and collect them into a new table.

I’ll just go ahead with the 2nd approach.

There are in turn 2 ways to accomplish this task, namely SELECT DISTINCT … FROM … and SELECT … FROM GROUP BY

SELECT DISTINCT … FROM …

Rather intuitively, this SQL query selects all distinct combination of columns from a table and leaving out duplicate ones:

CREATE TABLE new_orders SELECT DISTINCT receipt, quantity, price FROM orders

One thing to note is that you have to leave out the original primary key such as ID so that all rows are only compared for uniqueness on necessary data fields. It is after this creation query is successfully completed when you add a primary key indexing column to the new_orders table.

SELECT … FROM GROUP BY …

GROUP BY clause is usually used for statistical functions of MySQL such as COUNT() or AVG() to only take a distinguished group of columns into calculation. In this situation, however, it comes handy that it inherently excludes multiple table records with the same value in the specified column and includes just one of them in the returned results.

For example, if you need to include only those records from the old table orders that’s with distinct values in column receipt:

CREATE TABLE new_orders SELECT * FROM orders GROUP BY receipt

Now all the records in the new table new_orders are with distinctly unique receipt values.

Depending on whether the original records are referenced by other tables via ID, you can regenerate the ID or leave them in the new table new_orders.

{ Comments on this entry are closed }