1 tip about ALTER TABLE xxx ADD INDEX yyy

by Yang Yang on May 14, 2009

Proper indexing can work wonders for large databases in query efficiency, it’s a compromise of disk storage in exchange for sorting speed. For databases with millions of records, indexing takes considerable amount of disk storage, and, AGONIZING lengths of time to create.

I recently compiled a database with one table mytable consisting of over 4 million records, as apparently I want some columns to be identifiable for each of the rows, I need indexing one or 2 of them to do that.

ALTER TABLE mytable ADD INDEX category

Wherein category is one of the columns by which every record in mytable is referenced to that of another table.

You know how long this short query took me?

4 hours. Much longer than the time it took me to do a unique indexing just before it, on the same table.

So there you go, a general indexing takes much much longer time to finish than unique indexing and primary key indexing.

But that’s not all of the tip today.

The tip is that, before you are inserting all the records one by one into that table which would result in a super huge set of rows, configure that particular column to be generally indexed.

This is no better solution than indexing it all after the table is fully populated though, in that with preexisting table indexes, it takes a bit longer to insert every record as MySQL has to write down redundant information every time it inserts a new row into the table.

But in terms of programmer experience, it’s much better than waiting 4 hours hoping for the indexing to succeed and fearing it would not. I’d rather spread the extra required time across each insertion of the records.

{ Comments on this entry are closed }

Largest U.S. Bankruptcies, 1980–Present

by Yang Yang on May 14, 2009

Just imagine, that much amounts of money just vaporized into thin air at a cause of nothing.

Company Bankruptcy date Total assets pre-bankruptcy (in millions)
Lehman Brothers Holdings, Inc. 9/15/2008 $691,063
Worldcom, Inc.1 7/21/2002 $103,914
Enron Corp.2 12/2/2001 $65,503
Conesco, Inc. 12/17/2002 $61,392
Pacific Gas and Electric Co. 4/06/2001 $36,152
Texaco, Inc. 4/12/1987 $34,940
Financial Corp. of America 9/9/1988 $33,864
Refco Inc. 10/17/2005 $33,333
IndyMac Bancorp, Inc. 7/31/2008 $32,734
Global Crossing Ltd. 1/28/2002 $30,185
Bank of New England Corp. 1/7/1991 $29,773
Calpine Corp. 12/20/2005 $27,216
New Century Financial Corp. 4/2/2007 $26,147
UAL Corp. 12/9/2002 $25,197
Delta Air Lines, Inc. 9/14/2005 $21,801
Adelphia Communications 6/25/2002 $21,499
MCorp. 3/31/1989 $20,228
Mirant Corp. 7/14/2003 $19,415
American Home Mortgage Investment Corp. 8/6/2007 $18,829
NTL, Inc. 5/8/2002 $16,834

1. Worldcom, Inc. assets taken from the audited annual report dated 12/31/2001.

2. The Enron assets were taken from the tax documents filed on 11/19/2001. The company has announced that the financials were under review at the time of filing for Chapter 11.

{ Comments on this entry are closed }

Google AdWords mistake (a bug?)

by Yang Yang on May 13, 2009

Naturally, you’d never expect an AdSense ad of a site to appear on itself. But this I just encountered, indicates an inherent *bug* of AdWords.

adwords bug

Or is it? But I’ll never understand otherwise. What’s the point? Maybe there’s an option for the AdWords advertiser to opt out of his advertisement listing being published on his very own site with AdSense on, but can’t Google just do it automatically. My other guess is that for a very large community site such as a popular forum, there are people (who are members of the community) who wants to advertise to the very audience (mostly members) of the site, sets up a campaign via AdWords targeting them and puts the ad exclusively on the site which is also targeted a part of it.

Anyway, I believe Google should automatically turn this off by default and let the advertiser himself to explicitly turn it on.

{ Comments on this entry are closed }

While you can always fulfill tasks such as creating a new table out of existing ones in MySQL by the help of PHP, it’d be much faster and way more efficient with native SQL queries.

Consider when you need to combine a few tables into one, more precisely, you need just certain fields / columns from each of the table and then piece them together into a new one. You can write a PHP script to do that: 1) read the original records from the tables, 2) insert them one by one with a loop. However the insanely better way is to utilize INSERT INTO … SELECT … FROM … a little more sophisticated as:

INSERT INTO new_table (col1, col2, col3) SELECT old_table_1.col2 as col1, old_table_2.col3 as col2, old_table_3.col1 as col3 FROM old_table_1, old_table_2, old_table_3

{ Comments on this entry are closed }

I’ve been DreamHost since 2006 and that’s more than 3 years of hosting experience with them on a shared plan ($9.95 / month). For those who’d prefer scanning instead of reading, I’ll sum up my points in short lists:

Pros

  1. Affordable price with generous offerings in disk storage and monthly transfer.
  2. Full pack of features – everything you can imagine for LAMP websites, they are there for you to access.
  3. 10 years of operating their own servers makes them extremely technical and the support is always to the point by knowledgeable staff.
  4. Proprietary control panel enables you to do everything without a hassle. Could be a Con though, as it’s too easy to use and somewhat addictive, once you get used to it, you’d be reluctant to switch.
  5. A lot of features are only found on DreamHost such as control panel API, Google Apps (I love it!), Jabber IM, Media Streaming and much more that are all automated and even a novice hoster can take advantage of.
  6. Open discussion forums.
  7. Very well documented wiki specifically for their users with detailed instructions to do everything.

Cons

  1. The average uptime is approximately 99.7% for shared plans – just so so.
  2. Sometimes it’s rather slow accessing your site. Blame the server load and amount of websites hosted on a single server. You can feel that you are in a shared environment without a doubt.
  3. I’ve experienced from 1 minute to 20 hours of response time from DreamHost email support. On average, it’s roughly 5 hours.
  4. Their referral program does work and the 2nd tier 5% forever rewards may successfully lock you with them “forever” because once you leave DreamHost, you leave the lifetime stream of 5% referral rewards. Pretty bad ass, huh?

DreamHost Server Stats Charts

All charts and data are gathered and provided by basicstate.com who pings (Spenser from basicstate.com points out that it’s not just pinging but also a full HTTP pull of the page) one of my sites hosted with DreamHost (shared plan) at 15-minute intervals.

DreamHost Uptime (shared)

dreamhost uptime chart

DreamHost Average Response Time (shared)

dreamhost average response time

DreamHost Total Response Time (shared)

dreamhost total response time

Conclusion

DreamHost (I can only speak of their shared plans which I have experience with) is absolutely worth the money if you are hosting hobbyist or personal sites that are not so mission critical. As a result of their sophisticated technical background, the hosting environment gives you as much freedom as you need yet balanced well in security and overall server well-being. It’s a paradise for web developers and non-technical users at the same time.

I’d especially recommend DreamHost to beginner webmasters for its affordable price and pack of features – you literally can’t find anything that’s offered by competitive hosting companies not in DreamHost’s menu. And, while you can do everything from the control panel, you can still utilize SSH to do advanced stuff. That said, the biggest drawback of DreamHost becoming a fantastic host is the reliability and performance issue that haunts its shared users.

DreamHost will be a terrific journey for you to learn and grow until you have to find a more solid pie to ensure your websites. I might go with their Private Server (Their way of VPS, a little more pricey than their shared plans in my opinion) plans soon, but with the same amount of money why don’t I go with another hosting company and try them out and in addition, to have a few different C-class IPs for the sake of SEO. ;)

Shameless Self Promotion: $60 DreamHost Coupon Code

Well, if you think my review is helpful and feel like going ahead, I’d very much appreciate it if you could use coupon CASH60 to sign up at DreamHost. It gives you an instant $60 discount with all yearly plans.

{ Comments on this entry are closed }

A good practice of increasing SQL efficiency is to reduce the number of separate queries as much as possible. In a situation where you want a row updated if it exists and inserted if it doesn’t may take a newbie MySQL developer to write 2 independent queries, namely:

  1. first, check if the row exists with “SELECT * FROM table WHERE …”
  2. second, if it exists, “UPDATE table SET …”; ir it doesn’t exist, “INSERT INTO table …”

thus loading the server with unnecessary burdens. Checking if a certain record already exists and decide between insert or update can be simply achieved by a single MySQL query:

REPLACE INTO table SET id = 42, foo = 'bar';

REPLACE INTO first checks if the row with id = 42 already exists, if it does, it deletes the row and insert the current one; if it doesn’t, it simply inserts the current record.

Another way is to use ON DUPLICATE KEY UPDATE clause in a paradigm such as INSERT … ON DUPLICATE KEY UPDATE to determine if the record to be inserted can cause a duplicate entry in a UNIQUE index or PRIMARY KEY index, if it doesn’t, it inserts it, if it does, the existing row is updated. The following two statements have the same results:

INSERT INTO table (a, b, c) VALUES (1, 2, 3) ON DUPLICATE KEY UPDATE c = c + 1;
// column ‘a’ is UNIQUE indexed, thus to prevent duplicate UNIQUE column values, the old row (whose a = 1) is updated with a new value of ‘c’
UPDATE table SET c = c + 1 WHERE a = 1;

{ Comments on this entry are closed }

3 rather interesting computer related how to

by Yang Yang on May 2, 2009

Your mouse suddenly got offline by itself and you only got half an hour before the presentation yet the PPT still needs a few adjustments. Thank God you know how to use computer without a mouse.

A retiring computer is hardly environment-neutral, you need to dispose them properly as various parts of it contain toxic chemicals. Basically, you have 4 ways to recycle a computer.

This is the fun part. An essential part of being a good hacker, many may not know, is to be able to guess passwords. People are too easy to use catchy texts or numbers for their passwords, if you know them well enough, chances are you may have guessed it right after a few tries.

{ Comments on this entry are closed }

What is a programming framework?

by Yang Yang on May 2, 2009

Programmers write programs for users so certain processes in the workflow can be automated for efficiency and correctness.

As such, to improve the programming efficiency of similar projects for addressing a common set of related problems, such as the need for content management system in web authoring, some programmers write programming frameworks to integrate the bare bone workflow of a system that acts like a prototype. It is then on the prototype that more details are added and project-specific needs are satisfied.

While you can create a php website from scratch, spinning out code line by line, you may also consider using one of the frameworks such as cakephp and zend framework to accelerate your development if you are familiar with one of them.

However, it’s not all goodness coming out of enjoying such an established foundation. First of all, you have to learn them and some modern frameworks such as Zend Framework are well established that it’s just not easy to master it. While you may spend time learning other people’s framework, you can easily create your own, especially if you have done enough coding and prototyping and have always wanted to be faster in production by reducing repeating stuff, which is the essence of programming frameworks.

Every seasoned programmer has his own library of frameworks, one way or another. If you speak PHP, you may want to read this php framework comparison article by chad.

{ Comments on this entry are closed }

VeriSign, the central registry for COM domains, recently released a new tool called Data Analyzer. Data Analyzer allows you to check the amount of traffic a domain name gets, even if the domain does not exist.

So just the kind of traffic analysis tool that examines the type-in traffic of unregistered domains (.COM). Well, not just type-in. Imagine a domain that has some history but recently got abandoned, wouldn’t it be great to know that? To sum up, you can use this tool to:

  1. Grab valuable unregistered .com domains with decent type-in traffic.
  2. Know mistyped versions of your current domains, those your visitors tried but failed to get to your site.
  3. Hunt for abandoned or dropped domains that have a history and traffic.

The tool has just now been released to domain registrar Dynadot at this address. If you are not already a Dynadot customer, sign up for an account.

Where does the data come from?

Verisign, being the central registry for COM domains, also runs the root name server for COM domains. Each time you type in a domain name in your browser, Verisign’s root name server will receive a query for that domain name. Verisign stores all queries, including those for domains that do not exist, in their massive Data Analyzer database.

How do I use the Data Analyzer?

You can find the Data Analyzer by logging into your Dynadot account and clicking on the “Search” tab and then “Data Analyzer Tool” sub menu.

On this page you can submit a list of domain names that you want to check the traffic for. They will send your domains to Verisign, and the next day you can sign back into your account and check the results. Each domain name will receive a score between 0 and 10. 0 means no traffic, and 10 means lots of traffic.

How can the Data Analyzer help me?

Say for example you own the domain name “acmewidgets.com“. You want to check to see if your customers are mistyping your domain name, and thus not getting to your website. You can submit a list of domains to the Data Analyzer such as:

axmewidgets.com
acmwidgets.com
acmewigets.com

If any of these domains return a high score, then you can register the domain name, and forward the traffic to your correct domain name.

How many domains can I upload?

You can upload as many domains as you want. In fact some customers have already uploaded millions of domains to be analyzed by the Data Analyzer. However, if you upload lots of domains, it may take many days to finish analyzing your domains. Verisign limits the number of domains Dynadot can submit to them each day.

{ Comments on this entry are closed }

Katrina Costedio Steals (KatrinaCostedio.com)

by Yang Yang on April 29, 2009

Katrina Costedio currently lives in Sarasota, Florida and has a small design firm (katrinacostedio.com) that attends to small business needs doing all kinds of design work: print design, card design, logo design and web design. She’s planning to go to San Francisco, California, though.

I worked for her on 2 projects, namely maahs acupuncture clinic site, including PSD to CSS/HTML service and a php contact form; and obrienorganics.com including ALL stuff that’s basically from PSD design to everything – PSD to CSS/HTML, php CMS and shopping cart. Other than these, I have helped her with other sites as well:

  1. some wordpress work for burkethejerk.com
  2. fixing a few css and javascript errors for searchlight-partners.com
  3. there’s some more I can’t remember now…

Katrina Costedio would have been a good lady, if she has kept her words and not stolen my work without paying me a penny.

Below is an email message I planned to send her but understandably as  she might have blacklisted me, I’ll just post it here for all to see:

So obrienorganics.com launches with my code? You know you haven’t paid me a penny for it, so technically it’s all my property and you are stealing. We never had any contract that says if all goes this way, you unconditionally own everything I did.I thought you said you would be losing thousands dollars more to finish what I started. I beg your pardon. It’s you who shut me off from the servers and projects by suddenly changing all the login passwords. I feel stupid and betrayed the next morning when I found out I couldn’t log in any more. Not to mention you never started from scratch again. Instead, you stole my work shamelessly for your own clients. Because I have spent a month on your project and it’s almost done!

Katrina Costedio is stealing.

You crapped my code as unsecure and stolen yet you use it with no shame. What kind of b*tch are you? (I still have all her original words in my mail box.)

Also we haven’t finished up with my “copyright” issues yet, which file did you say I stole from other people and didn’t even bother to remove the copyright information? Ohhh, if it happends to be jquery.js or csshover3.htc, they are licensed under LGPL for crying out loud and almost all sites use them! I can’t believe you were trying to come up with this kind of shit to demonize me? So that you can take my hard work free? At least get a programming class before doing that, katie.

{ Comments on this entry are closed }