Categories
Linux Server Administration Tips

Linux: How to ‘find’ and search ONLY text files?

The ‘find’ command in Linux systems searches through a directory and return files that satisfy certain criteria. For instance, to find a file that contains the string ‘needle text’ in the ‘mydocs’ directory:

find mydocs -type f -exec grep -l "needle text" {} \;

The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.

To achieve this, use this version of the above command:

find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1

I asked the question at stackoverflow.com and peoro came up with this solution. It works great.

Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.

Thus far the best way to do this

find -type f -exec grep -Il . {} \;

Or for a particular needle text:

find -type f -exec grep -Il "needle text" {} \;

The -I option to grep tells it to immediately ignore binary files and the . option along with the -l will make it immediately match text files so it goes very fast.

Categories
Apache Web Server Tutorials & Tips PHP Tips & Tutorials SQL / MySQL Tips and Tutorials

How to bring down / optimize memory usage in your unmanaged Linux VPS box and avoid OOM (Out Of Memory) errors?

The other day I was very upset about some extraordinary down times of my unmanaged VPS box at Linode. As it’s unmanaged, support staff at Linode are not responsible for the failures. I contacted them and they told me it’s OOM (Out Of Memory), pointing me to the right documentation to figure out how to get the problem sorted out myself. After a few tweaks and observations for a week, so it seems that I have successfully optimized my VPS server to take on more traffic with less resources such as RAM.

The problem almost always lies in where the user is free to feed stuff to your website or program. Sometimes Convert Hub spikes in memory usage and forces my box to use swap that relies on disk I/O to work. This happens when someone uploads an ultra large picture to be processed or converted. While I may restrict the size of the picture that is allowed to be uploaded, I may also do the following settings to optimize the entire LAMP environment so the other websites enjoy it as well.

Apache 2 Low-Memory Optimization

Use this command to identify the MPM you are using:

apache2 -V | grep 'MPM' # for Debian-based systems
httpd -V | grep 'MPM' # for Fedora/CentOS systems

Find and change these settings in your Apache 2 configuration file (usually found at /etc/apache2/apache2.conf):

StartServers 1
MinSpareServers 3
MaxSpareServers 6
ServerLimit 24
MaxClients 24
MaxRequestsPerChild 3000

Switch to Lighttpd or Litespeed if possible.

MySQL Low-Memory Optimization

Same as above, find and change these settings of the MySQL configuration file (may be at /etc/mysql/my.cnf) accordingly:

key_buffer = 16K
max_allowed_packet = 1M
thread_stack = 64K
table_cache = 4
sort_buffer = 64K
net_buffer_length = 2K

PHP Low-Memory Optimization

Find your PHP configuration file (php.ini) and modify the PHP script memory limit to 32M or less (default is 128M):

memory_limit = 32M
Categories
.htaccess Tutorials & Tips Information Security

Use .htaccess to allow access only from a single HTTP referrer

Sometimes you want the user to access something (a web page or a downloadable file) only by clicking a link on your own website instead of being able to directly access it by typing in the URL address in the browser address bar. This is achievable by a few lines in .htaccess.

RewriteEngine On
RewriteCond %{HTTP_REFERER} !(www.)?example.com/download-page.php
RewriteRule .* - [F]

Write down the above lines in the .htaccess of the directory that you want users to access only by clicking links on http://www.example.com/download-page.php or http://example.com/download-page.php. Direct access to download stuff from the directory or from any other HTTP referrer will fail.

While this may not be bullet proof as referral information can be faked from the client side, it is a simple solution that should suffice in most cases. For example, this can be used to prevent hot linking from other websites that link directly to something on your website, reducing traffic stealing.

Categories
.htaccess Tutorials & Tips Information Security PHP Tips & Tutorials

Turn off and disable magic_quotes_gpc in .htaccess

It’s not only insecure but it inconveniently commands the use of PHP function stripslashes() every time you pull something from the database or when you get something from the client side. While most of the hosts out there are using factory settings of PHP that turn off magic_quotes_gpc by default, there are a few that don’t.

The value of magic_quotes_gpc cannot be set with the ini_set() function after PHP 4.2.3, some hosts enable custom php.ini in your home directory which you can use to set magic_quotes_gpc to 0 (zero) or false. Otherwise, you’d have to resort to .htaccess to set the PHP configuration values for your local directories.

To turn off magic_quotes and magic_quotes_gpc off in .htaccess, simply put these lines in the .htaccess file of your site / directory wherein you want magic_quotes or magic_quotes_gpc disabled:

php_value magic_quotes 0
php_flag magic_quotes off
php_value magic_quotes_gpc 0
php_flag magic_quotes_gpc off
Categories
Hosting Tips & Deals Linux Server Administration Tips

Use stat command to display file system meta information of any file or directory under Linux

PHP has a stat() function that returns an array containing the meta information of a file such as owner, size, time of last access, last modification or last change. It’s basically the stat command under Linux that returns and shows the file system meta information of any file or directory:

stat myfile.txt

Which returns:

  File: `myfile.txt'
  Size: 1707            Blocks: 8          IO Block: 4096   regular file
Device: 811h/2065d      Inode: 96909802    Links: 1
Access: (0644/-rw-r--r--)  Uid: (1354144/    voir)   Gid: (255747/pg940032)
Access: 2010-02-16 08:00:00.000000000 -0800
Modify: 2010-02-18 04:16:51.000000000 -0800
Change: 2010-02-18 04:16:51.000000000 -0800

To get the meta information of the current working directory:

stat .

Which returns:

  File: `.'
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 811h/2065d      Inode: 96904945    Links: 4
Access: (0755/drwxr-xr-x)  Uid: (1354144/    voir)   Gid: (255747/pg940032)
Access: 2009-08-31 17:07:16.000000000 -0700
Modify: 2009-12-20 05:18:57.000000000 -0800
Change: 2009-12-20 05:18:57.000000000 -0800
Categories
Hosting Tips & Deals Linux Server Administration Tips

Linux: How to open and extract an RAR zipped file and unrar the archive?

Funny I should use “zipped” for an RAR compressed package. Anyway, you can easily zip or unzip a zip file or tar compress a package, but how does one do it with an RAR file? WinRAR is well distributed across all Windows systems. But on Linux, you have to first install the command package rar.

However, if your host has been around for quite some time such as DreamHost, you will not need to install it yourself as it’s come with the system. Just fire up this command to unrar any RAR archives:

rar x myfiles.rar

Which will then extract all the data from myfiles.rar into the current working directory.

There are other commands you can rely on to achieve the same task though, depending on your host and the server distribution. For example, you may have unrar instead of rar. Other than these, you may also find RAR related installation packages on Debian and Ubuntu by:

aptitude search unrar

It will search and show you related available packages:

p   unrar-free                           - Unarchiver for .rar files

Which is another utility to unrar any RAR files on Linux. Just install it by aptitude install unrar-free and use to unpackage the compressed RAR.

Categories
HTTP Tips & Tutorials Linux Server Administration Tips

scp, rsync: Transfer Files between Remote Servers via SSH

Chances are you have a bunch of different hosts that are housing your website files, for the sake of data safety (never put all eggs in a single basket) and possibly some SEO advantage. If that is the case, you will infrequently come to the need to move some files from one host server to another. How does one do that?

Well the straight answers include downloading the files from the source host and then uploading it to destination one via FTP. It’s not much of a time-waster with small number of files, especially those small in size. However, if it’s an impressively large chunk of package, say, 4GB, or thousands of files, this’d be quite a daunting job that may very well take the better part of your day or even a few days.

The shortcut is to transfer those files directly from the original host to the other, via SSH. That is of course, if you have both hosts enabled with SSH.

scp Command

Log into the destination host via SSH and try the following command:

scp -r [email protected]:/home/remoteuser/dir-to-be-transferred/. /home/localuser/backup

Wherein remote.host.com is the address of the source host and remoteuser is the SSH user (shell user) account that can read the remote directory to be transferred, namely /home/remoteuser/dir-to-be-transferred. The last argument is the local path that’s receiving the incoming files / directory.

The dot at the end of dir-to-be-transferred makes sure that all hidden files such as .htaccess are copied as well. Without the current directory sign (dot), hidden files are NOT copied by default.

You can also transfer a specific file:

scp [email protected]:/home/remoteuser/mybackup.tar.gz /home/localuser/backup

As a matter of fact, scp works the exactly same way as an ordinary cp command except it’s able to copy files back and forth remote hosts. The “s” of “scp” stands for safe, because all the data transferred is encrypted on SSH.

It’s a great way to back up your valuable website data across multiple different hosts that are physically far away from each other. With the help of crontab jobs that do the regular backups automatically, this is even better than some of the commercial backup services.

rsync Command

The command of rsync is a more preferable option to scp for synchronizing stuff across different hosts because it compares differences and works incrementally, thus saving bandwidth, especially with large backups. For examples,

rsync -av --progress [email protected]:/home/remoteuser/dir-to-be-transferred /home/localuser/backup

This would copy and transfer the directory dir-to-be-transferred with all its content into backup so that dir-to-be-transferred is a sub-directory of backup.

rsync -av --progress [email protected]:/home/remoteuser/dir-to-be-transferred/. /home/localuser/backup

With an extra /. at the end of the source directory, only the content of the directory dir-to-be-transferred are copied and transferred into backup. Thus all the content of the directory dir-to-be-transferred are now immediate children of backup.

To make the transfer of a very large file resume-able, use the -P switch which automatically includes –progress:

rsync -avP [email protected]:/home/remoteuser/large-file.ext /home/localuser/backup

So when the transfer is interrupted, run the same command again and rsync would automatically continue at the break point.

To specify the SSH port, such as 8023, just add:

 --rsh='ssh -p8023'

rsync automatically takes care of all hidden files, so there’s no need to add a dot at the end of the source directory.

To exclude a specific directory from being synchronized:

 --exclude 'not/being/transferred'
Categories
Linux Server Administration Tips

Linux: Find files changed or modified within xx day or older than xx day

One of the utility commands of Linux that you should know in the first day of your Linux learning seminar is find.

To search recursively in the directory somedir for files changed / created / modified within 1 day:

find somedir -ctime -1

Or within 5 days:

find somedir -ctime -5

To search recursively in the directory somedir for files changed changed / created / modified more than 1 days ago:

find somedir -ctime +1

Or more than 5 days ago:

find somedir -ctime +5

To search recursively in the directory somedir for files modified within 1 day:

find somedir -mtime -1

To search recursively in the directory somedir for files modified more than 1 days ago:

find somedir -mtime +1

Or more than 2 weeks ago:

find somedir -mtime +14
Categories
Hosting Tips & Deals Linux Server Administration Tips

Linux: Change Directory or CD to the Previous Directory / Last Path

cd is the command in Linux to change the current working directory. While you can change to your home directory by cd ~, you can change to the previous directory or last directory you were in by:

cd -

Which would come very handy when you are working across multiple directories back and forth. To change to the second last directory in the path history, simply add a slash:

cd --
Categories
Hosting Tips & Deals Linux Server Administration Tips

Linux: Check how much disk storage each directory takes up (Disk Usage command – du)

The Linux command du stands for disk usage which is used to check the amount of disk storage any particular directory or file is using. By default, the simple command:

du

Would return the disk usage in God-knows-what-unit of each of the directories in the current working directory and those beneath them – in a recursive manner. If you happen to have lots of them, the returned stats would be scrolling down crazily which barely makes it any useful.

Even if you have specified a specific directory such as "somedir":

du somedir

It still works in this uncomfortable way.

The solution is to use the -sh switch, the one switch a beginner will ever need:

du -sh

Which simply returns the amount of disk space the current directory and all those stuff in it are using as a whole, something like:

2.4G

Much much more intuitive and readable.

By:

du -sh somedir

You can find out how much disk storage directory "somedir" is using:

101M    somedir

To get all the subsequent / child directories disk usage from the current directory, simply use the asterisk:

du -sh *

It will then list the disk usage of all of them (but not recursively) one by one in a very readable manner:

8.0K    dir1
1.4G    dir2
135M    dir3