The ‘find’ command in Linux systems searches through a directory and return files that satisfy certain criteria. For instance, to find a file that contains the string ‘needle text’ in the ‘mydocs’ directory:
find mydocs -type f -exec grep -l "needle text" {} \;
The problem of this approach is that it would search through ALL files in this directory including the binary ones such as images, executables and zip packages. Sensibly, we would only want to search through text files for a specific string. If there are far too many of binary files in present, it’d be a significant waste of CPU usage and time to get what you want, because it’s totally unnecessary to go through the binary files.
To achieve this, use this version of the above command:
find mydocs -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text | cut -d ':' -f1
I asked the question at stackoverflow.com and peoro came up with this solution. It works great.
Basically, the bold part checks each file’s mime type information and only searches the files that have ‘text’ in its mime type description. According to the Linux ‘file’ command manual, we can be fairly sure that files with ‘text’ in its mime type string are text files AND all text files have ‘text’ in its mime type description string.
Thus far the best way to do this
find -type f -exec grep -Il . {} \;
Or for a particular needle text:
find -type f -exec grep -Il "needle text" {} \;
The -I
option to grep tells it to immediately ignore binary files and the .
option along with the -l
will make it immediately match text files so it goes very fast.
Pingback: Tweets that mention Linux: How to ‘find’ only text files? -- Topsy.com
Pingback: How to Search for Only Text Files
Pingback: How to Search for Only Text Files | Moh Lab.
thanks a lot
grep has an option -I which is what you are looking for
-I Process a binary file as if it did not contain matching data; this is equivalent to the –binary-files=without-match option.