<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kavoir &#187; Regular Expression Tips &amp; Tutorials</title>
	<atom:link href="http://www.kavoir.com/category/programming/regular-expressions/feed" rel="self" type="application/rss+xml" />
	<link>http://www.kavoir.com</link>
	<description>Just another dumbass webmaster, goofing around...</description>
	<lastBuildDate>Thu, 09 Feb 2012 01:59:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>A small mistake in a regular expression caused connection to reset &#8211; (.+)+</title>
		<link>http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html</link>
		<comments>http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html#comments</comments>
		<pubDate>Sat, 04 Dec 2010 06:42:02 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[Regular Expression Tips & Tutorials]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html</guid>
		<description><![CDATA[Was doing something with a regular expression and very oddly the connection keeps being reset every time I refresh the web page. I tried to narrow down the problematic line by removing the code in functional chunks. Finally it comes down to a preg_match() instance with a small bit in the regular expression that&#8217;s accidentally [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Was doing something with a regular expression and very oddly the connection keeps being reset every time I refresh the web page.</p>

<p>I tried to narrow down the problematic line by removing the code in functional chunks. Finally it comes down to a preg_match() instance with a small bit in the regular expression that&#8217;s accidentally and wrongly typed in caught my attention:</p>
<pre><code>(.+)+</code></pre>
<p>Got rid of the second plus sign:</p>
<pre><code>(.+)</code></pre>
<p>And it&#8217;s all right.</p>
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html" rel="bookmark" title="September 29, 2010">Regular Expressions for Natural Numbers or Positive Integers (1, 2, 3, &hellip;), Negative Integers and Non-negative Integers</a></li>
<li><a href="http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html" rel="bookmark" title="December 12, 2009">PHP: Subject String Length Limit of Regular Expression Matching Functions</a></li>
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html" rel="bookmark" title="September 29, 2010">Regular Expression for Date and Time Strings</a></li>
<li><a href="http://www.kavoir.com/2009/01/using-javascript-to-refresh-and-reload-an-iframe-on-main-page.html" rel="bookmark" title="January 1, 2009">Using JavaScript to refresh and reload an iframe on main page</a></li>
<li><a href="http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html" rel="bookmark" title="March 4, 2010">PHP: Check or Validate URL and Email Addresses &ndash; an Easier Way than Regular Expressions, the filter_var() Function</a></li>
</ul>
<p><!-- Similar Posts took 2.198 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Regular Expression for Date and Time Strings</title>
		<link>http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html</link>
		<comments>http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html#comments</comments>
		<pubDate>Wed, 29 Sep 2010 08:40:19 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[PHP Tips & Tutorials]]></category>
		<category><![CDATA[Regular Expression Tips & Tutorials]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html</guid>
		<description><![CDATA[Often we need the users to enter a valid string of date or time in the form. But how do you validate the strings with regular expressions? In PHP, you can use these functions and regular expressions. RegExp and function to validate against date string: // Default: YYYY-MM-DD function isDate($subject, $separator = '-') { return [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Often we need the users to enter a valid string of date or time in the form. But how do you validate the strings with regular expressions? In PHP, you can use these functions and regular expressions.<span id="more-2048"></span></p>
<h4>RegExp and function to validate against date string:</h4>
<pre><code>// Default: <strong>YYYY-MM-DD</strong>
function isDate($subject, $separator = '-') {
	return preg_match('@^\d{4}'.$separator.'(0[1-9]|1[0-2])'.$separator.'(0[1-9]|1[0-9]|2[0-9]|3[0-1])$@', $subject);
}</code></pre>
<h4>RegExp and function to validate against time string:</h4>
<pre><code>// Default: <strong>HH:MM:SS</strong>
function isTime($subject, $separator = ':') {
	return preg_match('@^(0[1-9]|1[0-9]|2[0-4])'.$separator.'(0[1-9]|[1-5][0-9])'.$separator.'(0[1-9]|[1-5][0-9])$@', $subject);
}</code></pre>
<p>If you need to validate against a different format, just change the $separator.</p>
<p>Now that you have the functions to validate date and time, you can combine them to verify date time strings such as <strong>2016-04-30 18:19:05</strong>:</p>
<pre><code>function isDateTime($subject) {
	$subject_array = explode(' ', $subject);
	if (count($subject_array) == 2) {
		return isDate($subject_array[0]) &amp;&amp; :isTime($subject_array[1]) || $subject == '0000-00-00 00:00:00';
	}
	return false;
}</code></pre>
<p>At <a href="http://www.formkid.com/">Form Kid</a>, these are functions I use for fields that need validation of the date and time.<br />
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html" rel="bookmark" title="September 29, 2010">Regular Expressions for Natural Numbers or Positive Integers (1, 2, 3, &hellip;), Negative Integers and Non-negative Integers</a></li>
<li><a href="http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html" rel="bookmark" title="December 12, 2009">PHP: Subject String Length Limit of Regular Expression Matching Functions</a></li>
<li><a href="http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html" rel="bookmark" title="March 4, 2010">PHP: Check or Validate URL and Email Addresses &ndash; an Easier Way than Regular Expressions, the filter_var() Function</a></li>
<li><a href="http://www.kavoir.com/2009/04/php-getting-the-current-date-and-time.html" rel="bookmark" title="April 22, 2009">PHP: Getting the Current Date and Time</a></li>
<li><a href="http://www.kavoir.com/2009/07/instantly-boost-sql-query-efficiency-of-regexp-or-rlike-by-2000.html" rel="bookmark" title="July 23, 2009">Instantly Boost SQL Query Efficiency of REGEXP or RLIKE by 2000%</a></li>
</ul>
<p><!-- Similar Posts took 2.311 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Regular Expressions for Natural Numbers or Positive Integers (1, 2, 3, &#8230;), Negative Integers and Non-negative Integers</title>
		<link>http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html</link>
		<comments>http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html#comments</comments>
		<pubDate>Wed, 29 Sep 2010 07:18:30 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[PHP Tips & Tutorials]]></category>
		<category><![CDATA[Regular Expression Tips & Tutorials]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html</guid>
		<description><![CDATA[When I&#8217;m developing the online form creator that enables the users to create form fields that accept only certain type of numbers, I need to verify if a given string is a valid natural number such as 1, 2, 3, 4, …. I’m writing the code / functions in PHP but you can literally use [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>When I&#8217;m developing the <a href="http://www.formkid.com/">online form creator</a> that enables the users to create form fields that accept only certain type of numbers, I need to verify if a given string is a valid natural number such as 1, 2, 3, 4, …. I’m writing the code / functions in PHP but you can literally use the regular expression in other programming languages as well. I use the following function to distinguish strings if they are natural numbers or positive integers.</p>

<pre><code>function isNaturalNumber($subject) {
	return preg_match('|<strong>^[1-9][0-9]*$</strong>|', $subject);
}</code></pre>
<p>You can add for a leading plus sign as well:</p>
<pre><code>^+?[1-9][0-9]*$</code></pre>
<h3>Regular Expression for Negative Integers?</h3>
<p>Negative integers are –1, –2, –3, …. Just add a minus sign before the regular expression for positive integers:</p>
<pre><code>^-[1-9][0-9]*$</code></pre>
<h3>Regular Expression for Non-negative Integers?</h3>
<p>That is, 0, 1, 2, 3, 4, …. By a little help of the isNaturalNumber function, you can use this function to check if a string is a legal non-negative integer:</p>
<pre><code>function isNonNegativeInteger($subject) {
	// @^(0|[1-9][0-9]*)$@
	if ($subject == '0' || isNaturalNumber($subject)) {
		return true;
	}
}</code></pre>
<p>Or if you insist on using a regular expression:</p>
<pre><code>function isNonNegativeInteger($subject) {
	return preg_match('@<strong>^(0|[1-9][0-9]*)$</strong>@', $subject);
}</code></pre>
<h3>PHP functions to check if a string is a valid integer?</h3>
<p>Just use the above functions in combination or the native <strong>is_integer</strong>() function of PHP.</p>
<pre><code>function isInteger() {
	return isNegativeInteger($subject) || isNonNegativeInteger($subject);
}</code></pre>
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html" rel="bookmark" title="September 29, 2010">Regular Expression for Date and Time Strings</a></li>
<li><a href="http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html" rel="bookmark" title="December 12, 2009">PHP: Subject String Length Limit of Regular Expression Matching Functions</a></li>
<li><a href="http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html" rel="bookmark" title="December 4, 2010">A small mistake in a regular expression caused connection to reset &ndash; (.+)+</a></li>
<li><a href="http://www.kavoir.com/2009/04/php-number-format-function-to-format-numbers-in-php-integer-or-float.html" rel="bookmark" title="April 23, 2009">PHP: Number Format Function to Format Numbers in PHP (Integer or Float)</a></li>
<li><a href="http://www.kavoir.com/2010/09/php-true-empty-function-to-check-if-a-string-is-empty-or-zero-in-length.html" rel="bookmark" title="September 29, 2010">PHP: True empty() function to check if a string is empty or zero in length</a></li>
</ul>
<p><!-- Similar Posts took 3.605 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PHP: Check or Validate URL and Email Addresses &#8211; an Easier Way than Regular Expressions, the filter_var() Function</title>
		<link>http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html</link>
		<comments>http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html#comments</comments>
		<pubDate>Thu, 04 Mar 2010 02:55:59 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[Information Security]]></category>
		<category><![CDATA[PHP Tips & Tutorials]]></category>
		<category><![CDATA[Regular Expression Tips & Tutorials]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html</guid>
		<description><![CDATA[To check if a URL or an email address is valid, the common solution is regular expressions. For instance, to validate an email address in PHP, I would use: if (preg_match('&#124;^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$&#124;i', $email)) { // $email is valid } A simpler and more forgiving one would be: &#124;^\S+@\S+\.\S+$&#124; Which is usually quite enough for signup forms [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>To check if a URL or an email address is valid, the common solution is regular expressions. For instance, to validate an email address in PHP, I would use:</p>

<pre><code>if (preg_match('<strong>|^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$|i</strong>', $email)) {
	// $email is valid
}</code></pre>
<p>A simpler and more forgiving one would be:</p>
<pre><code>|^\S+@\S+\.\S+$|</code></pre>
<p>Which is usually quite enough for signup forms in preventing stupid typo errors. You get to validate the email by a validation link sent to the address anyway, as a final call whether the address is valid or not. For those who are obsessively curious, <a href="http://ex-parrot.com/~pdw/Mail-RFC822-Address.html">this</a> may serve you well.</p>
<p>For URL, you can use this one:</p>
<pre><code>|^\S+://\S+\.\S+.+$|</code></pre>
<p>Or you can use one that is insanely detailed in addressing <a href="http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url">what a valid URL should be</a>.</p>
<h2>The filter_var() function of PHP5</h2>
<p>What we are talking about here really is the <a href="http://us2.php.net/filter_var">filter_var</a>() function of PHP5 that simplifies the URL and email validation by a large degree. To validate an email:</p>
<pre><code>if (<strong>filter_var</strong>($email, <strong>FILTER_VALIDATE_EMAIL</strong>) !== false) {
	// $email contains a valid email
}</code></pre>
<p>To validate a URL:</p>
<pre><code>if (<strong>filter_var</strong>($url, <strong>FILTER_VALIDATE_URL</strong>) !== false) {
	// $url contains a valid URL
}</code></pre>
<p>While filter_var() is meant to return the filtered results of the input according to the filter type specified, such as FILTER_VALIDATE_EMAIL or FILTER_VALIDATE_URL, you can generally use it to see if a valid email or a valid URL can be extracted from something. Better yet, filter and get the results first, use the result if it is good or abandon it when it is false:</p>
<pre><code><strong>$filtered_email</strong> = filter_var($email, FILTER_VALIDATE_EMAIL);
if (<strong>$filtered_email</strong> !== false) {
	// $filtered_email is the valid email got out of $email
} else {
	// nothing valid can be found in $email
}</code></pre>
<p>Same applies to FILTER_VALIDATE_URL. Here’s a full list of <a href="http://us2.php.net/manual/en/filter.filters.php">filter types</a> of filter_var() you can take advantage of.</p>
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html" rel="bookmark" title="September 29, 2010">Regular Expression for Date and Time Strings</a></li>
<li><a href="http://www.kavoir.com/2010/03/php-how-to-detect-get-the-real-client-ip-address-of-website-visitors.html" rel="bookmark" title="March 4, 2010">PHP: How to detect / get the real client IP address of website visitors?</a></li>
<li><a href="http://www.kavoir.com/2009/04/php-check-if-a-string-contains-another-string-or-substring.html" rel="bookmark" title="April 23, 2009">PHP: Check if a string contains another string or substring</a></li>
<li><a href="http://www.kavoir.com/2010/02/php-allow-specific-html-tags-in-text-input-controls-of-html-forms-textarea-input-typetext.html" rel="bookmark" title="February 15, 2010">PHP: Allow Specific HTML Tags in Text Input Controls of HTML Forms, &lt;textarea&gt;, &lt;input type=&rdquo;text&rdquo; /&gt;</a></li>
<li><a href="http://www.kavoir.com/2010/08/how-to-get-all-the-sub-directories-of-a-given-directory-in-php.html" rel="bookmark" title="August 1, 2010">How to get all the sub-directories of a given directory in PHP?</a></li>
</ul>
<p><!-- Similar Posts took 4.363 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2010/03/php-check-or-validate-url-and-email-addresses-an-easier-way-than-regular-expressions-the-filter_var-function.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>PHP: Subject String Length Limit of Regular Expression Matching Functions</title>
		<link>http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html</link>
		<comments>http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html#comments</comments>
		<pubDate>Sat, 12 Dec 2009 02:35:45 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[PHP Tips & Tutorials]]></category>
		<category><![CDATA[Regular Expression Tips & Tutorials]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/?p=1542</guid>
		<description><![CDATA[Here&#8217;s a quick tip for those who have encountered this very same problem that all regular expression functions of PHP such as preg_match() and preg_replace() stop working when the input string (subject string to be searched or matched) is too long or large. If you believe your regular expressions should work but didn&#8217;t and the [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Here&#8217;s a quick tip for those who have encountered this very same problem that all <strong>regular expression functions</strong> of PHP such as preg_match() and preg_replace() stop working when the input string (subject string to be searched or matched) is too long or large. If you believe your regular expressions should work but didn&#8217;t and the string to be searched is perhaps over 100kB in length, you have hit the match string length limit or <a href="http://www.php.net/manual/en/pcre.configuration.php#ini.pcre.backtrack-limit">PCRE&#8217;s backtracking limit</a> set by configuration variable <strong>pcre.backtrack_limit</strong>.</p>

<p>To solve this issue and lift the length limit, to perhaps 10 times the original, you have to reset the default value of <a href="http://www.php.net/manual/en/pcre.configuration.php#ini.pcre.backtrack-limit">pcre.backtrack_limit</a> in one of the following ways:</p>
<ol>
<li>If you are using cPanel, create a text file named <strong>php.ini</strong> and put it in the directory wherein you need to break the limit. Append this line in the file:<br />
<code>pcre.backtrack_limit = 1000000</code></li>
<li>If you operate your own dedicated / vps server, modify php.ini and put this line at the end of the file:<br />
<code>pcre.backtrack_limit = 1000000</code><br />
Refer to this article to <a href="http://www.kavoir.com/2009/06/where-is-phpini-located.html">find out where your php.ini is</a>.</li>
<li>Use runtime configuration function <a href="http://php.net/ini_set">ini_set</a>() to set it at runtime:<br />
<code>ini_set('pcre.backtrack_limit', 1000000)</code></li>
</ol>
<p>This seems to be only affecting PHP 5.2.<br />
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-date-and-time-strings.html" rel="bookmark" title="September 29, 2010">Regular Expression for Date and Time Strings</a></li>
<li><a href="http://www.kavoir.com/2010/12/a-small-mistake-in-a-regular-expression-caused-connection-to-reset.html" rel="bookmark" title="December 4, 2010">A small mistake in a regular expression caused connection to reset &ndash; (.+)+</a></li>
<li><a href="http://www.kavoir.com/2010/09/regular-expression-for-natural-numbers-or-positive-integers-1-2-3-11-12.html" rel="bookmark" title="September 29, 2010">Regular Expressions for Natural Numbers or Positive Integers (1, 2, 3, &hellip;), Negative Integers and Non-negative Integers</a></li>
<li><a href="http://www.kavoir.com/2009/05/how-to-change-vim-syntax-highlighting-colors.html" rel="bookmark" title="May 26, 2009">How to change Vim syntax highlighting colors?</a></li>
<li><a href="http://www.kavoir.com/2010/07/how-to-bring-down-optimize-memory-usage-in-your-unmanaged-linux-vps-box-and-avoid-oom-out-of-memory-errors.html" rel="bookmark" title="July 1, 2010">How to bring down / optimize memory usage in your unmanaged Linux VPS box and avoid OOM (Out Of Memory) errors?</a></li>
</ul>
<p><!-- Similar Posts took 4.559 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2009/12/php-regular-expression-matching-input-subject-string-length-limit.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PHP: Generating Summary Abstract from A Text or HTML String, Limiting by Words or Sentences</title>
		<link>http://www.kavoir.com/2009/02/php-generating-summary-abstract-from-a-text-or-html-string-limiting-by-words-or-sentences.html</link>
		<comments>http://www.kavoir.com/2009/02/php-generating-summary-abstract-from-a-text-or-html-string-limiting-by-words-or-sentences.html#comments</comments>
		<pubDate>Sat, 28 Feb 2009 06:53:38 +0000</pubDate>
		<dc:creator>Yang Yang</dc:creator>
				<category><![CDATA[PHP Tips & Tutorials]]></category>
		<category><![CDATA[Regular Expression Tips & Tutorials]]></category>
		<category><![CDATA[WordPress How To]]></category>

		<guid isPermaLink="false">http://www.kavoir.com/2009/02/php-generating-summary-abstract-from-a-text-or-html-string-limiting-by-words-or-sentences.html</guid>
		<description><![CDATA[On index or transitional pages, such as homepage or category pages of WordPress, you don’t want to show the full texts of your deep content pages yet but just a content snippet of the first few sentences or words as a summary with a read more link to the actual article. This is generally good [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>On index or transitional pages, such as homepage or category pages of WordPress, you don’t want to show the full texts of your deep content pages yet but just a content snippet of the first few sentences or words as a summary with a read more link to the actual article.</p>

<p>This is generally good in terms of SEO as it reduces <strong>duplicate content</strong> on your site and increases <strong>page views</strong>. With WordPress you can simply achieve this by using a plugin named <a href="http://www.thunderguy.com/semicolon/wordpress/evermore-wordpress-plugin/">Evermore</a>. However, with a home made CMS to select and display content abstracts, you will have to code a little bit on your own.</p>
<p>While you may be better off doing this with a plain SQL which I’m not an expert in, I’ll let in a little trick of PHP to accomplish the same task here.</p>
<h5>Full HTML Text</h5>
<p> <code>$text = &lt;&lt;&lt;TEXT    <br />I wrote a <strong>&lt;a href=&quot;#&quot;&gt;</strong>blog post<strong>&lt;/a&gt;</strong> yesterday about Chinese web design fonts. What did you think? It appeared that many are very interested. I guess it's the language barriers and cultural differences that make the westerners eager to know more about us. All right then, let me write more about that and maybe start a <strong>&lt;strong&gt;</strong>brand new domain<strong>&lt;/strong&gt;</strong> for it. Stay tuned!     <br />TEXT;</code><br />
<h5>The Problem – select first sentences</h5>
<p>Select and display the <strong>first 3 sentences</strong> (max) of the full HTML text above.</p>
<h5>The Solution</h5>
<pre><code>&lt;?php
preg_match('/^([^.!?]*[\.!?]+){0,<strong>3</strong>}/', <strong>strip_tags</strong>($text), $abstract);
echo $abstract[0];
?&gt;</code></pre>
<p>Output:</p>
<p><code>I wrote a blog post yesterday about Chinese web design fonts. What did you think? It appeared that many are very interested.</code> </p>
<p>Stripping out HTML tags for the summary is to prevent it from producing invalid HTML snippets as it’s possible that the process slices HTML elements in half, leaving just part of the tag or only the beginning tag there. However, you can always preserve tags in the abstract, with a slightly more sophisticated algorithm of course.</p>
<h5>Another Problem – select first words</h5>
<p>You want to distill an abstract of the <strong>first 30 words</strong> instead of sentences concluded by period punctuations such as ‘.’, ‘!’ and ‘?’.</p>
<h5>The Solution</h5>
<p>Simply modify the regular expression to:</p>
<p><code>/^([^.!?<strong>\s</strong>]*[\.!?<strong>\s</strong>]+){0,<strong>30</strong>}/</code> </p>
<p>Output:</p>
<p><code>I wrote a blog post yesterday about Chinese web design fonts. What did you think? It appeared that many are very interested. I guess it's the language barriers and cultural</code> </p>
<p>There&#8217;s an incomplete sentence so you may want to add a trailing of &#8216;&#8230;&#8217; at the end to denote the abstract nature.</p>
<p>In regular expressions, \s stands for all sorts of white spaces including <strong>single-byte space</strong>, <strong>tab</strong> and <strong>new line</strong>.</p>
<h3>Related Posts:</h3>
<ul class="similar-posts">
<li><a href="http://www.kavoir.com/2010/02/php-allow-specific-html-tags-in-text-input-controls-of-html-forms-textarea-input-typetext.html" rel="bookmark" title="February 15, 2010">PHP: Allow Specific HTML Tags in Text Input Controls of HTML Forms, &lt;textarea&gt;, &lt;input type=&rdquo;text&rdquo; /&gt;</a></li>
<li><a href="http://www.kavoir.com/2009/03/css-align-right-make-text-or-image-aligned-right-in-html-page.html" rel="bookmark" title="March 2, 2009">CSS: Align Right – Make text or image aligned right in HTML page</a></li>
<li><a href="http://www.kavoir.com/2007/03/spin-your-first-web-page.html" rel="bookmark" title="March 29, 2007">Create your first web page &#8211; Learn XHTML and Make Web pages</a></li>
<li><a href="http://www.kavoir.com/2009/08/html-tags-design-for-template-theme-creation.html" rel="bookmark" title="August 27, 2009">HTML Tags Design for Template / Theme Creation</a></li>
<li><a href="http://www.kavoir.com/2009/08/how-to-display-html-code-on-a-web-page.html" rel="bookmark" title="August 4, 2009">How to display HTML code on a web page?</a></li>
</ul>
<p><!-- Similar Posts took 4.875 ms --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kavoir.com/2009/02/php-generating-summary-abstract-from-a-text-or-html-string-limiting-by-words-or-sentences.html/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

