Many developers spend time writing code designed to extract a very specific number of words or characters from a piece of text. This text sample, often drawn from a blog post or comment, is usually displayed with a link that leads the user to read more.
It’s possible to use JavaScript, CSS and server-side languages to create text extracts. For the purposes of this article, I’ll use PHP.
It doesn’t particularly matter how you gain your piece of text. For this example, I’ll use the opening lines of Wikipedia’s entry on Richard III as a corpus:
$string = "<p><strong>Richard III</strong> (2 October 1452 – 22 August 1485) was King of England for two years, from 1483 until his death in 1485 in the Battle of Bosworth Field. He was the last king of the House of York and the last of the Plantagenet dynasty. His defeat at Bosworth Field, the decisive battle of the Wars of the Roses, is sometimes regarded as the end of the Middle Ages in England. He is the subject of the play <cite>Richard III</cite> by <a href=//en.wikipedia.org/wiki/William_Shakespeare>William Shakespeare.</a>"
The first thing I’ll do create a text extract is to remove HTML markup from the text contained in the variable $string
:
$string = strip_tags($string);
With the sample now pure text, I’ll trim it down to a set number of characters:
$string = substr($string, 0, 200);
Next, I’ll ensure that the sample does not end with a comma, exclamation mark, or other punctuation:
$string = rtrim($string, "!,.—);
Finally, I’ll ensure that the extracted text ends with a space (as we don’t want to have the text sample end with a cut-off word) before appending an ellipsis when the text extract is printed on the screen:
$string = substr($string, 0, strrpos($string, ' '));
echo $string."… ";
The result will look something like this:
Richard III (2 October 1452 – 22 August 1485) was King of England for two years, from 1483 until his death in 1485 in the Battle of Bosworth Field. He was the last king of the House of York and the…
While this method is sufficient for most purposes, the extract can be enhanced with other techniques, which I will demonstrate in future articles.
Enjoy this piece? I invite you to follow me at twitter.com/dudleystorey to learn more.