Back in a Bit

Pentametron

Pentametron, my twitter poetry engine, is now online! An experiment in finding inadvertent art in the internet’s endless outpouring of language, pentametron automatically collects twitter posts that happen to be in iambic pentameter. It processes about five million tweets per day, and finds a few dozen iambic lines in that time.

Words we don’t say

In 1997, when I was first hired at New York magazine, Kurt Andersen, now a best-selling novelist and radio-show host, had just been fired as editor. Everybody was grieving about this, though not me, since I wouldn’t have had a job there otherwise. And though it wasn’t until years later that I even met Kurt, he unwittingly left me a gift: tacked to the bulletin board in the office I took over was a single page titled ‘Words We Don’t Say’. It contained, as you might surmise, words and phrases that Kurt found annoying and didn’t want used in his magazine. Just yesterday, I rescued it from a bunch of old office stuff that I was throwing out, and I have to say, 14 years later, it’s still a pretty useful list of phony-baloney vocabulary that editors are well-advised to excise from stories.

The origins of abc

We see it every day on signs, billboards, packaging, in books and magazines; in fact, you are looking at it now the Latin or Roman alphabet, the worlds most prolific, most widespread abc. Typography is a relatively recent invention, but to unearth the origins of alphabets, we will need to travel much farther back in time, to an era contemporaneous with the emergence of (agricultural) civilisation itself.

Watch your language — most of you are wrong

Google is usually great for helping sort out uses of English, so you can check the difference between a pedaller and a peddler — though that doesn’t stop Guardian journalists getting it wrong, of course. But there are times when the majority of people get things wrong. In today’s Guardian, Patrick Barkham reports that “according to the Oxford English Corpus, a database of a billion words, dozens of traditional phrases are now more commonly misspelled than rendered correctly in written English.”

Porter Stemming Algorithm

The Porter stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.