The zesty sauce of After the Deadline is our language model. We use our language model to improve our spelling corrector, filter ill-fitting grammar checker suggestions, and even detect if you us...
https://blog.afterthedeadline.com/2010/07/20/after-the-deadline-bigram-corpus-our-gift-to-you/
I spent Sunday at the Computational Linguistics and Writing Workshop on Writing Processes and Authoring Aids held at NAACL-HLT 2010. There I presented After the Deadline. After the Deadline is a...
https://blog.afterthedeadline.com/2010/06/09/the-design-of-a-proofreading-software-service/
Before we begin: Did you notice my fancy and SEO friendly post title? Linguists refer to misused words as real word errors. When I write about real word errors in this post, I’m really referrin...
https://blog.afterthedeadline.com/2010/04/09/measuring-the-real-word-error-corrector/
I found an old screenshot today and thought I’d share it to give you an idea of (1) how bad my design eye is and (2) some history of After the Deadline. After the Deadline started life as a web...
https://blog.afterthedeadline.com/2010/03/19/humble-origins-polishmywriting-com/
One of the challenges with most natural language processing tasks is getting data and collapsing it into a usable model. Prepping a large data set is hard enough. Once you’ve prepped it, you ha...
https://blog.afterthedeadline.com/2010/03/04/all-about-language-models/
NGramJ is a Java library for language recognition. It uses language profiles (counts of character sequences) to guess what language some arbitrary text is. In this post I’ll briefly show you ho...
https://blog.afterthedeadline.com/2010/02/08/n-gram-language-guessing-with-ngramj/
Spell checking is a three-step process. Check if a word is in a dictionary, generate potential suggestions, and then sort the suggestions–hopefully with the intended word on top. Dr. Peter ...
https://blog.afterthedeadline.com/2010/01/29/how-i-trie-to-make-spelling-suggestions/
Spell checkers have a bad rap because they give poor suggestions, don’t catch real word errors, and usually have out of date dictionaries. With After the Deadline I’ve made progress on these ...
https://blog.afterthedeadline.com/2009/12/09/thoughts-on-a-tiny-contextual-spell-checker/
AtD *thrives* on data and one of the best places for a variety of data is Wikipedia. This post describes how to generate a plain text corpus from a complete Wikipedia dump. This process is a modi...
https://blog.afterthedeadline.com/2009/12/04/generating-a-plain-text-corpus-from-wikipedia/
I’m often asked if AtD gets smarter the more it’s used. The answer is not yet. To stimulate the imagination and give an idea of what’s coming, this post presents some ideas about how AtD ca...
https://blog.afterthedeadline.com/2009/11/23/learning-from-your-mistakes-some-ideas/