(This is a post I've had unpublished since writing it in 2016. Just hitting publish without reviewing right now because it's something I find myself periodically looking at the charts for). As w...
http://sappingattention.blogspot.com/2019/03/whats-in-hathi-trust.html
I periodically write about Google Books here, so I thought I'd point out something that I've noticed recently that should be concerning to anyone accustomed to treating it as the largest collecti...
http://sappingattention.blogspot.com/2019/02/how-badly-is-google-books-search-broken.html
I did a slightly deeper dive into data about the salaries by college majors while working on my new Atlantic article on the humanities crisis . As I say there, the quality of data about salaries ...
http://sappingattention.blogspot.com/2018/08/some-preliminary-analysis-of-texas.html
NOTE 8/23: I've written a more thoughtful version of this argument for the Atlantic. They're not the same, but if you only read one piece, you should read that one. Back in 2013, I wrote a fe...
http://sappingattention.blogspot.com/2018/07/mea-culpa-there-is-crisis-in-humanities.html
Historians generally acknowledge that both undergraduate and graduate methods training need to teach students how to navigate and understand online searches. See, for example, this recent article...
http://sappingattention.blogspot.com/2018/07/google-books-and-open-web.html
Matthew Lincoln recently put up a Twitter bot that walks through chains of historical artwork by vector space similarity. https://twitter.com/matthewdlincoln/status/1003690836150792192. The ide...
http://sappingattention.blogspot.com/2018/06/meaning-chains-with-word-embeddings.html
This is a blog post I've had sitting around in some form for a few years; I wanted to post it today because: 1) It's about peer review, and it's peer review week ! I just read this nice piece b...
http://sappingattention.blogspot.com/2017/09/peer-review-is-younger-than-you-think.html
Digging through old census data, I realized that Wikipedia has some really amazing town-level historical population data, particularly for the Northeast, thanks to one editor in particular typin...
http://sappingattention.blogspot.com/2017/07/population-density-2-old-and-new-new.html
I've been doing a lot of reading about population density cartography recently. With election-map cartography remaining a major issue, there's been lots of discussion of them: and the "Joy Plot "...
http://sappingattention.blogspot.com/2017/07/population-density-1-do-cities-have.html
Robert Leonard has an op-ed in the Times today that includes the following anecdote: > Out here some conservatives aren’t even calling them “public” > schools anymore. They c...
http://sappingattention.blogspot.com/2017/07/what-is-described-as-belonging-to.html
The Library of Congress has released MARC records that I'll be doing more with over the next several months to understand the books and their classifications. As a first stab, though, I wanted to...
http://sappingattention.blogspot.com/2017/05/a-brief-visual-history-of-marc.html
One of the interesting things about contemporary data visualization is that the field has a deep sense of its own history, but that "professional" historians haven't paid a great deal of attentio...
http://sappingattention.blogspot.com/2017/04/the-history-of-looking-at-data.html
I want to post a quick methodological note on diachronic (and other forms of comparative) word2vec models. This is a really interesting field right now. Hamilton et al have a nice paper that sh...
http://sappingattention.blogspot.com/2016/12/some-notes-on-corpora-for-diachronic.html
This is a quick digital-humanities public service post with a few sketchy questions about OCR as performed by Google. When I started working intentionally with computational texts in 2010 or so...
http://sappingattention.blogspot.com/2016/12/ocr-failures-in-2016.html
Like everyone else, I've been churning over the election results all month. Setting aside the important stuff, understanding election results temporally presents an interesting challenge for visu...
http://sappingattention.blogspot.com/2016/12/a-192-year-heatmap-of-presidential.html
I'm pulling this discussion out of the comments thread on Scott Enderle's blog , because it's fun. This is the formal statement of what will forever be known as the EFFICIENT PLOT HYPOTHESIS FOR...
http://sappingattention.blogspot.com/2016/09/the-efficient-plots-hypothesis.html
Word embedding models are kicking up some interesting debates at the confluence of ethics, semantics, computer science, and structuralism. Here I want to lay out some of the elements in one recen...
http://sappingattention.blogspot.com/2016/08/language-is-biased-what-should.html
Debates in the Digital Humanities 2016 is now online, and includes my contribution, "Do Digital Humanists Need to Understand Algorithms?" (As well as a pretty snazzy cover image …) In it I l...
http://sappingattention.blogspot.com/2016/07/why-digital-humanists-dont-need-to.html
Some scientists came up with a list of the 6 core story types . On the surface, this is extremely similar to Matt Jockers's work from last year . Like Jockers, they use a method for disentanglin...
http://sappingattention.blogspot.com/2016/07/plot-arceology-emotion-and-tension.html
I usually keep my mouth shut in the face of the many hilarious errors that crop up in the burgeoning world of datasets for cultural analytics, but this one is too good to pass up. Nature has jus...
http://sappingattention.blogspot.com/2016/07/nature-publishes-flat-earth-research.html
I started this post with a few digital-humanities posturing paragraphs: if you want to read them, you'll encounter them eventually. But instead let me just get the point: here's a trite new categ...
http://sappingattention.blogspot.com/2016/05/literary-dopplegangers-and.html
A heads-up for those with this blog on their RSS feeds: I've just posted a couple things of potential interest on one of the two other blogs (errm) I'm running on my own site. One, "Vector Spac...
http://sappingattention.blogspot.com/2015/11/a-heads-up-for-those-with-this-blog-on.html
Mitch Fraas and I have put together a two-part interactive for the Atlantic using Bookworm as a backend to look at the changing language in the State of Union. Yoni Appelbaum, who just took over ...
http://sappingattention.blogspot.com/2015/01/state-of-union-and-corpus-comparison.html
Far and away the most interesting idea of the new government college ratings emerges toward the end of the report. It doesn't quite square the circle of competing constituencies for the rankings ...
http://sappingattention.blogspot.com/2014/12/federal-college-rankings-pitfalls-of.html
Before the holiday, the Department of Education circulated a draft prospectus of the new college rankings they hope to release next year. That afternoon, I wrote a somewhat dyspeptic post on the...
http://sappingattention.blogspot.com/2014/12/federal-college-rankings-who-are-they.html