facebook for the characters of 19th century fiction
there are few occasions when the computer science wing of a university gets together with the english department. don’t get me wrong, the english department is an insecure scrounger all too eager to take over bits and pieces from every other discipline. marxism? sure! gender studies? why not? semiotics? gimme gimme! but one thing that english has yet to grab up is compsci.
and yet this paper manages to unify both fields in one amazing topic: using computers to extract social networks from 19th century literary fiction. from the abstract:
We present a method for extracting social networks from literature, namely, nineteenth-century British novels and serials. We derive the networks from dialogue interactions, and thus our method depends on the ability to determine when two characters are in conversation. Our approach involves character name chunking, quoted speech attribution and conversation detection given the set of quotes.
using the data presented in this paper, i mapped out the conversation network of the principal characters of jane austen’s mansfield park. the size of the oval is proportional to how often a character is mentioned (ie. their tumblarity) and the connection line weight is proportional to the conversation length. among other items, we can clearly see that edmund, despite fewer mentions, is clearly the central character of the book.
as i always feared, it was only a matter of time before our humanities professors were squeezed out of a job by a bad boy gang of robot scholars.


