Over the summer and early autumn I had the opportunity to continue work on the Reading Lives project. The project has grown over the last few years from a discussion at a ‘hack-day’ about a column of data in a survey result-set that nobody knew what to do with, to a web based application that allows people to explore the answers given to the question “What role has reading played in your life?”. The fact that you are reading this now suggests that you too could answer that question, and share a short summary of the role that reading has played in your life.
About the project
The project has become what it is through some funding from the Arts and Humanities Research Council via the CATH (Collaborative Arts Triple Helix) project lead by do.collaboration at the University of Birmingham in partnership with the University of Leicester. CATH consists of several teams, each one of which is made up of a developer, an arts organisation and researchers. Our team consists of researchers Danielle Fuller (University of Birmingham) and DeNel Rehberg Sedo (Mount Saint Vincent University, Canada); myself Tim Hodson (the developer) and Writing West Midlands (the arts organisation).
At the time of writing this, the project is not only allowing people to explore the existing survey answers, but to contribute their own answers. The app presents a user profile which people can fill out with their own survey answers. Recently the app featured at the Birmingham Literature Festival, and was seen on the big screen at several of the events.
About the Data
The Answers as I call them are the answers to the question “What role has reading played in your life?”. Each answer is analysed for it’s word content by using a (relatively) simple algorithm called Term Frequency – Inverse Document Frequency. This algorithm allows me to decide how important a word is in a particular document based on it’s frequency in the Answers and it’s frequency in the corpus of all answers. To quote Wikipedia “The tf-idf value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others.”. This calculated importance of words is used to build a word cloud which is meant to act as an alternative way to explore the frequently occurring themes of peoples Answers. We also have demographic data for the Answers which was collected through the original survey. It is planned to use this to allow further empathetic connections between the viewer of the app and the original Answers, but we haven’t got to build that bit yet :).
About the app (warning – gets technical!)