Elena M. Friot

Home » #dh2068 » Thickening the Data: How Excel Helped Me Become a Better Historian

Thickening the Data: How Excel Helped Me Become a Better Historian

The purpose of this post is to show how I started rethinking my approach to historical research – finding sources, reading them, interpreting them, and using them to ask and answer new questions.

For Week 5 of our Digital History seminar, we had to read a few essays about the nature of historical data.  A post by Trevor Owens suggested that as historians, we can think of data in three particular ways – as artifact, as text, and as information.  Regardless of the form, data allows us to glean that stuff we call “evidence” from the stacks (or bytes) parked in front of us.  Owens also teamed up with Fred Gibbs on a collaborative essay on data interpretation and historical writing and they wrote something that really resonated with me as I dove into my own research:

But data does not always have to be used as evidence. It can also help with discovering and framing research questions. Especially as increasing amounts of historical data is provided…playing with data—in all its formats and forms—is more important than ever. This view of iterative interaction with data as a part of the hermeneutic process—especially when explored in graphical form—resonates with some recent theoretical work that describes knowledge from visualizations as not simply “transferred, revealed, or perceived, but…created through a dynamic process.”*  [Paragraph 11, The Hermeneutics of Data and Historical Writing, *Martin Jessop, “Digital Visualization as a Scholarly Activity,” Literary and Linguistic Computing, 23.3 (2008): 281-293, 282.]

That part in bold – iterative interaction with data – is the phrase that stuck out to me.  I have done plenty of research in the past, but my standard approach was to read the source, squeeze out the information, and move onto the next piece.  Thinking about working with my research over and over again at first sounded tiresome, but like they say – don’t knock it until you try it!

I added that new gem to what Tricia Wang said about big data and thick data.  After untangling the remaining historiographical cobwebs of Clifford Geertz and thick description, I started thinking about the importance of the story.  Where could I find the story hidden in the data? My growing Excel spreadsheet looked impressive with all its carefully labeled columns, but where was the poignancy? What did service numbers, parents’ names, and regiments tell me about the soldiers and their community? What did all this data have to do with the way these soldiers were remembered by the town?

The steps below outline my foray into treating my research as data, and might serve as a model as you tackle your own research:

Step 1: If You Want to Think of Research as Data, Create an Excel Spreadsheet!

I had never used an Excel spreadsheet to organize my research prior to starting this project.  I stuck to index cards, legal pads, and a whole lot of Post-It Notes.  When it came time to write the paper, I was a mess.  I always pulled it off, but not without a gradual loss of sanity and random moments of anti-social behavior.  When I started thinking about my research as data, I didn’t head for my stash of paper products – I went right for my laptop and fired up Excel.  With Excel, I can add columns every time I come across a different type of data.  I can color-code cells to create categories and connections.  Best of all, my data is digitized and I can save and export the information in a variety of formats for use in different digital tools.

Excel Screenshot

Only a couple columns in this spreadsheet came from the newspaper clippings. The rest were added based on searches completed with the information from those clippings.

Step 2: Venture into the Unknown Wealth of Cyberspace – Add and Fill Columns!

I had A LOT of blank spaces on my spreadsheet when I first started.  My research was helped by the kind staff at the Swift County Historical Society, who sent me an envelope full of newspaper clippings from the 1940s, when the Appleton streets were renamed for their war dead.  Short biographies of the 29 soldiers from Appleton who died during World War II helped fill in some of the categories – name, street name, and date of death.  Information in these bios was inconsistent, so for some of the men I had their unit, date of birth, parents’ names, and location of death.  Empty cells irked me, so I had to start thinking about where I might find that information.  If I had all the time and grant money in the world, I could easily hop on a plane and plant myself in the archives, city records offices, and libraries that have this information.  Because I don’t, I turned to the Internet (and all the FREE stuff I could find) to fill in the blanks.  I found information about these men that I had not anticipated, and as a result of hours of trying various search terms and moaning at the services that promised to find what I needed (but for a price) I was able to fill out most of my table and even add columns that I had not previously thought of, or considered important.  Adding the units they served in gave me a way to find out about their lives as soldiers.  What campaigns were they involved in? Would families back home have read about these in the news? What were they like as soldiers? Did they earn any honors and awards? How did they die? Did their place of death have any bearing on where they were buried? I discovered that PFC Edwin L. Haven was on board the SS Leopoldville, a Belgian troop transport ship, when it was sunk by a German U-Boat on Christmas Eve, 1944.  I unearthed photographs of the ship and information about the efforts of shipwreck-hunters to find it in the English Channel.  This is the thick data that gives meaning to the names, numbers, and dates that pepper my spreadsheet, but I only found this thick data because the spreadsheet screamed at me to be filled.

This portion of the spreadsheet was added after perusing online military records, census data from 1940, and sites like www.findagrave.com.

This portion of the spreadsheet was added after perusing online military records, census data from 1940, and sites like http://www.findagrave.com.

Step 3 – Seek Ye the Treasures of “Iterative Interaction” – Search, and Ye Shall Find!

The stories I discovered are fabulous, but I also unearthed a variety of other sources that help me construct a social history of wartime Appleton and its residents.  Every time I opened up the spreadsheet I used the information to mangle together unusual search strings.  One of the gems I discovered was the website of composer Daniel Kallman.  He was asked to write a piece for Appleton and the 34th Infantry Division, the unit in which many of  Appleton’s soldiers served (and still serve today!).  Read what he writes about how the composition, Streets of Honor, was composed:

Prior to composing "Streets of Honor," Kallman visited Appleton, listened to interviews conducted by author Erling Kolke, and read the same newspaper articles I did from the Appleton Press.

Prior to composing “Streets of Honor,” Kallman visited Appleton, listened to interviews conducted by author Erling Kolke, and read the same newspaper articles I did from the Appleton Press.

I found this because I kept playing with my data.  Kallman’s website led me to the semi-fictitious novel Streets of Honor by Erling H. Kolke.  As historians we sometimes hesitate to include fiction among our sources, but I snapped up this self-published novel and read it immediately.  This “down-home novel about the activities and families of the 135th Infantry Regiment” tells the story of a small Minnesota town with deep communal ties.  Reading this sheds light on why the community chose to honor their war dead by renaming their streets in 1947.

One search term in particular led to a treasure-trove of wartime letters written by one family from roughly 1941-1945.  I typed in “Ole Veum, Appleton” and first found a newspaper article about his exploits in the skies above Tunisia, but when I put in “Ole Veum, Africa” I found the letters.  “Letters from World War II” is a blog maintained by descendants of the Nelson family from Appleton, Minnesota.  The blog contains transcriptions of hundreds of letters, as well as scanned photographs, menus, postcards, and telegrams the family sent back and forth for the duration of the war.  THIS IS A GOLDMINE!

This website is a historian's jackpot, especially since it contains the transcriptions of the letters instead of scanned images.  The text is easy to dump into text-mining and topic modeling tools.

This website is a historian’s jackpot, especially since it contains the transcriptions of the letters instead of scanned images. The text is easy to dump into text-mining and topic modeling tools.

The lesson here is that precious evidence is sometimes hidden beyond the first search pages, and that rethinking and playing with data helped me locate sources I might never have found if I gave up after the first unproductive search terms.  Though I feel like I spent too many hours sitting in front of a computer digging for this information, think about the time (and money) I might have had to spend to find similar information in a dusty archive.  Now I have great stories, fabulous data, and can narrow my archival needs more accurately.

Conclusion – So, Why Should You Do This, Too? 

The simple answer is because you can.   The better answer is because actively engaging with my data produced new questions and new sources.  I started with an Excel spreadsheet, and the quantity and quality of my data snowballed from there.  I did this all from the comfort of home (and my slightly overheated university office) without having to pay a penny for travel, photocopying, and morale-busting days in the archive.  I am a more productive and creative researcher as a result.  My data is organized, and because of cloud-sharing made possible by Google, Apple, and Dropbox, I can access my data anywhere and easily manipulate it to upload into various digital tools.  Because I found the letters, I now have materials to use in a text-mining and topic-modeling project.  Don’t worry that you will be any less of a historian because you call your research “data.”  Keep in mind Wang’s recommendation about “thick data” and ponder how your quantitative data can help you uncover the qualitative data that produces the richest historical analyses.


  1. […] you look at my data from my previous post, Thickening the Data, you’ll see that I have tons of information in my spreadsheet.  When we first started […]

  2. […] I wrote a brief outline of my chapters for my advisor and he offered comment.  Wouldncha know, Elena Friot beat me to the “Excel saved me life” post.  I highly recommend her article.  Column […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: