I can’t recall where I stumbled upon this awesome visualization tool, but I also blogged about it on my cliotech blog. This is the type of cool application that makes me wish I were still teaching history.
Here is the information from the developer, Chirag Mehta.
- The above tag cloud shows the popularity, frequency, and trends in the usages of words within speeches, official documents, declarations, and letters written by the Presidents of the US between 1776 – 2007 AD.
- The dataset consists of over 365 documents downloaded from Encyclopedia Britannica, ThisNation.com, and WhiteHouse.Gov.
- Once the documents have been dated and converted to plain-text, my tag-cloud-generation script goes through every text chronologically and makes a list of all the unique words that have been used and counts how many times each word is used.
- Next it identifies the different variations of words and combines them under the most common variation. E.g. "promised", "promises", "promising", and "promise" might be grouped under "promises". I use the Porter Stemming Algorithm – PHP port by Richard Heyes to find root word for variations. I wish Snowball had a non PECL PHP class.
- Then it removes the most common words like "the", "and", "this", "that" and some not so common language-specific words like "hitherto", and "notwithstanding".
- Once the commonly used language-specific words are removed, it makes a "tag cloud" in which the more commonly used words are shown in bigger font size than the less frequently used ones.
- Additionally, it tries to figure out how long ago a given word hit its peak usage and brightens the recently used words while fading away words haven’t been used in a while.
- Due to the lack of precise date information, I’ve had to estimate the date for some of the texts. Nevertheless, I think it gives quite a good overview of the word usage and consequently highlights the primary issues of the day.
- This aging tag cloud was created by the same algorithm that generates my personal tag cloud using the plain-text html-stripped contents from my ‘blog entries.
What a clear and visual way to examine political trends and themes over time. Not only can students make generalizations about common themes, but they can examine how the attention to those themes have waxed and waned over time.
They can make comparisons and contrasts among various periods in history. They could even use this tool along with a mapping application, like GoogleEarth, Frapper, or Google MyMaps, to geotag America’s growing "footprint" on the world stage.
Just imagine the possibilities . . . asking students to analyze the issues facing Presidents Lincoln, Roosevelt, Kennedy, Reagan, and Bush using this tool. Or, imagine asking students to create a mind map of themes that have resonated throughout different presidential terms.
Users can even download the elegant tagging application to apply to other texts.