The data are extracted from Wikileaks HTML files, cleaned, and stored in a form useful for interactive exploration. diplomatic compound, and emails apparently missing from those released publicly. These touch on contentious issues including the handling of classified information, the 2012 attack on the Benghazi U.S. We illustrate how the filtered displays can be used to generate hypotheses and uncover interesting information. All displays are interactively filtered by time range and selected FOIA codes. The (stemmed) terms having highest frequency in the displayed email, and those having highest tf-idf are listed in separate displays. FOIA exemption codes appear as a selectable list and a barplot shows emailcounts by FOIA code. Other displays add some information beyond metadata. Scatterplot points are coloured by whether the email was redacted or not. Correspondents and their edges are coloured according to whether that email account could be identified as being an approved Federal government account or not.Ī second display shows two daily time series: the total number of emails for that day, and the number meeting selection criteria.Ī third display shows a scatterplot of the time of day versus the day on which that emailappeared. Volume determines the thickness of each spoke and high volume determines an inner circle whose spokes are shortened.
The main display shows Clinton’s most frequent correspondents arranged as nodes of a spoked graph with Clinton at the centre.
The visualization focuses on the meta-data of each email, including its senders, receivers, and the timestamp the email appeared on the Clinton server (from the Wikileaks source).Īn interactive time range slider filters all email and all displays automatically update to changes in the slider.
We present a web-based visualization that allows the user to interactively filter and display characteristics of 32,795 of Hillary Clinton’s emails as provided by Wikileaks.