Affiliates’ Projects

This is a document to record the ongoing digital projects by dSHARP affiliates at CMU. If you are working on a digital project and don’t see yours on this list please let us know!

Current Affiliates’ Projects


DocuScope is a text analysis environment with a suite of interactive visualization tools for corpus-based rhetorical analysis. The DocuScope Project began in 1998 as a result of collaboration between David Kaufer and Suguru Ishizaki at Carnegie Mellon University. David created what we call the generic (default) dictionary, consisting of over 40 million linguistic patterns of English classified into over 100 categories of rhetorical effects. Suguru designed and implemented the analysis and visualization software, which can annotate a corpus of text against any dictionary of regular strings that are classified into a hierarchy of rhetorical effects. While we designed DocuScope as a tool for rhetorical analysis, we also found that it was extremely effective for developing the dictionary in a systematic fashion.

Six Degrees of Francis Bacon

Six Degrees of Francis Bacon is a digital reconstruction of the early modern social network that scholars and students from all over the world can collaboratively expand, revise, curate, and critique. Historians and literary critics have long studied the way that early modern people associated with each other and participated in various kinds of formal and informal groups. By data-mining existing scholarship that describes relationships between early modern persons, documents, and institutions, we have created a unified, systematized representation of the way people in early modern England were connected. Unlike published prose, Six Degrees is extensible, collaborative, and interoperable: extensible in that actors and associations can always be added, modified, developed, or, removed; collaborative in that it synthesizes the work of many scholars; interoperable in that new work on the network is put into immediate relation to previously mapped relationships.


E-thos is a digital humanities project which analyzes appeals to expertise in scientific testimony over climate change. The project advances work in digital humanities methods through its development of strategies for identifying argumentative appeals in phrases and within clauses as well as through the development of digital tools which support these modes of textual analysis.

Latin American Comic Archive

This is a project to explore, identify and pilot the use of digital tools to: 1) create an online platform to house a curated archive of digital Latin American comic books; 2) enhance research and teaching of Spanish-language comics through student/scholar collaboration.

Mapping Gandhi: A History of Social Change in Wardha, India

The Mapping Gandi project aims to build an interactive online map that will 1) illustrate the history of Mahatma Gandhi and several of his most prominent disciples in Wardha, a rural district in central India, 2) overlay the contemporary efforts of the Kamalnayan Jamnalal Bajaj Foundation to continue Gandhi’s legacy in Wardha, 3) record Bajaj staff and local villagers discussing the legacy of Gandhi 4) empower users to ask questions of Bajaj staff and villagers, and thus 5) to explore the spatial dimensions of Gandhi’s legacy and the politics of memory in local and transnational contexts. Wardha played a special role in Gandhi’s life, as it was the location of his last ashram and his primary residence from the early 1930s until his death in 1948. Wardha was also the base of several of Gandhi’s most prominent disciples: including the land reform activist, Vinoba Bhave, and the social worker, Baba Amte, who worked to defend and empower leprosy patients. Today, Wardha is home to the Kamalnayan Jamnalal Bajaj Foundation, a large rural development organization funded by the Bajaj family, a family with strong historic ties to Gandhi. By linking the history of Gandhi to the work of the Bajaj foundation, our interactive map explores the politics of memory and the potential for digital platforms to empower efforts at public history that are simultaneously local and transnational.

Print and Probability: A Statistical Approach to Clandestine Publication

Early modern British printers often printed their works clandestinely in order to evade responsibility for introducing religious and political dissent, avoiding risks of ignominy, imprisonment, or official seizure of printing materials. As a result, there are still over 100,000 early modern books and pamphlets whose printers remain unknown. However, defects and variations in the printing tools of this era may hold the key to identifying these printers. Painstaking, individual studies have found telltale defects in the printing of individual characters from this era, due to damaged type pieces. These defects show up in several works from the same printers, potentially allowing anonymous printers to be identified. For this approach to be effective on a large scale, it must be possible to automatically screen for such defects and then to classify works based on a large set of potential defects. This suggests an automated, statistical approach to both defect identification and document classification. We will develop methodology for both of these problems, and validate its reliability in a preliminary study. Such a computational method for inferring printers of early modern books would have profound implications for early modern studies, yielding insights into the secret print networks of individual authors and the sociology of anonymous early modern printing.

Alumni Projects

Visual Haggard

Visual Haggard is a digital archive intended to centralize and improve access to the illustrations of popular Victorian novelist H. Rider Haggard. Kate Holterhoff created this project during her time as a Ph.D. candidate in the Literary and Cultural Studies department at Carnegie Mellon University.