A Brief Introduction to Data Mining Projects in the Humanities

Document Type


Publication Date



Data mining offers the capability to view data in a new light, discovering associations and patterns not appreciated before. For the humanities domain, it exemplifies the interdisciplinary efforts of digital humanities. The technique provides answers and prompts further questions from new discoveries. Part of knowledge discovery in databases, data mining involves identifying relevant n-grams, classifying and reclassifying results, modeling the interdependence of variables and clustering results into meaningful subgroups. From designing research questions to determining how best to display and communicate results, the process requires collaboration between information professionals and humanities scholars. A selection of data mining projects illustrates how the technique is being applied for humanities research. Tools for data mining are readily available online, through simple web interfaces or for download and customization for optimal results.