- Start date: -
- End date: -
- Funder: AHRC
- Co-investigators: Serge Sharoff, Tony Hartley
Much humanities research relies on or would benefit from analysis of electronic corpora (representative collections of texts). The main advantage of using corpora over hand-picked examples is the ability to collect data systematically, to assess the centrality of certain features to the research material, and to establish experimentally potential trends in the data. However, the major difficulty faced by corpus-based studies in humanities research is that creating and annotating a new corpus and designing an appropriate search engine for textual analysis require complex technical support.
IntelliText's novel contribution lies in tuning advanced tools and methods from computer science to the needs of humanities researchers, integrating them into a single software application with a simple interface and good documentation. This allows humanities researchers with no specialised background in computer science or corpus linguistics to take advantage of powerful methods of text collection and analysis. It enables them to collect new project corpora from the web, have them enriched automatically with linguistic and other annotations, and then easily uncover interesting patterns of usage, starting either from their own intuitions and hypotheses, or from expressions and patterns identified as potentially noteworthy by the system.