Legalese as Seen Through the Lens of Corpus Linguistics. An Introduction to Software Tools for Terminological Analysis

María José Marín


In spite of the plethora of possibilities offered by Corpus Linguistics to the study of legal English, the research devoted to the study of this English variety based on this discipline is not as fruitful as that dedicated to other branches of ESP. The present research could be regarded as an introduction into major issues related to the design and compilation of a legal corpus such as the application of appropriate sampling strategies to ensure its representative value. This study also examines the implementation of Automatic Term Recognition (ATR) methods for the analysis of legal terminology and the automatic deployment of collocate networks. The first section explores such a controversial issue as establishing the ideal size for a specialised corpus applying the type/term ratio to a corpus of judicial decisions, the BLaRC, used as reference. In section 3, the assessment of different Automatic Term Recognition (ATR) methods is described. Out of five different methods, Drouin’s (2003) TermoStat is found and recommended as the most efficient one in legal term mining. Finally, sections 4 and 5 demonstrate the practicality of collocate networks (Williams, 1998; 2001) in their capacity to reveal lexico-grammatical patterns which provide plenty of information for the study of legal text. A case study of the sub-technical legal term party using Lancsbox – designed by Brezina, McEnery & Wattam (2015)is presented in section 5.2, where its general and specialised contexts are examined. Such scrutiny brings to the foreground interesting data such as the relevance of marriages of convenience in a collection of judicial decisions.

Legal English, Corpus Linguistics, Terminology, Automatic Term Recognition, Collocate Networks, Lancsbox

