BYU Law Hosts 6th Annual Law and Corpus Linguistics Conference to Advance the Study of Language in the Law

Virtual Event to Include Harvard Law Project Announcement, Keynote by Prominent Linguistics Professor and Public Interface Launch

January 27, 2021 11:05 AM EST

News and research before you hear about it on CNBC and others. Claim your 1-week free trial to StreetInsider Premium here.

PROVO, Utah, Jan. 27, 2021 /PRNewswire/ -- BYU Law, a leading national law school focused on leadership in the legal profession, today announced it will hold its sixth annual Law and Corpus Linguistics Conference on February 5. The event will virtually convene prominent legal and linguistics scholars, judges and industry professionals interested in furthering the discipline of corpus linguistics – the methodology for understanding the meaning of words at the time they were written by analyzing language in large collections of texts called "corpora."

In addition to hosting this annual conference, BYU Law develops pioneering legal research corpora, and fosters influential scholarship and training using this method. BYU Law will release the public version of its law and corpus linguistics platform and search interface the first quarter of 2021. The open public beta ( has already been used by thousands of researchers, including federal and state justices, and appellate attorneys. The platform enables legal professionals to analyze the meaning of words that can be applied to current cases, with several of the corpora having been referenced in legal opinions

"It's an exciting and turbulent time in the world of law as courts continue to seek ways to interpret the meaning of words from important historical rulings and founding-era documents," said D. Gordon Smith, Dean, BYU Law. "I'm inspired to see how far we've come in just a decade since we recognized the potential of corpus linguistics to revolutionize the process of interpretation in the legal space. I expect continued advances in the discipline as we grow our community of interdisciplinary scholars and legal professionals familiar with the practice."

2021 Conference

BYU Law's sixth annual Law and Corpus Linguistics Conference, sponsored by Schaerr Jaffe, LLP, brings together legal scholars from across various areas of scholarship, prominent corpus linguistics scholars, and judges who have employed corpus linguistics analysis in their decisions. The keynote presenter is Tammy Gales, Associate Professor of Linguistics at Hofstra University. She has presented lectures and published numerous articles about using corpus linguistics as a tool in legal interpretation and forensic linguistics.

The event will include three conference sessions on papers or panel topics to address best practices in corpus methods, development of new corpora, triangulation using corpus linguistics and other methods, and interpretation applications (statutes, contracts, canons of interpretation, patents, etc.). For more information about the conference, visit or click here to register.

CAPCorpus: Applying Corpus Linguistics to Harvard's Groundbreaking Caselaw Access Project

Following the conference, BYU Law will debut the CAPcorpus encompassing Harvard Law School's American Case Law Access Project (, which includes over 6.7 million cases representing roughly 12 billion words. Covering 360 years of American case law, the corpus will provide avenues for research never before available to law and linguistic researchers. The scope of the data shared by Harvard required the complete re-write of the underlying law and corpus linguistics research platform. Users will be able to search the entire CAPCorpus, or select segments to create more narrowly defined custom corpuses. For replicability, searches and results will continue to be savable and linkable to Google Sheets, which can be shared with others to verify research conclusions and cited without fear of alternation after citation. 

Interface Public Release

In tandem with the conference, BYU Law will demo the formal first public release of its law and corpus linguistics research platform – an unprecedented legal technology tool launched in 2018 that makes available the first large-scale data sets of all U.S. Supreme Court rulings and founding-era documents to provide historical context for the usage and meaning of words for legal use. Each corpus contains millions of words from thousands of texts representing language from a relevant period or court. The platform enables legal professionals to analyze the meaning of words that can be applied to current cases.

Beta tested by thousands of users, the first public release will include a simplified interface and a variety of measures to help researchers utilize the tool. These metrics include distribution and adjusted frequency measures. Distribution metrics measure how distributed a word is in the corpus. They contrast words that are evenly distributed with those that clump in a small number of documents. More than 20 new measures have been added including Gries DP and Gries DP Norm – Gries DP normalized measures dispersion, a feature that allows researchers to see not only how often a word appears but how widespread the word is. This is useful to researchers as words that appear a lot in a small handful of texts are significant in different ways from words that appear a few times in a wide range of texts. Compared to the Gries' DP the normalized version provides possible minimum and maximum values.

Corpus linguistics is largely concerned with frequencies, built on the assumption that how frequently (or infrequently) a word appears is meaningful. Frequency information is used in a variety of subdisciplines, from language teachers wanting to prioritize teaching the words their students will most frequently encounter to legal scholars attempting to determine which meanings of a word are most common or ordinary to psycholinguists designing experiments to understand language comprehension. With the central importance of frequency information, it is also critical to consider how words are distributed in the corpus. A word that appears a few times in many different documents is different than a word that appears many times in a single document. A word that has a high frequency but only appears in one document is likely to tell us more about that document than about the corpus as a whole. Corpus researchers who are interested in frequency information in most cases want to assure that their data is not skewed by words with low distribution. The new interface will allow users to select what measures are most appropriate for the research they are conducting.

For more information about BYU Law's Law and Corpus Linguistics project, visit

About BYU Law SchoolFounded in 1971, the J. Reuben Clark Law School (BYU Law) has grown into one of the nation's leading law schools – recognized for innovative research and teaching in social change, transactional design, entrepreneurship, corpus linguistics, criminal justice and religious freedom. The Law School has more than 6,000 alumni serving in communities around the world. In its most recent rankings, SoFi ranked BYU Law as the #1 best-value U.S. law school in their Return on Education Law School Ranking. For more information, visit  

Cision View original content to download multimedia:


Serious News for Serious Traders! Try Premium Free!

You May Also Be Interested In

Related Categories

PRNewswire, Press Releases