Explore
Chapter 2
Literature Review
This Chapter reviews the academic literature on corpus linguistics studies specifically in the context of BE, provides an overview of the use of corporate annual reports in ELT pedagogy and establishes the theoretical framework (i.e. lexical approach) which underpins the corpus-based materials which were developed.
2.1. Corpus Linguistics
2.1.1 Definition
Bowker and Pearson (2002:9) simply describe a corpus (pl. corpora) as “a large collection of authentic texts that have been gathered in electronic form according to a specific set of criteria”, or as McEnery and Baker (2017:1) put it, a corpus can be understood as “large bodies of machine-readable texts”. Corpora are digital files that can be analysed with the help of software known as corpus analysis tools or concordancers. Baker and Pearson (2002) point out that corpora are an extraordinary resource for linguists and language researchers. In a corpus investigation, small fragments of a text are examined, such as individual words or multi-word bundles, and multiple fragments can be examined simultaneously, therefore our interaction with corpora differs significantly from the way we interact with printed texts. Finally, it is important to note that a corpus is “not simply a random collection of texts” (Bowker and Pearson, 2002:11) but the texts in a corpus are selected according to “explicit criteria in order to be used as a representative sample of a particular language or subset of that language” (ibid.:11), i.e. of a particular subject field.
2.1.2 Types of corpora
Bowker and Pearson state: “There are almost as many different types of corpora as there are types of investigations” (2002:11). They subsequently identify types of corpora according to various criteria, such as the size, purpose, type of texts. The corpora are therefore classified as:
(a) General reference vs special purpose: general corpus refers to one representative of a given language as a whole and can therefore be used to make general observations about that particular language. A special purpose corpus focuses on a particular aspect of a language, such as the LSP of a particular subject field, a specific text type, language variety or the language used by members of a certain demographic group.
(b) Written vs spoken vs multimodal: a written corpus contains written texts, while a spoken corpus consists of transcripts of spoken material. Additionally, multimodal corpora combine different media.
(c) Monolingual vs multilingual: a monolingual corpus contains texts in a single language, a multilingual corpus contains texts in two or more languages.
(d) Synchronic vs diachronic: a synchronic corpus presents a snapshot of language use during a limited time frame, whereas a diachronic corpus can be used to study how a language has evolved over a long period of time.
(e) Open vs closed: An open (monitor) corpus is one that is constantly being expanded, a closed (finite) corpus is one that does not get augmented once it has been compiled.
(f) Learner corpus: A learner corpus is one that contains texts written by learners of a foreign language, created for the purpose of comparison and identification of the types of errors made by the learners. (ibid. 2002:11-13)
Following this taxonomy, a corpus compiled for the purpose of this thesis can be described as specialised, written, monolingual, synchronic and closed.
2.1.3 Corpus Linguistics
Corpus Linguistics (CL) is an approach or methodology for studying language use through the analysis of corpora. It is an empirical approach that involves studying examples of actual, authentic language, rather than hypothesizing about it. CL also makes extensive use of IT technology, which means that data can be analysed in ways that are not possible when dealing with printed material (Baker and Pearson, 2002:9).
What is worth mentioning is that CL first gained recognition among linguists about thirty years ago (Church, 1990) and the reason behind its sudden new popularity was the fact that “text was more available than ever before” (ibid.). Thirty years later, due to technological progress and accessible corpus tools, this relatively new area of linguistics has evolved into a rather “vibrant discipline” (Szudarski, 2017:1). Also Bowker and Pearson (2002:10) recognise the role of the constantly developing technology in the renewal of interest in CL. The emergence of new software programmes and corpus tools makes it easier to compile and consult electronic corpora, typically much larger than printed equivalents. Accordingly, electronic texts can be gathered and consulted in a much quicker manner than printed texts (Bowker and Pearson, 2002:11). As a result, “Corpora are becoming a very popular resource for people who want to learn more about language use” (ibid.:1).
2.1.4 General applications of CL in different fields
The increasing sophistication of CL methods and technology led to a rapid expansion in their use in the last three decades and to what has been deemed as a “remarkable renaissance” (Rutherford, 2005: 354; McEnery and Wilson, 2001: 1) of CL. Its development spread across the social and psychological sciences (McEnery and Wilson, 2001, ch. 4). Corpus linguistics methods have been applied in various disciplines and genre analysis (McEnery and Wilson, 2001: 117-119). Interestingly, they have also been employed in the examination of the pragmatics of questions in formal police interviews (Johnson (2002) cited by Rutherford, 2005). Depending on the researchers’ interests, CL approaches can be applied to a number of areas of linguistic study: language pedagogy, discourse analysis, translation studies, lexicography, LSP pedagogy, pragmatics, sociolinguistics, media and business discourse, literary or political linguistics (Bowker and Pearson, 2002:11, Szudarski: 2018). What is more, in the developing field of computational linguistics AI systems and language processing tools commonly adopt corpus-based resources (Bowker and Pearson, 2002:11). Also noteworthy is the use of corpora to assist historians to gain insights into societies of the past (McEnery, Baker, 2017:1). As Bowker and Pearson (ibid.) note: “corpora can be used by anyone who wants to study authentic examples of language use.”
As the physical constraints of printed media do not apply to electronic corpora and millions of words of running texts can be stored digitally, language corpora have the potential to be used more extensively than other resources. Their electronic form means that they are easily updatable and much more straightforward to consult than printed resources, which means conducting a search of a corpus can be done in seconds (Bowker and Pearson, 2002:18). Therefore, it is not surprising that CL research approaches have been applied in a wide range of disciplines, and have been used to investigate a broad range of linguistic issues.
2.2 Business English and Annual Reports
This section Discusses Business English and the genre of Annual Reports from a CL perspective.
2.2.1 Business English and its distinctive features
Business English is a broad concept within the ESP varieties, it is often used “as an umbrella term to refer to any interaction, written or spoken, that takes place in English, where the purpose of that interaction is to conduct business” (Nickerson and Planken, 2016:3). The main purpose of BE instruction is to communicate effectively in a professional context. As Frendo (2005:1) states, millions of people around the world use English daily in their business activities. Essentially, BE is communication with other people within a specific context and its main function is bringing “people together to accomplish things they could not do as individuals” (Frendo, 2005:1).
Due to its practical, task-oriented nature BE has a set of unique distinctive features that can be briefly summarised as:
(a) asymmetrical – business interactions are often a result of an unequal status of their participants (e.g. manager vs. trainee) which reflects the language and the communication strategies used;
(b) topic-centred and task-oriented – the language serves the purpose of accomplishing certain tasks in order to fulfil the organisation’s goals;
(c) standardised – the structure of the formal business interactions is typically ordered, it tends to progress through a number of stages and involves specific turn-taking rules;
(d) specialised – BE involves specific, professional lexis relevant to the participants’ specialism, the business discipline and the company’s core activity, making it distinctively different from the everyday English (Nickerson and Planken, 2006:42-43).
Nelson’s findings (Nelson, 2006) correspond to these conclusions. His corpus analysis reveals that, first of all, the BE words and phrases represent a limited number of semantic categories and combinations of words, compared with everyday English. Additionally, BE lexis is predominantly related to business and it tends to be more positive in nature. Finally, adjectives in the BE largely refer to products and companies rather than people and they emphasise action rather than emotion. Furthermore, Nelson (2006) also highlights the importance of the concept of a business-specific semantic prosody understood as “the collocational meaning arising from the interaction between a given node and typical collocates” (McEnery and Xiao, 2006:5). It refers to the relationship of a given word to speakers and hearers, and is concerned with attitudes (Baker et al, 2006:144), and similarly to collocations, semantic prosody cannot be accessed via conscious introspection nor intuition by the non-native speakers (Sardinha, 2000). Nelson (2006) argues these characteristics should have direct implications for BE instruction and have to be taken into consideration when designing the relevant BE syllabus.
2.2.2. Authentic and corpus-informed materials in BE classroom
In the interest of narrowing the gap between teaching and professional business communication many scholars (Nickerson and Planken, 2016:44), Frendo 2005:40, Koester, 2006, Nelson, 2006) emphasise the importance of real language data in BE pedagogy. There are many reasons for using authentic materials in the BE classroom, with the term authentic referring to materials “not written for pedagogic processes” (Wallace, 1998: 145). Some crucial benefits are that the authentic materials allow the students to better understand what they will find in their prospective professional environments (Ruiz-Garrido and Palmer-Silveira, 2015). Authentic materials also appear to increase students’ motivation and interest (Breen 1985; Tomlinson 2001); resulting in eager meaningfully engaged learners (Apsari, 2014).
For good reasons, the discussion of authenticity in the classroom has also been re-energised by the availability of corpus data (O’Keeffe et all, 2007:26). Many researchers (Tsai, 2021; Skorczynska, 2010; Walker, 2011, O’Keeffe at al, 2007) suggest that the corpus evidence should be taken into consideration when deciding materials for BE instruction. The literature shows that the unnatural-sounding linguistic components in published textbooks do not quite reflect the complexity of real-life language use that a corpus investigation reveals. As an example, the results obtained from Skorczynka’s (2010) study of BE textbooks reveal a gap between the textbook and the corpus sample in respect of the metaphors. Her corpus investigation established that nearly a third of textbooks’ metaphors were never used. These findings demonstrate the importance of the corpus evidence, when selecting teachable material for BE instruction. In addition to that, Evans’s (2012) statement that: “Only the most deluded materials designer could imagine that BE materials can accurately reflect the complexities of a global, wired world” establishes the need for genuinely authentic materials in the BE instruction. Among the scarce innovative corpus-based BE textbooks Business Advantage (Koester et al. 2012) deserves attention. Also, in Investigating Workplace Discourse (Koester et al: 2006) the authors argue for a combination of quantitative corpus-based methods in BE pedagogy that would focus on investigating specific linguistic features in different genres and qualitative methods such as analysis of conversations.
Also, Frendo (2005: 45) calls for authenticity in the context of BE instruction, considering it a key issue when selecting teaching materials. The use of language corpora, he suggests, can be particularly useful, as it is “now relatively easy either to compile one’s own corpus of language or to gain access to huge, computerised language databases,” that can be accessed by both teachers and the learners. (Frendo, 2005:49)
2.2.3 Annual Reports – definitions, genre and functions
Annual Reports (ARs) are “published documents used by most public companies to disclose the important corporate information to shareholders and the general public”(Lu and Ren, 2021:84). Bhatia (2010:39) characterises their main purpose as “informing their shareholders about the performance and health of the company, specifically its successes and failures, current problems, and prospects for its future development.” Contemporary ARs are typically visually attractive and consistently branded promotional documents (Cao, et al 2012). Created by the insiders, ARs are designed to be “used by audiences of both insiders and non-experts” (Cao, et al, 2012). The primary audiences for ARs are shareholders and potential investors while other targeted audiences are employees, customers, suppliers, governments, contractors and the community. Furthermore, ARs may be of interest to researchers, as well as to BE teachers and learners (Cao, 2012, de Groot et al, 2011).
2.2.4 Annual Reports and Corpora – pedagogical applications in BE
As for the pedagogical value and practicality of the ARs in the BE classroom, Nickerson and Plankett (2015:97) consider them “common forms of promotional Business English texts” and consider their suitability and usefulness in BE pedagogy. They assert that “students at higher levels of language proficiency could easily work with texts like annual reports” (ibid. 2015:105) and that since annual reports are easily accessible they can readily provide practitioners with “a rich source of information to use with advanced Business English classes.” (ibid. 2015:105). Similarly Poole’s (2017) research demonstrates the value of AR in a corpus-based analysis in which he also advocates the use of “pedagogically-downsized specialized corpora”, that can be implemented in the BE classroom and also more broadly in ESP contexts.
In one of the most recent research projects, Lu and Ren (2021) offer a corpus-based analysis of the linguistic features of the specific section of ARs, namely Management Discussion & Analysis, of public companies in China, in which they compare linguistic practices of Chinese corporate writers with those of international norms. Authors argue that the findings on the linguistic differences between the narratives produced by the Chinese and American companies have implications for Business English pedagogy, specifically in the Chinese context.
Further details and the rationale for choosing Annual Report for this project are discussed in the Methodology Chapter, in section 3.3.
2.3 Corpus-based applications in BE
The literature provides various examples of adopting CL in language teaching and specifically in reference to BE.
Clifton and Philips (2006:76) argue that for the syllabus to have high surrender value “and not waste the learners’ time with language that is not pertinent to their discourse community”, it is essential for the instructor to carry out a language audit or needs analysis. Therefore, by building up a corpus that reflects the nature of the relevant interactions the instructor “can take a more objective look at what language is useful to the learner” (ibid.:76). This ensures that the syllabus and the teaching materials are based on the authentic linguistic material applicable to the particular discourse community and not on an intuition which tends to be false (ibid.).
Similarly to Nelson (2006) who provides corpus-based evidence that specific areas of lexis are characteristic of authentic BE and different from everyday English, Walker (2011) demonstrates the usefulness of a corpus-based analysis in teaching those unique aspects of vocabulary and collocations. His study applies specifically to answering learners’ questions about collocations and semantic prosody in the context of BE and reveals the effectiveness of corpus investigation in teaching senior managers in global companies to develop a more sophisticated command of what is perceived as the key lexis. A corpus-based investigation of the authentic text of a specific Business genre or collocational behaviour of key lexis can be used to answer specific lexical questions in a very precise and accurate way.
In the realm of BE pedagogy, the application of CL was also successfully verified by Tsai’s (2021) study. The research explores the effects of corpus consultation in a BE, more specifically Business Letters writing course. Her research provides evidence that corpus use can be a very beneficial learning tool for BE writing as it improves students’ writing quantity and quality in terms of lexical and syntactic complexity. She advocates “easy-to-use software tools for corpus analysis” for both learners and instructors and the popularisation of corpus application in BE writing and EFL overall.
2.4 CL and the lexical approach in materials development
2.4.1 Corpus-informed BE materials
Previous studies reveal that the majority of the available material does not reflect the findings of corpus-based research into business discourse (Hyland, 1994, Williams, 1988, as cited by Walker, 2011). Textbooks have been criticised for their choice of unnatural-sounding linguistic components and business meetings have been shown to contain “a high degree of linguistic complexity not reflected in the (teaching) resources” (Walker, 2011). Researchers believe that CL can help bridge the differences between instruction and practical communication. Corpora in addition to their application in producing grammars and dictionaries can be effectively used to produce textbooks and teaching materials by providing authentic language examples for material developers (Nickerson and Planken, 2016:173, Cotos, 2017). Teachers themselves can successfully use concordancer software to develop exercises that “prompt students to test linguistic hypotheses, notice contextual meanings, examine collocations” (Cotos, 2017:7). Koester (2014), whose work on Business English textbooks already featured CL, advises supplementing materials with corpus-based examples that show how learners can employ various language functions to improve their overall performance in negotiations (Koester, 2014:175-6).
A recent, and undoubtedly interesting example, and also a compelling reflection of the increasing popularity of application of corpora in teaching, is a project coordinated by Le Foll (2021). The outcomes are presented as an online step-by-step guide for teachers on using online and corpus resources. Created with a contribution of pre-service trainee teachers from Osnabrück University (Germany) as an Open Educational Resource the project aims to empower ELT teachers to design their own, authentic, corpus-based lessons.
2.4.2 The Lexical approach
The lexical approach was first introduced and described by Lewis (1993) and is specified as a method of teaching foreign languages based on the premise that “language consists of grammaticalised lexis, not lexicalised grammar” and “the grammar/vocabulary dichotomy is invalid; much language consists of multi-word ‘chunks’” (Lewis, 1993, vi). The approach rests on the idea that an important part of learning a language is built upon “being able to understand and produce lexical phrases as chunks” (Selivan, 2012). Learners first internalise chunks and can then extract regularities from them, much in the same way as native-speaker children master the English grammar without explicitly attending to the tense rules (ibid.). A lexical syllabus is based on the principle that it is beneficial to teach the most frequent words in a language first as these words have a wide variety of uses, so students will acquire the flexibility of language, while also covering the main points of grammar in a language without having to memorise a large vocabulary (Baker et al, 2006:107). The frequency information of the target language can be provided by a corpus analysis.
When describing the lexical approach, the term ‘chunking’ has to be elaborated on, as described by Lewis (1993:121). It refers to the way in which lexical items are naturally stored in the memory i.e. redundantly, not only as individual morphemes, but as parts of phrases, and as longer memorised chunks of speech. Lexical items are retrieved from memory in pre-assembled chunks. Native speakers retain many chunks and as a result of combining them fluency is achieved. This process has direct implications for language learning as chunking significantly reduces processing challenges, since “tremendous demands are made upon students as they attempt to re-create language from scratch” (Lewis, 1993:121).
2.4.3 Rationale behind lexical approach
Clifton and Philips (2006) argue that the lexical approach can be an effective pedagogic tool for three main reasons. Firstly, the lexical approach disputes “an overemphasis on teaching single, decontextualized words”, claiming it may “hinder the development of an L2 lexicon and deny the learner the possibility of rapid and fluent use of the L2” (ibid.:75). Another argument is that “it is more effective for the learner to learn the whole and break it into its parts than to build up to the meaning of a lexical chunk by focusing on the separate parts”(ibid.:75). They argue that by promoting the more frequent chunks within the target genre/discourse, the teachers will significantly contribute to developing learners’ fluency and accuracy. As errors in collocations are particularly frequent among language learners, an approach that specifically targets collocations rather than isolated vocabulary helps to address this “problematic area of language learning” (ibid.:76).
In addition to the evidence suggesting that the lexical approach is an effective way of BE instruction, corpus-driven approaches are argued to provide samples of the authentic language to be used in the actual instruction (McEnery and Wilson, 2001, Bowker sand Pearson, 2002). Nickerson and Planken (2016:173) emphasise the importance of corpus-based studies as input for the course and teaching materials development, not only because corpora can provide authentic language examples for course developers, but also because such studies have uncovered fundamental differences between Business English and general English. Clifton and Philips (2006) also endorse the use of lexical and CL approaches suggesting a data-driven lexical approach working on the basis that the teacher, whose authority comes from the linguistic evidence of the corpus drawn from the Target Discourse Community, can gain legitimacy and authority on the analysed specialised language. The corpus and further analysis of the most frequently occurring lexical items should be the fundamental resource in bridging the gap “between learners’ specialised needs and the instructor’s limited knowledge of the language of a particular discourse community”(ibid.:73).
"The literature provides various examples of adopting CL in language teaching and specifically in reference to BE."
