Chapter 7 · An Adaptive Data-Driven Approach to Second Language Acquisition

The role of Deliberate Learning

In her study, Elgort (Elgort, 2011) investigates the question of how effective Deliberate learning (DL) words is and how it compares to acquiring words through Krashen's comprehensible input. She points out that she herself and other authors already suggested, that acquisition via comprehensible input is not enough to acquire L2 vocab. Rather, it needs to be "supplemented by deliberate form-focused learning [as it] provides an efficient and convenient way of memorizing vocabulary." (Elgort, 2011, p. 368).

Elgort also refers to Nation (Nation, 1980), saying that users are able to learn from 30 to 100 new words per hour with what Elgort calls "bilingual word pairs", which are more commonly known as "flashcards" (Elgort, 2011, p. 368).

Elgort states, that with Deliberate learning (DL), retention rates of words are on average much higher, than when learning with natural input. However, the quality of these vocabulary words is questionable, as it remains unclear, if their quality can reach the quality of vocab acquisition from natural input and if the learned words are good enough to be used in real world language use cases, she adds.

Elgort acknowledges, that probably the user's mental word lexicon is organized in two different systems in the brain, or atleast, that the information is organized differently in the user's brain based on form ('formal-lexical') and meaning ('lexical-semantic'). Note that this is in line with the word-language schema distinction and indicates, that a transfer from simple to real word schema might indeed not be trivial since they are, in Elgorts words, organized differently.

Elgort suggests that 'acquisition' means, that the user can 'fluently' access a word in the mental word lexicon. She seeks to answer the question of how high the quality of word acquisition via DL is, with a study because, as she says, there is a lack of empirical evidence to answer this question (Elgort, 2011, p. 368).

With the findings from her study, Elgort concludes that Krashen's learning-acquisition dichotomy is not justified as DL is convenient, efficient but also effective and thus allows for acquisition. She adds that natural input still valuable, as DL benefits from using a word in different contexts.

Compared to natural input, "[DL] increases the learning rate and improves accuracy of vocabulary knowledge, [because of this,] this method is particularly appropriate when speedy acquisition of a finite set of words is needed" (Elgort, 2011, p. 400). Here Elgort refers to (Chung & Nation, 2003) as well as other Authors and suggests, that these characteristics are especially beneficial, when the user wants to learn a limited amount of words in a short period of time, because he needs to use those words in a specific domain such as travel or business.

Where DL thrives most is when it is used in combination with such a domain-specific word need, as freshly and efficiently learned words are encountered in the real world language usage that show the words in rich contexts (Elgort, 2011, p. 399f).

Finally, Elgort, while referring to Nation (Nation, 2007), concludes that DL should only be a part of a bigger mix of instructional designs. She specifies a mixed approach that balances four parts: meaning input, meaning output, language focused learning, and fluency development (Elgort, 2011, p. 400).

Elgort on Deliberate Learning

The conclusion of Elgorts study is in line with our suggestion that Krashen underestimates the role of Deliberate learning (or as he calls it, 'learning'). The increased efficiency of deliberately learning 'bilingual word pairs' confirms our interpretation of CLT: Reducing the complexity of the language schema to simple word schemas, that, at least for a novice, only contain the word and its translation is exactly what a bilingual word pair is. The prediction that learning such simplified word schemas will be more efficient, than starting with input because of the difference in cognitive load, is supported by the results of Elgort's study.

Elgort's suggestion to use a mix between the efficient DL to learn needed words and then solidify them and expand their nuance by using and encountering them in domain specific contexts, is also a good addition and answer about the integration of DL learning: Expanding and automating the language schema is sped up, when complementing it with Deliberate learning and practice of the corresponding word schemas. When reading an article about, for example, a new innovative Smartphone, to learn new words such as battery, screen, software before or after reading the article, seems intuitive. It also demonstrates to the user that these simple word schemas, learned more or less in isolation, have a rich context and are actually used in real word language use cases.

The strong point of synergy between Deliberate learning and acquisition

This synergy of synchronizing Deliberate learning and practicing word schemas (in form of vocab) on the one hand and comprehensible input for the language schema on the other, appear to be a key factor of fast and efficient L2 acquisition.

With regards to CLTs schema automation, it also seems likely that the more advanced the language schema becomes, the less needed the simple word schemas will be, because the user will be better equipped to deal with the complexity. At the same time, learning new word schemas should also be more easy, so the reduced need and the increased easiness will balance each other out. Learning 100 new word schemas, when the user already knows 500 will be harder, but more significant than learning 100 new word schemas when the user knows 1000 words. Learning those 100 words at the advanced stage will in turn probably be easier.

Critique on Elgort and the mix of learning activities

However, our conclusion based on Krashen and CLT appears to contradict Elgort and by extension also Nation in her final conclusion about the balance of the four learning types. In this part, she contradicts Krashen, who claims that output comes by itself, when the user feels ready, but more so she contradicts Krashen with her claim, that language and fluency focused learning and development have equal value for users as do DL and comprehensible input ('meaning focused input' in her words). As we will later encounter the advocacy for a mixed and balanced learning system more often, we will expand on this now:

While her claim for the high utility of DL and its weakening of Krashen's learning-acquisition dichotomy seems plausible, this does transfer to rejecting Krashen's stance on other areas such as output, language and fluency based learning practices. The word-language schema distinction derived from CLT counters the notion that other learning activities besides learning vocab and language acquisition are of the same value.

Under the assumption that time, motivation, and cognitive energy are limited, it is best to learn as fast and efficient as possible, or more precisely, to transition quickly from learning artificially simplified word schemas to learning the actual language with the language schema. Moreover, the premise for learning simple word schemas (as vocab) is to break down the complexity and interdependence of natural input. This means, that word schemas should be as small and lean as possible. When the language schema is still non-existent or underdeveloped, the elements of the word schemas need to be small: The elements can be a bilingual word pair, such as 'cat' and 'gato', but they should not contain the gender, plural forms, pronunciation, in the case of a verb conjugations or any other elements such as perfect spelling. In other words, it is more useful for the users to focus on building a lean dictionary (or 'lexicon', in Elgort words), which's word schemas consist of very few elements. This has three reasons:

First, the fast transition to natural language is important, as building the word schema is only an aid to building the language schema in the first place. 100 well learned word schemas, each consistent of many elements, such as perfect spelling, conjugations, gender and pronunciation is a much weaker aid in comprehending a natural input, than 1000 learned word schemas with only two elements, the bilingual word pair.

Secondly, according to Zipf's law the frequency of words in any given natural text will decrease exponentially (inversely) the lower its rank on a respective frequency table is (George, 1935). As a consequence, a few words appear in many articles because they are so frequent (on top of a distribution table). Since the frequency is reduced rapidly the further down the word is on the distribution table, this also means, that after having learned a certain amount of words, words get less useful as they are far less frequent. Unless the user first learns almost all words that exist in the language, he will always encounter new words in a natural text (or any form of natural input). This is supported by Nation, who stated that about 9000 words are needed to comprehend 98% of a text (Nation, 2010). When the user starts reading natural input, he will learn new word schemas by reading them repeatedly. This means, that newly constructed word schemas, from the natural input, will be small and have far fewer elements, because the user will first learn their meaning, before learning other attributes such as their grammar. It seems wasteful to learn word schemas in detail, if new schemas, learned through input, will be simple regardless.

Thirdly, if the user omits all the details, when constructing word schemas, instead of deliberately learning them, and only learns the meaning deliberately, after having started to read input, over time he will learn all the sub-elements (like conjugation, gender and pronunciation) automatically through input acquisition. Only then, it seems, should the user start new types of learning exercises, that directly target these elements and thus aid the user in his language schema construction and automaton, in the same way that Deliberate learning and practice of vocab is most valuable in synergy with natural input. This, however, should be initiated only by the user, similar to how Krashen suggests that output is initiated by the user. This is also in line with the claim made in CLT, that the content of the learning system needs to be tailored to the user, because content that could be germane for one user, might be extraneous for another.

Comparing different learning activities

Finally, we will contrast valuable to less valuable learning activities. First we will replace Elgort's four types of learning activities with more basic descriptors, that still correspond to her four types.

Explicit learning and practice of vocabulary.

Reading

Listening

Speaking

Writing

Explicit learning and practice of grammar.

Explicit learning and practice of pronunciation.

In addition to there could also be an arbitrary number of random word games, someone could come up with (such as putting words in boxes or connecting them with lines).

Learning vocab, for example by using bilingual word pairs in form of flashcards, has a low cognitive load. It is efficient but also transparent as it is easy to track what a user knows, how well, and what he doesn't know. It is also convenient, as Elgort notes, and it allows for Krashen's monitor, as time and pressure can be controlled by the user. For example, he can do a relaxed practice session, when he has time, in private, or a competitive practice session with a timer.

Reading a text is similar to this. The cognitive load is far higher, but the setting can be controlled and the monitor used. An unknown word can be looked up, time can be taken for reading slowly, or reading something again. A text also provides immediate feedback with how much a user already knows.

This is in contrast to a real world conversation or in-class listening comprehension activities. They both have a high cognitive load, but the cogitative load can be reduced. For example, the listening comprehension can be replaced by an internet video (eg. a Youtube video).

Activities such as writing have their use cases, but for a novice, writing inevitably leads to a cognitive overload, to an even higher degree than reading, because the user has to actively understand and produce the language instead of just interpreting it. In CLT it is claimed, that germane load can be high and Hulstijn and Laufer (Hulstijn & Laufer, 2001) stated that high cognitive energy improves vocab recall, so for a user with a sufficiently advanced language schema, writing might be optimal. Here the difference in complexity should be considered in writing an essay or writing very simple content such as a text message.

The activity of practicing pronunciation appears to be least valuable. If the user has a sufficient amount of words to rudimentary communicate with other in the L2, even if his pronunciation is at a very low level, others can probably still communicate with him. If, in contrast, the user knows less words, but can pronounce them very well, it will not be of much help to him, because he lacks too many words. So learning pronunciation later rather then sooner seems plausible. Additionally, pronunciation is only needed for real world language usage for communication with others. This use case is so advanced, that it will only occur at the last stages of L2 acquisition, so there is almost no point to learning pronunciation early, just as one has to learn to walk before one can run. Finally, CLT predicts that after the language schema is advanced, it likely will be easier to learn and add additional elements such as pronunciation, similar to how it becomes easier to add new words.

This thesis, built

HablaCore is the framework in these chapters, turned into an app — real articles you read with instant, in-context translation.

Try HablaCore free