We help you speak English clearly.
Free Speech Lesson


Clear Talk Mode and the Task-Dynamic Model of Speech Production

Clear Talk Mode and the Task-dynamic Model of Speech Production

Why read this?   It’s for people who are a little intense about getting the best out of learning.


Does the nature of the speaking task, like the nature of the movement task, change the dynamics of the system?

Did you ever do running in a race?    Did you do a sprint—50 or 100 meters or yards?   Or long distance?  Then you know that  act of running changes the tenseness of muscles, the rate, and rhythm  of movement compared to walking.  Those features of the muscle actions are also different for sprint (relatively short distance “as fast as you can”) compared to long distance or marathon running. This is called the task-dynamic model  (Kelso and Tuller, 1984, Saltzman et al, 2010) and hierarchical task-based control model of speech incorporating sensory feedback by Parrell, B. et al. (2018).

For human speaking, these task-dynamic modes are called registers or speaking styles , or speaking modes.  One speaking style or mode is  “motherese” or the speaking pattern mothers all over the word use when talking to young children.  What do you hear in “motherese” – higher pitch, emphasizing words by going up in pitch, emphasized sounds and movement of the articulators—lips, tongue, jaw, teeth.

Other modes in English have been  investigated  beginning in the 1920s and summarized by Denes and Pinson (1993) and later studies also summarized by  Smiljanic and Bradlow (2009). These research investigations included  the effect of the  range of sound intensity (loudness) on intelligibility , the effect for intelligibility when talking over background noise, the features of English speech when the task is to talk clearly to persons who have a hearing impairment, and the style called Clear Speech.   Very limited study has also been reported by Smiljanic and Bradlow for the style Clear Speech for other languages or Clear Speech style of talking in English for non-native speakers of American English (AE).  In 2000, Antonia Johnson published her dissertation which compared a   Clear Speech prescribed mode or style of speaking to conversational style in non-native speakers of English.

Briefly, research study determined that features of  clear speech in English includes greater speech volume (louder), feature enhancement for consonants (e.g. making fricatives like S, SH, F, V, TH lengthier in duration), feature enhancement for vowels ( e.g. using greater opening of the mouth for the first part of  long vowels A, I, and O,  and extending the duration time for the English short vowels A and O and changing of formant frequencies and change for vowels).

Since 2000, we at Clear Talk Mastery have scientifically analyzed assessment for almost a thousand different people with  63 different languages and from  64 countries for intelligibility (pre diagnostic assessments, mid-course and end of course assessments).

The task for our student-learners has been  to acquire clear English speech—to increase intelligibility or understandability of their American English speech. Johnson discovered in the dissertation work that in order to go into the clear speaking AE mode and learn the six strategies which native-born speakers of American English use when speaking clearly, non-native born speakers of English needed more.  They required specific enunciation instruction—what features of the 23 consonants and 14 vowel sounds to enhance,  precisely where to position the articulators of tongue, lips, teeth, jaw, instruction on making articulator muscles stiff and tense, which speech sounds to lengthen in duration and which to produce quickly.  These we called Tactics (Tactics are details for strategies.)

Notably, the task-dynamic model was our important guide— the task was for student-learners to acquire a clear speaking mode which made  AE (American English) highly understandable to all listeners (native born English speakers and non-native speakers of English).

A sidebar:  Based on the speed of being able to use strategies of clear speaking derived from previous and our own research, we have concluded that all languages have  a clear speaking mode— probably used at a minimum when talking in noisy environments from childhood or perhaps talking to a person with a hearing impairment (like a grandparent) when the purpose of the speaking task was to be understood.  For example, who hasn’t noticed a toddler aged 18 months up to age 4 requesting an item from the mom or caretaker when in a noisy room? 

We found that the style of clear talking or Clear Talking Mode when first learned— along with specific enunciation instructions—produced a predictable mode or style of speaking.   Student-learners reported using high energy, high attention when first learning the Clear Talk Strategies with the added enunciation instructions (including Tactics).   The features of this mode when speaking in a sentence included pauses between the words (and syllables) as the talker was processing in the brain the “plan” for the next word and a quick review in the brain of the accuracy of the previous word.  Our instruction for AE speech sounds including for learning purposes to hyperarticulate the consonants so they were at least double loud and double slow (for lengthier duration consonants) and double fast (for quick AE) consonants.  The rationale for this hyperarticulation was that the brain learns faster when the movement or action is highly salient—easy to feel and hear.

Accurate vowel pronunciation for 14 AE vowels was instructed after “mastery” of consonants (about 80-90%) using the clear speaking mode.  For home practice, to speed up learning, student-learners used maximum effort, maximum accuracy of positioning of the articulators, hyper or very enhanced feature of prolonging appropriate AE consonants and maximum tensing the articulator muscles.  By 2017, our observations during instruction and assessments made it clear that this learning mode has unique characteristics so we gave it a name – we called this Workout Practice, or Workout Mode of Clear Talk.  There is more to be said on this, which I will get to later.

Importantly, we emphasized that in daily life  when talking with other people, the optimal Clear Talk Mode or style of talking would be a mode where the articulator muscles continued to be stiff and tense but not maximal tenseness and the pauses between words not as lengthy.   This mode or style we called Careful Clear Talk Mode.   Because muscles and the brain or central nervous system were learning a new series of patterns (procedural learning), it was impossible for student-learners to make the change quickly in speech gestures of the consonant- to- vowel -to consonant speech sounds in a word.  For example for the word “tag”, to  push the tongue to the roof or top of the mouth hard and quickly for the T consonant then push tongue forward (and flat) for the AE short vowel A, then raise the back of the tongue blade to the roof of the mouth at the back of the mouth for a  G consonant  — these series of speech gestures  were impossible for nonnative speakers to do as quickly as an adult native speaker of English or a child because native speakers of AE  had literally years of practice.  In other words, it was impossible for the non-native learner to imitate the speed of a native-born adult AE speaker. 

Based on much research, including our own Action Research (ongoing assessment which directed change in instruction), we adhered to the Task-Dynamic Model  of  human movement and speech production.  It was a mode we were instructing—much like a physical trainer would instruct a runner eager to succeed in long distance running.

Like other physical activities, speech is central nervous system (brain and nerves) and muscles.  Just as there is the Task-Dynamic and Hierarchical Task-Dynamic model for Motor Control (motor means movement), there is also a Task-Dynamic Model for Speech Production.

For efficacious clear AE speech instruction,  we used diagnostic pre assessment,  mid-course and post course assessment.   More on this later.

copyright 2023, Clear Talk Mastery, Inc. All rights reserved.

Does Exercise Strengthen Speech Muscles?

You already know that specific exercise will strengthen your skeletal muscles—for example leg muscles for running, football—both soccer and American football– and  arm muscles and upper body exercise for many sports, including tennis.

  What about muscles for human speaking?  Skeletal muscle is found throughout the body, attached to bones via tendons. It is also present in tongue, lips, cheeks, attached to the jaw,  the cricothyroid muscle attached to the vocal folds for voicing, the esophagus, and the diaphragm.

 “The Human Tongue Slows Down to Speak: Muscle Fibers of the Human Tongue” by Sanders et al 2013 found that the average percentage of slow Muscle Fibers (MF) in adult and 2-year-old muscle specimens were 54% compared to  newborn human which was 32%.  In contrast, tongue muscles of the rat and cat have no slow MFs and macaque monkey  28% slow MFs; the MFs of rat and cat tongue are exclusively fast MFs.  Distribution in humans of slow MFs in tongue was found medially and posterially.  Special to adult human tongues  were MF-type grouping, large amounts of loose connective  tissue and short MF branching.   Relatedly by way of explanation for the similarity between  percentage  of  slow MF  for two-year-olds and adults,  by two years of age, human toddlers have been vocalizing (crying) since birth, babbling often since 6 months of age and speaking words for up to a year—largely through employing the same muscle structures they used for feeding.  But the movements and stiffness or tenseness of  these muscles, the tongue and lip muscles, for example, are different  for speech compared to the suckling from birth. 

An old adage: if you want to strengthen the muscles for an activity, say bicycling, swimming, or “whatever,”  then do that activity.  Such is the same for human speaking and muscle fibers in the tongue, especially. 

The importance of slow MF tongue muscles for North American English speech is that of the 25 consonants,  17 have a lengthened duration (slow consonants), and out of 14 vowel sounds,  10 vowel sounds have a lengthened duration — so it stands to reason that production of these 24 English speech sounds are associated with  using slow MFs (also called slow twitch muscles).   The lengthening or longer duration of specific English consonants and vowels can be measured via acoustic analysis (Johnson, 2000), and the need for 2nd formant change for “long vowel” production is likewise well documented.   By way of contrast,  consonant and vowel sounds for many other languages have significantly shorter durations compared to North American English speech.  (Spanish and Japanese have been described as the fastest spoken languages in the world with all or virtually all consonants and vowels spoken quickly.)

Positioning or placement of the tongue and tongue shape, are critical for accuracy of speech sounds recognition by listeners.   For example, the Spanish and other spoken languages such as Mandarin  produce “f” and “v” sounds with a short duration via the action of pushing air through partially open lips.  For North American English, the “f” and “v” sounds are prolonged—thus likely physiologically using slow MFs for the jaw  which holds steady the position of the lips with the position of upper teeth resting on lower lip)  to allow a more prolonged push of air and air friction through the lips, and prolonged voicing for the “v” sound via action at the vocal folds via the cricothyroid muscle.

Two forms of muscle movement (loading the muscles) have been identified to “grow” slow MFs and fast MFs (aka slow twitch muscles and fast twitch muscles, respectively).    One is lengthening the muscles and the other is isometric action (stiffening or tensing the muscles).

Lengthening movement forward of the human tongue is used for specific English consonants and vowels.  For example, for accurate articulation for the North American “L” consonant sound, pushing the tongue forward to touch the lower lip and holding it there makes for the producing a “L” consonant sound which is consistently recognized by human listeners as the “L” English sound.  For accurate recognition of  the “L” consonant sound, it must have a lengthened duration.  Based on our assessments, other positions of the tongue are less consistently recognized as an “L” speech sound, probably due to coarticulation effects from preceding and following speech sounds in a word with the “L” consonant sound.

To jump forward to answering the question first posed for acquisition of clear North American English,  muscle strengthening of the skeletal muscles of especially  the tongue, lips, jaw, and the cricothyroid muscle attached to the vocal folds  can be accomplished through two means–  accurate pronunciation of  English speech sounds in words and, as we now understand,  via muscle exercises specific to slow MFs  and fast MFs.

In the last year, our coached instruction for acquiring clear North American English speech for nonnative-born speakers of English has included vocal exercise aimed at strengthening slow MFs (slow twitch muscles) and fast MFs (fast twitch muscles)  for selected consonants and  vowels.   Those exercises have enhanced speech intelligibility outcomes, that is, measured word and speech sound intelligibility have increased substantially for those doing our coached courses.

Next time—description of specific exercises and speaking tactics during  home practice (also called, aka, direct practice) for words and sentences to grow slow MFs and fast MFs (slow twitch muscles and fast twitch muscles respectively).

Just so you know,  we have identified 14 dimensions for successful acquisition of clear North American English speech by nonnative-born speakers of North American English — from over 20 years of instruction and scientific assessment.  Success is defined as optimally efficient learning which is long lasting.  The Clear Talk Mastery method of instruction and learning is both an art and a science— the art is in getting to all 14 dimensions.  In this article, we are describing one dimension.

Be sure to check out our coached instruction- go to www.ClearTalkMastery.com and click on “Services. And be sure to check out the weekly self learning program, our proven subscription called ClearTalk Weekly—video and audio tutoring– you access 24/7 www.subscription.cleartalkmastery.com It works for people new to the admirable goal of making their English speech better for career and for life. It works for people who have done a coached course but was to rev up their accuracy.

©Clear Talk Mastery, Inc. 2023

Find What People Want and Need

Find What People Want and Need

Come to full attention when you hear the following phrases:

“ I want …”

“I need…”

“My goal is to …”

“I’m having a problem with…”

“I’m looking for …”

“I’m involved in a project that …”

Speech Tip #2

English Speech Tip 2: Short Vowels a, e

In this video Dr. Antonia Johnson shows how to pronounce the American English short vowel A and short vowel E.

Let us know what sounds or words you would like help with!

Speech Tip #1

English Speech Tip 1: l, r, e “clear”

In this video Dr. Antonia Johnson shows how to pronounce the American English word “clear.” She focuses on the English l, r, and long ē sounds.

Let us know what sounds or words you would like help with!