R. Edward Smith, Ph.D.

(This paper is a somewhat revised and amended version of my December 1980 University of Hawaii doctoral dissertation in Linguistics of the same title. My dissertation committee was Gregory Lee (Chairman), Derek Bickerton, Robert Cheng, Gordon H. Fairbanks, Stanley Starosta, and Donald M. Topping.)



The successful completion of this work would hardly have been possible without the inspiration and support of certain teachers, colleagues, and friends. My sincere appreciation goes to the members of my committee, in particular my chairman, Greg Lee, in whose classes I first became aware of Natural Phonology and the new and interesting questions it brings to the study of pronunciation. His patience, accessibility, and erudition are responsible for much of the enlightenment I have achieved in the process of destroying many of my favorite prejudices. I also owe much to David Stampe whose presence in Hawaii on the faculty of the Linguistic Institute in 1977 and as a visiting scholar in the summer of 1980 was a source of inspiration and enlightenment and who was kind enough to comment on much of the work in these pages. Thanks go also to my good friend and colleague Shozo Kurokawa whose kindness and generosity have contributed much to the realization of this work. Interviews recorded by him in both the Toohoku and Hiroshima areas of Japan were an invaluable source of data and his readiness to discuss matters having to do with Japanese pronunciation made many of my conclusions possible. For the recorded interviews of Hawaiian Japanese speakers upon which some of the data in Chapter 5 is based I wish to thank two of my former students in Japanese linguistics in the East Asian Language Department at the University of Hawaii – Carolyn Kagawa and Utako Walsh. Finally I send a fond aloha to the entire Linguistics Department for making my stay there a warm, human, and fulfilling experience.



This paper applies the theory of Natural Phonology to the study of Japanese pronunciation. In Japanese as in any language constraints are placed on the universal set of phonetically motivated processes leaving a particular subset to govern underlying and derived structure. In this work 30 important processes of Japanese are identified by form and function. Evidence comes from careful and hypoarticulate speech data and the pronunciation of loan words.

Chapter 1 presents a general introduction to the theory of Natural Phonology.

In Chapter 2 processes affecting vowels and glides are identified including an important process, syllabicity reversal, which changes downgliding diphthongs into up-gliding ones. The important role played by this process in both diachronic developments and synchronic speech processing is shown.

Chapter 3 discusses consonants and consonant processes including four processes which are important to the derivation of superficially palatalized consonants from underlying plain ones.

Chapter 4 describes the role of processes in the processing of hypoarticulate speech. Some of the processes which govern the lexicon are shown to govern derived structure as well while others are subject to counter feeding.

In Chapter 5 the role of processes in the nativization of loan words is explored. Two borrowing strategies are proposed – an innovating one oriented toward imitating some of the phonetic characteristics of the source language and a conservative one oriented toward preserving the canons of the target language.

Chapter 6 analyses some features of Hawaiian Japanese speakers who have resided in Hawaii since prior to 1924. The speech of immigrants from Hiroshima and Fukushima are constrasted in terms of differing constraints on certain processes. The question of whether there is phonological levelling in the direction of Hiroshima speech is asked, and on the basis of data presented there the question is answered negatively. The difference between phonetically and conventionally motivated constraints on pronunciation is exemplified and an answer to the question of whether phonetically motivated constraints are more 'persistent' in adult speech is explored. On the basis of the data presented no significant difference is found.




1.0   Introduction
1.1   Natural phonology and phonemics
1.2   Fortition and Lenition Processes
1.3   Context-free and Context-sensitive processes
1.4   Constraints on underlying representation
1.5   Processes vs Rules
1.6   Constraints on the Application of Processes
1.6.1   Ordering
1.6.2   Suppression
1.6.3   Limitation


2.1   Syllable and mora in Japanese
2.1.1   The syllable
2.1.2   The mora   Mora duration   Mora and the Kana Orthography
2.2   Vowels
2.2.1   Context-free Vowel Processes   Raising   Bleaching   Coloring
2.3   Japanese Underlying Vowels
2.3.1   Vowel Length
2.3.2   Dipthongs   On-gliding dipthongs   Off-gliding dipthongs   Types of off-gliding dipthongs   Derivation vs. Borrowing
2.3.3   Glide Epenthesis

3.0   Consonants
3.1   Japanese has twelve underlying consonants
3.1.1   Underlying /dz/
3.1.2   Syllable initial (onset) distribution
3.1.3   Syllable final (offset) distribution   Distribution of offset nasal   Distribution of offset obstruents   NP and GP analyses of geminate consonants compared
3.1.4   Obligatory Consonant Lenitions   Offset /n/ lenition   Onset consonant lenitions   /h/ is pronounced [ɸ] before /u/   /h/ is pronounced [ç] before /i, y/   /dz/ is pronounced [z] intervocalically   /g/ is pronounced [ŋ] intervocalically   Palatalization   Palatalization and Language Change
4.0   The term hypoarticulate speech
4.1   Vowel Lenitions
4.1.1   Vowel unvoicing   Complete assimilation of voiceless vowels
4.1.2   Voiced vowel assimilation   Offset /r/ nasalization
4.1.3   Syllabicity reversal
4.1.4   Vowel Coalescence
4.1.5   Vowel Shortening
4.1.6   Height Assimilation
4.2   Glide Lenitions
4.2.1   Glide fronting
4.2.2   Labial glide deletion
4.2.3   Palatal glide deletion
4.2.4   Post consonantal palatal glide deletion
4.3   Consonant lenitions
4.3.1   Obstruent voicing assimilation
4.3.2   Regressive obstruent assimilation
5.0   Modern Loan Words
5.1   Conservative vs Innovating
5.2   Processes in NP
5.3   Stampe's analysis applied to 'conservative' and 'innovating'
5.4   Two Borrowing Strategies
5.5   [ wo, we, wi ]
5.6   [ye] [C'e]
5.7   Three categories of proccesses
5.8   [ ti, tu, tyu ]
5.9   [ di, du, dyu ]
5.10   Conservative borrowing strategy
5.11   Innovating borrowing strategy
5.12   Correlation of strategies with ease of assimilation
6.0   Introduction
6.1   Process-governed features
6.1.1   Comparative analysis of dialects in Table 6.2
6.2   Persistence of VOWEL RETRACTION and Y DEL in Fukushima issei speech
6.3   Rule-governed features of Fukushima dialect
1.1   Suppression of Processes
2.1   Foreign Verb Stems
2.2   Japanese Lexical Vowels
2.3   Foreign Vowels and Their Japanese Cognates
2.4   'Upside-Down Words'
3.1   Underlying Consonant Inventory
3.2   Consonant Feature Matrix
3.3   Short Syllable Inventory
6.1   Number of Issei in Hawaii in 1924 with their Prefecture of Origin
6.2   Merger of H/Y Syllables in F
6.3   Variation in the Speech of a Fukushima Resident
6.4   Occurrence of Vowel Retraction and Y Deletion in the Speech of Two Fukushima Issei
6.5   W-stem Verb Paradigm of the Verb 'to buy'
6.6   Alternative Analyses of the w-stem verb paradigm
6.7   Occurrences of w-stems in the speech of M and F
P14   H LAB
P25   OFFSET /r/ NAS
P17   PAL




1.0 Introduction

The treatment in this paper is based on the theory of natural phonology (NP) first proposed by David Stampe. See for instance Stampe (1969, 1973), and Donegan and Stampe (1979). In Donegan and Stampe (1979) what is to be studied in phonology is characterized as 'the discrepancy between the sound perceived and intended and the sound pronounced.' This suggests three main areas of concern:

i) Phonological (underlying) representation – the sound perceived and intended;

ii) Phonetic (surface) representation – the sound pronounced;

iii) Processes – the phonetically motivated substitutions which "form the system of limitations standing between the intention and actualization of speech" (ibid p. 78).

The elements in phonological representation are taken from the lexicon – the list of items stored in the speaker's long-term memory. A lexical representation is an empirical hypothesis on the form in which an item is stored in the lexicon. This is the form which Stampe (with Sapir) believes is most readily brought to consciousness by the speaker. Such representation is constrained by processes which elsewhere actually govern substitutions in the language, but which in the lexicon act in the manner of morpheme structure conditions in generative phonology to disallow segments and sequences to which they might apply. According to Stampe (1973) 'These processes are mental operations performed on behalf of the physical system in speech perception and production. The purpose of these processes is to substitute for a class of sounds or sound sequences presenting a specific common difficulty to the speech capacity of the individual, an alternative class identical but lacking the difficult property.' The existence and form of processes may be inferred from substitutions which occur in child language, in diachronic development, in loan words, in speech errors, in secret languages (e.g. pig latin) and in hyper- and hypo-articulated speech.

The following example derivations are from Stampe (1973):

(i). /kant/[khæ̃t] 'can't'

(ii). /kæt/[khæt] 'cat'

NOTE: This paper makes extensive use of phonetic symbols. Your browser should display the characters in example (i) above like this:
You should see a small tilde ( ~ ) centered above the ash symbol ( æ ). The font is Arial Unicode MS. If your browser does not display this correctly, please go to this page: The International Phonetic Alphabet in Unicode for possible solutions.

In (i) the underlying representation on the left represents what the speaker intends, and the phonetic representation on the right shows the resulting pronunciation after the application of processes of aspiration, vowel nasalization, and nasal deletion. Compare this with the derivation of 'cat' in (ii) where the surface form differs from (i) only by the oral versus the nasal vowel.(1) The contrast of oral and nasal vowels in this pair of words, though phonetically distinctive, is below the level of consciousness of the naive speaker. it is thus the underlying representations which best represent the speaker's perception of the pronunciation of these words.(2)

1.1 Natural phonology and phonemics

In Stampe (1973) the relationship between NP and traditional phonemics is characterized in the following way:

"The relationship between contrary processes is a systematic account of the notion 'allophone' in traditional phonemics. An allophone is a sound which does not occur in underlying (phonemic) representation, but only in superficial (phonetic) representation, due to a context sensitive 'allophonic' process. An allophonic process is any process, like vowel nasalization in English, which creates sounds which do not occur in underlying representation in the language... In natural phonology, the nonoccurrence of certain sounds (nasal vowels) in underlying representation in a language is attributed to a process (vowel denasalization) in the phonological system of the language. Thus the notion 'allophonic process' translates as any process which gives rise to sounds eliminated by a prior, more general process in the system. Vowel nasalization in English is allophonic because it gives rise to sounds which the prior context-free process of vowel denasalization eliminates – nasal vowels. And nasal vowels are therefore 'allophones of' nonnasal vowels in English." (p. 25).

1.2 Fortition and Lenition Processes

In Donegan and Stampe (1979) segmental processes are either 'fortitions' or 'lenitions.' Fortitions – which include dissimilations, diphthongizations, syllabications, and epentheses – are designed to make pronunciations more perceptible. Lenitions – which include assimilations, monophthongizations, desyllabications, reductions, and deletions – make segments and sequences of segments easier to pronounce. The nasalization of the vowel in 'can't,' a lenition, is a response to the difficulty involved in pronouncing a vowel with raised velum when a sound with lowered velum follows.

1.3 Context-free and Context-sensitive processes

Lenitions are generally context-sensitive or syntagmatic processes. Fortitions are generally context free or paradigmatic processes and have the effect of maximizing the phonetic properties of individual segments often heightening their differences with neighboring segments. Thus the context-free vowel denasalization process in English maximizes the vocalic quality of the underlying vowel in 'can't.' It also accounts for the fact that English speakers perceive vowels as nonnasal even when they superficially aren't. In addition to governing the lexicon vowel denasalization in English also applies to foreign words such as French [mamã] rendering it [mama] without nasalization. On the other hand French [monami] without a nasalized first vowel is rendered [mõnami] by English speakers (examples from Stampe 1973). The nasal vowel in French [mamã] cannot survive in English because there is no post-vocalic nasal context in which it can be derived by processes of English. [khæ̃t] also lacks a post vocalic nasal, but its absence is due to the process which deletes nasals between a preceding vowel and a following stop in English. No such process exists for deleting post vocalic nasals not followed by a stop. In the case of French [monami] the first vowel is followed by a post vocalic nasal thus making vowel nasalization derivable in English.

1.4 Constraints on underlying representation

In NP there are circumstances in which underlying representations may be deeper than 'phonemic'. The case of syllable final obstruent devoicing in German furnishes an example. In German the words 'organization' and 'many colored' are homophonously [bunt] in their unflected form, but their inflected alternants are [bunde] and [bunte] respectively. The neutralization of [d,t] in the uninflected form is due to obligatory application of the syllable final obstruent devoicing process. Under such circumstances where the lexical representation is relatable to the surface by processes a 'morphophonemic' underlying representation is justified, and 'organization' is /bund//bunde/ and 'many-colored' is /bunt//bunte/.

There is one other case where non-allophonic alternants on the surface may be represented by a single underlying representation, viz. surface alternants that are derivable by optional application of processes e.g. careful-casual alternants like 'hands' /hæ̃ndz/[hæ̃ndz][hæ̃nz] , due to the optional application of [d]Ø / n_z.

Thus in NP underlying segments are identical with their surface representations except that 1) 'allophonic' features are barred from underlying representation, and 2) 'morphophonemic' representations are legal where required by alternation provided they are mapped to the surface by obligatory or optional processes of the language.

1.5 Processes vs Rules

Stampe makes a sharp distinction between processes, which have synchronic phonetic motivation, and rules, which define conventional substitutions without synchronic phonetic motivation. These are the neo-phonetic and paleophonetic alternations, respectively, of Baudouin de Courtenay (1895). The so-called 'velar softening' rule which alternates e.g. [k/s] in 'electri[k]electri[s]ity' is a rule not a process. Rules can be ignored if a speaker so chooses without creating difficulties in pronunciation e.g. 'electri[k]ity' is quite pronounceable by speakers of English as are 'persnickity' and 'lickity split.' Conventional substitutions are typically obligatory however, wherever they apply. (English speakers don't say 'electri[k]ity' even though capable of it). Rules play no role in the processing of forms resulting from speech errors or 'tongue slips.' In a spoonerism, 'Cynical guys' becomes [ǵinikl saiz] not [dz̆inikl kai̯z] i.e. velar softening does not apply to /g/ before /i/, but fronting, a process, does. As Stampe (1973) puts it: 'Phonological constraints which are learned (i.e. rules RES) do not govern our phonetic behavior (p. 44).' Nor, in the example, does [s] in 'cynical' revert to /k/ before non /i/. Or to pick a better example (Donegan and Stampe 1979 p. 166) in the secret language Ob, in which the syllable /ab/ is inserted before every vowel, 'electricity' is pronounced [abilabɛktrabɪsabɪtabɪ], not *[--k--] even though the vowel of /ab/ would block velar softening of a hypothetical underlying /k/. This shows that the relationship of [s] in 'electricity' to /k/ in 'electric' is not represented in the mind as an underlying /k/ in 'electricity.' And in general, Donegan and Stampe conclude, such 'systematic phonemic' representations – underlying representations related to surface representation through the application of rules rather than processes – are not confirmed by empirical evidence. Hence, they conclude, rules play no role in either the productive or perceptual aspects of speech processing.

There is the possiblity then that rules play only an interpretive role in the grammar. This in fact seems implicit in some of the non-NP writings which posit a less abstract lexical representation than is characteristic of generative phonology. Leben and Robinson (1977) suggest, for example, that if the level of lexical representation is phonemic then 'allophonic' rules might relate this to the surface while 'morphological' rules would apply interpretively to determine if lexical items shared a morpheme.

1.6 Constraints on the Application of Processes

1.6.1 Ordering

Processes which govern lexical representation are context-free where they determine the underlying segmental inventory and context-sensitive where they constrain underlying sequences. Where the effect of context-free and context-sensitive processes contradict each other as in the case of vowel denasalization and nasalization, the context free process is ordered first and governs the lexicon rendering all underlying vowels nonnasal. The context sensitive process governs surface structure and provides for the nasalization of vowels in a nasal context.

1.6.2 Suppression

Though the set of innate processes is universal, constraints on the application of processes are language particular and take the form of suppression, limitation, or ordering of the processes. In achieving mature speech the child is thus faced not with the task of learning processes but of learning to constrain their application in appropriate ways. Thus e.g. 'wabbit' will not become 'rabbit' until the English-speaking child has learned to suppress the context-free process r → w which substitutes simpler /w/ for more complex /r/. Since suppressing the application of a process takes effort, a frequent cause of language change is the failure of children acquiring language to suppress a process to the extent that mature speakers do.

The suppression of processes affects the phonological and phonetic structure of the language in various ways. In the case of vowel denasalization and nasalization four logical possiblities exist. These are shown below along with a representative language of each type and the phonological and phonetic consequences:

Table 1.1

V-denas V-nas LanguageEffect
Unsuppressed Unsuppressed English/v/ → [v,ṽ]
Suppressed Unsuppressed Hindi/v/ → [v]
/ṽ/ → [ṽ]
Suppressed Suppressed French/v/ → [v]
/ṽ/ → [ṽ]
Unuppressed Suppressed Hawaiian/v/ → [v]

1.6.3 Limitation

In English there is a process that deletes [h] before sonorants – affecting the sequences [hn, hl, hr, hw, hy, hV]. Historically, the gradually more general application of [h]-deletion proceeded along a parameter based on the degree of sonority of the following segment. Starting before the least sonorous segment, [h]-deletion affected e.g. OE [hnutu] 'nut', [hlaxan] 'laugh, [hring] 'ring', and is presently affecting e.g. Mod E [hweil] 'whale', [hjuː] 'hue', [hau̯s] 'house' in various dialects. There is an implicational order involved such that speakers who delete [h] before [j] (pronouncing 'hue' and 'you' alike) also do so before [w] (pronouncing 'whale' and 'wail' alike). And those who delete [h] before vowels (e.g. Cockney speakers) do so before all other sonorants as well. The fact of the implicational relationship means we are dealing with a single process. Where [h] is invariably present that part of the process is suppressed. Where [h] is invariably absent the relevant sub-process of [h]-deletion governs the lexicon, and where [h] is variably absent the relevant sub-processes govern only surface structure.


(1) There are dialects where the pronunciation of (i) and (ii) differ in additional ways. Stampe has pointed out to me that some speakers have breaking i.e. [æ][æe̯] (southern U.S.) or [æ][aə̯]~[æe̯] (northern US.) before nasals and therefore have a distinct vowel in 'can't.'
(2) A source of native speaker awareness of /n/ in [khæ̃t] is the presence of /n/ in 'can'. It is evidently there because speakers who experience no difficulty in pronouncing [khæt] backwards [thæk] will when asked to pronounce [khæ̃t] backwards attempt [tnæk] even though it is unpronounceable in English. Evidence for the lack of phonetic [n] in [khæ̃t] is the fact that the [t] is often flapped in 'can't I' [khæ̃ɾae] just as it is in 'cat eye' [khæɾae̯]. The presence of a preceding nasal would block flapping.




2.1 Syllable and mora in Japanese

2.1.1 The syllable

Because the syllable is the basic unit of segmental organization in all languages, many patterns of segmental distribution and evolution may best be understood within the context of syllable structure. The syllable may be characterized as an organization of segments in a sonority pattern. An utterance is a series of syllables – sonority peaks surrounded by less sonorous satellite segments. Vowels, the most sonorous segments, typically serve as peaks, with obstruents, the least sonorous segments, as satellites. Liquids, nasals, and glides occupy intermediate positions between vowels and obstruents. Satellite segments which precede the syllable peak comprise the syllable onset, while those which follow the peak comprise the syllable offset. Perceptually the ideal syllable is characterized by maximum sonority at the peak and minimum sonority in the satellites. Coupled with the tendency toward open syllables (syllables with no offset) this may explain the universal presence in infant speech of combinations of the type [papa], [tata], [dada], [mama].

Japanese syllable structure approaches the ideal of vocalic peaks of sonority separated by relatively non-sonorous consonant onsets. There are more complicated canonical shapes as well (cf. Chapter 3) but the classic CVCV pattern prevails overall. Such a pattern offers maximum perceptual clarity, but lenition processes diminish this clarity in favor of ease of articulation. The tension between these two teleologies explains many of the contradictory developments in synchronic speech processing as well as in diachronic phonology. For example, all syllables have vocalic peaks in lexical representation, but the. application of vowel weakening processes (cf. Chapter 4) sometimes results in syllables with obstruent or nasal peaks on the surface e.g. / → [ ] "nice', /uma/ → [] 'horse.'

Syllable structure influences the application of processes in various ways. For instance, other things being equal, onset segments are less likely to undergo lenition processing than offset segments, e.g. offset /n/ is subject to a number of assimilatory processes to which onset /n/ is immune:

(1) /tan.i/ → [tãã.i] 'unit'
(2) / → [] 'valley'

In (1) /n/ weakens under the influence of an adjoining vowel while in (2) it does not. (1) also illustrates the fact that the application of processes is often constrained by the presence of syllable boundaries. In Japanese the offset nasal segment will extend nasalization to the preceding peak but not to a segment of the following syllable.

2.1.2 The mora

Most treatments of Japanese phonology fail to distinguish between syllable and mora. Hattori (1960 p. 247ff) discusses syllable structure, and McCawley (1968 p. 58ff) and Martin (1967 p. 246 and 1975 p. I-1 ff) clearly contrast the two. But all too frequently the terms 'syllable' and 'mora' are used interchangeably in reference to what is in fact the mora.

The mora concept arises in languages which make a distinction between long and short syllables, e.g. Japanese, Classical Latin, Classical Greek. A mora is a rhythmical unit of which short syllables have one and long syllables have two. The extra mora in long syllables is due to the presence of an offset segment. Thus in Japanese /a/, /ka/, /kya/ are short syllables and /aa/, /kaa/, /kyaa/, /kyan/ are long syllables. McCawley (1968), using terminology from Troubetskoy, has dubbed Japanese a 'mora counting syllable language,' by which he means that the syllable is the prosodic unit or bearer of pitch accent (i.e. there is no distinction between syllables accented on the first mora and syllables accented on the second mora), and the mora is the unit of phonological distance (i.e. accent rules are of the type 'place the accent on the antepenultimate moral). The example McCawley uses to show that both the syllable and the mora are necessary is the loan word /erebeetaa/ 'elevator' which in accordance with the above rule of accentuation of Japanese would be accented /erebeétaa/, but which in fact is accented /erebéetaa/ because long syllables are never accented on the second mora. The rule then must be worded 'place the accent on the syllable containing the antepenultimate mora.

There is other evidence that Japanese is a 'syllable language.' Ashworth and Lincoln (1973) cite verb stems derived from borrowed words which are shortened forms of the originals to which a native inflectional suffix is attached, e.g.

Table 2.1

Foreign Verb Stems

 OriginalVerb stemNon-past 
1./a.dzi.tee.syon//adzir-//adziru/'to agitate'
2./'to demonstrate'
3./'to harmonize'
4./da.ben/(1)/daber-/ /daberu/'to chat idly'

To state the rule for deriving the verb stem from the original in 2, 3, and 4 it will be necessary to recognize not only the syllable but the difference between long and short syllables since the stem apparently must begin with two short syllables followed by /r/ even if the syllables in question were originally long. Thus the verb stem is formed on the first two syllables (not moras) of the original minus any offset segments. Mora duration

It is usually claimed (e.g. Han 1960) that in Japanese the moras in an utterance are of more or less equal duration, but spectrographic evidence (Wang 1968) shows that on the contrary the duration of moras differs widely depending on segmental constituency and other factors. Native speakers are aware of the number of moras in an utterance (or a line of poetry) and it is apparently length based on this quantity that they respond to rather than length measured in centiseconds. To the extent that the latter plays a role it is in the intention and perception of speech rather than its actuation. Mora and the Kana Orthography

The confusion between the syllable and the mora in Japanese may derive to some extent from the 'syllabic' kana orthography with which the language is written. At its inception the kana orthography was, indeed, syllabic since there was apparently no distinction between long and short syllables in Old Japanese (OJ). Subsequent developments introduced such a distinction however, and today the kana orthography would be better termed 'moraic.'

2.2 Vowels

In her study of the natural phonology of vowels Donegan (1978) distinguishes three basic identifying vowel features – palatality, labiality, and sonority.(2) A cover term for the features palatality and labiality is color (timbre), and vowels with the palatal and/or labial feature are, chromatic vowels. The relationship between color and sonority is such that vowels with a high degree of one will have a low degree of the other:


2.2.1 Context-free Vowel Processes

Donegan also proposes a number of universal context-free processes which account for the commonly occurring vowel substitutions in natural languages. Included are the following apparently contradictory pairs of processes: RaisingLowering; Bleaching (depalatalization or delabialization) – Coloring (palatalization or labialization); and TensingLazing. Raising, Coloring, and Tensing increase color and Lowering, Bleaching and Lazing increase sonority. Although the pairs of processes have contradictory teleologies, the fact is that only one member of the pair tends to apply under given phonetic conditions. Raising

Raising tends to apply to chromatic vowels, especially lower ones, while Lowering tends to apply to achromatic vowels, especially higher ones:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
  i       ɨ       u
  e       ʌ       o
  æ       a       ɔ Bleaching

Bleaching is more likely to apply to lower chromatic vowels:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
i →→       ɨ    ←← u
e →→→→     ʌ     ←←←← o
æ →→→→→→     a     ←←←←←← ɔ Coloring

Coloring tends to apply to higher achromatic vowels:

[+ PAL]   [- PAL]
[- LAB]
  [+ LAB]
i←←←←←← ɨ→→→→→→ u
e ←←←← ʌ→→→→ o
æ ←←a →→ ɔ

A characteristic of any process is that the application of its subprocesses reflects a strict implicational hierarchy such that for instance Coloring never applies to a lower achromatic unless it applies to higher ones (e.g. ʌe implies ɨi), or Lowering never applies to a mid achromatic vowel unless it applies to a high one as well (e.g. ʌa implies ɨʌ). Since context-free processes in NP act in the manner of morpheme structure rules in generative phonology the hierarchical conditions on the application of processes governing the lexicon account for the fact that certain vowels (e.g. /ɨ, æ ,ɔ/) are frequently absent from underlying inventories.

2.3 Japanese Underlying Vowels

Japanese has the following five lexical vowels:

Table 2.2

Japanese Lexical Vowels

 [+ PAL][- PAL]
[- LAB]
[+ LAB]
[+ HIGH]i u
[- HIGH]
[- LOW]
e o
[+ LOW] a 

From the absence of low chromatic vowels we can infer that Raising or Bleaching govern the lexicon. Likewise, the absence of the less sonorous achromatics implies that Lowering or Coloring also govern the lexicon. There is diachronic evidence for the raising of (long) low chromatic vowels. OJ [au̯] underwent monophthongization to [ɔɔ̯] (Nishihara 1970). We can assume [ai̯] > [ææ̯] as well since this latter remains in certain dialects (e.g. Hiroshima prefecture according to Itoo (1979)), but in the standard language these monophthongs were raised to [oo̯] and [ee̯] respectively.

Loan phonology furnishes further evidence of vowel processes that govern the lexicon. Lovins (1973) cites a number of correspondences between foreign vowels and their Japanese cognates, e.g.

Table 2.3

Foreign Vowels and Their Japanese Cognates

/ɔ/oRaising/ooto/ 'auto'
/ʌ/aLowering/rabu/ 'love'
/æ/aBleaching/batto/ 'bat'
/ə/aLowering/saakasu/ 'circus'

(See Lovins ibid. p. 72ff for discussion of exceptions due to orthographic influence and other reasons.)

2.3.1 Vowel length

As is typical of mora-counting languages, vowel length in Japanese is distinctive, and all five vowels occur in both long and short varieties. Long vowels will be written as a vowel plus homorganic glide sequence (VV̯), the vowel representing the syllable peak and the homorganic glide the offset e.g.

/ki/ 'tree'   –   /kii̯/ 'strange'
/me/ 'eye'   –   /mee̯/ 'niece'
/obasan/ 'aunt'   –   /obaa̯san/ 'grandmother'
/ku/ 'nine'  –   /kuu̯/ 'emptiness'
/ko/ 'child'  –   /koo̯/ 'this way'

Although for convenience long vowels are here written as two segments there is evidence, both diachronic and synchronic, that they are in fact unitary. Historically long vowels have not undergone diphthongization as might be expected if they were structurally bipartite (cf. Donegan 1978, p. 56). What changes they have undergone e.g. raising [ææ̯] > [ee̯] have affected the long vowel as a whole. There is also evidence from synchronic speech processing. Martin (1974) cites a game called sakasa kotoba 'upside-down words' (also the source of a special language referred to as yakuza kotoba 'gangster argot') which requires that the first and last portions of words be reversed, e.g. /kore//reko/. Table 2.4 contains examples involving long vowels and other syllables:

Table 2.4

'Upside-down Words'

1./tai̯ra/ 'Pr Name'/rai̯ta/
2./kappu/ 'cup'/pukka/
3./ringo/ 'apple'/gonri/ ~ /gorin/
4./gindza/ 'Ginza'/dzangi/ ~ /dzagin/
5./byoo̯bu/ 'screen'/bubyoo̯/
6./tii̯dzu/ 'cheese'/dzutii̯/

Where long syllables are involved the syllable is divided if it contains a diphthong (example 1) or an offset nasal (examples 3 and 4). However, it is never divided if it contains a long vowel (examples 5 and 6).

Long vowels may also arise on the surface in lenited speech due to the application of the process Vowel Coalescence (cf. §4.1.4) to a sequence of identical vowels in adjacent syllables e.g. [suu.ri] 'vinegar vendor' (< /su/ 'vinegar' /uri/ 'sell').

2.3.2 Diphthongs

A diphthong is a syllable nucleus with two vowel segments only one of which is syllabic. The non-syllabic may come from an adjoining consonant which is weakened e.g. z > y, b > w; or from an adjoining vowel which loses its syllabicity e.g. i > y, u > w. Diphthongs may also arise from simple vowels (Donegan 1978 p. 111) but such a development seems not to have occurred in Japanese. in ongliding diphthongs the non-syllabic precedes the syllabic e.g. [ya, wa]. In off-gliding diphthongs the non-syllabic follows the syllabic e.g. [ai̯, au̯]. In mora-counting languages such as Japanese an off-gliding diphthong constitutes a long (i.e. two-mora) syllable. On-gliding diphthongs

OJ had the following on-gliding diphthongs (Martin 1976):

ye, ya, yo, yu
wi, we, wa, wo

Due to the gradually more general application of processes eliminating prevocalic glides SJ now has only:

ya, yo, yu

and [wa] alternates with [a] in hypoarticulate speech (cf. Chapter 4). The loss of the labial glide before vowels occurred in the following historical order: earliest before the palatal vowels /i,e/ (where it apparently shifted to /y/ which was itself subsequently lost (Martin 1975); next before /o/ (this is a recent development); and last before /a/ where it is now optionally fronted or deleted depending on the environment. The following two processes account for these developments:(3)

(P1)   GLIDE FRONTING:   W → Y / ___[ V
[ - lab
[ ! higher

Before /i,e/ this process governs the lexicon assuring */wi, we/. Before /a/ and preceded by a palatal vowel it optionally governs surface structure accounting for [iwa, ewa)[iya, eya) e.g. /i wa sinai/[i ya sinai] 'it is not found', /kore wa/[kore ya] 'this (topic)'. The application of this subprocess of P1 is most frequent where the topic marker 'wa' is involved.

(P2)   PRE VOC LAB GLIDE DEL:   W → Ø / ___[ V
[ - pal
[ ! higher

Before /u,o/ this process governs the lexicon assuring */wu,wo/. Before /a/ application is constrained as follows:

(i) Preceded by a non-palatal vowel it optionally governs surface structure e.g. /boku wa/[boku a] 'I (topic)', /soko wa/[soko a] 'there (topic)', /siawase/[s̆iaase] 'happiness'.

(ii) Preceded by # (word boundary) it usually does not apply e.g. /waza waza/*[aza aza ]. (Note however [atas̆i], a female speech version of [watas̆i] 'I').

The loss of /ye/ was due to the palatal counterpart of LAB GLIDE DEL and is accounted for by the following process:

(P3)   PRE VOC PAL GLIDE DEL:   Y → Ø / ___[ V
[ + pal
[ ! higher

(P1), (P2), and (P3) together assure */yi, ye, wi, we, wu, wo/. The effects of (P2) and (P3) can be seen in loan phonology. The following data is from Lovins (1973 p. 97):

/iisuto/   'yeast'
/uuru/   'wool'
/ikooru/  'equal'

In Chapter V alternate forms of these loan words are discussed.

The 'mirror image' of PRE VOC PAL GLIDE DEL applies optionally in hypoarticulate speech to delete [y] after palatal vowels before non-palatals as in [iyoo̯][ioo̯] 'let's stay' (cf. §4.2.3).

Although there is diachronic evidence for Glide Fronting, at least before /e/, synchronic evidence is difficult to come by due to */yi, ye/. However there are some hypoarticulated forms which show its effects, e.g.

/inoue/[inouwe][inouye] 'Proper Name'
/nihon e/[nihõõ e][nihõõ we][nihõõ ye] 'to Japan'

The /w/ in these forms arises from application of Glide Epenthesis, a fortition process described in §2.3.3. It is not unusual for sequences which are lexically inadmissible (e.g. [we] and [ye] in these examples) to be created on the surface in hypoarticulate speech processing. Stampe (1973) gives the example of the hypoarticulate form [bniθ] 'beneath' which violates the lexical constraint in English against initial stop-nasal clusters. Off-gliding diphthongs

There is general agreement that the off-gliding diphthongs in SJ are [ai̯, ei̯, oi̯, ui̯] (Hattori 1960, Martin 1975, McCawley 1968) with Martin adding [au̯] as a possibility in loan words. But the phonetic basis for these assertions is not clear. Martin (1967) gives only 'morphophonemic' status to bi-moric syllables saying they have 'nothing to do with any assumed physiological manifestations (p. 247).' Later however (Martin 1975) he says the notion of the bi-moric syllable is based on 'auditory impression and on accent behavior; it is assumed that experimental phonetic investigations will prove the existence of an articulatory (motor-production) unit that corresponds to these ... syllables (p. I-2).' If we assume bi-moric syllables with two vowel qualities (i.e. off-gliding diphthongs) then one of the vowels must be [+ syllabic] (the peak) and one [- syllabic] (the offset). It follows that the well-known constraint that only the first mora of a long syllable can bear the accent (McCawley 1968) is due to the fact that accent must fall on the syllable peak. In [tadáI̯ma] 'right now' (< [táda] 'just' + [íma] 'now') the accent shift is a prerequisite to considering to be a diphthong what was originally two separate vowels. In /siíru/ 'to force' the location of the accent shows the /i/'s are not tautosyllabic. (Cf. Martin (1975 p. 24-5) for a relevant discussion of verb accentuation.

A criterion for assigning two vowel qualities to the same syllable might be their joint participation in processes such as monophthongization. Historically the substitutions [ai̯, ei̯, oi̯,] > [ee̯] have occurred in at least some dialects, and the same may be said for [au̯, ao̯, ou̯] > [oo̯] and for [e̯] > [yoo̯]. In the discussion below vowel sequences which form the input to processes of this sort are assumed to be off-gliding diphthongs.

In the history of Japanese, intramorphemic vowel sequences arose from diachronic developments of the type CVCV > CVV where the elided consonant included among others the palatal and labial glides discussed in § The resulting vowel sequences plus subsequent diachronic and synchronic vowel sandhi developments, if any, are indicated below. It will be seen that the same processes are involved in diachronic and synchronic substitutions, the different effects being due to limitations on the application of the process in one case but not the other. (All diachronic evidence cited in 1-20 below is from Martin 1976.)

/Vi̯/ Sequences:

1. /ei̯/ virtually non-occurring in ModJ due to /ei̯/ > /ee̯/ which governs the lexicon for most speakers. e.g. /sensei̯/ > /sensee̯/ 'teacher'; also applies in loan phonology (cf Chapter 5).
2. /ai̯/ occurs in ModJ. Diachronic evidence for /ai̯/ > /ææ̯/ >/ee̯/. Merges with /ae̯/ in hypoarticulate speech (cf Chapter 4) and in some dialects (cf Iitoyo 1976, p. 272 ff).
3. /oi̯/ occurs in ModJ. Diachronic evidence for /oi̯/ > /ee̯/. Merges with /oe̯/ in hypoarticulate speech.
4. /ui̯/ occurs in ModJ. Merges with /ii̯/ in hypoarticulate speech.

/Vu̯/ Sequences:
5. /ou̯/virtually non-occurring in ModJ due to /ou̯/ > /oo̯/ which governs the lexicon for most speakers. e.g. /kou̯/ > /koo̯/ 'this way'. Also applies in loan phonology.
6. /au̯/ non-occurring (except in recent loans) in ModJ morphemes due to /au̯/ > /oo̯/ e.g. /siyau̯/ > /siyoo̯/ 'a method'.
7. /eu̯/ non-occurring in ModJ due to /eu̯/ > /yoo̯/ e.g. /keu̯/ > /kyoo̯/ 'today'.
8. /iu̯/ occurs in ModJ though there is diachronic evidence (e.g. /yorosiku/ > /yorosiu̯/ > /yorosyuu̯/ 'good') and synchronic evidence (cf Chapters 4 and 5) for /iu̯/ > /yuu̯/.

/Ve̯/ Sequences:
9. /ae̯/ occurs in ModJ though there is diachronic evidence for /ae̯/ > /ee̯/ (e.g. /temae̯/ > /temee̯/ 'you' in some dialects).
10. /oe̯/ occurs in ModJ.
11. /ue̯/ occurs in ModJ. Evidence for [ue̯] → [ (w)ee̯] in hypoarticulate speech and loan words.
12. /ie̯/ occurs in ModJ though there is diachronic evidence for /ie̯/ > /ee̯/ e.g. /kie̯ru/ > /kee̯ru/ 'to be extinguished' (Hiroshima dialect) and evidence in hypoarticulate speech for [ie̯] → [(y)ee̯].

/Vo̯/ Sequences:
13. /ao̯/ occurs in ModJ though there is diachronic evidence for /ao̯/ > /oo̯/ e.g. /mao̯su/ > /moo̯su/ 'to say'.
14. /uo̯/ occurs in ModJ. There is evidence in hypoarticulate speech and loan words for [uo̯] → [(w)oo̯]
15. /io̯/ occurs in ModJ. There is evidence in hypoarticulate speech for [io̯] → [yoo̯].
16. /eo̯/ non-occurring in ModJ due to /eo̯/ > /yoo̯/ e.g. /meo̯to/ > /myoo̯to/ 'husband and wife'.

/Va̯/ Sequences:
There are no Va sequences from diachronic developments. The few occurrences in ModJ are due to recent loans.
17. /ia̯/ e.g. /kasimia̯/ 'cashmere' but there is evidence for /ia̯/ → /iya/ e.g. /siberiya/ ~ /siberia̯/ 'Siberia', /itariya/ ~ /itaria̯/ 'Italy' and many such pairs. Also school children are known to have spelling difficulty with the distinction. There is also evidence for /ia̯/ → /ya/ e.g. /syamu/ 'Siam' (Lovins 1973 p. 98). There is some diachronic evidence for /ia̯/ > /e/ e.g. /kiyasu/ > /kia̯su/ > /kesu/ 'to erase'.
18. /ea̯/ e.g. /sukuea̯/ 'square' but there is evidence for /ea/ → /eya/ e.g. /hea̯/ ~ /heya/ 'hair', /hurea̯/ ~ /hureya/ 'flare' (Lovins ibid p. 77).
19. /oa̯/ e.g. /doa̯/ 'door', /sukoa̯/ 'score'.
20. /ua̯/ e.g. /manikyua̯/ 'manicure', /amatyua̯/ 'amateur'. Types of off-gliding diphthongs

Typologically the sequences in 1-16 may be divided into three groups of off-gliding diphthongs:

(I) Up-gliding diphthongs not of mixed color: /ei̯, ai̯, ou̯, au̯, ae̯, ao̯/.
(II) Up-gliding or level diphthongs of mixed color: /oi̯, ui̯, eu̯, iu̯, oe̯, eo̯/.
(III) Down-gliding diphthongs: /ie̯, io̯, ue̯, uo̯/.
Historical and dialectal evidence indicates that where the non-syllabic, was higher than the syllabic (an up-gliding diphthong), the non-syllabic sought the height of the syllabic. I will account for this development by the process

[ n height
  →   [ n - 1 ]   /   V
[ m height
 Condition: n higher than m

Where the vowel was low and the glide high, the assimilation presumably occurred in two steps through iterative application of the process:

Group I ei̯ > ee̯
ou̯ > oo̯
ai̯ > ae̯ > aæ̯
ae̯ > ae̯ >aæ̯
au̯ > ao̯ > aɔ̯
ao̯ > ao̯ > aɔ̯
Group II oi̯ > oe̯
eu̯ > eo̯

The sequences involving low achromatic vowels were then subject to

[ - color
  →   [ α color]   /   ____   V
[ α color

e.g. aæ̯ > ææ̯
e.g.   aɔ̯ > ɔɔ̯

[ɔɔ̯] < [au̯] is attested in MidJ (Nishihara 1970) and was subsequently raised to [oo̯]. [ææ̯] is attested in some ModJ dialects (Itoo 1979, Iitoyo 1976) but in SJ was raised to [ee̯].

For Group I sequences the result of these developments was monophthongization which effected the mergers

ae̯   >   ee̯
ao̯   >   oo̯

Group II sequences underwent further change by Syllabicity Reversal, a process which applies to off-gliding diphthongs whose non-syllabic is not higher than its syllabic:

(P6)   SYL REV:   [ + syll
[- cons
[ n ht
[ - syll
[ - cons
[ m ht
  →   [ - syll
[ - cons
[ + syll
[ - cons
[ + long
Condition: m not higher than n

The [+long] syllabic on the right side of the arrow is written VV̯ in accordance with the convention proposed in §2.3.2, e.g.:

oe̯   >   o̯ee̯
eo̯   >   e̯oo̯
ui̯   >   wii̯
iu̯   >   yuu̯

The resulting non-high onset glides underwent Raising (the same process proposed in § for vowels). Glides are tenser and more chromatic than their homorganic vowels and so are even more subject to Raising. This implicational relationship (i.e. the raising of vowels implies the raising of homorganic glides) establishes that the same process is involved. The following formulation is from Donegan (1978) with a slight revision to include glides:

(P7)   RAISING:   [! - cons
[! n high
[! + chromatic
[! + tense
[! lower
  →   [n + 1 high] + tense
(! = especially)

e. g.o̯ee̯ → wee̯
 e̯oo̯ → yoo̯

[wee̯] was then subject to Glide Fronting and Palatal Glide Del: [wee̯] → [ yee̯ ] → [ ee̯ ].

The following are complete diachronic derivations exemplifying monophthongization and syllabicity reversal:

  eu̯ ui̯ oi̯ iu̯ ai̯ au̯ ei̯ ou̯
GLIDE HT ASSIM eo̯   oe̯   aæ̯ aɔ̯ ee̯ oo̯
VOWEL COLORING         ææ̯ ɔɔ̯    
SYLABICITY REVERSAL e̯oo̯ wii̯ o̯ee̯ yuu̯        
RAISING yoo̯   wee̯   ee̯ oo̯    
GLIDE FRONTING   yii̯ yee̯          
GLIDE DELETION   ii̯ ee̯          
  [yoo̯] [ii̯] [ee̯] [yuu̯] [ee̯] [oo̯] [ee̯] [oo̯] Derivation vs. Borrowing

Stated in their most general form processes often define the goal of a sound change in progress and provide a principled account of the relationship between various dialects in terms of evolution toward that goal. However processes do not always apply in their most general form, and sound changes frequently stop short of completion e.g. there are many [ai̯] sequences which did not monophthongize or which did so only in some dialects. Generative treatments (e.g. Hasegawa 1979, Slawson 1970) often note that only certain lexical items are subject to the 'ai̯, oi̯, ae̯ → ee̯ rule' and that there are certain social constraints on its application. Consider the following pairs from Hasegawa (ibid p. 129)

1. /atarimae̯/ /atarimee̯/ 'of course'
2. /nai̯/ /nee/'negative suffix'
3. /sugoi̯/ /sugee̯/ 'terrific'
4. /koi̯tu/ */kee̯tu/'this man'
5. /sai̯go/ */see̯go/'the last'
6. /kimae̯/ */kimee̯/ 'generosity'

The forms on the left occur in SJ. The use of the right hand forms in examples 1, 2, and 3 by standard speakers is restricted to males. The impression is dialectal or slangy. The right hand forms in examples 4, 5, and 6 do not occur at all in standard speech. Under the circumstances it seems clear that the [ai̯] – [ee̯] alternation in the above pairs is no longer a phonetically motivated substitution. The alternants are equally pronounceable by standard speakers and are not sensitive to rate of speech or other factors which might suggest phonetic motivation. For SJ the source of the right hand forms is interdialectal borrowing, not derivation by processes from the forms on the left. In Chapters 4, 5, and 6 evidence from hypoarticulate speech, loan phonology, and Hawaiian Japanese, respectively, will be examined to throw light on the synchronic status of these processes.

The down-gliding diphthongs in § labelled Group III (i.e. [ie̯, io̯, ue̯, uo̯] do not seen in general historically to have undergone SYL REV though they meet the input conditions as formulated for that process. (The example /kie̯ru/ > /kee̯ru/ in § is an example however.) Apparently SYL REV applied primarily to sequences where the syllabic and non-syllabic were the same height, as in Group II. However, we shall see that the application of SYL REV does include down-gliding diphthongs in hypoarticulate speech (cf Chapter 4) and loan words (cf Chapter 5).

2.3.3 Glide Epenthesis

Besides the lenition processes so far considered in this chapter there are also fortition processes which act to eliminate vowel sequences and restore a CVCV pattern. Glide Epenthesis is an optional fortition which interposes a non-syllabic between two vowels. This prevents the lenition of vowels in a sequence thus maximizing perception, the typical result of fortition processing.

(P8)   GLIDE EPENTHESIS:   0   →   [ - cons
[ - syll
[ α color
  /   [ + syll
[ α color
[ ! higher
  ____   [ + syll
[ - α color
[ ! lower


1. /siawase/ → [siyawase] 'happiness'
2. /sio/ → [siyo] 'salt '
3. /iu/ → [iyu] 'to say'
4. /kore o/ → [koreyo] ' this (accusative)'
5. /o atuu̯/ → [owatsuu̯] 'hot (honorific)'

The application of Glide Epenthesis can result in mergers of the following sort:

6. /kioo̯/ 'the past' → [kiyoo̯] (merges with /kiyoo̯/ 'dexterity').
7. /kare ni aru/ 'belongs to him' → [kare ni yaru] (merges with /kare ni yaru/ 'give to him').
8. /uti ni oru/ 'be at home' → [utsi ni yore] (merges with /uti ni yoru/ 'call at home').
9. /o ari desu ka/ 'does it exist?' → [owari des ka] (merges with /owari desu ka/ 'are you finished?' )


(1) daben 'idle chat' is a Sino-Japanese compound.
(2) I owe such of my understanding of vowel processes to Donegan (1978) and (Donegan) Miller (1972, 1973).
(3) The notation used in the formalization of processes will be alphabetical where no sacrifice in clarity will result. Where feature notation is used it will vary from cover terms such as 'glide,' 'height,' 'color' to the feature terms themselves. In all cases the allusion is to articulatory gestures. The notational device '!' (read 'especially') is from (Donegan) Miller (1972). The feature matrix for consonants will appear in Chapter III. That for vowels and glides is as follows:

 i e a o u y(i̯) w(u̯)
cons   ----- -----
syll   +++++ -----
pal  + +--- + +---
lab  - - - + + - - - + +
high   + - - - + + - - - +
low  -- + - - -- + - -




3.0 Consonants

3.1 Japanese has the following twelve underlying consonants:

Table 3.1

Underlying Consonant Inventory

p t s k h
b d dz g  
m n r    

Consonants will be referred to either segmentally, as above, or by feature specification. The following articulatorily based features will be necessary:

Table 3.2

Consonant Feature Matrix

p b t d s dz k g h m n r
cons + + + + + + + + + + + +
syll - - - - - - - - - - - -
son - - - - - - - - - + + +
high - - - - - - + + - - - -
back - - - - - - + + - - - -
low - - - - - - - - + - - -
ant + + + + + + - - - + + +
cor - - + + + + - - - + + +
voi - + - + - + - + - + + +
cont - - - - + - - - + - - -
nas - - - - - - - - - + + -
str - - - - + + - - - - - -
del rel - - - - ? + - - - - - -
flap - - - - - - - - - - - +
lab + + - - - - - - - + - -
pal - - - - - - - - - - - -

3.1.1 Underlying /dz/

The choice of /dz/ as the voiced counterpart of /s/ requires some comment. Most analyses posit /z/ with an accompanying statement that for most SJ speakers it is realized as [dz] initially and [z] intervocalically. Structurally, the /z/ analysis is more symmetrical and in terms of language universals it is usually true that languages with affricates have the corresponding fricatives as well. Nevertheless, from the point of view of NP it seems more natural to make the stronger form underlying and provide for the occurrence of the weaker 'allophone' by a lenition process. The NP analysis involves a context-sensitive fortition z → dz /.__ (. = syllable boundary) which governs the lexicon and a context-sensitive lenition dz → z / V-V ordered after it which governs derived structure. Han (1960) describes the distribution of the two variants as follows:

[dz] and [z] do not contrast. They are in free variation with some speakers and with others they are in complementary distribution. In the speech of the author, a number of spectrographic experiments revealed that [dz] occurs initially as in [dzuibun] (considerably) [dzoo] (elephant), whereas [z] occurs in non-initial position as in [suzus̆ii] (is cool) [s̆izuka] (is quiet). However, in slow careful speech [dz] may occur in non-initial position. (pp 49-50)

There is also data from non-standard dialects (cf. Iitoyo 1974) showing [dz] in all positions. Frequent substitutions in children's speech of the sort [yowamus̆i] → [yowamuts̆i] 'sissy' suggest that the z → dz fortition may be part of a more general process s, z → ts, dz with the voiceless portion eventually suppressed by the child as he approximates mature speech. Historically Arisaka (1957) claims that sibilants in ModJ were affricates in OJ. If so, the more general form of the process governed the OJ lexicon.(1)

3.1.2 Syllable initial (onset) distribution:

The twelve consonants combine with the five simple vowels /a, i, u, e, o/ and the rising diphthongs /ya, yu, yo, wa/ (each of which constitutes a short syllable itself) to form the following 100 short syllables:

Table 3.3

Short Syllable Inventory

a pa ka ba da ga sa dza ha ma na ra
i pi ki bi   gi si dzi hi mi ni ri
u pu ku bu   gu su dzu hu mu nu ru
e pe ke be de ge se dze he me ne re
o po ko bo do go so dzo ho mo no ro
ya pya kya bya   gya sya dzya hya mya nya rya
yu pyu kyu byu   gyu syu dzyu hyu myu nyu ryu
yo pyo kyo byo   gyo syo dzyo hyo myo nyo ryo

/wa/ may not be preceded by a tautosyllabic consonant in SJ due to POST CONS LAB GLIDE DEL which governs the lexicon, but /kwa/ and /gwa/ occur in some dialects.

(P9)   POST CONS LAB GLIDE DEL:   /w/ → Ø / .C__

For an explanation of */di, du, dya, dyu, dyo/ see

3.1.3 Syllable final (offset) distribution:

The consonants /p, t, k, s, n/ may also occur in syllable final position. As offset segments they have the value of one mora (they are often referred to as 'syllabic' or 'mora' consonants) and the syllables in which they occur are long syllables.(2) Thus the 105 short syllables of Table 3.3 form the basis for 525 possible long syllables when combined with each of the five offset consonants e.g. /ap, at, ak, as, an, tap, tat, tak, tas, tan/(3) Distribution of offset nasal

Offset /n/ may be followed by any segment or by (word boundary), e.g. /den.po/ 'telegram', /han.tai̯/ 'opposite' /den.wa/ 'telephone' /hon.ya/ 'bookstore', /tan.i/ 'a unit', /gan#/ 'cancer'. Distribution of offset obstruents

There is a sequential constraint on the occurrence of offset obstruents /p, t, k, s/ such that they may not be followed by # and that the following syllable must begin with an identical obstruent. The result is inter-syllabic geminates as in /ip.pai̯/ 'full', / 'husband', /kek.kyo.ku/ 'after all', /is.syo/ 'together'. NP and GP analyses of geminate consonants compared

Most occurrences of geminate obstruents in Japanese are; in vocabulary of Chinese origin (cf. Kuroda 1964 for a thorough study of this and other sources). Japanese has many bi-morphemic lexical items borrowed from Chinese where the first element in the compound is a bi-syllabic morpheme ending in /-tu/ or /-ti/, e.g. /zetu/ 'tongue', /niti/ 'day' and the second element in the compound begins with a voiced sound, e.g.

/zetu + on/   'lingual sound'
/niti + dzyoo/   'every day'

However if the second morpheme of the compound begins with a voiceless sound (i.e. /k, t, h(<p), s/ the compound will have geminate /kk, tt, pp, ss/ at the morpheme boundary, e.g.

(i) /zetu/ 'tongue' (ii) /niti/ 'day'
a) /zek + ken/ 'dorsum' a) /nik + ken/ 'daily'
b) /zet + too/ 'apex' b) /nit + tee/ 'daily routine
c) /zep + poo/ 'tongue' c) /nip + poo/ 'daily report'
d) /zes + sen/ 'word war' d) /nis + si/ 'diary'

McCawley (1968) formulates a classic generative analysis of the data in (i) and (ii). He represents these Sino-Japanese compounds in the lexicon in their ungeminated forms and provides for gemination by 1) deleting the high vowel between /t/ and /k, t, h, s/ and 2) regressively assimilating the point and manner of articulation of the then morpheme-final /t/ to the following voiceless obstruent.

The NP analysis I am proposing has lexical representations less abstract than those proposed by McCawley. Compounds with geminate consonants will appear as such in the lexicon. On this analysis McCawley's obligatory high vowel deletion and regressive obstruent assimilation rules play no role in the underlying-to surface derivation of Sino-Japanese compounds. High vowel deletion is, in fact, a live process in Japanese, but its application is optional and not limited to SJ compounds. It manifests itself in the processing of hypoarticulate speech under purely phonological conditions as I will show in §4.1.1. As for regressive obstruent assimilation, it governs the lexicon assuring */tp, ts, tk/ and applies optionally though less generally than high vowel deletion in hypoarticulate speech (cf. §4.3.2). McCawley's analysis is rejected because processes which apply optionally and in a very general fashion under conditions implying phonetic motivation (i.e. in hypoarticulate speech) would have to be specified under non-phonetic conditions as applying obligatorily to a portion of the lexicon.

An alternative analysis suggested to me by Robert Cheng would have just two allomorphs – a vowel-final one (/zetu/, /niti/) and a consonant-final one (/zet/, /nit/). On this analysis the underlying representation of the above compounds would be as follows:

/zet+ken/ /nit+ken/ /zet+too/ /nit+tee/ /zet+poo/ /nit+poo/ /zet+sen/ /nit+si/

This analysis requires that REGR OBS ASSIM which governed the lexicon under the original analysis be constrained to apply only derivationally to yield superficial geminates. This cost would be worthwhile if all geminate obstruents in the language could be derived from /-t/ plus /p, t, k, s/, for we would then gain the advantage of allowing only one syllable-final obstruent (/t/) in the lexicon – a simpler phonology than one allowing four since in the former case only the process(es) responsible for barring syllable-final /t/ must be constrained so as to allow it whereas in the latter case the process(es) barring syllable-final /p, k, s/ must also be constrained so as to allow them. However, there are some geminate obstruents which according to the principles of PP cannot be derived from an underlying /t/. Consider the following alternations: [gakubu] 'faculty' [gakusei] 'student' [gakuhu] 'seat of learning' [gakuto] 'student' [gakkoo] 'school' Unlike the earlier set of /t/ morphemes those with /k/ do not geminate except when followed by homorganic /k/ as in [gakkoo]. There are thus only two allomorphs [gaku] and [gak], and the 'morphophonemic' representation of the latter as /gat/ is not justified since there is no alternation and the phonetic representation can be derived from a 'phonemic' one. The 'economy' of the alternative analysis is thus lost along with the justification for choosing it over the less abstract one originally proposed.

3.1.4 Obligatory Consonant Lenitions

The twelve underlying consonants are rendered pronounceable by the obligatory application of a number of lenition processes. Offset /n/ lenition:

Offset /n/ is particularly susceptible to lenition occurring as it does in a 'weak' (syllable final) position. If it is followed by a continuant or # it will be realized as a nasalized continuant i.e. an offglide homorganic with the preceding vowel e.g. /tan.i/ [tãã̯.i] 'a unit', /honya/ [hõõ̯ya] 'bookstore', / [bẽẽ̯.s̆i] 'narrator', /bin.sen/ [bĩĩ̯.sẽẽ̯] 'stationery'. The following process provides for this case:

(P10)   PROG OFFSET /n/ ASSIM:   /n/   → [ - sul
[ + nas
[ α F
  /   [ V
[ α F
___ { [ + cont ]
{     #
(F = all features not specified)

If offset /n/ is followed by a non-continuant it will be realized as a nasal non-continuant homorganic with the following non-continuant, e.g. /sen.bee/ [sẽm.bee̯] 'cracker' /hon.too̯/ [hõn.too̯] 'true', /dzin.koo̯/ [dz̆iŋ.koo̯] 'population'. The following process provides for this case:

(P11)   REGR OFFSET /n/ ASSIM:   /n/ → [α position]   /   ___ [ - cont
[ α position

The vowel preceding offset /n/ is regressively nasalized by the following process:

(P12)   TAUTO-SYL REGR VOWEL NAS:   /V/ → [ + nas ]   /   [ + nas ]

(The contradictory context-free process

(P13)   VOWEL DENAS:   /V/ → [ - nas ]

governs the lexicon.) Onset consonant lenitions /h/ is pronounced [ɸ] (a voiceless bilabial fricative) before /u/.

(P14)   H LAB:   /h/ → [ɸ]   /   ___ u.   e.g. /huu̯too̯/ [ɸuu̯too̯] 'envelope' /h/ is pronounced [ç] (a voiceless palatal fricative) before /i, y/ by the process PAL to be described in below.

/h/[ç] / __ i, y   e.g. /hidoi̯/ [çidoi̯] 'severe' /dz/ is pronounced [z] intervocalically:

(P15)   AFFRIC WEAKENING:   /dz/ → [z] / V_V /g/ is pronounced [ŋ] intervocalically:

(P16)   STOP NAS: /g/ → [ŋ] /   V_V Palatalization

A number of substitution types are frequently subsumed under the term 'palatalization'. Bhat (1974) gives numerous examples from a wide variety of languages of the fronting, raising, and/or spirantization of consonants under the influence of (usually) palatal vowels or glides. In Japanese 'palatalization' velar consonants are fronted to prevelar, labial consonants are raised, and coronal consonants are raised, affricated, and become [-anterior] all before /y, i/. /y/ after [-anterior] coronals is then deleted. As Bhat shows, these substitutions apply quite independently and in various combinations in the world's languages. In the following analysis based on principles of NP each substitution type will be attributed to the application of a distinct process. At appropriate points I will draw contrasts with two well-known alternative treatments – Bloch (1950) and McCawley (1968).

Palatalization is distinctive only before non-palatal vowels. Letting /s/ stand for all consonants the phonetic distribution is as follows:

Plain:  si se sa so su]
Palatalized:  s̆i se s̆a s̆o s̆u]

There is paradigmatic evidence in the verb morphology for underlying plain consonants before /i/ e.g.

[hanas̆i] /hanas + i/ infinitiveof 'to speak'
[hanase]/hanas + e/imperative"
[hanasanai̯]   /hanas + anai̯/   negative"
[hanasoo̯]/hanas + oo̯/tentative"
[hanasu]/hanas + u/non-past"

Palatalization is then derived via a context-sensitive lenition process:

PAL   [ + cons ] → [ + pal ]   /   ___ [ + syll
[ + pal
[ + high

(Note: For consonants [+pal] implies [+high, -back] i.e. either the fronting of [+back] or the raising of [-back] consonants.)

This process is attested in loan phonology where foreign words with plain consonants before /i/ are borrowed with [s̆] e.g. Eng 'sea' = Jpn [s̆I]. Palatalized consonants followed by back vowels are also analyzed as underlyingly plain with palatalization due to the presence of a following palatal glide e.g. [s̆a, s̆o, s̆u] = /sya, syo, syu/. On this analysis, which is also reflected in the native orthography, coronal consonants pattern with non-coronals e.g. [p'ya, s̆a k'ya] = /pya, sya, kya/. To provide for palatalization by /y/ in addition to /i/ PAL must be slightly generalized as follows:

(P17)   PAL   [ + cons] → [ + pal ]   /   ___ [ + pal
[ + high
] (4)

There are dialects, notably in Kyushu, which have [s̆e] where the standard language has [se]. In such dialects [s̆e] might be analyzed as either /se/ or /sye/.

On the /se/ analysis PAL applies in its most general form i.e. before all [+pal] segments not just [+high] ones. In dialects where there is no contrast between [se] and [s̆e] this is a possible analysis. (cf. §5.6 for a discussion of SJ idiolects with such a contrast.)

On the /sye/ analysis palatalization is due to the presence of a post-consonantal /y/ and [s̆e] patterns with the non-palatal vowels:

[s̆e][s̆u] [s̆o] [s̆a]
/sye/ /syu/ /syo/ /sya/

There is historical support for the latter analysis. In pre-modern times the standard language (based on the Kyoto dialect) had [ye] and [s̆e] where modern SJ has [e] and [se]. It is apparently also true (Martin 1976) that wherever [s̆e] > [se] occurred [ye] > [e] also did, and that those dialects which still have [s̆e] also have [ye]. On this analysis the loss of both [s̆e] (and [ye]) would be attributable to a more general application of PRE-VOCALIC PAL GLIDE DEL which (due to the '! higher' condition) already applied before high palatal vowels.

Note also e.g. Jpn [sepaado] 'shepherd dog' (< Eng 'shepherd'), a 19th century borrowing showing evidence of the limitation of PAL to exclude application before Jpn /e/ in the standard language. (See however §5.6 for current practice involving the foreign sequence [s̆e].)

In NP terms (suggested by Stampe as reported in Ohso 1971) the non-occurrence of palatalized consonants in the lexicon is due to a context-free depalatalization process:

(P18)   DEPAL   [ + cons ] → [ - pal ]

(Note: For consonants [-pal] implies either [+back ] or [-back, -high].

DEPAL is ordered before PAL in accordance with the universal precedence principle which requires that when a context-free and a context-sensitive process contradict each other the context-free process is ordered first and governs the lexicon, and the context-sensitive process then applies and governs derived structure.

Two more processes play a role in the derivation.

(P19)   ALVPALADJ   [ + cor
[ + pal
[ - flap
  →   [ - anterior ]

which adjusts the place of articulation of palatalized coronals except /r/ from dental or alveolar to alveopalatal, e.g. [s'] → [s̆]. ALVPALADJ is a common substitution in a wide variety of languages (cf. Bhat 1974). The other process is

(P20)   POST CONS PAL GLIDE DEL   [ y ] → Ø   /   [ + cor
[ - ant

which deletes [y] after non-anterior coronals. (cf. §4.2.4 for wider application of this process in hypoarticulate speech.)

The following derivation involves all four processes involved in palatalization:

/syasi/ 'luxury' (governed by DEPAL)
s'yas'i (by PAL)
s̆yas̆i (by ALVPALADJ)
[s̆as̆i] (by POST PAL Y DEL)

Bloch (1950), a phonemic treatment, specifically rejects the analysis that treats coronals and non-coronals alike. Thus for Bloch [p'ya, s̆a, k'ya] are phonemicized /pya, s̆a, kya/ because [s̆a] and [s̆i] have phonetically identical consonants and there is no phonetic reflex of /y/ after alveo-palatals.(5) Since /s̆/ is a phoneme in [s̆a], under the once a phoneme always a phoneme constraint [s̆i] is also /s̆i/ for Bloch.

POST CONS PAL GLIDE DEL defines a partial merger what Kiparsky (1968) calls 'contextual neutralization' because in some positions (i.e. after palatalized coronals) /y/ merges with /Ø/. From the point of view of NP, however, an underlying /y/> after coronal obstruents which is deleted on the surface does not violate the constraints on lexical representation given in Stampe 1973:

i)Lexical representation is identical with the pronunciation of surface forms (e.g. [p'ya, s̆a, k'ya]) except that
ii) A class of phonetic segments is barred from lexical representation if:
        a)derivable by natural processes (e.g. [s̆] derivable from /sy/ by PAL, ALVPALADJ, POST CONS Y DEL).
        b)there is an unsuppressed context-free process which eliminates that class of segments (e.g. DEPAL eliminates /s̆/).

Principle (iib) is based on Stampe's assumption that ordering is more economical (i.e. takes less effort) than suppression as a strategy for avoiding the application of a process since ordering prevents a process from applying only in certain environments whereas suppression prevents it from applying at all. Assume that a child in learning Japanese has available to him the universal processes DEPAL, PAL, and POST CONS Y DEL and is attempting to decide whether [s̆a] is /s̆a/ or /sya/. If he suppresses DEPAL then PAL and POST PAL Y DEL govern the lexicon and [s̆a] cannot be /sya/ or /s̆ya/ so must be /s̆a/. If on the other hand he orders DEPAL before PAL then DEPAL governs the lexicon and [s̆a] cannot be /s̆a/ so must be /sya/. Since ordering is more economical than suppression, the /sya/ analysis is preferred.

In NP terms the differences between Bloch's analysis and the one proposed here are:

Proposed NP analysis:
1.   DEPAL governs the lexicon barring /s̆/.
2.   PAL is an allophonic process ordered after DEPAL and [s̆as̆i] is /syasi/.
Bloch's analysis:
1.   DEPAL is suppressed so /s̆/ is not barred from the lexicon.
2.   PAL governs the lexicon (since there is no longer a prior contradictory process) and /si/ and /sy/ are barred from the lexicon.
3.   POST PAL Y DEL which governed the lexicon under the proposed NP analysis as well but had no work to do now bars /s̆y/.

McCawley's analysis of palatalization is considerably more complex than these due to the abstract nature of his underlying forms and to the fact that he divides the lexicon into four strata – Native, Sino-Japanese, Onomatopeia, and Foreign – each with its own underlying segment inventory.(6) As far as palatalization is concerned the strata divide into two types – one with underlying palatalized consonants contrasting with plain ones (the SJ, Onom, and Foreign strata) and one without such a contrast (the Native stratum). This reflects the historical assumption that prior to the massive influx of Chinese loanwords beginning in the sixth century AD Japanese had little or no palatalization. McCawley notes that in the Native stratum palatalization is only distinctive before /a/. (He is apparently unaware of the examples of contrast before /o/ which, as Shibatani (1973) notes, are few but perhaps no fewer than those before /a/). To avoid setting up both a plain and a palatalized consonant series in the lexicon cCawley posits a second low vowel /â/ which conditions palatalization and then merges with /a/ without surfacing. This analysis has been attacked even by some who accept absolute neutralization because the conditioning feature McCawley choses to include in the matrix of /â/ is [+high] resulting in a vowel specified [+high, +low]. McCawley states that since the segment never surfaces this combination of features is acceptable.

Although McCawley (1969) has since disavowed this anaylsis it should be pointed out that in addition to absolute neutralization there are two things about it which violate principles of NP. First, a [+high, +low] vowel is unpronounceable in any language and lexical representations in NP must be fully specified and pronounceable (though not necessarily by speakers of the language in question), and second, a situation where a high and a low vowel (/i/ and /â/) condition palatalization and a mid vowel does not, violates the implicational hierarchy which says that if a lower vowel conditions palatalization all higher vowels must also. In the standard language which McCawley is analyzing /e/ does not condition palatalization.

To complete the NP analysis of palatalization we must look beyond the consonant /s/ which I have used as an exemplar, to the remaining coronal obstruents /dz/ (the voiced counterpart of /s/), /t/, and /d/.

The stops /t, d/ are affricated before high vowels and /y/ by the process

(P21)   STOPAFF   [ + cor
[ - son
  →   [ + delrel ]   / ___ [ - cons
[ + high

A result of STOPAPF is that historical /di, du, dyu/ have merged with /dzi, dzu, dzyu/ respectively. The affricates arbitrarily will be assumed to underly all occurrences of [dzi, dzu, dz̆u](7). The relevant portion of STOPAFF thus governs the lexicon. The following sample derivations involve the processes so far discussed in connection with palatalization:

/pi/ p'i     [ p'i]
/pyu/ p'yu     [ p'yu]
/ti/ ts'i ts̆i   [ ts̆i]
/tyu/ ts'yu ts̆yu ts̆u [ ts̆u]
/tu/ tsu     [ tsu]
/si/ s'i s̆i   [ s̆i]
/syu/ s'yu s̆yu s̆u [ s̆u]
/dzi/ dz'i dz̆i   [ dz̆i]
/dzyu/ dz'yu dz̆yu dz̆u [ dz̆u]
/hi/ çi     [ çi]
/hyu/ çyu     [ çyu]
/ki/ k'i     [ k'i]
/kyu/ k'yu     [ k'yu]

In Donegan and Stampe (1979) the application of processes is characterized as simultaneous and iterative. In the above derivations PAL and STOPAFF apply simultaneously, feeding ALVPALADJ which in turn feeds POST PAL Y DEL. For an example of iterative application (application of a process more than once in a single derivation) in regard to PAL see Note 4. For an example involving RAISING see § Additional examples will be found in Chapter 4. Palatalization and Language Change

The static format of PAL and DEPAL belies a certain dynamism recently introduced into the language by the borrowing of lexical items mostly from English. The effect of these borrowings is the gradual introduction of a distinction between plain and palatalized consonants before front vowels. Among innovating borrowers (See Chapter 5 for an elaboration of this and related terms) occurrences of foreign [ti] are borrowed as Jpn [ti] as in [paatii] 'party'. Conservative speakers borrow this as [paats̆ii]. It is rare however for [si] and [zi] to be borrowed intact. Both innovating and conservative speakers have [s̆i] and [dz̆i] as in [s̆i] 'sea' and [dz̆iiru] 'zeal'. From the data of loan phonology we may generalize a hierarchy of resistance to the suppression of palatalization, viz. stops resist less than fricatives (i.e. fricatives are quicker to palatalize than stops). Voicing provides an additional parameter. To formalize this hierarchy the relevant processes, PAL and DEPAL, must be stated in decomposed form so that the varions sub-processes may be dealt with separately:

(1)   DEPAL (2)   PAL
(a)   s
(b)   dz
(c)   t
  → [ - pal ] (a)   s
(b)   dz
(c)   t
  → [ + pal ]   / ___ [ - cons
[ + high
[ + pal

The present situation in the Japanese speech community may be described as follows (See Chapter 5 for evidence supporting these assertions):

Conservative speakers apply the full forms of (1) and (2) and say:

[paats̆ii] 'party'[dz̆iiru] 'zeal' [s̆i] 'sea'

Other speakers have suppressed (1c) and (2c) and say:

[paatii] 'party'[dz̆iiru] 'zeal' [s̆i] 'sea'

Others have suppressed (1bc) and (2bc) and say:

[paatii] 'party'[dziiru] 'zeal' [s̆i] 'sea'

Finally there are a few speakers who have suppressed both processes entirely and say:

[paatii] 'party'[dziiru] 'zeal' [si] 'sea'

There is also historical evidence supporting this implicational hierarchy. Hattori (1960) reports that in the 16th century standard language there was (and still is in some modern dialects) the following distribution of coronals before palatal vowels:

[s̆i, s̆e, z̆i, z̆e] but [ts̆i, te, dz̆i, de]

Assuming the analysis /si, se, zi, ze/ and /ti, te, di, de/ these data show PAL applying to [s, z] before it applies to [t, d] in pre-mid vowel position at least. The implications of this for loan phonology will be explored in Chapter 5. It further supports the assertion that a higher vowel conditions the application of PAL before a lower one does.

Note that the application of the subprocesses of DEPAL and PAL are linked. One cannot suppress e.g. (1c) and not (2c). To do so would allow /ti/, /ts̆i/ and [ts̆i] while disallowing [ti] – a case of absolute neutralization. Conversely one cannot suppress (2c) and not (1c) thus disallowing [ts̆i] (and /ts̆i/) which is contrary to fact.

Note a second constraint. To say there is a hierarchy of resistance to change is to assert an implicational relationship among the subprocesses of DEPAL and PAL. The claim is that any speaker who has the distinction [si] ≠ [s̆i] (i.e. who says [si] 'sea') will also have the distinction [ti] ≠ [ts̆i] (i.e. will say [paatii] 'party'). Though I have no data on this point, my impression based on personal experience is that it holds true.


(1) On the other hand the also frequent substitution in children's speech s → ts̆ before non-palatal vowels as in e.g. 'chan' for 'san', the title signifying Mr., Mrs., or Miss suffixed to proper names, would suggest that more than affrication is involved.
(2) The mora value of offset consonants and of the offset glides treated in § is due to a prosodic rather than a segmental process.
(3) The 100 short syllables of Table 3.3 may also become long by the addition of an offset glide as in the long vowels and diphthongs of Chapter 2, e.g. /aa̯, ai̯, taa̯, tai̯, koo̯, koi̯, syuu̯, syui̯/.
(4) This revision also provides for those derivations involving geminate palatalized consonants, e.g. /dzassi/ 'magazine' → [dzass̆i] → [dzas̆s̆i] where after the second consonant is regressively palatalized by a following vowel the first consonant, by iterative application of PAL, is also regressively palatalized.
(5) In the Toohoku dialect discussed in Chapter 6 the application of ALVPALADJ is in some cases variable. Where it does not apply the conditions for POST PAL Y DEL do not obtain and the glide is clearly beard, e.g. /sya/ → [s'ya ].
(6) Much of McCawley's motivation for dividing the lexicon depends on the (non-)application of rules not processes. The rules/processes distinction is discussed in Chapters 1 and 6.
(7) This assumption is reflected in the spelling of these sequences in conventional Japanese orthography. The practice results from an official regulation of 1946 which decreed that thenceforth the kana symbols for the /d/ syllables would be replaced by those for the /dz/ syllables where the high vowels and /y/ followed. The reform was in response to the difficulty experienced by the general population, even the highly educated, in choosing the historically accurate alternative. This argues for the psychological reality of the merger but not, of course, of the alternative chosen to represent it orthographically.




4.0 The term hypoarticulate speech

The term hypoarticulate speech subsumes a variety of labels including, among others, contracted, reduced, casual, informal, fast, careless, and sloppy speech. As these terms imply, the occurrence of hypoarticulate speech depends on such extralinguistic factors as rate of articulation, level of attention, and social setting. High frequency forms and collocations also sees particularly susceptible to hypoarticulate speech processing.

In NP hypoarticulate speech usually involves the more general application of processes through the relaxation of suppressions or limitations as compared with the situation in careful speech. For example in § it was noted that the subprocess of LAB GLIDE DEL that deletes labial glides before non-low vowels governs the lexicon. In hypoarticulate speech the non-low limitation is relaxed, and the process applies in its most general form resulting in surface contrasts of the following kind:

1. [ yawarakai ] 'soft' (careful)
2. [ yaarakai ]"(hypoarticulate)

A paramount fact of hypoarticulate speech is that it displays a reduced number of segments and contrasts vis a vis careful speech. This is exactly what one would expect when suppressions or limitations on processes are relaxed since the application of lenition processes tends to eliminate differences between segments. In NP, then, relaxed speech implies the relaxing of constraints on the application of processes. Generative phonology on the other hand requires of the speaker that the more relaxed and unmonitored his speech the more rules be must apply.

As a result of the more general application of processes, mergers are common in hypoarticulate speech e.g.

3. /owari desu ka/ 'Are you finished?'
4. /o ari desu ka/ 'Does it exist?'

merge to [oari desu ka] due to the wider application of LAB GLIDE DEL. Articulatory ease is allowed to take precedence over clarity precisely in those situations where a loss of information is either not likely or not crucial i.e. in familiar or relaxed settings.(1)

A hypoarticulate speech derivation involves a careful speech form to which one or more optional lenition processes have applied. An optional process is thus one whose application is not essential to the pronounceablity of a form i.e. whose input and output are both pronounceable by native speakers of the language. Compare this with the obligatory processes of Chapters 2 and 3 without whose application the pronunciation of underlying forms would be difficult if not impossible for native speakers of Japanese.(2)

Hypoarticulate speech is important in NP as a source of evidence for both the existence and the interaction of processes. This is in sharp contrast with GP where such phenomena are relegated either to 'performance' or to 'low level phonetics' with the implication that they are of little theoretical interest to phonologists. In the NP view ignoring surface alternations deprives us of some important insights into underlying structure since the same processes often have both phonetic and phonological consequences. Stampe (1973) cites the example in English of stop voicing assimilation after /s/ in hypoarticulate speech which converts e.g. [its bin] 'it's been' into [itspin]. This process governs not only surface alternations but also lexical representation preventing such sequences as */sb, sd, sg/ while allowing /sp, st, sk/. (Cf. §4.3.1) for Japanese examples involving obstruent voicing assimilation.

4.1 Vowel Lenitions

The following types of vowel weakening in hypoarticulate speech will be discussed: unvoicing, total assimilation (deletion), coalescence, syllabicity reversal, height assimilation, and shortening.

4.1.1 Vowel unvoicing

A striking characteristic of Japanese pronunciation is the frequent occurrence of voiceless vowels. These result from the application of a vowel unvoicing process which is conditioned by a number of factors the most important of which is that the consonants surrounding the vowel (or preceding it if the vowel is followed by #) must be voiceless. In Japanese such consonants are limited to the three stops [p, t, k], the two affricates [ts, ts̆] and the five fricatives [s, s̆, ɸ, ç, h] e.g.

1. /suki/ [su̥ki] 'likeable'
2. /sita/ [s̆i̥ta] 'below'
3. /hito/ [çi̥to] 'man'
4. /huta/ [ɸu̥ta] 'cover'
5. /tuku/ [tsu̥ku] 'to attach'
6. /titi/ [ts̆i̥ts̆i] 'father'
7. /pika/ [pi̥ka] 'flash'
8. /kusi/ [ku̥s̆i] 'comb'
9. /kita/ [ki̥ta] 'came'
10. /desu/ [desu̥] 'is'

Another important condition is the sonority of the vowel, the less sonorous high vowels being more likely to unvoice than the more sonorous lower ones e.g. the unvoiced non-high vowels in

11. [ko̥koro] 'heart'
12. [to̥koro] 'place'
13. [ke̥kkoo̯] 'all right'
14. [atḁtakai] 'warm'
15. [hḁha] 'mother'
16. [wakḁkatta] 'understood'

occur under more restricted conditions than those governing the unvoicing of the high vowels in 1-10 e.g. faster rate of speech and according to Martin (1952 p. 14) when unvoicing of non-high vowels occurs 'it is usually of the, initial syllable and coinciding with the repetition of the same vowel in the following syllable...' Identity of the preceding and following consonant would also seem to be a contributory factor.

In a study of vowel unvoicing based on spectrographic data, Ban (1962) proposed inherent vowel length as a conditioning factor in vowel unvoicing. She found that other things being equal the inherent length of the five Japanese vowels differed non-distinctively as follows with the shortest vowel /u/ representing a ratio of one:

/u/ /i/ /o/ /e/ /a/

The correlation between vowel height, inherent length, and sonority suggests that a single feature may be involved here.

Other significant variables studied by Han were: the effect of tempo (the faster the tempo the more unvoicing), pitch accent (low-pitched vowels are more likely to unvoice than high-pitched ones) and the manner of articulation of neighboring sounds (fricatives, affricates, and stops in that order are more likely to cause unvoicing). Thus she found that in comparing

17. ki̥kukoto desu   'is to listen'
18. kiku̥koto desu   'is with Kikuko'

it is the first vowel in (17) that is unvoiced while in (18) it is the second. The difference is due to the difference in pitch level. On the other hand in

19. si̥kuto itta   'he said "four by nine"'

the first vowel is unvoiced regardless of the high pitch, showing the overriding effect of the fricative manner of articulation.

Given the complexity of the unvoicing phenomenon it is not surprising that a satisfactory formalization of the process has not been arrived at. The usual formulation is more or less as follows:

+ high
  →   [ - voi ]   / [ - voi ] ___ {[
 - voi  

which ignores almost as such as it captures. After an essentially similar formulation (his Rule 26 p. 127) McCawley (1968) describes an additional complication:

'I am unable to state the exact form of the rule, which will be considerably more complicated than the above due to the fact that when several consecutive syllables each contain a diffuse short vowel between voiceless consonants, only alternate vowels become voiceless. However, whether the first, third, fifth, etc., or the second, fourth, etc. vowels become voiceless depends on several factors such as which vowels are /i/'s and which /u/'s and what the consonants are.' (1968 p. 127)

In fact even the alternate syllable constraint on the application of unvoicing is relaxed in extremely lenited speech. As Han reports in her spectrographic study:

If the sequence CV̥CV̥CV̥ is composed of voiceless fricatives and vowels as in /huhukuka/ 'Is it a complaint?' or /susukika/ 'Is it Susuki?' two successive vowels may be unvoiced. The sequence /huhuku/ is sometimes pronounced [ huhu̥ku ] or [ hu̥hu̥ku ] or at a faster tempo it is further reduced to [ h:h:ku ] or even to [ h: : ku ] in which the two vowels and the intervening /h/ are reduced to a mere durational feature.' (p. 39)

A solution to the problem stated above by McCawley lies in the recognition that processes apply implicationally and variably along axes defined by features present in the input and/or environmental conditions of the process. Using the notational device [!] (read 'especially') as proposed in (Donegan) Miller (1972) the vowel unvoicing process sight be stated as follows:

! higher
  →   [ - voi ]   / [
- voi
! +strid
___ {[
+ cons
- voi
! +strid

with rate of articulation as an extralinguistic contributory factor. Complete assimilation of voiceless vowels

Han's statement that voiceless vowels are 'reduced to a mere durational feature' has been called vowel loss or deletion by others (Martin 1952, Shevelov and Chew 1969). David Stampe has suggested to me that both on phonetic grounds and in the interest of a simpler universal phonology these deletions and indeed perhaps all deletions may be looked upon as complete assimilations of the 'deleted' segment to an adjoining one. In Japanese we might say that voiceless vowels are completely assimilated to an adjoining voiceless consonant e.g. /suki/[su̥ki][sski]. [ss] (or any sequence created by this process) will by convention maintain the original prosodic length. The following process is responsible:

- voi
  →   [
- voi
  /   [
- voi
___ {[
- voi
(Where Ci = Cj or Ck)


1. /suki/ [su̥ki] [] 'likeable'
2. /sita/ [s̆i̥ta] [s̆i.ta] 'below'
3. /hito/ [çi̥to] [ç] 'man'
4. /huta/ [ɸu̥ta] [ɸ.ta] 'cover'
5. /tuki/ [tsu̥ki] [] 'moon'
6. /tikai/ [ts̆i̥kai] [ts̆.kai] 'near'
7. /kusai/ [ku̥sai] [k.sai] 'odorous'
8. /kisya/ [ki̥s̆a] [k.s̆a] 'train'
9. /ikutu/ [iku̥tsu] [ik.tsu] 'how much'
10. /akita/ [aki̥ta] [ak.ta] 'proper name'
11. /akiko/ [aki̥ko] [ak.ko] 'proper name'

There are a number of conditions which are not expressed in the above formulation of VCL VOWEL ASSIN. (1) Application is obligatory if the vowel is preceded by a fricative or affricate (i.e. a prolongable strident) as in examples 1-6. (2) The vowel assimilates to the segment most resembling it i.e. the most vowel-like of the adjacent consonants whether it be the preceding or the following one (compare items 1 and 7 where assimilation is to the continuant rather than the non- continuant). (3) Application is optional if condition (1) is not met as in examples 7-11. Mergers due to VCL VOWEL ASSIN

The assimilation of the vowel between identical consonants as in example (11) results in mergers of the following type:

12a. /ikko/    [ikko] 'one thing'
12b. /ikuko/ [iku̥ko] [ikko] 'how many things'

Forms with different underlying high vowels may also merge where the vowels to be assimilated are preceded by a stop which is unreleased or released into a following stop or affricate:

13a. /kaki ka/ [kaki̥ka] [kakka] 'It's a persimmonm?'
13b. /kaku ka/ [kaku̥ka] [kakka] 'Does he write?'
14a. /akitagawa/ [aki̥tagawa] [aktagawa] 'pr. noun'
14a. /akutagawa/ [aku̥tagawa] [aktagawa] 'pr. noun'

Such mergers are resisted however if the first obstruent is released - the usual case if one of the obstruents adjacent to the voiceless vowel is continuant and the other is not:

15a. /sikoo̯/ [s̆i̥koo̯] [s̆.koo̯] 'contemplation'
15b. /syukoo̯/ [s̆u̥koo̯] [s̆.koo̯] 'manual work'
16a. /kisi/ [ki̥s̆i] [k's̆.s̆i] 'shore'
16b. /kusi/ [ku̥s̆i] [ks̆.s̆i] 'comb'
17a. /kisai̯/ [ki̥sai̯] [k's.sai̯] 'statement'
17b. /kusai̯/ [ku̥sai̯] [k's.sai̯] 'malodorous'

Ohso (1971) describes the [ s̆ ] before [ i ] as 'bright' and before [ u ] as 'dark.' Schane (1971) describes the [ s̆ ] before [u] as 'rounded.' My own impression is that the difference lies in the fact that the initial [ s̆ ] in (15a) is pronounced with the lips spread while that in (15b) is not. The difference between (16a) and (16b) is between a front and back /k/ plus lip spread in the former. It is clear in any case that some coloration acquired from the assimilated vowel is manifested during the release of the preceding consonant thus maintaining the contrast in the speech of some speakers. For some items /si/ and /syu/ are in free variation e.g. /syukudai/ 'homework' is also /sikudai/ for some speakers. The same is true of /dzi/ and /dzyu/ e.g. /sindzyuku/ 'Shinjuku' is /sindziku/ for some speakers.

4.1.2 Voiced vowel assimilation

Under certain conditions voiced vowels, especially high ones, may be totally assimilated to an adjacent voiced consonant segment. The environment requires a prolongable consonant adjacent to the vowel i.e. a nasal, fricative, or affricate preceding it or a nasal or fricative following it.


! high
! lab
  →   [
+ voi
  /   {[
___ [ Ck ]
Conditions1.Cj or Ck must be voiced.
 2.Vcd Cj = nasal, fricative, or affricate
 3.Vcd Ck = nasal or fricative
 4.Ci = CJ or Ck

Assimilation is most likely between identical consonants, especially nasals:

1. /tanomimasita/    [ta.nom.mas̆.ta] 'asked'
2. /tanomu mo/    [] 'even asking'
3. /ani no/    [] 'my brother's'
4. /kinu no/    [] 'silken'
5. /doogu ga/ [dooŋu ŋa] [dooŋ.ŋa] 'tool (subj)'
6. /ogigaya/ [oŋiŋaya] [oŋ.ŋa.ya] 'place name'
7. /dzudzan/ [dzuzan] [dz.zan] 'careless'

If the adjacent consonants are dissimilar the assimilation seems more likely where one of them is voiceless. (example (16) below seems more frequent than example (11))

8. /umai/   [umai] [m.mai] 'tasty'
9. /unagi/   [unaŋi] [ŋi] 'eel'
10. /ugoku nai̯/   [uŋokunai̯] [ŋ.ŋo.ku.nai̯] 'not move'
11. /tadzuneru/      [] 'to inquire'
12. /sunawati/      [̆i] 'namely'
13. /kimitati/      [kim.ta.ts̆i] 'you (plural)'
14. /kokugo/   [kokuŋo] [ko.kŋ.ŋo] 'national language'
15. /hadzukasii̯/ [hazukasii̯] *[haz.ka.sii̯] [has.ka.s̆ii̯] 'bashful'
16. /dzutto/ [dzutto] *[] [] 'fully'
17. /dzitu/ [dz̆itsu] *[dz̆.tsu] [ts̆.tsu] 'truth'

In 15, 16, and 17 the last stage of the derivation is due to OBS VOICING ASSIM (cf. §4.3.1).

Even non-high vowels may be assimilated, especially if an identical vowel occurs in the following syllable:

18. /sono mama de/ [sono de] 'in that way'
19. /sono momo wa/ [sono wa] 'that peach'
20. /momo iro/ [sono iro] 'pink'
21. /nanadzyuu̯/ [̆uu̯] 'seventy'
22. /sono toki/ [son.toki] 'that time'
23. /anata/ [an.ta] 'you'

In some cases VCD VOWEL ASSIM creates inadmissible consonant sequences which obligatorily undergo further lenition:

24. /sono goro/ [sonoŋoro] *[sonŋoro] [soŋŋoro] 'at that time'
25. /nani ga/ [naniŋa] *[nanŋa] [naŋŋa] 'what (subject)'

The last stage of derivations (24) and (25) is due to OFFSET /n/ LENITION (cf. § Offset /r/ nasalization

VCD VOWEL ASSIM also applies to sequences involving the flap /r/ followed by a vowel and a homorganic (i.e. coronal) consonant e.g.

1. /tarinai̯/ [tarinai̯] *[tarnai̯] [tannai̯] 'lacks'
2. /wakaranai̯/ [wakaranai̯] *[wakarnai̯] [wakannai̯] 'don't understand'
3. /kurenai̯/ [kurenai̯] *[kurnai̯] [kunnai̯] 'don't receive'
4. /kuru no/ [kuruno] *[kurno] [kunno] 'going'
5. /kuru to/ [kuruto] *[kurto] [kunto] 'if (he) goes'
6. /nomeru to/ [nomeruto] *[nomerto] [nomento] 'if (he) can drink'

The following process governs the lexicon and also applies obligatorily to the inadmissible [rC] sequences created by VCD VOWEL ASSIN in the above derivations.

(P25)   OFFSET /r/ NAS:   [ + flap ]   →   [
- flap
- nas
  /   ___   [ + cor ]

4.1.3 Syllabicity reversal

The process STL REV whose diachronic consequences were described in § also applies synchronically in hypoarticulate speech. The process is restated below for convenience:

(P6)   SYL REV:   [ + syll
[- cons
[ n ht
[ - syll
[ - cons
[ m ht
  →   [ - syll
[ - cons
[ + syll
[ - cons
[ + long
Condition: m not higher than n

In the following hypoarticulate speech derivations involving SYL REV unstarred forms are either attested or predicted to occur. Some of these forms show counterfeeding(3) by SYL REV of processes which govern lexical representation and careful speech, e.g. in 1-5, 8, 9 the intermediate forms with consonant-labial glide sequences are sometimes heard. Also initial labial glides as in 6-7 are not fronted or deleted. (cf § for the constraints on the occurrence of labial glides in the lexicon.) As for the palatal glides POST CONS Y DEL (cf § applies in 10, 13, and 14, but PRE VOC Y DEL does not apply (i.e. is counterfed by SYL REV) in 11. (cf § for the constraints on the occurrence of palatal glides in the lexicon.)

1. /atui̯/  [atsui̯] [atswii̯] [atsii̯] 'hot'
2. /samui̯/  [samui̯] [samwii̯] [samii̯] 'cold'
3. /natue̯/  [natsue̯] [natswee̯] [natsee̯] 'girl's name'
4. /ai̯tu wa/  [ai̯tsu a] [ai̯tswaa̯] [ai̯tsaa̯] 'he'
5. /ai̯tu o/  [ai̯tsu o] [ai̯tswoo̯] [ai̯tsoo̯] 'him'
6. /ui̯mago/    [wii̯mago] 'first grandchild'
7. /uo̯itiba/    [woo̯its̆iba] 'fish market'
8. /soto wa/ [soto a] *[soto̯aa̯] [sotwaa̯] [sotaa̯] 'outside'
9. /soko e/  *[soko̯ee̯] [sokwee̯] [sokee̯] 'there'
10. /tie̯ko/  [ts̆ie̯ko] *[ts̆yeeko] [ts̆eeko] 'girl's name'
11. /mie̯ru/    [mie̯ru] [myee̯ru] 'to be able to see'
12. /miu̯ra/    [miu̯ra] [myuu̯ra] 'proper name'
13. /nio̯i/  [nio̯i] *[ɲyoo̯i] [ɲoo̯i] 'odor'
14. /sia̯wase/  [s̆ia̯wase] *[s̆yaa̯wase] [s̆aa̯wase] 'happiness'

These examples furnish evidence of the relative persistence (resistance to suppression) of processes. The post-consonantal labial glides in 1-5, 8, and 9 are less frequent and less salient to native speakers than the initial glide-vowel sequences in 6, 7, and 8. This may be due to the preference for canonical CV syllables which the former examples violate and the latter examples support. 4.1.4

4.1.4 Vowel Coalescence

In §2.3.1 allusion was made to a process which shifts a vowel into the same syllable as a preceding vowel. The following process is responsible:

(P26)   VOWEL COALESCENCE:   V.V   →   VV̯

Where the vowels are identical the effect is to convert a vowel sequence into a long vowel e.g.

1. /hako o/ [hakoo̯] 'box (acc)'

Mergers may result:

2a. /su+uri/ [suu̯.ri] 'vinegar vendor'
2b. /suu̯+ri/ [suu̯.ri] 'mathematical principle'
3a. /sato+oya/ [sa.too̯.ya] 'foster parent'
3b. /satoo̯+ya/ [sa.too̯.ya] 'sugar salesman'

Where the vowels are not identical the result of vowel coalescence is an off-gliding diphthong e.g.

4. /tada+ima/ [ta.dai̯.ma] 'right now'

4.1.5 Vowel Shortening

Long vowels may become short. The following process is responsible:

(P27)   VOWEL SHORTNG:   Vi V̯i   →   Vi

The process appears to apply most frequently in word final position as in the following examples:

1. /gakkoo̯/ [gakkoo̯] [gakko] 'school'
2. /hontoo̯/ [hõntoo̯] [hõnto] 'true'
3. /sensee̯/ [sẽẽsee̯] [sẽẽse] 'teacher'
4. /too̯kyoo̯/ [too̯kyoo̯] [too̯kyo] 'Tokyo'
5. /arigatoo̯/ [arigatoo̯] [arigato] 'thanks'
6. /ohayoo̯/ [ohayoo̯] [ohayo] 'good morning'

As Martin (1959) notes, shortening is particularly common in words where the preceding syllable is long as in (1)-(4). Long vowels in high frequency expressions as in (5) and (6) are also frequently shortened. Long vowels which arise from the compensatory lengthening associated with syllabicity reversal also frequently undergo shortening as in the final stage of the following derivation:

7. /kore wa/ [kore ya] [kore a] [koryaa̯] [korya] 'this (topic)'

4.1.6 Height Assimilation

The process PROGR HT ASSIM whose diachronic consequences were described in § also applies synchronically in hypoarticulate speech.

PROGR HT ASSIM:     →   [ n-1 ht ]   /   [
m ht
Condition: n higher than m


1. /dai̯tai̯/ [dai̯tai̯] [dae̯tae̯] 'generally'
2. /kau̯ntaa̯/ [kau̯ntaa̯] [kao̯ntaa̯] 'counter'

margers may result e.g.

3. /hai̯/ [hai̯] [hae̯] 'yes'
4. /hae̯/   [hae̯] 'fly'

4.2 Glide Lenitions

4.2.1 Glide fronting

In addition to the lexicon-governing subprocesses of GLIDE FRONTING which assure */wi, we/ (cf § the process applies more generally in hypoarticulate speech between a preceding palatal vowel and a following /a/ e.g.

1. /i wa sinai̯/ [i ya s̆inai̯] 'it is not found'
2. /kore wa/ [kore ya] 'this (topic)'

The application of this subprocess is most frequent where the topic marker /wa/ is involved.

4.2.2 Labial glide deletion

In addition to the lexicon-governing subprocesses of LAB GLIDE DEL which assure */wu, wo/ (cf § the process applies more generally in hypoarticulate speech between a preceding non-palatal vowel and a following /a/.(4)


1. /uwagi/ [uagi] 'overcoat'
2. /owari desu ka/ [o ari des ka] 'are you finished?'
3. /sunawati/ [sunaats̆i] 'that is to say'

Example (2) represents a merger with the sentence /o ari desu ka/ *Does it exist?. In §2.3.3 these utterances were shown to merge in the opposite direction due to the application of the fortition process GLIDE EPENTHESIS.

4.2.3 Palatal glide deletion

In addition to the lexicon-governing subprocesses of PRE VOC PAL GLIDE DEL which assure */yi, ye/ (cf § there is a process POST VOC PAL GLIDE DEL which applies optionally in hypoarticulate speech between a preceding palatal vowel and a following non-palatal vowel:

(P28)   POST VOC PAL GLIDE DEL:   Y   →   Ø   /   [
+ syll
+ pal
  /   ___   [
+ syll
- pal


1. /iyoo̯/   [ioo̯] 'Let's stay'
2. /uti ni yoru/   [uts̆i ni oru] 'call at home'
3. /heya/   [hea] 'room'
4. /kore wa/ [kore ya] [kore a] 'this (topic)'

Example (1) represents a merger with /ioo̯/ 'Let's say' and (2) with /uti ni oru/ 'be at home'. Mergers in the reverse direction would result from the application of GLIDE EPENTHESIS (cf. §2.3.3).

4.2.4 Post consonantal palatal glide deletion

In addition to the subprocess of POST CONS PAL GLIDE DEL which bars a palatal glide after non-anterior coronals e.g. */s̆ya/ and *[ s̆ya ] the process applies more generally in hypo-articulate speech after all coronals (i.e. including after [ r ] which is [ + anterior] as well, e.g.

/sore wa/   →   sore ya   →   sore a   →   sorya   →   sora   'that (topic)'

4.3 Consonant lenitions

4.3.1 Obstruent voicing assimilation

The application of VCD VOWEL ASSIM (cf. §4.1.2) feeds the process:

(P29)   OBS VOICING ASSIM:   [ - son ]   →   [ - voi ]     /   ___   [
- son
- voi
(Note: this formulation means the affected voiced obstruent
may either precede or follow the voiceless obstruent)


1. /hadzukasii/ [hazukas̆ii̯] *[hazkas̆ii̯] [haskas̆ii̯] 'embarrassed'
2. /dzitu/ [dz̆itsu] *[dz̆tsu] [ts̆tsu] 'truth'
3. /sugoi/ [sugoi] *[s.goi] [s.koi] 'terriffic'

This process also governs the lexicon preventing obstruent sequences with mixed voicing e.g. */sz, zs, kg, dt/.

4.3.2 Regressive obstruent assimilation

In § mention was made of a process which regressively assimilates [ t ] to a following [ p, t, k,s ] producing geminates (pp,tt,kk,ss). This process governs the lexicon ruling out non-geminate obstruent sequences where [ t ] is the first member.

(P30)   REGR OBS ASSIM:   [
+ cor
- cont
- voi
  →   [ α F ]     /   ___   [
- son
- voi
α F

It also applies optionally in hypoarticulate speech, e.g.

1. /gotisoo/ [gots̆isoo] [gots̆.soo] [gosso] 'dainty'
2. /natuko/ [natsuko] [nats.ko] [nakko] 'female name'
3. /mitiko/ [mits̆iko] [mits̆.ko] [mikko] 'female name'
4. /butukaru/ [butsukaru] [] [bukkaru] 'to hit (int)'


(1) In §2.3.3 examples (3) and (4) were shown to merge in the opposite direction under fortition due to GLIDE EPENTHESIS. Cf Donegan and Stampe 1978 p. 143 for numerous examples in English of such contradictory mergers due on the one hand to fortition and on the other to lenition processes.
(2) As noted in Chapter 3 even underlying forms must be pronounceable but not necessarily by speakers of the language involved. The pronounceability of underlying forms follows from the NP stricture against specifications in underlying representation which are archiphonemic or which represent phonetically impossible combinations.
(3) If SYL REV counterfeeds e.g. POST CONS LAB GLIDE DEL then the latter process does not apply to the output of SYL REV even though its input conditions are met. Cf. Donegan and Stampe (1979) for a discussion of counterfeeding and other ordering constraints on processes. Although historically SYL REV did not counterfeed GLIDE FRONTING, LAB GLIDE DEL or PAL GLIDE DEL in the diachronic developments outlined in §, it appears to do so in the synchronic grammar.
(4) The existence of this subprocess gives rise to the historically inaccurate backformation /bawai/ 'case' for many speakers alongside /baai/ the historically attested and also occurring form.




5.0 Modern Loan Words

There is a class of words in Japanese about whose pronunciation there is a lack of agreement among native speakers. These are the modern loan words, most of which eve come from European languages within the last century.

5.1 Conservative vs Innovating

Bloch (1950) divides Japanese speakers into two groups – those who stick close to native sounds and sequences in pronouncing these loans and those who use 'sounds and combinations not present elsewhere in their speech' (p.330). Bloch labels the two pronunciations 'conservative' and 'innovating'. One of his examples is the English word 'film' which according to Bloch is variously pronounced in Japanese:

[ çirumu ][ Φuirumu ][ Φirumu ]
/ hirumu // huirumu // Φirumu /

He calls the first two pronunciations 'conservative' because they involve sounds or sequences already present in the language and the third pronunciation 'innovating' because the sequence [ Φi ] does not occur except in these loans. For innovating speakers the labial allophone of /h/ before /u/ has, according to Bloch, become a phoneme in its own right contrasting with /h/ before other vowels. As further evidence of the new status of [Φ] he cites examples of its occurrence in other innovating (column III) pronunciations:

'fair'[ hee̯a ]
/ hee̯a /
[ Φuee̯a ]
/ huee̯a /
[ Φee̯a ]
/ Φee̯a /
'foul'[ hau̯ru ]
/ hau̯ru /
[ Φuau̯ru ]
/ huau̯ru /
[ Φau̯ru ]
/ Φau̯ru /
'fork'[ hoo̯ku ]
/ hoo̯ku /
[ Φuoo̯ku ]
/ huoo̯ku /
[ Φoo̯ku ]
/ Φoo̯ku /
'fuse'[ çyuu̯zu ]
/ hyuu̯zu /
[ Φuyuu̯zu ]
/ huyuu̯zu /
[ Φyuu̯zu ]
/ Φyuu̯zu /

5.2 Processes in NP

In NP processes play a key role in loan phonology. Stampe proposes two major mechanisms for assimilating foreign pronunciations from a source language (Ls) into a target language (Lt). 'One involves treating the foreign sound or sequence as if it were derived through native processes from forms which conform to the constraints this system imposes on underlying representations.' (Stampe 1972 p. 69).

In Stampe's example, faced with the foreign pronunciation [ŋujen] 'Nguyen" the English speaker, perceiving initial [ŋ-] as a derived sound, substitutes /ingujen/ or /ɨngujen/ via the backwards derivation:

/ ingujen / 
 nasal assimilation
 stop assimilation
 nasal monophthongization
[ ŋujen ] 

This derivation involves independently motivated context-sensitive processes of English and treats a lexically inadmissible element (initial /ŋ-/ as if it were derived from lexically admissible /ing/. Note that the motivation for this roundabout derivation is the existence of a context-free process ŋ → n which, according to Stampe, governs the lexicon in English and thus prevents a simple carrying over of VN [ŋujen] to Eng */ŋujen/.

Stampe's second major mechanism for assimilating foreign pronunciations is illustrated by the alternative pronunciation [nujen] used by some English speakers. Here, the English speaker hears [ŋujen] as such, but the ŋ → n process which governs lexical representation and which prompted the first speaker to derive [ŋ-] from /ing/ applies in his pronunciation of the word to give /nujen/.(1)

5.3 Stampe's analysis applied to 'conservative' and 'innovating'

Stampe's analysis provides the basis for a principled account of Bloch's conservative and innovating pronunciations. Faced with English 'film', 'fair', 'foul', 'fork','fuse' with lexically inadmissible [f], the conservative speaker will lexicalize them as /hirumn/, /hee̯a/, /hau̯ru/, /hoo̯ku/, /hyuu̯zu/ via the context-free process f → h which governs the Japanese lexicon. This accounts for the column I pronunciations in §5.1.

Alternatively the conservative speaker, identifying Eng [ f ] with Jpn [ Φ ] (the allophone of /h/ which occurs before /u/) perceives and pronounces Eng [ fi ], [ fe ], [ fa ], [ fo ], [ fyu ] as [ Φui ], [Φue], [ Φua ], [ Φuo ], [ Φuyu ] derived through native processes from lexically admissible /hui/, /hue/, /hua/, /huo/, /huyu/. This accounts for the column II pronunciations in §5.1.

At this point, contrary to Bloch, we can derive his innovating pronunciations (§5.1 column III) from these same column II lexical representations by means of a native process, SYLLABICITY REVERSAL (VV̯ → V̯V), which plays an important role in the language both diachronically (cf. § and synchronically in the processing of bypoarticulate speech (cf §4.1.3). Thus, many of what Bloch characterized as 'combinations not present elsewhere in their speech' are in fact found in casual speech. SYL REV weakens a vowel (in this case [n]) between a preceding consonant or # and a following vowel. The weakened [u], a labial glide, is then deleted by LAB GLIDE DEL (cf. § Applied to the column II pronunciations in §5.1 SYL REV and LAB GLIDE DEL produce the following innovating pronunciations:

  Column II Column III
'film'/ hui̯rumu /[ Φui̯rumu ]→   ? [ Φwirumu ]→   [ Φirumu ]
'fair'/ huee̯a /[ Φuee̯a ]→   ? [ Φwee̯a ]→   [ Φee̯a ]
'foul'/ hua̯uru /[ Φua̯uru ]→   ? [ Φwau̯ru ]→   [ Φau̯ru ]
'fork'/ huoo̯ku /[ Φuoo̯ku ]→   ? [ Φwoo̯ku ]→   [ Φoo̯ku ]
'fuse'/ huyuu̯zu /[ Φuyuu̯zu ]→   * [ Φwyuu̯zu ]→   [ Φyuu̯zu ]

On this analysis Bloch's innovating pronunciation (column III) and one type of conservative pronunciation (column II) are different stages in a single derivation based on a uniform lexical representation. Their relationship is thus analogous to that between careful and hypoarticulate speech forms characterized in Chapter 4. Such a treatment was not open to Bloch whose structural phonemic model contained no mechanism for integrating careful and hypoarticulate speech forms in a single derivation. Ironically, a careful pronunciation of a foreign form (probably fairly frequent since it is foreign) will show more distortion of the Ls original than a casual one.

5.4 Two Borrowing Strategies

The foregoing NP analysis invites a relabelling of Bloch's various pronunciations. There appear to be basically two borrowing strategies involved resulting in two types of pronunciations of loan words – one oriented toward lexical admissibility in Lt, the other toward phonetic resemblance to the Ls form.

Let us label the Lt-oriented strategy the conservative one because the result (Bloch's column I pronunciation) is more native and less concerned with the phonetics of Ls.

The Ls-oriented strategy encompassing Bloch's column II and III pronunciations is the innovating one since it is relatively more concerned with approximating Ls phonetics. The innovating strategy produces a continuum from relatively less innovating (column II) to relatively more innovating (column III) pronunciations. This is the point at which language change may take place. Speakers who invariably produce forms at the more innovating end of the continuum (i.e. column III) may restructure the lexical item making the necessary changes in their phonologies (e.g. suppression of Φ → h to accommodate the new lexical representation /Φirumu/) recognized by Bloch.

5.5 [ wo, we, wi ]

These lexically inadmissible sequences are lexicalized as /uo, ue, ui/ by conservative speakers and pronounced [ uo, ue, ui ] e.g.

'water'/ uoo̯taa̯ /[ uoo̯taa̯ ]
'waiter'/ uee̯taa̯ /[ uee̯taa̯ ]
'wit'/ uitto /[ uitto ]

Innovating speakers have [ woo̯taa̯ ], [ wee̯taa̯ ], and [ witto ] via SYL REV which counterfeeds GLIDE FRONTING and LAB GLIDE DEL (cf. §4.1.3). If restructering to /woo̯taa̯/, /wee̯taa̯/, and /witto/ takes place then GLIDE FRONTING, which previously governed the lexicon, is suppressed thus allowing /wi, we/, and LAB GLIDE DEL, which also governed the lexicon, is limited to apply only before the high vowel thus allowing /wo/. Post consonantal (wi) is also borrowed e.g.

'quick'/kuikku/ → [ kuikku ] → [ kwikku ]
'the twist'
(a dance)
/ tuisuto / → [ tsuis:to ] → [ tswis:to ]

The latter example provides evidence of the strong identification of Ls [w] before /i/ with Lt /u/ since it is borrowed as /u/ even at the cost of introducing unwanted affrication on the preceding /t/. Note that final [-t ] in this same example (and in virtually all Ls words) is borrowed as [-to] presumably to avoid affrication, but where a post-consonantal [w] is sought this general strategy is abandoned.

5.6 [ye] [C'e]

These lexically inadmissible sequences have been borrowed in various ways. The following pronunciations are attested:

'yes'[ esu ][ iesu ][ yesu ]
'yellow'[ eroo̯ ][ ieroo̯ ][ yeroo̯
'Yale'[ ee̯ru ][ iee̯ru ][ yee̯ru ]
'shepherd'[ sepaa̯do ] [ s̆epaa̯do ]
'jelly'[ dzerii̯ ] [ dz̆erii ]
'cello'[ sero ] [ ts̆ero ]
'chainstore'  [ ts̆ee̯nsutoa ]
'check'  [ ts̆ekki ]

The column I pronunciations are due to PAL GLIDE DEL or DEPAL which govern the lexicon of the conservative speaker. This is analogous to the strategy employed by conservative speakers who borrow 'film' as /hirumu/.

The column II pronunciations result from lexicalizations /iesu/ etc. – a strategy analogous to that employed by speakers who borrow 'film' as /huirumu/.

At this point we might propose deriving the column III pronunciations from column II by optional application of SYL REV. The data support such an analysis for 'yes', 'yellow', and 'Yale'. However there do not appear to be any attested column II pronunciations of Ls [C'e] in the last five examples. Though native speakers I have queried feel that forms like [s̆iepaado] etc. are theoretically possible, none has ever heard or used them. This holds true for other [C'e] loans as well, e.g.

'shaker'*[ s̆iekaa ][ s̆eekaa ]
'shell'*[ s̆ieru ][ s̆eru ]
'jet'*[ dz̆ietto ][ dz̆etto ]
'general'*[ dz̆ieneraru ][ dz̆eneraru ]

With no intermediate forms attested I will assume restructuring has taken place and Ls [C'e] is Jpn /Cye/. (cf. /Cya, Cyo, Cyu/ in § This entails limiting PAL GLIDE DEL to apply only before high palatal vowels - a development which parallels the limitation on LAB GLIDE DEL to allow /wo/. This yields the following innovating derivations for column III pronunciations:

'shepherd'/ syepaa̯do /→   *[ s̆yepaado ]→   [ s̆epaado ]
'jelly'/ dzyerii̯ /→   *[ dz̆yerii ]→   [ dz̆erii ]
'chainstore'/ tyee̯nsutoa̯ /→   *[ ts̆yee̯nsutoa̯ ]→   [ ts̆eens:toa̯ ]
'check'/ tyekki /→   *[ ts̆yekki ]→   [ ts̆ekki ]

On this analysis, we can still maintain our two borrowing strategies, i.e. the Lt-oriented one (§5.1 column I) and the Ls- oriented one (§5.1 column III). Except for [sero] 'cello' there are no attested column I pronunciations for Ls [ts̆e] sequences. In general I believe that conservative speakers are now using the same pronunciation as innovating speakers for this sequence. The strategy underlying the lone and somewhat old fashioned [sero] nativization may have historical implications. Arisaka (1957) claims that ModJ sibilants /s/ and /z/ (our /dz/) come from earlier affricates. In NP terms a CF process Affricate → Fricative, suppressed at an earlier time thus allowing /ts, dz/, now applies in part (i.e. the voiceless part) ruling out /ts/. The conservative item /sero/ 'cello' (from Eng [ts̆elo]) might be due to the application of this context-free process (along with PAL GLIDE DEL which also governs the lexicon) to yield /sero/ rather than the inadmissible /tsyero/. Conservative [ dzerii ] 'jelly' (rather than [ zerii ]) shows that the voiced portion of the process remains suppressed.

5.7 Three categories of proccesses

We have identified three categories of processes in the grammar:

  1. Processes that govern lexical representation barring certain segments (via context-free processes) or sequences (via context-sensitive processes) from the lexicon.
  2. Processes that obligatorily govern derived structure typically applying to underlying representation to give careful speech forms.
  3. Processes that optionally govern derived structure typically applying to careful speech forms to give hypoarticulated forms.

In borrowing, a conservative speaker will lexicalize Ls pronunciations in terms of either (i) or (ii). An innovating speaker may also make use of (iii).

5.8 [ ti, tu, tyu ]

So far our analysis of foreign assimilations has involved lexically inadmissible Ls sequences e.g. [ fi, wi, ye ]. Innovating speakers also borrow Ls sequences that are lexically admissible but do not occur on the surface in Lt, i.e. Ls [ ti, tu, tyu ]. These sequences are lexically admissible in Japanese but normally would obligatorily undergo the application of certain derivational processes to yield [ ts̆i, tsu, ts̆u ] on the surface. These latter are in fact the pronunciations used by conservative speakers, and on the basis of this we will assume that the Ls phonetic sequences are simply taken as lexical representations and are then subjected to whatever derivational processes apply. Innovating speakers, in order to approximate the Ls pronunciation, presumably suspend the derivational processes yielding a surface pronunciation identical with the lexical representation. This is a different strategy from that proposed in §5.3 - §5.5 where innovating speakers apply casual speech processes to approximate Ls pronunciations. In the present case there do not appear to be any casual speech processes by which to derive [ ti, tu, tyu ].


'party'/ paatii /[ paats̆ii ][ paatii ]
'two'/ tuu /[ tsuu ][ tuu ]
'tulip'/ tyuurippu /[ ts̆uu̯rippu ][ tyuurippu ]

The lexical entries of innovating speakers will contain a diacritic showing they are exempt from the application of the relevant derivational processes i.e. STOP AFF, PAL, ALVPALADJ, and PAL GLIDE DEL.(2)

5.9 [ di, du, dyu ]

Since the voiced subprocess of STOP AFF governs the lexicon (cf. § these sequences will be lexicalized by conservative speakers as /dzi, dzu, dzyu/ and pronounced [dzi, dzu, dzyu] via PAL, ALVPALADJ, and LAB GLIDE DEL, e.g.

'diesel'/ dziidzeru /[ dz̆iizeru ]
'drawers'/ dzuroosu /[ dzuroosu ]
'duet'/ dzyuetto /[ dz̆uetto ]

Innovating speakers in order to render admissible the pronunciations [di, du, dyu] must lexicalize the above sequences as /di, du, dyu/ by suppressing STOPAFF so that it no longer governs the lexicon. They must also suspend the obligatory derivational processes PAL, ALVPALADJ, and PAL GLIDE DEL, e.g.

'diesel'/ dzii̯dzeru /[ dii̯zeru ]
'laundry'/ randuri /[ randuri ]
'duet'/ dyuetto /[ dyuetto ]

5.10 Conservative borrowing strategy

At this point we may characterize the conservative (Lt oriented) borrowing strategy in the following NP terms:

(1) All Ls phonetic forms are lexicalized via unaltered native processes (i.e. no foreign segments or sequences are put in the lexicon).

(2) Ls phonetic segments which are lexically inadmissible in Lt are lexicalized via CF processes that govern the lexicon, e.g.

'film'/ hirumu / via f → h
'violin'/ bai̯orin / via v → b
'shepherd'/ sepaado / via C' → C

(3) Ls phonetic sequences that are lexically inadmissible in Lt are lexicalized via context-sensitive processes that govern the lexicon, e.g.

'yes'/ esu / via PAL GLIDE DEL

(4) Ls phonetic sequences that are lexically admissible in it are so lezicalized and undergo applicable obligatory derivational processes, e.g.

'sea'/ si / → [ s̆i ] via PAL
'tea'/ ti / → [ ts̆i ] via PAL, STOPPAFF

(5) Ls phonetic sequences which match Lt surface sequences are lexicalized so as to yield those surface sequences via appropriate derivational processes, e.g.

'she'/ s̆i / < [ si ] via PAL
'cheese'/ ts̆iizu / < / tiizu / via PAL, STOPPAFF

None of the conservative strategies involve the suspension or alteration of native processes.

5.11 Innovating borrowing strategy

The innovating or Ls-oriented borrowing strategy may be characterized in the following NP terms:

(1) Ls phonetic segments which are (lexically and superficially) inadmissible in Lt will not be admitted to the lexicon of Lt. This rules out segments such as /v/ which, as a study described in Neustupny (1978 p. 87) indicates, have only marginal status among some bilingual Japanese speakers. Haugen's observation concerning the Norwegian language in America is reaffirmed in Japanese loan phonology:

"In general all such new phonemes remain in a highly marginal position in the language structure. Many of them are limited to bilingual speakers, and the rest are limited to particular words and expressions." (Haugen 1953 p. 410.)

(2) Ls phonetic segments which are lexically inadmissible but superfically admissible (i.e. have allophonic status) may be admitted to the lexicon via suppression of a context-free process that governs the lexicon e.g. /Φirumu/ by suppression of Φ → h.

(3) Ls phonetic sequences that are lexically inadmissible in it may be lexicalized via the limitation of context-sensitive processes that govern the lexicon, e.g. /syepaado/ 'shepherd' by limitation of PAL GLIDE DEL to before high vowels.

(4) Ls phonetic sequences that are lexically admissible but superficially inadmissible may be rendered pronunciable by suspension of derivational (sub)processes e.g. Eng 'sea' Jpn /si/ --> [si] by suspension of PAL.

5.12 Correlation of strategies with ease of assimilation

Given the various strategies involved in the innovative borrowing of Ls sequences it is tempting to attempt a correlation of these strategies with the relative ease of assimilation of the resulting sequences. Neustupny (1978) reports several Japanese studies on the relative ease of articulation and perception of various 'foreign' sequences by native speakers of Japanese.

One study, Kawakami (1963), contains the author's subjective judgements of the difficulty of pronouncing a number of foreign sequences. Confining ourselves to the sequences we have considered in this chapter, he found the following order of difficulty beginning with the easiest:

s̆e   dz̆e   ts̆e   ti   tu   di   du   d'u   Φa   Φi   Φe   Φo   wi   we

Another work (Uechi and Kanno 1961) studied the ease with which Japanese speakers could perceive the difference between certain 'innovating' sequences and their 'conservative' counterparts. The following scale was determined beginning with the easiest to distinguish from its counterpart (in parenthesis):


Although both studies are open to several objections and interpretations they do agree remarkably well. The following correlation exists between their hierarchy of difficulty and the innovating strategies proposed in this chapter:

(easiest)s̆elimit processes that govern the lexicon
 tisuspend obligatory derivational processes
 disuspend lexicon governing process
and obligatory derivational processes
 Φaapply optional derivational processes
(most difficult)we"

Neustnpny (p. 88ff) seeks an explanation for the hierarchy of difficulty in terms of 'the degree of differentiation' of the Ls form from its closest native counterpart. 'Degree of differentiation' is greatest where the feature(s) by which the Ls and Lt forms differ are distinctive (i.e. phonemic) and, if so, by what frequency and generality the distinction is employed in the language (i.e. functional load), e.g. [ s̆e ] and [ dz̆e ] possess an almost native degree of distinctiveness because they are distinguished from their native counterparts [ se ] and [ ze ] by palatalization (and stridency) which is 'phonemic' in Japanese. The number of features by which they differ is also a factor, according to Neustupny, e.g. [ ti ] and [ di ] possess a higher degree of distinctiveness (from [ ts̆i ] and [ dz̆i ]) than [ tu ] and [ du ] (from [ tsu ] and [ dzu ] because the former pair differ from the native syllables by lack of aspiration as well as palatalization while the latter pair differ only by affrication. Where the feature(s) by which they differ are nondistinctive (i.e. allophonic) the 'degree of differentiation' is less. According to Neustupny the greater the degree of differentiation the easier to distinguish between the Ls and Lt forms. The NP analysis based on processes and strategies may provide a principled account of these matters treated informally by Neustupny. Neustupny's interest here seems confined to the perceptual side. But the agreement between Kawakamios hierarchy of pronunciation difficulty and Uechi/Kanno's hierarchy of perceptual difficulty is accounted for if we assume, as NP does, that for a given utterance the same processes mediate underlying and surface forms in either articulation or perception.


(1) In an interesting further comment (p. 70) Stampe says that for some speakers the ŋ → n process seems to become a perceptual constraint so that for [ ŋujen ] they hear [ nujen ] in the first place.
(2) The dental stridents /s/, /dz/ are quite resistant to this strategy and only the conservative pronunciations [ s̆i ], [ dz̆i ] are attested for Ls [ si ], [ zi ].




6.0 Introduction

This chapter will extend the analysis in the preceding chapters to the speech of selected speakers of Hawaiian Japanese. Differences in dialects will be shown to depend on differing constraints on the application of processes. Data is from the speech of several long-time residents of Hawaii who emigrated from Japan prior to 1924 when the Oriental Exclusion Act cut off further large-scale immigration. These immigrants are known locally as 'issei', a Japanese word meaning 'first generation.' The term actually includes any first generation Japanese immigrant but in this study will have the more restricted meaning of Japanese immigrant arriving before 1924. The data was taken from tape recorded interviews with issei from two dialect areas of Japan – Hiroshima and Yamaguchi Prefectures in Western Japan and Fukushima Prefecture in Eastern Japan.

Issei in Hawaii came, for the most part, from farm families whose speech reflected local dialects rather than SJ. Television, radio, movies, and universal education were either non-existent or much less influential than today in the spread of the standard language. Most issei therefore arrived speaking a local dialect. Table 6.1 shows the number of issei in Hawaii in 1924 along with their prefecture of origin. The information is from the Publication Committee of the United Japanese society of Hawaii (1964:314-5).

Table 6.1

Number of Issei in Hawaii in 1924
with their Prefecture of Origin
Prefecture of OriginNumber/%
Total  116,615/100%

The earliest and most numerous group (48%) were from Hiroshima/Yamaguchi. It is said that their speech forms the basis for standard Hawaiian Japanese and that issei from other dialect areas, including Fukushima, have tended to adapt their speech to this standard. One of the purposes of this chapter will be to present phonological evidence bearing on the question of whether the speech of Fukushima issei has levelled in the direction of H/Y/. A second question to be explored bears on the distinction between processes and rules discussed in §1.5 and elsewhere. Stampe's characterization of process-governed behavior as innate and therefore difficult to suspend and rule governed behavior as learned and, though usually obligatory, easily suspended seems to imply that the persistent features in a dialect levelling situation will be the process-governed ones while the rule-governed one will be relatively easily levelled.

The Fukushima dialect in Japan differs from the Hiroshima/Yamaguchi one by well-known features – both process-governed and rule-governed. Our procedure will be to isolate several of these and measure there relative frequency in the speech of Fukushima issei.

6.1 Process-governed features

A striking feature of Fukushima dialect is the merger of syllable types shown in Table 6.2. The syllables labelled H/Y are identical to the SJ ones.

Table 6.2

Merger of H/Y Syllables in F
/ si /   [ s̆i ]   ↘ 
/ su /   [ su ]   →/ su / [ su ]
/ syu /   [ s̆u ]   ↗ 
/ ti /   [ ts̆i ]   ↘ 
/ tu /   [ tsu ]   →/ tu / [ tsu ]
/ tyu /   [ ts̆u ]   ↗ 
/ dzi /   [ dz̆i ]   ↘ 
/ dzu /   [ dzu ]   →/ dzu / [ dzu ]
/ dzyu /   [ dz̆u ]   ↗ 

Although the Fukushima dialect is often characterized as in Table 6.2 there is in fact a great deal of variation among speakers due to the influence of the standard language and other little understood factors. A recent cross-generational language survey of two communities in Fukushima prefecture (Iitoyo 1974) showed SJ forms (i.e. those on the left in Table 6.2 since H/Y = SJ in this case) appearing in the speech of most of those surveyed, especially the younger generation. Another finding was that palatal /i/ did not always merge with /u/ but remained distinct though centralized. Most of the data in Iitoyo's survey comes from single word elicitation. My own data taken from a 45-minute recorded interview do not contradict his findings, however. The interview was conducted by a 40-year old male SJ-speaking visitor to a Fukushima village in the same locality which produced most of the Fukushima immigrants to Hawaii. The interviewee is an 84-year-old male, life-long resident of the village. He is a retired farmer with an elementary school education. Although his pronunciation sounds quite dialectal, there are numerous SJ variants as well. Table 6.3 shows some of this variation.

Table 6.3

Variation in the Speech of a Fukushima Resident
Substitution% of Occurrences
[ s̆u ] > [ su ] 100% (3 of 3)
[ ts̆u ] > [ tsu ]   65% (10 of 16)
[ dz̆u ] > [ dzu ]   35% (6 of 17)
[ s̆i ] > [ si ] ~ [ su ]   43% (73 of 168)
[ ts̆i ] > [ tsi ] ~ [ tsu ]   37% (24 of 66)
[ dz̆i ] > [ dzi ] ~ [ dzu ]   33% (12 of 39)

6.1.1 Comparative analysis of dialects in Table 6.2

The mergers shown in Table 6.2 are process-governed features of Fukushima dialect. Ignoring variation, Fukushima differs from H/Y dialect in the following ways:

  1. POST CONS PAL GLIDE DEL governs the lexicon preventing */Cyu/ where C = a coronal obstruent.
  2. VOWEL RETRACTION governs the lexicon preventing */Ci/ where C = a coronal obstruent.


'husband'/ syudzin / [ s̆udz̆in ]/ sudzun / [ suzũũ; ]
'sushi'/ sushi / [ sus̆i ]/ susu / [ susu ]
'to differ'/ tigau̯ / [ ts̆igau̯ ]/ tugau̯ / [ tsugau̯ ]
'surgery'/ syudzyutu / [ s̆udz̆utsu ]/ sudzutu / [ sũndzutsu ]*
*This form shows prenasalization which is another distinguishing Fukushima feature.

6.2 Persistence of VOWEL RETRACTION and Y DEL in Fukushima issei speech

The distinctive features of Fukushima speech exemplified by the mergers shown in Table 6.2 are process governed features due to the lexicon governing processes VOWEL RETRACTION and POST CONS PAL GLIDE DEL. Since there is lexical and syntactic evidence that Fukushima speakers are adapting their speech to a H/Y standard we might assume that ceteris paribus levelling would also occur on the phonological level. Samples of the speech of two Fukushima issei speakers, one male and one female, were examined in an attempt to ascertain this. The samples consisted of tape recorded interviews. Each interview lasted 45 minutes and was conducted at the home of the speaker.

M (the male speaker) arrived in Hawaii in 1915 at the age of 15. Re was interviewed by a female speaker of SJ from Japan. M stated prior to the interview that he always speaks 'pure Fukushima dialect' when he converses with his relatives who are also from Fukushima, but that he tries to speak standard Japanese when conversing with others. It is not known whether M's model of 'standard Japanese' is SJ or the speech of H/Y issei. However for the features under investigation it is immaterial since SJ and H/Y are identical.

F (the female speaker) arrived in Hawaii in 1916 at the age of 23. She was interviewed by a third-generation Hawaiian Japanese female student of Japanese at the University of Hawaii. The interviewer was a friend of the family but does not speak Hawaiian Japanese so addressed F in the SJ she had learned at the university.

Table 6.4 shows the occurrences of the two features in the speech of M and F.

Table 6.4

Occurrence of Vowel Retraction and Y Deletion
in the Speech of Two Fukushima Issei
 Substitution% of Occurrences
 [ s̆u ] > [ su ]M   50% (2 of 4)
F    No Data
Y DELETION[ ts̆u ] > [ tsu ]M   No Data
F    0% (0 of 1)
 [ dz̆u ] > [ dzu ]M   20% (2 of 10)
F    40% (4 of 10)
 [ s̆i ] > [ si ] ~ [ su ]M   78% (80 of 96)
F    80% (157 of 175)
VOWEL RETRACTION[ ts̆i ] > [ tsi ] ~ [ tsu ]M   80% (30 of 37)
F    59% (20 of 36)
 [ dz̆i ] > [ dzi ] ~ [ dzu ]M   79% (18 of 23)
F    62% (31 of 50)

Because these are process-governed features NP predicts that the forms to the right of the arrow will persist even in the face of a tendency to level toward the forms to the left of the arrow. In the case of vowel retraction persistence is strong and the prediction is borne out. Results are mixed in the case of Y DELETION partly, perhaps, because of the sparseness of the data (due to the low frequency of CyV syllables in SJ). Given the additional fact that these substitutions are strongly stigmatized among Japanese speakers all over Japan and Hawaii as 'rustic' and 'unrefined', not to say comical, their continued use would be all the more remarkable in the absence of some explanation of their persistence.

6.3 Rule-governed features of Fukushima dialect

There is a well-known difference between the Pukushima and Hiroshima/Yamaguchi dialects in the shape of the stem of a class of verbs called v-stem verbs. The difference occurs before stopped suffixes. Table 6.5 shows a representative paradigm with an IC cut between stem and suffix.

Table 6.5

W-stem Verb Paradigm of the Verb 'to buy'
 FukushimaHiroshima / Yamaguchi
1.Present Indka/uka/u
7.Past Indkat/takoo/ta

The difference in the stem in items 6-9 is apparently due to divergent historical developments of the following sort:

'to buy'   OJ   kapite
> kaΦite > kaΦute > kaute > koote
> kapte > katte

The label 'w-stem' verb is due to the presence of a stem final /w/ in the negative form of this class of verbs.

Table 6.6

Alternative Analyses of the w-stem verb paradigm
1.Present Indka/ukaw/ukap

Column A is the phonemic representation of these forms found in Bloch (1950).

Column C shows the OJ form of the verb stem. This is also the lexical representation McCawley (1968) proposes in his generative phonological analysis. The surface forms of the stem are derived by rules which to some extent recapitulate the historical evolution of the form.

Column B shows the NP lexical representation of the stems in items 1-5 based on the principle that it shall be no deeper than 'phonemic' unless required by alternation, and that any such 'morphophonemic' representation must be relatable to the surface by processes, not rules, of the language, i.e. kaw → ka /_e,a,o,u by LAB GLIDE DEL and GLIDE FRONTING (cf. §

The NP lexical representation in Column B differs from the phonemic representation in A due to the above principle whereby a lexical /w/ is allowed in forms where none occurs on the surface. It differs from the generative phonological representation in C because there are apparently no processes in the language relating /p/ to the phonetic representations. Note further that neither the gerundial stem /kat-/ in /katte/, the Fukushima form, nor /koo-/ in /koote/, the H/Y form can be represented lexically as /kaw-/ since there are no synchronic processes in Japanese by which to derive [kat] or [koo] from /kaw/. (The process VOWEL COLORING (cf. § which played an essential role in the diachronic development /kaw/ > /koo/ is no longer a live process in Japanese.) This amounts to a claim that the alternations /kaw/ ~ /kat/ and /kaw/ ~ /koo̯/ are rule-governed rather than process-governed alternations. The prediction is that the replacement of /katte/ (and all such forms with stopped suffixes) by the H/Y form /koo̯te/ would be relatively easy.

To see if this prediction is borne out the speech of the two Fukushima issei, M and F, was analyzed. Verbs of this class are relatively rare in the language, and occurrences were few in the speech of either speaker. Table 6.7 shows the results.

Table 6.7

Occurrences of w-stems in the speech of M and F
 Speaker MSpeaker F
/ kat / type stems55
/ koo̯ / type stems12
Total  67

Since the /koo/-type stems belong to the dominant dialect in Hawaii one would have expected more levelling in that direction if, in fact, rule-governed features are easily replaced. Instead only 15% of M's and 30% percent of F's occurrences were in the expected direction. Two factors may have influenced the results. First contrary to the case involving Y DELETION and VOWEL RETRACTION, the Fukushima verb stems correspond to those of SJ and the /koo/ verb stems are non-standard. Coupled with the fact that the interviewers were SJ speakers this may explain why the H/Y forms did not occur. Obviously the speakers' awareness of and attitudes toward these alternative forms can play an an important role in their distribution, all the more so perhaps if they are rule-governed. In any case in the light of the data in the above table, there is no basis for the claim that rule-governed features are less persistent than process-governed ones. However, it would appear that those features characterized as process-governed in §6.3, though stigmatized, persist to a significant degree in the speech of Fukushima issei. The evidence then is at least compatible with the hypothesis that this is due to their process-governed nature. As for the question posed at the beginning of this chapter i.e. 'Are Fukushima issei speakers adapting their speech to that of H/Y issei speakers?' the data support an affirmative answer at least to the extent that /koo/-type verb stems occur in the speech of Fukushima issei speakers since these forms are foreign to both Fukushima dialect and to Standard Japanese.




The essential descriptive and explanatory elements in Natural Phonology are the processes – the universal set of phonetically motivated substitutions available to speakers of all natural languages. In the pronunciation of Japanese a particular subset of these universal processes governs underlying and derived structure. In the foregoing chapters, 30 important processes of Japanese have been identified and their form and function described. The evidence has come from careful and hypoarticulate speech and from the pronunciation of loan words.

Processes may govern lexical or derived structure or both. For instance DEPAL and V DENAS are context-free processes which govern the Japanese lexicon assuring that underlying consonants will be plain rather than palatalized and that underlying vowels will be oral not nasal. PAL and V NAS are context-sensitive processes ordered after their CF contradictory counterparts thus assuring that consonants and vowels will be superficially palatalized and nasalized, respectively. Sometimes a single process governs both lexical and phonetic representation. PRE VOC LAB GLIDE DEL governs the lexicon preventing underlying homorganic glide vowel sequences */wu, wo/. It also optionally governs the superficial alternations [ wa ] ~[ a ] in hypoarticulate speech. On the other hand OBS VOICING ASSIM which governs the lexicon preventing underlying sequences with non-uniform voicing like */zk, sg/ also governs surface structure, obligatorily applying to sequences with mixed voicing which arise in hypoarticulate speech processing.

These examples support Stampe's contention (1973 p.30) that there is no reason to propose a general extrinsic distinction between the roles of processes whether they govern underlying representation ('phonotactic', 'morpheme structure', 'redundancy' rules) or derived representation ('morphophonemic', 'allophonic', 'P-' rules). Indeed there is reason not to do so – notably the same one that prompted Halle to reject a distinction between the allophonic and morphophonemic substitutions involved in Russian obstruent voicing assimilation, i.e. a single process is responsible for [c̆) → [ j ] and [ t ] → [ d ] even though in Russian the former substitution is allophonic and the latter morphophonemic.

The analysis of palatalization in § and w-stem verb stems in §6.3 exemplifies another important feature of NP – principled constraint of the abstractness of lexical representations. Lexical representation is constrained in two ways. First, it must be at least 'phonemic' due to the ordering of contradictory processes (e.g. V DENAS and V NAS) which bars allophonic features from the lexicon. Secondly, the lexical representation of a form may be deeper than phonemic if there are alternants of the form which cannot be derived from the phonemic representation. However, the depth of this 'morphophonemic' representation is constrained by the condition that it must be relatable to the surface by synchronic processes of the language. The importance of the abstractness issue can be gauged by the number of proposals in the phonological literature e.g. Kiparsky (1973), Schane (1971) designed to constrain the abstractness of underlying representations.

Concerning the pronunciation of loan words in Japanese, there is evidence in Chapter V which is significant in two very different ways. I refer to conservative and innovating pronunciations. The analysis of conservative pronunciation seems fairly straightforward within the context of the principles of NP, e.g. when speakers borrow 'she' as [ s̆i ] we can be fairly certain that it has the same underlying representation as Jpn [ s̆i ]. Or when they borrow 'sea' as [ s̆i ] it seems clear that 1) there is a process in Japanese that substitutes palatalized [ s̆ ] for plain [ s ] and 2) the borrower has taken the Ls pronunciation as underlying. On the other hand the analysis of innovating pronunciations is less obvious. When 'sea' is borrowed as [ si ] or 'film' as [ Φirumu ] there are a number of possible analyses. We may simply allow inadmissible items in the lexicon with the prediction that at some future time they will become admissible by the pressure or alteration of native processes – a possibility suggested in Lee (1975). Or we may assume that the presence of these innovating pronunciations already attests to the alteration of native processes and speculate as to what these alterations are. I have chosen the latter course in Chapter V simply to provide an opportunity to see where the speculation might lead. But given the present state of our knowledge of these matters, it is the conservative pronunciations which provide the clearest evidence for the existence and interrelation of processes.

The analysis of Hawaiian Japanese in Chapter VI is at once an attempt conveniently to contrast the phonology of SJ with that of another dialect through a comparison of certain processes in the two phonologies and an attempt to determine whether the distinction between processes and rules could be the basis for predicting the relative persistence of certain forms in the speech of adults. The results indicate that process-governed features tend to persist in the speech of adults, but due possibly to the presence of sociolinguistic factors it is not possible to conclude that rule-governed features are less persistent as NP would predict. This chapter did furnish evidence that Fukushima issei have indeed adopted some of the rule-governed features of Hiroshima/Yamaguchi speech in Hawaii.



Arisaka, Hideyo. 1957. Kokugo oninshi no kenkyuu (Studies in the history of the Japanese phonemes). Tokyo: Sanseido.
Ashworth, D. and P. Lincoln. 1973. Loanwords and the morphophonemics of the Japanese verb. In Papers in Japanese Linguistics 2/2. Los Angeles: University of Southern California.
Baudouin de Courtenay, Jan. 1895. Versuch einer Theorie phonetischer Alternationen. Strassburg-Cracow. Translation in Stankiewicz 1972: 144-212.
Bloch, Bernard. 1950. Japanese phonetics. Language 26.86 125. Reprinted in Joos 1957: 329-348.
Concise gairaigo jiten (Abridged foreign language dictionary). 1972. Tokyo: Sanseido.
Donegan, Patricia. 1973. Bleaching and coloring. In Papers from the ninth regional meeting. Chicago: Chicago Linguistic Society.
______. 1978. On the natural phonology of vowels. Ohio State University Working papers in linguistics 23.1-159. Columbus: Department of Linguistics.
Donegan, Patricia and David Stampe. 1979. The study of natural phonology. In Current approaches to phonological theory. Ed. Daniel A. Dinnsen. Bloomington: Indiana University Press.
Han, Mieko. 1962. Japanese phonology. Tokyo: Kenkyusha.
Hasegawa, N. 1979. Casual speech vs. fast speech. In Papers from the fifteenth regional meeting. Chicago: Chicago linguistic Society.
Hattori, Shiroo. 1960. Gengogaku no hoohoo (Methods in linguistics). Tokyo: Iwanami Shoten.
Haugen, Einar. 1953. The Norwegian language in America. Philadelphia: University of Pennsylvania Press.
Iitoyo, K. 1974. Gengo siyoo no hensen (Changes in language use). Tokyo: The National Language Research Institute.
Joos, Martin. 1957. Readings in linguistics. Washington, D.C.: American Council of Learned Societies.
Kawakami, S. 1963. Gendaigo no hatsuon (Contemporary Japanese pronunciation). Kooza Gendaigo 1.184-210. Tokyo: Meiji Shoin.
Kiparsky, Paul. 1968. Linguistic universals and linguistic change. In Universals in linguistic theory. Eds. E. Bach and R. Harms. New York: Holt, Rinehart, Winstcn.
______. 1973. How abstract is phonology. In Three dimensions of linguistic theory. Ed. O. Fujimura. Tokyo: TEC Company.
Kuroda, S.Y. 1964. Generative grammatical studies on the Japanese language. Unpublished Ph.D. dissertation. Massachussetts Institute of Technology.
Leben, W. R. and O. W. Robinson. 1977. Upside down phonology. Language 53.1-21.
Lee, Gregory. 1975. Natural phonological descriptions, part I. In Working papers in linguistics 7/5. Honolulu: Department of Linguistics.
______. 1976. Natural phonological descriptions, part II. In working papers in linguistics 8/3. Honolulu: Department of Linguistics.
Lovins, Julie. 1973. Loanwords and the phcnological structure of Japanese. Unpublished Ph.D. dissertation. University of Chicago.
Martin, Samuel. 1952. Morphophonemics of standard colloquial Japanese. Baltimore: Linguistic Society of America.
______. 1959. Review of Wenck's Japanische Phonetik. Language 35.370-82.
______. 1967. On the accent of Japanese adjectives. Language 43.246-77.
______. 1975. Reference grammar of Japanese. New Haven: Yale University Press.
______. 1976. Earlier Japanese. Unpublished manuscript.
McCawley, James. 1968. The phonological component of a grammar of Japanese. The Hague: Mouton.
______. 1969. Length and voicing in Tubatlabal. In Papers from the fifth regional meeting, pp. 407-415. Chicago: Chicago Linguistic Society.
Miller, Patricia (Donegan). 1972. Some ccntext-free processes affecting vowels. Ohio State University Working papers in linguistics 11.136.168. Columbus: Department of Linguistics.
Neustupny, J. 1978. Post-structural approaches to language. Tokyo: University of Tokyo Press.
Nishihara, S. 1970. Phonological change and verb morphology of Japanese. Unpublished Ph.D. dissertation. University of Michigan.
Ohso, Mieko. 1971. A phonological study of score English loan words in Japanese. Ohio State University working papers in linguistics. Columbus: Department of Linguistics.
Schane, Sanford. 1971. The phoneme revisited. Language 47.503-21.
Shevelov, George and John Chew, Jr. 1969. Open syllable languages and their evolution: common Slavic and Japanese. Word 25/1-3.
Shibatani, Masayoshi. 1973. Review of McCawley's The phonological component of a grammar of Japanese. PCIA 17.127-143. Berkeley: University of California.
Stampe, David, 1969. The acquisition of phonetic representation. In Papers from the fifth regional meeting of the Chicago Linguistic Society, pp. 443-454. Chicago: Chicago Linguistic Society.
______. 1973. A dissertation on natural phonology. Unpublished Ph.D. dissertation. University of Chicago.
Stankiewicz, Edward. 1972. A Baudouin de Courtenay anthology. Bloomington: Indiana University Press.
Uechi, N. and K. Kanno. 1961. Nihongo ni okeru gaikokugo no hyooki to hatsuon. (The notation and pronunciaticn of foreign languages in Japanese.) Nenpo No. 6. Tokyo: NHK Hoosoo Bunka Kenkyuusyo.
United Japanese Society of Hawaii Publication Committee. 1964. Hawaii nihonjin iminshi (A history of Japanese immigrants in Hawaii. Honolulu: United Japanese Society of Hawaii.
Wang, William. 1968. Acoustic measurement of Japanese mora. PCLA. Berkeley: University of California.

With thanks to my good friend Steve Trussel for converting this into elegant electronic (html) form for the web.
[Any errors are, of course, probably mine - ST].