A K-ToBI transcription for an utterance consists minimally of a recording of the speech, an associated record of the fundamental frequency contour, and (the transcription proper) symbolic labels for events on the following four parallel tiers:
1. a word tierThe original English ToBI allows the free proliferation of site-specific extra tiers, and so do J_ToBI and K-ToBI. Sites with aligner for English, for example, have generally added a phones tier for phonetic segmentation, and J_ToBI users have agreed to add an obligatory "finality" tier where intonational phrases that sound "final" to a turn are minimally marked as such (until we can develop a more complete discourse model of discourse finality to govern a hierarchy of labels for this tier). In accordance with this general design principle, K-ToBI users are encouraged to add their own customized tiers to label events of site-specific interest, and keep records of why each particular tier was added and how it is used. By comparing extra tiers across labeling sites we, too, may find that we all agree on the desirability of some generally used tiers specific to Korean labeling which can be made obligatory for K-ToBI.
2. a tone tier
3. a break-index tier
4. a miscellaneous tier
If the labeling platform is xwaves and xlabel (or any
similar labeling platform that works in terms of time flags), the word
label should be placed at the end of the final segment in the word, as
determined by the labeler from the waveform or spectrogram record. That
is, each word should be marked at its right edge. Filled pauses and
the like should also be marked, using some site specific convention for
the Hangul or romanized spelling. A romanization convention used
at UCLA and ATR site is in Appendix A.
Values for the break index are chosen from the following
set:
1. <<l8c3>>Note that while the accentual phrase and intonational phrase are defined in the prosodic model by tonal markings, the break index value indicates the labeler’s subjective sense of disjuncture and not simply the juncture that typifies the apparent tones. Thus, the break index tier markings are not completely redundant to the tone tier markings for break index levels 2 and 3. In cases of mismatch, the number should follow the perceived juncture rather than the tones, although it should be flagged with the diacritic "m", as in:
azumEninga ENze maNdIrEjo?
2 1 3
‘madam when make’
-> ‘When is Madam making (it)?’2. <<t1p2s10>>
igEsIN uridIR maIMU segjeedo hAdaQdweNda
2 2 2 2 3
‘This our mind world too apply to’
-> ‘This applies to insdie our mind’3. <<t1p2s6>>
zIG, saNhonIN saraiSImjE aMsEgIN zugEiNnIn gEsida
3 2 3 2 1 3
‘That is, coral-TOP alive and rock-TOP dead-progressive rel.marker to be’
-> ‘That is, coral is alive and rock is dead’
(cf. for an example of ‘0’ break, see the example sentence 6 below)
2m a medium strength disjuncture that typically would
be marked by the tonal pattern of the accentual phrase, but without
any tonal markings, or with the tonal markings of an intonational
phrase edge.
3m a stronger disjuncture that typically would be marked
by the tonal pattern of an intonation phrase, but with the tones
of an accentual phrase rather than a boundary tone.
Note that low-ending phrases marked with the 2m label in theory allow two corresponding analyses on the tone tier: (1) no tonal markings for the end of the accentual phrase, or (2) a L% boundary tone. Since the latter is a more complex analysis, we prescribe the first.
In an xwaves/xlabel type system, the break index label
should be associated with a point in time at the end of each word, as indicated
in the word tier. It should be located exactly at, or slightly to
the right, of this word marker, so that break indices can be unambiguously
associated with other tiers. Transcriber uncertainty about break-index
strength is to be indicated with a minus ("-") diacritic affixed directly
to the right of the break index -- e.g. "1-" to indicate uncertainty
between "0" and "1"; "2-" to indicate uncertainty between "2" and "1";
and so on.
Note that since the "m" diacritic suggests certainty
about the break index analysis in the face of conflicting tonal evidence,
the "-" diacritic should not be used together with "m". That is,
for example, in the case of a break with the sense of disjuncture usually
associated with an accentual phrase but no corresponding rise indicative
of LHa, either the labeler is unsure that there is such a strong sense
of disjuncture (i.e. "2-") or the labeler is sure of the
disjuncture and this is a mismatch (i.e. "2m").
Example sentences (sound files and f0 tracks will be available
soon)
4. <<t1p1s2>>
doQgi bujEU du hjEQtA zuQesE iRbaNzEgiN kEsIn waNzEnhwa,
2 2 2- 1 2m 1 3 3
‘motivation giving-poss two method among general-rel thing-top completeness’
-> ‘Among two kinds of providing motivation, the general thing is completeness’5. <<t1p2s5>>
gIrEna, gatIN hjENmigjEQe sanho zogagIR noko bomjEN
3- 2 3 1 2- 1 3
‘but, same microscope-loc. coral piece-acc. to put and see if’
-> ‘But, if you see a piece of coral under microscope,...’6. <<t1p2s5>>
sanhoga sEQzaQhamjENsE bjENhwahago iDTanIN gEsIR aR SuiDTa.
2m 2 2- 1 2 0 3
‘coral-nom. growing change-prog.-rel. thing-acc. to see’
-> ‘We can see that coral is growing and changing’
LHa the accentual phrase edge tone. It marks the
right edge of an intonation-phrase-medial accentual phrase.
H- the phrasal H-. It marks the early peak seen
around the first or second syllable of some accentual phrases. Such
a peak occurs very often in the last accentual phrase of an intonational
phrase that ends in L%. (In fact here it is the typical case unless
the phrase is so very short that there cannot be any rise at all).
The peak also can readily occur in phrases with final rises (i.e. medial
accentual phrases, which end in LHa, or final accentual phrases before
H%, LH%, etc.) in cases where the phrase is long enough to realize another
smaller peak at the beginning. The timing of the peak depends on
the initial segment in the phrase. When the phrase begins with an
aspirated or tense obstruent, the peak will be on the first syllable, but
if the initial segment is anything other than an aspirated or tense obstruent,
the peak for the H- will occur later, and there will be a noticeable rise
to it from a L tone on the first syllable. The H- early phrasal tone
should be placed at the corresponding peak observed in the F0 contour.
The LHa tone should be placed at or just before the corresponding
break index marker regardless of the actual location of the peak.
If the peak is not at the phrase boundary as observed in the waveform,
then the corresponding F0 peak should be marked by one of the
following:
Note that while here we include both logical possibilities, in reality, we have yet to observe an instance of "<". That is, typically the peak is slightly before the accentual phrase boundary.
- > indicates an early peak corresponding to a later tone label
- < indicates a late peak corresponding to an earlier tone label
Example sentences (sound files and f0 tracks will be available
soon):
7. <<t1p2s8>>The boundary tones for intonational phrases include:
sEQzaQhago iNnIN gEsi saraiNnIN gEsida
H- > LHa H- L%‘to grow-prog. is to live-prog.’
-> ‘Growing means that it is live’8. <<t1p2s6>>
zIG, saNhonIN saraiSImjE aMsEgIN zugEiNnIn gEsida
H- L% H- LHa L% LHa H- L%‘i.e., coral-TOP alive and rock-TOP dead-progressive rel.marker to be’
-> ‘That is, coral is alive and rock is dead’
L%All intonation-phrase final boundary tones are placed at or just before the break index mark, regardless of F0 contour. When a word is final to an accentual phrase and fianl to an intonational phrase, only the intonational phrase boundary tone is written at or just before the endof the word. The actual peak in the F0 contour corresponding to H%, LH%, HL%, LHL%, etc., should be marked by one of the following (unless the peak for a H% or LH% is actually at the phrase edge):H%
- a level ending, or a gently falling boundary tone spread over much of the last accentual phrase from the LH- at the beginning
LH%
- a rising boundary tone that begins to rise from a minimum on a syllable near the end of the intonation phrase, but earlier than for LH%
HL%
- a rising boundary tone that is more localized than H%, rising sharply from a valley well within the final syllable (That is, by comparison to H%, this is a sharper later rise, starting after the onset of the final syllable.)
LHL%
- a rising-falling boundary tone that rises to a peak before the last syllable, and then falls during the last syllable. -- essentially a combination tone, consisting of H% followed by L%.
HLH%
- a rising-falling boundary tone, like HL%, except that the rise begins considerably later -- essentially a combination tone consisting of LH% followed by L%.
- a rise-fall-rise boundary tone -- a combination of HL% and H%. That is, the timing of the rise is the same as HL% but followed by a shallow dip and then another small rise.
For any of the complex boundary tones with more than one H tone, the peak that is marked in this way should be the highest one. In this way, K-ToBI (like J_ToBI) provides for pitch range estimators without a separate HiF0 label. Again, note that while we include both logical possibilities for boundary tones ending in a H, in reality, we have yet to see a case of a rising boundary tone with "<", and of course "<" is logically impossible for rising-falling tones HL%, LHL%, etc.
- < late peak
- > early peak
Example sentences (sound files and f0 tracks will be available
soon):
9. <<4boundary>> gIrASEjo
H- H%
‘Is that so?’10. <<4boundary>> gIrASEjo
H- LH%
‘Is that so?’11. <<4boundary>> gIrASEjo
HL%
‘Is that so?’12. <<4boundary>> gIrASEjo
H- LHL%
‘Is that so?’13. <<J3A2 >> onIR zEnyEge nuga mEgEyo
LHa H- HLH%‘Today night who eat?’
-> ‘Who is eating tonight?’
so that the interval is delimited by the < .... > pair.laughter < beginning of interval of laughter laughter > end of a period of laughter
Beckman, Mary; & Hirschberg, Julia (1994) The ToBI Annotation Conventions. Manuscript, Ohio State University. [For information on obtaining by ftp, send e-mail to tobi@ling.ohio-state.edu.]
De Jong, Kenneth (1994) "Initial tones and prominence in Seoul Korean," Ohio State University Working Papers in Linguistics, No. 43, pp. 1-14.
Jun, Sun-Ah (1990) "The prosodic structure of Korean -- in terms of voicing," In E-J. Baek, ed., Proceedings of the 7th International Conference on Korean Linguistics, pp. 87-104. University of Toronto Press.
Jun, Sun-Ah (1993) The Phonetics and Phonology of Korean Prosody. Doctoral Dissertation, Linguistics, Ohio State University. [For information on ordering, send e-mail to osdl@ling.ohio-state.edu.]
Jun, Sun-Ah (1995) "Asymmetrical prosodic effects on the laryngeal gesture in Korean," In Bruce Connell and Amalia Arvaniti, eds., Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV, pp. 235-253. Cambridge University Press.
Lee, Sook-hyang (1989) "Intonational domains of the Seoul
dialect of Korean," Journal of the Acoustical Society of America, vol.
85, suppl. 1, p. S99.
Pitrelli, John; Beckman, Mary; & Hirschberg, Julia
(1994) "Evaluation of prosodic transcription labeling reliability in the
ToBI framework," Proceedings of the 1992 International Conference on Spoken
Language Processing, vol. 1, pp. 123-126.
Silverman, Kim; Beckman, Mary; Pitrelli, John; Ostendorf, Mari; Wightman, Colin; Price, Patti; Pierrehumbert, Janet; & Hirschberg, Julia (1992) "ToBI: a standard for labeling English prosody," Proceedings of the 1992 International Conference on Spoken Language Processing, vol. 2, pp. 867-870.
Venditti, Jennifer (1995) Japanese ToBI Labeling Guidelines. Manuscript with examples, Ohio State University. [For information on obtaining by ftp, send e-mail to venditti@ling.ohio-state.edu.]
Campbell, Nick; & Venditti, Jennifer (1995) "J-ToBI:
an intonational labeling system for Japanese," Paper presented at the Autumn,
1995, Meeting of the Acoustical Society of Japan.