Bruce P. Hayes

Department of Linguistics

UCLA

What is a wug-shaped curve and why is it of interest?

The wug-shaped curve is a pattern of frequency widely found in quantitative studies of
variable phenomena in linguistics. Indeed, it is so widespread that I
believe its appearance may be meaningful from the viewpoint of theoretical linguistics. Visually, the wug-shaped
curve takes the form of two or more identical sigmoid (logistic) curves, spaced apart.

The wug-shaped curve is a natural consequence of probabilistic versions of Harmonic grammar, such as MaxEnt.
Here is how the analysis is set up: we divide
the constraint set into two families, having different forms or
teleologies: Baseline constraints and
Perturbers. We then plot the empirical data points, in the form of
probabilities (zero to one) on the vertical axis, and Baseline
probability on the horizontal axis. This is done separately,
in a different color, for the data series defined by violations of the
Perturber constraints. We also plot the sigmoid lines themselves,
which show the model fit -- ideally, the data points will cling to
their respective sigmoids. You can decide how often this happens by
looking at the charts below.

To make the research agenda plain: along with some of my colleagues, I suspect that MaxEnt, or something like it, is correct for natural language, and that is why we see wug-shaped curves all over the place when we look at quantitative language data. You can judge for yourself by browsing through the images in this gallery, or by analyzing your own data in this way (see last section for how).

Various
people see various things in multiple sigmoids. It was Dustin Bowers
who suggested to me that they look like wugs. The wug was invented and
first drawn in 1958 by Jean Berko Gleason,
in one of the most famous papers ever
written in linguistics. In recent years, the wug has been adopted by
the field of linguistics as a sort of mascot. The real wug is
cuter than the mathematical one.

What is this web page for?

I've
written a paper about wug-shaped curves in linguistics, which you can
download here. Due to length limitations it doesn't have all of the cases I've
compiled, and it seems that a web site would be the best format to
display them all together. I've included all the wug-shaped curves I have ever plotted for this project, include the ones where the data don't look entirely pretty.

Browsing hints

For references, please follow the links or look at the bibliography section of my paper. In most cases, a spreadsheet is linked just above the curve, which will tell you how I obtained the data, did the MaxEnt analysis, and plotted the curve. For a few, my spreadsheet is currently very messy and can't be shared yet (though you could ask).

Directory

Wug-shaped curves in phonology

Wug-shaped curves in phonetics

Wug-shaped curves in syntax

Wug-shaped curves in sociolinguistics

Wug-shaped curves in semantics/pragmatics

Wug-shaped curves in language change

Wug-shaped curves in sound symbolism

Appendices to the paper

Some graphs used to diagnose theories

How I made the curves

A. Deriving a single data point for Kluender et al. (1988)

B. Recoding Harmony as a single value in a two-candidate system

C. Why language change normally occurs at a constant rate

D. Deriving the MaxEnt sigmoid from first principles

E. On interation terms/conjoined constraints

F. On statistics and linguistics

Wug-shaped curves in phonology

Hungarian vowel harmony

Sources: Hayes and Londe (2006), Hayes et al. (2009), Zuraw and Hayes (2017), Hayes (in progress)

Y-axis: how often a stem will take back suffixes in a wug experiment

Baseline constraints: stem vowels influence harmony

Perturber constraints: stem-final consonants influence harmony

Spreadsheet (forthcoming), plotting script

French liaison

Sources: Zuraw and Hayes (2017), Hayes (in progress)

Y-axis: likelihood of elision or liason; e.g. use of [l] instead of [la] for feminine definite article

Baseline constraints: lexical propensity of Word 1 to act as an h-aspire word

Perturber constraints: lexical propensity of Word 2 to appear in its isolation form

Spreadsheet, plotting script

Tagalog Nasal Substitution

Scholarly sources: Zuraw (2000, 2010), Zuraw and Hayes (2017), Hayes (in progress)

Y-axis: how often a stem of a given type will under the process of Nasal Subsitution

Baseline constraints: place and manner of stem-initial consonant

Perturber constraints: propensity of a particular prefix to trigger the process

Spreadsheet, plotting script

Inversion of Final Devoicing in Dutch

Sources: Ernestus and Baayen (2003), Hayes (in progress)

Y-axis: how often speakers guess that a stem-final consonant will appear as voiced when suffixed

Baseline constraints: place and manner of stem final consonants, neighboring segment

Perturber constraints: based on three degrees of vowel length in the stem

Spreadsheet (forthcoming), plotting script

Finnish genitive plurals

Sources: Anttila (1997), Boersma and Hayes (2001), Goldwater and Johnson (2003), Hayes (in progress)

Y-axis: how often a stem will take the longer [-den] allomorph of the genitive plural

Baseline constraints: whether allomorph choice will result in two consecutive light syllables

Perturber
constraints: based on vowel height and weight of stem syllables.
One perturber is inviolable and therefore produces flat line, not a sigmoid.

Spreadsheet, plotting script

Schwa/zero alternations in French

Source: Smith and Pater (2020)

Y-axis: how often zero shows up in French schwa/zero alternations

Baseline constraints: whether schwa is inserted or deleted, consonants in environment

Perturber constraint: whether deletion of a schwa creates clashing (adjacent) stressed syllables

Spreadsheet, plotting script

Stress placement in Hupa

Source: Ryan (2019)

Y-axis: probability of initial stress rather than second syllable stress

Baseline constraints: weight of initial syllable

Perturber constraints: weight of second syllable

Spreadsheet, plotting script

Wug-shaped curves in phonetics

Perception of voicing based on closure duration and length of preceding vowel

Scholarly sources: Kluender et al. (1988), Hayes (in progress)

Y-axis: likelihood an experimental participant will perceive a voiced instead of a voiceless stop

Baseline constraints: closure duration

Perturber constraints: long vs. short preceding vowel

Spreadsheet, plotting script

Wug-shaped curves in syntax

Datives in English

Sources: Szmrecsanyi et al. (2017), Hayes (in progress)

Y-axis: how often the meaning of the dative construction will be expressed using NP NP rather than NP to NP

Baseline constraints: governing various properties of the Recipient

Perturber constraints: status of the Theme

Spreadsheet (forthcoming), plotting script

Genitives in English

Sources: Szmrecsanyi et al. (2017), Hayes (in progress)

Y-axis: how often the meaning of the possessive will be expressed using NP NP rather than NP to NP

Baseline constraints: an amalgam; see Hayes (in progress)

Perturber constraints: based on length of possessor in words

Spreadsheet (forthcoming), plotting script

One can also plot the same data with length as the Perturber, like this:

Wug-shaped curves in sociolinguistics

Contraction of the copula in Black English

Sources: Labov (1969), Cedergren and Sankoff (1974), Hayes (in progress)

Y-axis: how often the speaker uses a contracted (vowelless) allomorph of the copula

Baseline constraints: left side environment, include pronominal portmanteaux like "he's"

Perturber constraints: right side syntactic environment

Spreadsheet, plotting script

Deletion of the copula in Black English

Sources: Labov (1969), Cedergren and Sankoff (1974), Hayes (in progress)

Y-axis:
how often the speaker uses a null allomorph of the
copula, assuming they have already chosen to contract

Baseline constraints: left side environment, include pronominal portmanteaux like he's

Perturber constraints: right side syntactic environment

Spreadsheet, plotting script

This is perhaps the messiest case I have seen; perhaps the use of conditional probability is the problem?

Deletion of [l] in Quebec French

Sources: ms. by Gillian Sankoff, cited and discussed in Bailey (1973), Hayes (in progress)

Y-axis: deletion rate of [l]

Baseline constraints: varying propensity of various function words to lose their [l]

Perturber constraints: sex and social class of speaker, perhaps a proxy for speaking style

Spreadsheet, plotting script

This particular case stands out as problematic for Stochastic OT, critiqued in Zuraw and Hayes (2017) and Hayes (in progress). Here is a graph of a best-fit model of these data in Stochastic OT:

Omission of que in Quebec French

Source: Cedergren and Sankoff (1974)

Y-axis: retention rate for que

Baseline constraints: surrounding consonants

Perturber constraints: formality of style, as varied by type of speaker

Spreadsheet, plotting script

R-Spirantization in Panamanian Spanish

Source: Cedergren and Sankoff (1974)

Y-axis: probability of realizing /r/ as a spirant

Baseline constraints: phrasal position, whether /r/ is part of the infinitive ending, speaking style

Perturber
constraints: following segment

This is a rather messy one, I admit, and in particular lacks extreme values of probability.

Spreadsheet, plotting script

R-Dropping in New York City English

Source: William Labov, via Cedergren and Sankoff (1974)

Y-axis: probability of deleting /r/ in syllable codas

Baseline constraints: speaking context

Perturber
constraints: designating different dialects spoken in the same
speech community. I'm not sure what the grammatical status of these
would be.

Spreadsheet, plotting script

Cluster Simplification in Detroit Black English

Source: Wolfram (1969), Hayes (in progress)

Y-axis: probability of deleting one of a pair of adjacent consonants

Baseline constraints: neighboring vowel/consonant, whether deleting consonant is part of past tense suffix

Perturber
constraints: social class, assumed to be proxy for speaking style

This curve has a puzzling too-close vertical grouping for the _ C/ -ed case.

Spreadsheet, plotting script

Wug-shaped curves in language change

Portuguese definite articles

Source: Kroch (1989), Hayes (in progress)

Y-axis: probability of use of a definite article when a NP also has an NP possessor "unevolved," smaller Pokemon creature

Baseline constraints: rising constraint preferring this usage, over centuries

No perturber

Spreadsheet, plotting script

Evolution of have from Aux to main verb in English

Source Zimmermann (2017). See Hayes (in progress) for discussion.

Y-axis: probability of employing have syntactially as a main verb rather than as an aux

Baseline constraints: a preference constraint shifting over time

Perturber
constraints: governing various distinct uses of auxiliary verbs

Right graph plots same thing in different coordinates (harmony difference), showing identical slopes

Wug-shaped curves in semantics/pragmatics

Quantifier scope

Source: AnderBois et al. (2012), Hayes (in progress)

Y-axis: probability subjects will prefer narrow scope

Baseline constraint: whether the target quantifier is in first or second position

Perturber constraint: whetherthe target quantifier is in subject or object position

Spreadsheet, plotting script

Wug-shaped curves in sound symbolism

Classification of Pokemon character names

Source: Kawahara (in press)

Y-axis: probability subjects will rate a Pokemon name as appropriate for an "unevolved, smaller Pokemon creature

Baseline constraints: length of name in moras

Perturber constraints: whether name includes an initial voiced obstruent (such as [d])

Spreadsheet (forthcoming), plotting script

Graphs used to diagnose theories

The MaxEnt sigmoid

This is discussed extensively in the main text of my paper and is plotted here to permit the comparisons that follow.

The asymmetrical sigmoid of one version of Noisy Harmonic Grammar

In one version of Noisy Harmonic Grammar, discussed in section 7.2 of the article text, the "noise" added to make theory stochastic is added to the constraint weights. This ends up producing a sigmoid curve quite different from that of Maxent; it is asymmetrical, and the long tail can be shown to asymptote at a value above zero.

The symmetrical, oddly-similar sigmoid of another version of Noisy Harmonic Grammar

If, in designing a Noisy Harmonic Grammar framework, you instead add the "noise" to the completed Harmony values of candidates, you get a sigmoid that is astonishingly similar to the MaxEnt (astonishing, since the math is completely different). Here are the MaxEnt and NHG sigmoids superposed, with constraint weights suitably scaled to make the resemblance clear.

I'm sure there are better ways (for instance, R is probably good) but this was the method I arrived at on an ad hoc basis. You can see examples of how all this works if you will download the spreadsheets and plotting scripts for individual cases above.

1. Obtain data. Some authors web-post their data, other have the data printed in their article, and still others give just a graph. Even with the latter, it is not hard to use Microsoft Paint to get the values: look at the bottom of the screen for vertical and horizontal coordinates of points in pixels. Hover over the data points, and over the legend ticks, and put their values into a spreadsheet. Then you can use arithmetic (or the handy Excel FORECAST() function) to convert pixels into real values.

2. If necessary, reduce data from tokens to types-plus-counts. I do this by applying my little Typizer program to the rows of a spreadsheet, read in plain text form, containing just the constraint violations.

3. Do a MaxEnt analysis of the data, which is
easily done in spreadsheet form. The spreadsheets above
show you do this; it helps also to know the basics of MaxEnt; see the paper. The key step requires you to deploy the Excel Solver
(which is free, but must be activated), in order to calculate
constraint weights. During this stage, you should calculate
Harmony in two columns, one for Baseline
constraints and one for Perturber constraints, then use their sum to
give the overall Harmony from which probabilties are calculted.

In
doing the
MaxEnt analysis, use this trick, assuming a particular input has two
candidates A and B: if Candidate B has one violation
of Constraint X, record the violation in the spreadsheet not as a 1 in
the Candidate B's row, but rather as a -1 in Candidate A's row. Then
the B row ends up blank, other than the crucial frequency value for B.
The
math will come out the same, and it gives you the harmony values
in ways
plottable as a single number, as described in my paper.

4.
Collate the data, keeping only
Candidate A for each pair. I perform this collation with formulas in the space
below the main MaxEnt analysis. You must also collate the values
for Observed Frequency,
Base Harmony, and Perturber Harmony. Optionally, you can include
data for Counts, if you'd like the program to plot as small the
datapoints that are not well-attested. It is also good to gather values
for Predicted Frequency; then you can make a scattergram with Observed
against Predicted,
calculate correlation, and in general assess whether your MaxEnt model
is a good model.

5. Within the spreadsheet, fill in the
necessary fields to make a plot. These are shown in blue in the
spreadsheets posted here, and also can be seen in the downloadable plotting scripts.

6. Clip the blue material out of your spreadsheet and save it as a text file, which is the plotting script.

7.
Download my PlotSigmoids.exe
program (Windows only, sorry!), put it in a new folder of your choice,
click on it, drag a plotting script file onto the designated blank
area of
the interface. It will make a bmp image and put it into the "out"
subfolder.

Questions: bhayes@humnet.ucla.edu.

Last updated March 2021