## UCLA Phonetics Lab## STATISTICS |

In the Phonetics Lab, we use Excel, SPSS (we have only one license now), and R. On this
page are collected some class handouts, and some tutorials first presented
in the lab's "Practical Data Analysis" series.

An alternative to RM-ANOVA that addresses the
sphericity
problem, has more power, deals with missing data, and lets you test
alternative models: linear mixed-effects models. See Colin Wilson's
presentation
on mixed-effects modeling (illustrated in
R, not SPSS), and a data file
from Baayen et al. to practice with (used on a Mac; PC users may want to
first open in Excel). See also the tutorial by Quene and van den Bergh (2004),
"On multi-level modeling of data from repeated measures designs: a tutorial",
*Speech Communication 43*: 103-121 (online access is available through
the UCLA Library). It is now possible to do LME in SPSS as well as in R: see the online SPSS tutorial on restructuring data files and running analyses.

About analyzing data from speech
perception experiments:

Analyzing
identification data

Analyzing
discrimination data (and comparing identification and discrimination)

Analyzing
discrimination and identification data with D-prime (signal detection)

Analyzing
ratings data

Excel
(Microsoft): should be on every general-purpose
computer in the lab. You may be surprised at how much you can
do just with Excel. (The info below may refer to older versions of Excel.)

However, see this page by Hans
Pottel about limitations of Excel for serious statistical analysis.

- a Ling. 104/204 handout on how to do some basic statistics in Excel
- FAQ: How to add error bars to bars in Excel charts? Here's an answer

- Allows Repeated Measures ANOVA with sphericity correction
- Fully supported by the statistics consultants on campus.
- The CLICC offers remote access
to SPSS by their Software Shortcut
website.

Go to SPSS Home Page

Go to UCLA ATS consulting
page on SPSS, including their SPSS Starter Kit
for new users

Go to RM ANOVA:
Doing it in SPSS

This question comes up all the time, and since other pages elsewhere with
the answer come and go, we now include it here, incongruous as it may
be.

The relevant Excel command is "**Format Data Series**", which can be
reached either by clicking on data after the plot is made, or through
the Format menu. Under Format Data Series go to the Y Error Bars
tab, and (generally) choose either 1 standard deviation, or the standard
error.

Which one to use? Depends what you want to show...

**Standard Deviation:** for a normal curve, +1 SD above the mean is
about 34% of the total data under the curve; thus plus-minus 1 SD around
a mean value in a graph covers about 2/3 of the total data.

**Standard Error:** like the SD of the distribution of the
mean (square root of variance of the mean). The standard error
is proportional to the SD and inversely proportional to the size of
the sample. 1.96 standard errors gives a 95% confidence interval
for the mean. That is, the 95% confidence interval around the mean
is plus-minus 1.96 times the standard error (e.g. if the mean is 100
and its standard error is 1, then the 95% confidence interval around that
mean is from 98.04 to 101.96). It’s useful to show the 95% confidence
interval on a graph because these intervals can be visually compared across
means – if the confidence intervals don’t overlap then the means are reliably
different. But the standard error by itself (not 1.96 times it) is
not so useful visually. Similarly, the 99% confidence interval is
the mean plus-minus 2.58 times the standard error. (Note: these
multipliers are sometimes adjusted up for small samples, to 2.13 and 2.95.)

* last updated July 2011 by P. Keating*