Mathematical Structures in Language II

Mathematical Structures in Language (LING 218)

Topic: "Counting Techniques and Statistics"

Winter 2005

Ed Keenan and Marcus Kracht

Time: MW 11:00 S 1:00
Place:
Prerequisites: Ling 180/208 or equivalent
Instructors: Ed Keenan & Marcus Kracht
Work required: weekly homework problems

Short Description of the Course

This course shall cover two areas: combinatorics and statistics. The content will be posted soon. There will be a syllabus available in due course (see below for a download).

Weeks 1 - 3:
- cardinalities of disjoint unions (addition),
- cross products multiplication),
- function spaces (exponentiation),
- permutations (with fixed points),
- partitions (equivalence relations)
- and the "n choose k" notation.
Practice defining bijections since a major way of counting a set is to show that it is bijectively related to a set we already know how to count. We end with some recurrence relations and an introduction to elementary probability. Linguistic applications: measuring the strength of constraints. For example extensional adjectives are restricting: Adj+N is a subset of N. How many maps from sets to sets are restricting? How many Det denotations satisfy the conservativity condition? How many are intersective?
Weeks 4 - 6:
- Probability spaces: definitions and basic laws;
- Bayesian statistics;
- Basic probability distributions
- Random variables; expectation, variance.
This will enable us to first of all set up the tools for statistical investigation in the first place: the probability spaces, and how they are related to a particular problem at hand. Once this is set up we can ask: what is the likelihood of a random letter (or a random sound) being equal to "e" or "r"? What is the likelihood that "e" follows "p"? And so on. Since the probability of e is obtained by dividing the number of overall cases divided by number of cases in which e takes place, the counting techniques developed in Weeks 1 - 3 will be put to immediate use.
Week 7: Parameter estimation (e.g. maximum likelihood). Once we have decided on particular type of model there are various parameters to fix. For example, a Markov model for the letter sequences (or other applications) require fixing the transition probabilities. But how do we get them in the first place? This is a very important issue, as there are theoretically reliable methods which often are impractical, so compromises have to be found (whose validity is once again dependent on statistical facts).
Week 8: Hypothesis testing (t-test; chi-square test). Also, once everything is in place and we have two competing hypotheses A and B concerning our data, which one fits better? Again, various methods of testing have been proposed, all with their own merits and drawbacks.
Weeks 9 & 10: Markov and hidden Markov models; the ergodic theorem (behavior in the limit). Markov models play an important role eg in speech recognition and tagging. They rely on predicting the future from the present, without any memory of the past. The ergodic theorem states that in the limit in almost all cases whatever the initial probability distribution is, in the limit the probability distribution is that of the single ergodic state. This allows to extract from the model the probability of occurrence of individual states.

Course Material

Notes on Boolean Algebras.
Notes on Generalized Quantifiers.
A syllabus on statistics (in .pdf) can be obtained here. Note that the syllabus might is still under construction, that is incomplete and may contain errors. We do the best we can to eliminate them on time.
Exercises: Statistics1, Statistics2, Statistics3, Statistics4, Statistics5.