||Due: September 29
Theme: Examine naturally occuring spoken discourse
Collect a sample of naturally occurring spoken discourse. "Naturally-occurring"
can be broadly construed to include radio or talk shows, children's play,
radio or TV news items, spontaneous or scripted storytelling, classroom
interactions, task-oriented conversations, classroom lectures, etc. By
"collect" we mean make video recordings (unless absolutely impossible to
do and audio recording is your only option). You should collect a minimum
of 15 minutes, and then transcribe at least 5 continuous minutes (usually
the middle of the discourse is the most natural). By "transcribe" we mean
you should make a record on paper of what you saw/heard --a good enough
record so that when we read the transcript, we know what went on (read
the Schiffrin appendix before doing this. You will probably want to use
one of her methods, or you will have to justify why you didn't).
The point is to push you to think about what discourse is and what
makes it hard to model discourse in a computational system. You may want
to have an interactive system in mind when you choose your sample.
Think about how a computer could replace a participant in the discourse.
Supposing that you had perfect word recognition, what are the most challenging
issues in processing the discourse? Are some of these challenges specific
to the sample domain you chose? Another point is to think about what makes
a sufficient record of discourse: how do you turn a speech event into an
on-paper transcript? What parameters need to be transcribed (the words,
the pronunciation of words, the intonation, the facial expression, the
gestures, fidgeting, pauses, etc.)?
What you must turn in:
We ask you to turn in to us the typed transcript and a ONE PAGE discussion
of the points listed above. That is, minimally, discuss the issue of what
makes an adequate transcription, and what challenges a computer might have
in interacting in the discourse that you have collected.