Collaboration Using Speech Recognition
Call In Radio Programs
Best quality radio conversations
might be constructed asynchronously with the broadcasts occurring only after a
highly engaged segment of the audience has has the opportunity to interact.
High accuracy recognition of a relatively small set of user preferred control phrases (e.g. "what
can I do?, back, forward, time?, sssh!, yes, no, i disagree,
that's wrong, I have a question, I have a comment, who said that? who
disagreed? who laughed? why? that is interesting/dumb/irrelevant/redundant") may be used to
record, sequence and layer soundbites in order to simulate a linear, real-time
conversation that is optimized to engage the audience. While that audience
will be identified initially by the mass demographics, these will become
increasingly focused until every stream received can be personalized.
Working Group Meetings
Similar techniques may be used to augment or supplant business meetings.
Instead of occurring at a fixed time, such meetings would be asynchronous
allowing objectionable and weak contributions to be weeded out and refined
without necessarily exposing the whole group to time losses. Because such
meetings tend to be be continuous, time sensitive subjects can be raised quickly
and receive the group's attention more quickly than at a scheduled meeting.
Subscribers could be alerted of emerging high priority areas by voice or email
which directs them to a linear conversation. When the recorded parties
have a high degree of interest in the new listeners reaction (e.g. when they are
higher in the chain of command), they might be paged to participate live in the
resulting conversation.
Process
Following is an outline of how the process could augment the call-in format
commonly used in radio.
- The program is distributed live on radio preceded by an announcement like,
"This program is part of a series on ____ being continuously improved by
feedback from callers to (phone number) and visitors to our web site. If you would like to ask a question,
comment or disagree please note the exact time as it will help you to find the
context for your interaction when you call. The most
interesting interactions will be integrated into a rebroadcast of
this program on/at ______."
- The program is recorded and the host's questions are marked and tagged to
facilitate navigation.
- Callers are asked to identify themselves, find the point at which they
wish to interject and speak their piece. With speaker barge-in, a
short utterance, cough or hurrah may be captured and used later as an audio
overlay. Such overlays will cue subsequent users that interaction with
other participants is possible. The volume of the overlay may be
adjusted to indicate the significance of the interjection.
- If widely supported, the barge-in may interrupt the conversation and set a
new direction. It is important for support to be registered quickly so
that the host and speaker can stopped and redirected to the problematic area.
The producer may override a strongly supported interjection if they feel that
it detracts from the need to present a case.
- The system instructs callers
in the use of the control language introducing only one or two words at a time so that novices are
easily able to achieve the most common objectives while repeat users can
become adept at using the controls.
- Contributors are asked to classify each contribution as a
question, disagreement, comment or restatement. They are also asked
whether they would like to record a short
interjection like as if participating in a live studio audience like an empathetic
word, a cough or hurrah. The earcon will be inserted or mixed into the program to serve as cues for
subsequent interactions. Feedback from subsequent users' reactions to
the piece will decide whether to insert the piece automatically, play the
piece only if the user interacts with the earcon, or discards the piece.
- The contributor then hears other contributions to the program and is asked
for his opinion. These opinions are collectively weighed (based on the
feedback received above - interactions can embed recursively) and influence
the subsequent prominence (principally volume) of the contributions as
represented by their earcons. Interesting and popular questions and
comments will tend to rise in prominence. According to this feedback,
pieces which cross a certain
threshold will change from earcons to become part of the program.
- The program staff monitors for emergent hotspots and can move to queue
those questions and comments to the live program. Or this can be done
automatically.
- The guest may stay after the live program ends, or call in himself, in
order to have more time to talk with the audience "off the air".
- The show is rebroadcast at a later time or date enhanced by the
asynchronous audience participation. Once the system has been primed
by the program host, the
audience collectively directs the program.
- A reward system is instituted to motivate high value contributions.
This could range from acknowledgements to cash rewards.
Notes: As a general principle, the novice user is exposed to a minimal spoken
command set. As user become more experienced, they are offered
opportunities to use and become fluent in more complex command
vocabularies. This approach may also be used in conjunction with a WWW
interface or Internet broadcast.
Revision History
Created
2/10/2003. Improved process or edited 2/25/04, 6/19/03,
1/30/04, 6/11/2004.
© 2003-2005 discussIT.org
|