In the Service of Collaboration that Balances Openness and Accuracy

Home World Zars Global Mind Staff

Collaboration Using Speech Recognition

Call In Radio Programs

Best quality radio conversations might be constructed asynchronously with the broadcasts occurring only after a highly engaged segment of the audience has has the opportunity to interact.  High accuracy recognition of a relatively small set of user preferred control phrases (e.g. "what can I do?, back, forward, time?, sssh!, yes, no, i disagree, that's wrong, I have a question, I have a comment, who said that? who disagreed? who laughed? why? that is interesting/dumb/irrelevant/redundant") may be used to record, sequence and layer soundbites in order to simulate a linear, real-time conversation that is optimized to engage the audience.  While that audience will be identified initially by the mass demographics, these will become increasingly focused until every stream received can be personalized.  

Working Group Meetings

Similar techniques may be used to augment or supplant business meetings.  Instead of occurring at a fixed time, such meetings would be asynchronous allowing objectionable and weak contributions to be weeded out and refined without necessarily exposing the whole group to time losses.  Because such meetings tend to be be continuous, time sensitive subjects can be raised quickly and receive the group's attention more quickly than at a scheduled meeting.  Subscribers could be alerted of emerging high priority areas by voice or email which directs them to a linear conversation.  When the recorded parties have a high degree of interest in the new listeners reaction (e.g. when they are higher in the chain of command), they might be paged to participate live in the resulting conversation.

Process

Following is an outline of how the process could augment the call-in format commonly used in radio.

  1. The program is distributed live on radio preceded by an announcement like, "This program is part of a series on ____ being continuously improved by feedback from callers to (phone number) and visitors to our web site.  If you would like to ask a question, comment or disagree please note the exact time as it will help you to find the context for your interaction when you call.  The most interesting interactions will be integrated into a rebroadcast of this program on/at ______."
  2. The program is recorded and the host's questions are marked and tagged to facilitate navigation.
  3. Callers are asked to identify themselves, find the point at which they wish to interject and speak their piece.  With speaker barge-in, a short utterance, cough or hurrah may be captured and used later as an audio overlay.  Such overlays will cue subsequent users that interaction with other participants is possible.  The volume of the overlay may be adjusted to indicate the significance of the interjection. 
  4. If widely supported, the barge-in may interrupt the conversation and set a new direction.  It is important for support to be registered quickly so that the host and speaker can stopped and redirected to the problematic area.  The producer may override a strongly supported interjection if they feel that it detracts from the need to present a case. 
  5. The system instructs callers in the use of the control language introducing only one or two words at a time so that novices are easily able to achieve the most common objectives while repeat users can become adept at using the controls.
  6. Contributors are asked to classify each contribution as a question, disagreement, comment or restatement.  They are also asked whether they would like to record a short interjection like as if participating in a live studio audience like an empathetic word, a cough or hurrah.   The earcon will be inserted or mixed into the program to serve as cues for subsequent interactions.  Feedback from subsequent users' reactions to the piece will decide whether to insert the piece automatically, play the piece only if the user interacts with the earcon, or discards the piece.
  7. The contributor then hears other contributions to the program and is asked for his opinion.  These opinions are collectively weighed (based on the feedback received above - interactions can embed recursively) and influence the subsequent prominence (principally volume) of the contributions as represented by their earcons.  Interesting and popular questions and comments will tend to rise in prominence.  According to this feedback, pieces which cross a certain threshold will change from earcons to become part of the program.
  8. The program staff monitors for emergent hotspots and can move to queue those questions and comments to the live program.  Or this can be done automatically.
  9. The guest may stay after the live program ends, or call in himself, in order to have more time to talk with the audience "off the air".
  10. The show is rebroadcast at a later time or date enhanced by the asynchronous audience participation.  Once the system has been primed by the program host, the audience collectively directs the program.  
  11. A reward system is instituted to motivate high value contributions.  This could range from acknowledgements to cash rewards.

Notes:  As a general principle, the novice user is exposed to a minimal spoken command set.  As user become more experienced, they are offered opportunities to use and become fluent in more complex command vocabularies.  This approach may also be used in conjunction with a WWW interface or Internet broadcast.

Revision History
Created 2/10/2003.  Improved process or edited 2/25/04, 6/19/03, 1/30/04, 6/11/2004.

  © 2003-2005 discussIT.org