Auditory User Interfaces --Foreword
By David Gries

In his award-winning Ph.D. thesis some three years ago, T.V. Raman described a computing system, called AsTeR, for rendering electronic documents aurally. Instead of reading documents on a monitor or on paper, one listens to them. AsTeR's spoken math is far easier to understand than ours. The listener browses the spoken document and have parts of it repeated--even in a different speaking style. AsTeR allows the usually passive listener to be an active participant in understanding an aural rendering of a document.

Raman extended his ideas on talking computers. they speak more and more to him in a sophisticated fashion. His aural desktop lets him listen to applications as he navigates his file system, manages tasks, maintains his calendar and rolodex, edits, handles his email, browses the web, develops and debugs programs, and reads articles, and books.

Raman's speech-enabling idea is to allow applications to produce aural output directly, using the same information that is used for more conventional visual output. The AUI (Auditory User Interface) works directly with the computational core of the application itself, just as the conventional GUI (Graphical User Interface) does.

You will read Raman's philosophy on user interaction, learn of the shortcomings of speaking the screen, benefits of a real AUI, and how Raman uses sophisticated facilities in emacs to implement his ideas for AUI's.

Raman calls for a return to simplicity, for that can provide better human-computer interaction. In the early days, there was a clean separation between computation and user interface ---there almost had to be since I/O was so primitive. As peripherals became more complicated, the separation became muddier. By enforcing a cleaner separation, Raman builds nice AUI's and makes his computing more effective and efficient. His system adds a dimension to human-computer interaction. I continue to be amazed at all these advances. Let me spend just a few paragraphs on the changes I have seen.

I took my only computer course in 1959, as a college senior. We learned how to program in a virtual assembly language. It didn't matter that the language was not real, since there weren't any computers to run our programs anyway. Forty years ago, almost all user-computer interaction was nonexistent.

Around 1964, I helped teach programming in Germany on a machine whose input

device was a paper-tape reader. Paper tape came on a roll; holes were punched in the paper to record information. Of course, the punched card reader had also been available for years (a punched card could contain up to 80 characters of information). If you made a mistake on a punched card, you only had to retype that card, and not a whole paper tape.

Punched cards did not always make the computer readily accessible. For example, at Cornell, in about 1970, the mainframe computer that ran students' programs was near the airport, some 4-5 miles away. Twice daily, the decks of punched cards to be run on the machine were trucked to the airport, and the output from execution came back four or five hours later! A year or two later, card readers hooked directly to the mainframe were placed at several locations around the campus. But even then, all through the 1970's, as the deadline for a programming assignment neared, the line of students waiting to put their cards into the card reader grew longer and longer. Sometimes, students waited one-half hour ---and then waited another hour for the program to run and the output to be printed. In the late 1970's, in many places ``terminals'' replaced punched-card input. But the real change didn't come until personal computers were introduced ---first, on machines like the Terak and finally in about 1983 with the introduction of the Macintosh (with 256K of memory, a floppy disk, and no hard disk). For the first time, one had almost instant feedback during compilation and execution. No longer did one have to wait five minutes to five hours for the output of a compilation or execution!

This desktop paradigm, with keyboard-mouse input and screen output, has

remained largely unchanged for perhaps ten years ---although machines got faster, disks and memory larger, and program environments more sophisticated. And graphics, not just text, came to be an important part of the human-computer interface.

The latest change was the advent of the WWW and then Java as a language for writing interactions. Now, programs written by students in the first programming course use browsers like Netscape, making it more interesting.

In summary, forty years has seen remarkable change in human-computer interaction. How do all these changes come about, and what changes can we expect in the future? I can't answer the second question, but the answer to the first question is easy: the changes are driven by the vision of people like T.V. Raman. Discontented with their current situation, but enthusiastic, creative, and persistent, these visionaries work to make drastic

improvements. I wonder what the next forty years of computing will bring.

David William L. Lewis Cornell Computer Cornell Ithaca NY.

Book Overview Contents Figures Tables Preface Acknowledgements Index

------------------------------------------------------------------------ Last modified: Tue Aug 19 17:08:44 1997