Audio Component Post-Mortem

Update as of October 15, 1998.

Most of this document was written in early September, 1998. As of this date (October 15,1998), some additional information has come to light which invalidates a few of our conclusions. Rather than rewriting the entire document, we'll simply add a few additional comments in these places. See below for a current status report.

Things We Messed Up

Every project has its problems. This portion of the document describes some of the ones that have to be laid squarely in our laps.

Via Voice and Via Voice Gold

We're using IBM's JSAPI implementation for W95 PCs for the speech portion of the Audio Look and Feel. One of the prerequisites for this package is IBM's ViaVoice Gold. When we began working on a Java version of the audio component, we had ViaVoice (but not the Gold version). Since time was tight (we were looking at a June 30 deadline), we tried using the software we had (ViaVoice) and started the process in motion to obtain a copy of ViaVoice Gold.

Speech worked with ViaVoice, but we encountered some problems when we tried using speech and sound in the same application. In particular, we found that after a sound had been played with a JMF Player, speech wouldn't work. It was necessary to stop() and deallocate() the Player before speech could be performed. Another problem was that in approximately 5-10% of the instances, sound would not start after speech unless we added a 50 millisecond wait after speech completed. Both of these "work-arounds" were incorporated into our implementation.

We were evidently pioneers in the integration of JMF and IBM's JSAPI implementation, since the IBM JSAPI newsgroup wasn't able to provide any advice on our initial difficulties.

We eventually received ViaVoice Gold, but by this time we were involved with other problems and saw no need to install it.

When we eventually did install ViaVoice Gold, we discovered that neither of the problems described above happened with the new product. In fact, both speech and sound could be played simultaneously! Much to our detriment (see below), however, the deallocate() method of the JMF Player remained in our code.

Update: The problems actually seem to have disappeared not because of ViaVoice Gold, but because the machine we installed ViaVoice Gold on had a different sound card/driver. Using ViaVoice Gold on a machine with the same sound card as our original test machine showed the same sound/speech conflict as described above.

Multi-Threading

Multi-threaded programming is difficult.

For the audio component, however, it is a necessity. Unlike a visual look and feel, a sonified component can take many seconds or even tens of seconds to "display" (consider a long menu). If the playing of a sonified component is done in the AWT-event thread, the entire application will seem to hang until a Report has completed. Furthermore, both sound and speech complete asynchronously and notification of their completion is handled by independent threads associated with JMF and IBM's JSAPI implementation. This means that code that starts playing a sound or speaking a piece of text must pause until notified of its completion. Finally, we may wish to pause and resume a running Report. All these considerations suggest that a multi-threaded architecture is the most appropriate one for the audio component.

The version of the Audio Look and Feel implemented for the March, 1998 JavaOne Conference used JNI to communicate with native Windows code that handled sound and speech. This version was multi-threaded and used a model in which the RPManager invoked synchronized methods of the ReportProcessor class to stop Reports and in which ReportProcessors invoked synchronized methods of RPManager to notify of it of significant events related to the Report being processed.

This model worked well enough for a demo but after adapting it to use Java-based sound and speech a number of deadlock situations were encountered. After many attempts to patch what seemed to be an interminable problem, we eventually rewrote the RPManager class to run as a separate thread and adapted the rule that all communication with the RPManager was to take place by placing events and requests on a queue. The deadlock problems vanished. See the architecture document for additional information.

Multi-threaded programming is difficult.

Experiences with IBM's JSAPI Implementation and Sun's Java Media Foundation

IBM's JSAPI Implementation

IBM's AlphaWorks site has provided us with an implementation of speech for Java which we're using for the speech portions of the audio look and feel. Although labelled as alpha code, the portion we've worked with (we haven't needed to use speech recognition for this application) has been surprisingly robust. We've encountered a few cases in which a particular method hasn't yet been implemented, but we've always been able to find an alternative way to accomplish the same thing.

IBM should be commended for their work on this product.

Sun's Java Media Foundation

The Java Sound API as described at JavaOne in March appeared to be just what we needed for the sound portion of the audio look and feel. Unfortunately, neither an API nor an implementation were available. Nor was a timeframe. What was available when we started work on the Java version of the audio component was version 1.0 of the Java Media Foundation Player class. It seemed like overkill for our needs but it was no longer in beta test mode so we figured it would probably do the job.

By following the examples supplied with the package we were able to get a working Player for sounds up quickly. We did, however, have some difficulties integrating it with speech but were able to find workarounds (see above).

After completing a series of unit tests on the audio component we began testing with the Swing Notepad application. The application worked well when the sonified components were permitted to complete but intermittent problems were encountered when one component interrupted another (e.g. rapidly moving through the items of a menu). Certain sounds would stop working after a while and eventually the entire application would hang. We moved back into the unit test environment.

Extensive trace code was added and the problems were determined to lie with the JMF Players; speech was not involved. Additional code (running in a separate thread) was added to wait a random amount of time (typically between 0 and 1.5 seconds) after starting a JMF player and then submit an interrupt Report. A typical test run would involve 125 wait Reports consisting of a speech item, a sound file, and another speech item. The sound file was interrupted as described above. This volume of Reports was usually sufficient to produce an unrecoverable error.

A number of different problems were discovered, but most had a few common characteristics:

The same soundfile could be played and interrupted without always producing a problem.
When a problem occurred, an exception was thrown and a stacktrace was produced several methods deep in JMF code.
After a problem occurred, the JMF Player would be incapable of playing its file again.
None of the error events documented for the Player class were produced when a problem occurred.

This last characteristic was especially significant because it meant that not only did the sound fail to play, but we had no way to detect from within the program that a problem had occurred! While we might have been able to catch some of these errors, the majority of them occurred in a Thread to which we didn't have access.

Some of the Exceptions and Errors were:

java.lang.NullPointerException
at com.sun.media.jmf.audio.AudioPlay$AudioStream.drain(AudioPlay.java:526)
java.lang.NullPointerException:
at com.sun.media.protocol.reliable.AudioSourceNode.stop(AudioSourceNode.java:213)
javax.media.ClockStartedError: setMediaTime() cannot be used on a started clock.
at com.sun.media.MediaClock.setMediaTime(Compiled Code)

We also encountered a number of cases in which invoking the stop() method of a JMF Player caused the invoking thread to hang. No exception was thrown; no error event was posted.

All these problems were reported to Sun along with an offer to supply lots of additional information if it would be useful.

The javax.media.ClockStartedError, although a problem, need not have occurred in our application. The error occurred when we called the JMF Player's deallocate method. According to Player documentation, the deallocate method may only be invoked on a stopped Player. We always did a stop() first so everything should have been fine. However, after a long study of trace files, we also noticed that in cases in which the problem occurred, a start event had never been received by our event listener. Deciding whether to call deallocate was trivial, but we also ended up adding some rather convoluted logic to help us decide whether the Player was damaged and needed to be re-created. This extra code was eventually scrapped when we determined that ViaVoice Gold obviated the need for deallocate(). See above

Update:As noted above, the need for deallocate() remains for some sound cards.

Work-arounds for these problems require two stages:

Detecting that a problem has occurred.
Taking corrective action to recover from it.

A number of partial work-arounds were investigated. The most general one appears to be the following:

Create a Thread to monitor the JMF Player method which may potentially abort or hang. This thread starts in a wait state and will be awakened just before the method in question is invoked. Upon awakening, the monitor thread enters a timed wait for a period of time greater than the maximum amount of time required for the method to complete normally. Normal completion is detected when a Listener receives the appropriate state change event from the JMF event notification thread. The listener code (which runs in a different Thread than the code that issued the JMF Player call) will set a switch that can be tested by the monitor thread. The listener code then issues a notify() call to awaken the monitor thread. When the monitor thread awakens, it tests the switch. If the switch has been set, the monitor thread will know the JMFPlayer method was successful and can again enter its initial wait state. If the switch is not set, the monitor thread knows the JMFPlayer method did not complete normally and can take appropriate action. This will probably involve creating a new Player (the old one may be damaged beyond repair), perhaps stopping the Thread that issued the JMF call and creating another one to replace it, and doing any other cleaning up that may be required.

The monitor code has already been written and tested. The cleanup code will be ready shortly.

What We Need

Sun has stated in its JMF mailing list that it is working on a bug-fix release. It would be extremely useful if we knew which bugs were going to be fixed and when we could expect the release. We've posted this request to the list, but other than a request to submit any bugs we consider show stoppers and a promise to fix as many as time permits, we've received no answer. This isn't terribly useful to developers who are facing deadlines and trying to decide whether to implement work-arounds or wait for a fix.

On a more general level, it would be useful would be to have JMF bugs posted using the same mechanism that is used for the JDK - the Java Developer Bug Parade. This mechanism is extremely useful to developers in that it lets us know what problems Sun is already aware of and helps us distinguish between problems in our own code and problems with Java. Sun's engineers may also find it useful in that it probably cuts down on the number of duplicate bug reports that are received. Particularly useful are comments from Sun's engineers about the status of a bug. For example, if an engineer remarks that she can't duplicate the problem, then a developer who encounters a similar problem will know that it's worth an investment of his time to try and create additional information to be passed on.

Audio Component Status as of October 15

As described above, the need to deallocate a JMF Player before doing speech seems to be sound card/driver specific; it's required on at least one type of sound cards and not on another. There may be other hardware dependencies we haven't encountered yet.

We've also encountered a memory leak that appears to be related to JMF v1.01 and/or JDK 1.1.6. After approximately 30-45 minutes of running, we get an "out of memory!" message and the application crashes. We assume it's JMF related since a 30 hour test run (and a number of shorter runs) using only speech Reports has no difficulties.

We've started testing with JDK1.1.7 and the new JMF 1.02 ( announced October 8). So far, we've run three "two hour" test runs without any memory leaks. Nor have there been any signs of the other problems we've described above. Thus far, we've only tried on one of the sound cards.