Chapter 25
Audio Presentations
CONTENTS
Think back to the last time you used
a tape recorder not to play but to record a cassette. Maybe you
taped your daughter's high school graduation ceremony or a company
lecture. Perhaps you took a recorder with you to a family reunion
to preserve some oral history. You might have even accidentally
recorded a phone conversation on your answering machine. Most
likely, though, it hasn't been all that long since you played
a tape in a cassette player. Maybe you listened to an audio book
driving to work this morning, or perhaps you carried a small portable
cassette player with you on your daily walk. If you're a fan of
music, you probably have a CD player on your stereo system. You
also might have a CD player on your home or work computer. Both
the tape recorder and CD player are familiar tools with which
to play and create audio. But you can use another tool to make
audio clips and import them into presentations, Web pages, and
movies. This tool is the computer.
Your computer has the potential to be a microphone, mixer, and
boom box all in one. The computer doesn't care what you are recording-your
own voice, a telephone conversation off the speaker phone, a music
CD, a bird call, a water fountain. You need only a microphone,
speakers, an editor, and a sound card for your computer's motherboard.
These days, many personal computers (and all Macintosh computers)
are built with sound cards.
Creating audio files for an online (or offline) presentation is
about as easy as tape-recording your grandfather's war stories-once
you get the hang of it. The hardest part is determining your hardware
needs and selecting suitable software.
This chapter explores the current options available for adding
an audio segment to your computer system and discusses the future
of computers and audio. To begin this discussion, let's look at
what's happening to a fictional company called Nemosyne, Inc.
Nemosyne's main product is a series of foreign-language study
packages. Each package contains a manual and two cassette tapes
oriented to one of 30 different languages. Nemosyne's primary
market is American executives who travel overseas on business.
Although Nemosyne wants to expand its market to European and Asian
executives, they have not yet reached that point. Currently, their
biggest seller is the English-to-Japanese study package, with
French running a close second.
Not too long ago, Nemosyne hired a Webmaster to create and maintain
a corporate intranet because the board of directors wanted to
capitalize on the advantages of a well-developed Web site. The
new Webmaster set up a few employee bulletin boards, an online
contact database, and CGI scripts that allow customers to order
products via an electronic form. Nemosyne has already noticed
an increase in sales.
The board of directors has big plans for expanding the public
side of their Web site: They want to be the first to bring foreign
language study to the Web. The company's Webmaster has told the
board that by creating a large base of audio files (all of which
can be extracted from their existing language tapes), they can
make a site to which American travelers can refer when they need
to know pronunciation for certain words or phrases.
A businessman on assignment in China, for example, needs to find
a dry cleaner. If he's online, the businessman can go to the Nemosyne
site where he can find a hotlink that looks something like this:
DryCleaning.ra
After he clicks the link, he is presented with a list of audio
files for dry cleaning-related phrases such as "Where would
I find a dry cleaner?" He selects this phrase, and within
seconds a Chinese version of this phrase plays on his laptop.
He can repeat the phrase as many times as he likes, until he can
say it himself. (By this time, he might be so impressed that,
before he looks for the dry cleaners, he'll take the time to order
Nemosyne's Chinese language package electronically with his credit
card.)
The board thinks the concept of online "instant phrases"
is terrific, but they're still a bit hesitant. The Webmaster has
been asked to create an online presentation that will give them
a feel for how the proposed site would operate. The presentation
needs to be posted on Nemosyne's intranet so that the CEO (who
is spending the summer in Bermuda) can access it easily.
The Webmaster begins by browsing the Web, just to see what other
users are doing with audio files. In one two-hour session, it
becomes apparent that creating online audio is not a difficult
task.
The biggest challenge for the Webmaster in developing the Nemosyne
presentation is overcoming limited bandwidth. This is the challenge
that all Web multimedia developers are facing currently. A 28.8
Kbps modem can move about 3.6 kilobytes (3.6K) per second, making
it about 50 times slower than the 176K per second that is needed
to play CD-quality audio.
To overcome the lack of bandwidth, you must compress all audio
files for the Web. Otherwise, they would come out sounding like
a tape recorder running on old batteries (or even worse). Although
you can compress audio files in many different ways, you can offer
audio on your Web site in only two ways.
The first option is to post files for users to download and save
to their hard drives. The disadvantage of this method is that
it asks a lot of the users. The users must know how to download,
how to find and use an audio player, and they must sacrifice hard
disk space-all just to listen to something that sounds merely
okay. The second option, streaming audio, also sounds merely okay
but is much less demanding. A streaming audio file plays just
seconds after a user clicks its link-provided a plug-in player
has been downloaded and stored in the proper directory.
Streaming audio works by creating an ultra-compressed version
of a regular digital audio file, but by keeping the compression
in an order that the computer (with the help of the player) can
understand as the data comes in. Currently, streaming audio products
are marketed by several different companies (more about them later
in this chapter), but they all use coders and decoders, and have
two components: a compressor, which compresses the audio
stream; and a decompresser, which plays the audio stream.
The compressor codes the original audio, and the decompresser
reorders the data into a file that is similar to the original
but not the same.
To hear some clips of presidential speeches that have been transformed
into streaming audio files, visit Webcorp's audio archives at
http://www.adobe.com/acrobat/readstep.htmlhttp://www.webcorp.com. To
hear some streaming audio examples of presentation music, visit
NetworkMusic's home page at http://www.networkmusic.com.
Opinions about the quality of streaming audio are varied. Although
having a short download time is nice, many streaming audio clips-especially
those that were originally created on old-fashioned tape players-sound
rough. Don't expect even the most basic stereo sound from streaming
audio; however, the quality should progress as bandwidth increases.
For the Nemosyne presentation, the Webmaster wants to use streaming
audio. Using these files is quicker than making each user download
the entire audio file before it is played. The Webmaster, however,
anticipates that the proposed site will eventually include some
audio files for downloading. If the Nemosyne company wants to
offer a sample tape for downloading to potential customers, for
example, streaming audio would not be the best choice. In this
case, having a less-compressed higher-quality version is better.
As it now stands, streaming audio is best for audio files that
are meant to be played spontaneously online. Other formats might
be used for longer audio files, especially ones in which sound
quality is especially important.
As the quality of streaming audio improves, it might completely
replace the other audio file formats (for online purposes). Streaming
audio is easier to handle: the number of streams can be determined
by the server software. A company that wants to have unlimited
streams has the option. Likewise, having as few as five or six
streams is possible. Streaming audio's flexibility is a new branch
in the ways to distribute audio files over a network. But like
most technology having to do with the Internet, its capabilities
and protocols are always changing.
Many times you'll be required to create your file in a sound editor
and then transform it for real-time playback with a streaming
audio encoder. For that reason, you should become familiar with
the audio formats used to digitize sound for the Web. If you don't
plan to stream your audio files, you can post any audio format
on your Web site. Remember that your users will have to download
the files and have the means to obtain a player. Including a compatible
player along with your bank of audio files is usually a good idea.
On the Web, the most frequent uses of nonstreaming audio are song
samples (visit Geffen/DGC Records at http://www.geffen.com),
audio greetings, stories, and clips from speeches.
Several different audio formats are available to Web developers.
Each method has advantages and disadvantages, but all the methods
are lossy, meaning the compression causes a decrease in
sound quality. Remember that a particular sound format might support
more than one way to encode the sound data.
For the Web, audio-MPEG, .WAV, .AIFF, and .AU-described next-are
some of the most popular. Most likely, your sound editor will
allow you to create files in one of these formats. However, you
will occasionally run across sites that use other formats. Distinguishing
between audio formats is a matter of testing them. You also should
take into consideration what type of sound you want to post. Some
formats work better with speech than with music and vice versa.
The same holds true for sound editors and players, including the
streaming variety.
The following are some descriptions of audio file formats:
- MPEG-Audio: A favorite of multimedia
experts, audio-MPEG (Motion Pictures Experts Group) is a sound
version of the digital video format MPEG. By executing a technique
called perceptual coding, audio-MPEG compresses audio by
removing extraneous data. This format produces small file sizes
and good sound quality, probably the best of all the formats.
You can choose between MPEG 1 (slower) or MPEG 2 (faster). You
probably won't have to flip a coin to decide.
- .AU (also uLaw, NeXT, and Sun Audio):
A common UNIX format, .AU is still popular for the Web. .AU is
a versatile format used mainly for its capability to make a very
small file rather than for its playback quality, which is usually
rather poor.
- .WAV (MS Windows) and .AIFF (Macintosh):
You can decide between 8- or 16-bit sound when using these formats.
The advantage is the flexibility, but the disadvantage is a tendency
to produce a "grainy" sound. These files have the potential
to become very large, which means a long download time for the
users.
Chances are good that you will be able to find a shareware program
that produces audio-MPEG, .WAV, .AIFF, and .AU files. Go to Virtual
Noise's Audio Help Desk at http://www.virtualnoise.com/audio1.html.
You can find some hot links that lead you to some outstanding
shareware sound-editing programs and players.
It's becoming more common for computer companies to pre-install
sound editors, players, and other multimedia applications along
with the system software. You can use the Sound Recorder that
comes with Windows 95 or Macintosh's SoundMachine to make audio
files and to play them. If you want to offer all your online audio
files in .AU, audio-MPEG, .WAV, or .AIFF format, give your users
instructions on what applications they need to play the files.
You can be very general and say something like "You'll need
a sound player on your machine to hear this file." To embed
an audio file onto your site for download, use the following standard
HTML reference tag:
<A HREF="example.wav">Example</A>
If you plan to make any part of your site dependent on audio files,
and you anticipate that a shareware program will not meet your
needs, you might want to look into some of the commercially available
sound-editing packages, such as MacroMedia's SoundEdit Pro (for
Macintosh only) or Cakewalk Music Software's Cakewalk Pro Audio
(Windows). Expect to pay around $400 to $500.
One issue the Webmaster considers while developing the online
Nemosyne presentation is whether to install an audio server. Think
of the audio server as a supplement to the Web and mail servers:
it gives you the capability to play audio streams from a given
site. You should not confuse an audio server with sound-editing
software. Think of sound editors as the "drawing programs"
of audio: they are nonserver applications that provide tools for
editing and manipulating sound.
A streaming audio server, available from the companies that are
currently developing streaming audio products, allows a Web site
to offer live, on-demand audio in real time. With one of the more
expensive packages, at least a couple dozen users-and possibly
many more-could therefore request the streaming audio files at
any given time. No downloading is involved. Without the audio
server, you would have to link streaming audio files to a site
that does have audio server software.
The Webmaster decides on the most basic version of Progressive
Network's RealAudio server software, which allows about five or
six simultaneous users. Nemosyne will pay the $600 it costs to
provide this service. The deluxe server would be quite pricey,
as the capacity for unlimited audio streams runs about $5,000
to $6,000. The Webmaster plans to ask for the more deluxe server
software during the Nemosyne final presentation.
Most streaming audio packages come in three parts: the server,
the encoder, and the player. Most likely, you can get some parts
(if not all) of the package for free by downloading them from
Web sites (URLs are listed next). Every package offers different
features. Do some research to find the one that's best for you.
- RealAudio by Progressive Networks:
RealAudio, developed by Progressive Networks-a pioneer in streaming
audio technology-offers the most polished, most popular package.
Most people find that RealAudio is easy to use, fully functional,
and reasonably priced (as far as streaming packages go). RealAudio
moves real-time audio over 14.4 Kbps and faster modems, and the
player can be downloaded for free. Progressive Networks offers
several packages especially for intranets. Each includes the server,
the deluxe version of the encoder, license, and upgrades. Prices
begin at $495. For more information, visit http://realaudio.com.
- StreamWorks by Xing: StreamWorks
is a well-developed (although not particularly user-friendly)
package. It's more expensive than some of the others but would
work well for an intranet that maintains hundreds of simultaneous
audio streams, like a broadcasting company. Like RealAudio, StreamWorks
requires dedicated server software, whereas TrueSpeech and IWave
(described in a moment) do not. Technical support for the free
downloadable player costs $29. You might need the tech support
considering that the Xing player is a bare-minimum MIME (Multipurpose
Internet Mail Extensions) version operating from an embedded applet.
Xing StreamWorks server software (platforms offered are SGI, Sun,
HP, and Linux) starts at $3,500. For more information, visit http://www.xingtech.com.
- TrueSpeech by DSP: TrueSpeech,
a freeware package, offers excellent-yet low-bandwidth-quality
for both music and speech. Along with IWave (below), TrueSpeech
requires no special server software, which makes both of them
much less expensive than their competitors. The free TrueSpeech
player, however, is a bit too basic for large files: users do
not have the option of stopping and resuming the playback, or
adjusting the volume. The TrueSpeech encoder is included with
MS Windows 95 (it's integrated into the Sound Recorder). For more
information, visit http://www.dspg.com.
- IWave by VocalTec: IWave (Internet
Wave) offers high-quality-yet low-bandwidth-music, but it does
not handle speech as well as other players. You can download IWave
for free. You also can obtain a well-designed (but large) player
by downloading the demo version of VocalTec's Internet phone.
The IWave server is limited in its functionality, but if your
streaming audio needs are for music, IWave is a good choice. For
more information, visit http://www.vocaltec.com/.
- ToolVox by Voxware, Inc.: A freeware
package, ToolVox is solely for speech. It lacks a fully functional
player and does not have a server component. ToolVox audio files
must be linked with standard HTML tags, which are then distributed
in the same manner as GIF and JPEG files. IWave and TrueSpeech
(above) also use the HTML method of distribution. The disadvantage
of not having server software is that you lose the control and
flexibility you might need for a high-volume commercial system.
ToolVox is a good test run if you are thinking about developing
streaming audio capabilities into your site. For more information,
visit http://www.voxware.com.
With a cassette from one of the foreign-language packages in hand,
the Nemosyne Webmaster is ready to make a recording. She can record
directly into the version of the RealAudio Encoder that came with
the RealAudio server, but she wants to gain experience creating
an audio file in a sound-editing program and then converting it
to RealAudio format. For this reason, she must download the free
version of the RealAudio Encoder from the RealAudio Web page.
(The free version does not have the capability to record live;
it can only convert an existing audio file into a RealAudio file).
While the encoder is downloading, the Webmaster searches the Web
for a good sound-editing program.
A sound editor's first purpose is to record a sound and transform
it into a digital format (preferably one of the more popular ones).
A good audio editor has the following functions:
- Operates with specialized sound cards
- Provides accurate input meters
- Allows a capacity to change bandwidth
and format specifications
- Offers transitions between the RAM and
the hard disk
- Creates more than one track
- Provides a variety of ways to mark time,
such as in measures/beats and seconds/milliseconds
The Webmaster is going to use a Microsoft Windows-compatible program
called GoldWave as a sound editor. The good thing about GoldWave
is that it is a shareware program and can be downloaded from the
Web for free. If you would like a copy of GoldWave, along with
some good documentation, visit GoldWave's home page at
http://garfield.cs.mun.ca/%7Echris3/goldwave/goldwave.html
You also can download many other freeware audio editor packages
(such as Cool Edit, Sonic Screwdriver, WAVany, and WHAM) from
the Web.
After GoldWave is installed, the Webmaster creates a new file
in which to record the first audio file, as shown in Figure 25.1.
Figure 25.1: A new file is created in GoldWave.
Because the Webmaster is making a new audio file from an existing
source (a cassette), she needs to route the computer to a tape
player or stereo system. Most computers have audio ports that
allow this routing. See Figure 25.2.
Figure 25.2: GoldWave's Device Controls panel allows
you to play and record sound.
Using a sound-editing program, the Webmaster can extract a one-minute
clip from the tape and adjust the bandwidth specifications. Once
edited, the file is called "greeting" and is saved as
a .WAV file, as shown in Figure 25.3.
Figure 25.3: The file is saved as greeting. WAV.
After an audio file is in .WAV (or another) format, it can be
stored in the proper directory and linked into an HTML file. A
user can then download the file and play it with a computer audio
player.
.WAV files-like any that have to be downloaded-do not play in
real-time, however. For this reason, the Webmaster pulls up the
RealAudio Encoder, shown in Figure 25.4: It's already been decided
that the Nemosyne presentation files will be streaming audio.
Figure 25.4: The RealAudio Encoder can transform a variety
of audio formats into streaming audio.
The RealAudio Encoder allows the Webmaster to transform the .WAV
file into an .RA file (.RA is the extension used specifically
for RealAudio files): greeting.RA.
Now that the file is in .RA format, the Webmaster wants to link
it to the online presentation site. To do so, RealAudio requires
that an additional document, called a metafile, be created
and attached to the site. The metafile, which has the extension
.RAM, is a text file that contains the URL of the RealAudio file.
It provides a link between the Web server and the RealAudio server.
The Webmaster also configures the Web server to recognize the
.RA and .RAM MIME types. The RealAudio page (http://www.realaudio.com)
gives specific instructions on creating the metafile and configuring
the Web server.
The Webmaster creates the metafile (greeting.RAM), configures
the Web server, and then links the audio file to the desired page.
All that's needed to perform these steps is a simple HTML tag:
<A HREF="/greeting.RAM">Greeting</A>
Providing that the RealAudio Player is installed, the file should
play on the computer in real time. This procedure can be repeated
to make additional real-time audio files.
Streaming audio files are going to provide the Webmaster with
what she needs to present her proposal effectively to Nemosyne's
board of directors. But you should remember that the procedure
described in this chapter is only one of many that you can use
to make an audio file. You have to decide whether to purchase
an audio server and what editing software you use. You also might
decide that you don't want to use streaming audio. It's the trend
of the future, but it's also expensive and a bit underdeveloped.
Waiting for the streaming audio technology to progress also is
an option.
Developments like streaming audio and other online multimedia
are indicators that the Web is rapidly becoming a truly interactive
environment. Real-time audio and video capabilities bring to the
Web what satellite dishes brought to television. Conferences,
court proceedings, talk shows, and celebrity chat hours eventually
might be available online.
Internet Phone is more than a speaker phone and more than a chat
room; VocalTec's Internet Phone gives Internet users the ability
to talk to each other in their real voices. By connecting to the
IRC (Internet Relay Chat) network, the Internet Phone software
provides a list of online users and conversation topics. After
you have a TCP\IP Internet connection, select a user from the
list to call. The minimum connection is a modem SLIP\PPP connection
of 14,400 baud. Internet Phone works best with at least a 486SX
PC with 25MHz and 8MB of RAM. Versions for both Windows and Macintosh
are available.
Internet Phone works by employing a voice compression algorithm
that minimizes bandwidth consumption. Calls made from the Internet
Phone cannot be traced, and the software allows for "private
topics" that cannot be accessed by outsiders. Users of Internet
Phone speak into a computer microphone.
A novel idea, the Phone's most obvious disadvantage is that you
can make Internet phone calls only to people who also have the
software. At this point, the Internet Phone is something like
a very sophisticated chat room or BBS. But who knows, maybe someday
everyone will be trashing touch-tones and buying high-tech microphones
(with built-in answering machines, of course).
Besides the VocalTec phone, other audio-conferencing software
is available. An excellent FAQ is available at http://www.gi.net/NET/PM-1995/95-04/95-04-28/0004.html.
For more general information, visit VocalTec's home page at http://vocaltec.com.
Also in the future is the development of voice recognition computing.
Imagine what it would be like to direct your computer to your
favorite Web sites not with a mouse, but with your voice! Voice
recognition often is used in word processing and other software
as an aid to the visually impaired.
Just about any computer function that is performed with a keyboard
and mouse has the potential to be performed with voice recognition.
This is good news for people who get tired of moving the mouse
around, or who were never that adept at dragging and clicking
to begin with. Although it's doubtful that voice recognition would
make the mouse and pad extinct (drawing programs are especially
dependent on the trackball), you can count on the technology becoming
integrated with more software packages.
The concept of machine-voice communication is older than you might
think-about 60 years. It wasn't until the 1980s, however, that
small vocabulary speech recognition software was developed to
run on IBM PCs. The software has continued to progress and is
greatly assisted by the Pentium and other powerful processors.
A twist on voice recognition is voice verification. You've probably
seen science-fiction movies in which a fingerprint is used as
a passkey; but because your voice is as unique as your fingerprint,
anticipate the development of the "voiceprint," which
might consist of a spoken password or phrase ("Open Sesame"?),
or the repetition of certain words at the computer's request.
You can find an excellent directory of voice recognition resources
at http://www.kurz-ai.com/gen-vr.html.
| MCKEON & JEFFRIES | In the short term, McKeon & Jeffries has very little use or need for audio. Not only do most of their machines have audio capability, but the information that's most important to them is technical research and is important to be read
and not heard. On the other hand, M&J hopes to use audio on a few applications in the future:
- Continuing education. Every CPA has to keep up with current tax law and accounting practices. To do so, many accountants attend seminars and talks given by experts. McKeon & Jeffries hopes to someday be able to broadcast these seminars through the
intranet to their users to save time and money, as shown in Figure 25.5.
- Conference calls. In the distant future, M&J hopes to be able to hold online audio conferencing through its intranet. The next step would be to recognize the audio, save it to a file, and make it searchable for individuals who want to use it as a
reference.
|
Figure 25.5: Continuing education is much more convenient
and efficient using an intranet.
| THE SPORTING GOODS AND APPAREL ASSOCIATION | The SGAA plans to implement sound in several different areas of the intranet. As their users become more sophisticated and have the ability to not only listen to audio but also to create audio, the site will use more and more audio
technology
- What's new. The SGAA wants to be able to greet users as they visit the site every day with a list of the new things added to the site each day, as shown in Figure 25.6. However, they don't want to crowd the home page with a lot of text. Using the
RealAudio server, the SGAA staff can record a new message every day so that when users log on, they can hear about new attractions on the site, fresh news, and any events scheduled for that day such as online chats or audio conferences..
- Advertising. Several of the manufacturers and distributors have product pages that contain information on specific products. For some of these products, radio advertising or the audio for television for advertising is captured and available on the
server for resellers and distributors to hear.
- Audio conferencing. Several times a month, the SGAA's executive committee engages in a short conference call to discuss association business. Members are invited to listen in using RealAudio through the intranet.
|
Figure 25.6: A different audio message greets users
every day at the SGAA site.
You can use digital audio for online presentations and as a means
to develop the resources of your intranet. Creating high-quality,
functional audio is a matter of determining your needs (speech,
music, or both) and scouting for software that will help you accomplish
your goals. Deciding how to compress your audio is most important.
Streaming audio packages enable you to play back in real time,
so if you want to bring live conferences to your site, streaming
audio is for you. On the other hand, if you are more interested
in posting music files that do not have to be played simultaneously,
and you want to preserve quality, you might want to post your
files for downloading. You must then decide what format to use
for posting your audio files. Audio-MPEG, .AIFF, .WAV, and .AU
formats offer means of compression.
In both cases, take into consideration the software choices you
have available: many servers, encoders, and players are available
through shareware, so if audio will not be a large part of your
presentation or intranet, they might be the best route. If you
have plans to make audio an important part of your presentation
or your intranet, look into one of the more sophisticated commercially
available packages. Don't forget that online sources can answer
questions that arise while you are in the process of creating
audio.
|