Opening up the Internet through Voice Portals
By , freelance journalist
The beginning of the new millennium will be remembered for many things but in terms of the Internet, one of the most significant developments has been the rapid emergence of the voice (or speech) portal.
Not long ago, speech-enabling the World Wide Web was a mere germ of an idea. Today, that idea is being turned into reality, with voice portals leading the way.
Voice portals began life as dedicated systems that provided access to a database of information via a voice channel. In a voice portal, input from the user is through spoken command, which the system can accept thanks to Advanced Speech Recognition (ASR) techniques. Output from the system back to the user is performed by text-to-speech (TTS).
Until recently, voice portals typically did not provide access to the whole worldwide Internet, but rather to 'walled gardens' of content managed by individual service providers, which related to their customers' particular data and services. Now, the Web itself is becoming available through voice interfaces.
"Compared to the PC market, telephony is
This has the potential to transform the nature of Web access, because at a stroke it makes Web content accessible via any telephone. And there are far more telephone users in the world than there are computer users: some 800m wireless phone users, and 1.2bn telephone lines, compared with a 'mere' 300m computer-based Internet users.
Voice portals are an important development because they bring the various benefits of voice-based access to the Internet. For one thing, rapid retrieval of information can be much easier via voice, because a user can simply state an item stored in a list or directory, without having to remember a number, scroll through a menu, or listen to each option.
For example, simply saying the railway stations that you want to travel to and from is much easier than inputting the same information via a phone with a WAP (Wireless Application Protocol) capability, using the phone?s keypad.
However, voice portals are not necessarily direct competitors with text-based techniques for mobile telephones like WAP. Instead they can be used as complementary technologies. For example, a train travel information site may be displayed on a WAP portal. On selecting this option, the user is transferred to an automated speech-enabled application to say where they want to travel to and from. This information is then delivered to the customer either via the voice prompt, or possibly in the form of text by SMS (Short Messaging Service), enabling the customer to store the data on their handset.
Hands-free usage and access to information while on the move, via a mobile telephone, are other major benefits of speech-enabling websites through voice portals. The value of this has been demonstrated by the widespread availability of email-by-speech services, offered by many leading Internet Service Providers (ISPs).
Moves into voice portals from many major Internet players are coming thick and fast, such as the announcement of the Yahoo-by-Phone service from the well known search engine company. AT&T, IBM and Lucent are just three of the world's largest IT companies actively developing voice portal technologies.
"For people on the move, the voice portal will access
A series of market reports predict voice portals will be one of the fastest growing and most widely used Internet technologies, given their ability to make Web content and services far more accessible.
UK market consultancy Ovum forecasts that by 2005, voice portals will become a $26 billion market. Ovum says a major application for voice portals will be to promote the use of 'personal assistant services'. People will dial in, and through them have access to a huge range of services, from ordering a pizza to placing calls, to obtaining any kind of information from websites.
"Compared to the PC market, telephony is still in the era of the DOS prompt," says the Ovum report's writer, Dan Ridsdale. "PC usage exploded when DOS was replaced by Windows. A large factor in this is that user interfaces have become more intuitive and user-friendly, bringing the functionality of the PC to telecommunications."
Improvements in speech recognition, expanding Internet use, and the arrival of microbrowsers on telephones will see a massive market emerge, he claims.
"Personal assistant service providers will effectively own the customer interface, becoming the users' personal portal into the network," he said.
The people at the cutting edge of this new market ? such as mobile phone operators ? agree. Karen Sinclair, Senior Product Manager for Orange, France Telecom's subsidiary providing cellular networks, says that with increased call revenue, such services could make up to 50% of Orange's business in the near future.
"...by 2005 more than 2bn people will be using voice portals" - Datacomm
"The voice portal will be the first thing you will use. People are on the move. The voice portal will access everything for you ? the Web, travel timetables, news," she says.
Kelsey Group, the eCommerce market research organisation, claims there will be 3m frequent and 10m occasional users of voice-enabled websites by 2002. And perhaps the most upbeat forecast of all comes from US company Datacomm Research, which says that by 2005 more than 2bn people will be using voice portals, voice-enabled websites and Web-based interactive voice response systems (IVRs).
"Voice-based services will humanise the Internet, extending Internet access to every telephone and making online shopping easier and more natural," says Paul Pauesick, Datacomm Research's Director of Research and principal author of the report. "By 2005, more people will surf the Web from phones than from PCs."
"Voice portals and voice application service providers will dramatically reduce costs associated with calls centres and customer premises equipment," said Ira Brodsky, President of Datacomm Research. "Voice-based Internet services will also spawn new competition for local, long-distance, and international telephone services," he added.
The Datacomm Research report says voice portals will help conventional businesses to exploit Internet-based e-commerce and customer relationship management (CRM) solutions. Voice buttons on business websites will enhance sales and service, providing a richer shopping experience, and VXML (a variety of the XML Web language that is designed for voice applications) will extend the Internet's reach to all telephones.
The report comes to several specific conclusions as to how the voice portal market should develop. For example, it says incumbent local exchange carriers (ILECs) must divorce services from the central office in order to survive. But legacy systems, employee resistance, and shareholder addiction to dividends will hinder ILECs, creating huge opportunities for voice application service providers to offer 'virtual central office' services. This will enable customers to reduce their investment in telephone equipment, support personnel, training, and facilities, and will gradually absorb today's IVR applications, it says.
Voice portals will become a key information source for mobile users, Datacomm Research claims, with the most successful voice portals being those prepared to absorb high production costs, provide local flavour, and quickly scale up their systems.
In the longer term, personal digital assistants will become Internet-based, talking avatars (also known as 'virtual agents' or 'verbots'). Such 'virtual agents' will serve as intermediaries between people and services ? everything from e-commerce sites to home security systems.
It may be a few years before we chat regularly with our personal avatar but there is no doubt about the potential of the voice portal market. Across Europe, there are already several examples of services that illustrate what can be achieved. In Greece, for instance, mobile phone operator is providing a range of innovative services that are essentially based on the voice portals concept. Working with Ericsson and speech recognition specialist Vocalis, STET Hellas has several systems either implemented or under development, including voice mail retrieval, and location dependent services.
The latter enable the mobile network to automatically identify where a caller is located and provide the most appropriate response to calls for emergency services, say, or facilities like tourist information. Location dependent services are currently at the cutting edge of mobile telephony, and seen by many observers as an area with huge potential that will grow rapidly over the next few years.
STET Hellas has developed a Smart Tourist Guide, designed to enable people who are walking or driving the streets of Athens to call a number and be told by the automated system where interesting tourist sites are in relation to their current location. Extensions to the system will be to tell callers where other relevant facilities are, such as the nearest hotel or railway station.
In Sweden, Arico AB is an Internet solution and content provider that specialises in e-commerce, Internet sales and interactive reservation systems as well as remote access updateable database systems. It has implemented two sophisticated voice-driven services: an automated on-line hotel booking system that can be operated entirely by speaking over the telephone, and a dual language email-by-voice facility.
For the hotel booking service, called LedigaRum ('JustRoom'), a comprehensive voice response (IVR) system acts as the front end to Arico?s interactive database technology. It enables customers to call a specified number, say the location they want, and obtain details of room availability including last-minute special offers. For hotels, it allows them to ensure that the details of their latest room vacancies and offers are constantly available, therefore minimising the number of rooms left empty.
The second service is believed to be the first example of a dual language, email-by-phone service. As with LedigaRum, Advanced Speech Recognition (ASR) and Text-To-Speech (TTS) give users a comprehensive speech interface to their email, allowing them to access, delete and respond to email messages using speech.
Whichever language was used to create the original email, Swedish or English, the system?s spoken output can be switched to the appropriate language. After the user dials in to the service, the system reads out the heading of their first email in Swedish. If this email was created in Swedish, the content is read back to them. However, if at any time the system speaks a subject heading or text that is incorrectly pronounced, the user is able to switch to English simply by saying ?English?. The English section is read to them, after which they can revert to Swedish. This is useful for the many Swedes, especially business users, who use both languages frequently
Another illustration of what is being achieved comes from a UK company, BrowseByPhone. Calling itself the UK?s first Voice Internet Service Provider (vISP), BrowseByPhone allows ordinary telephone users to access Web content by voice and a telephone keypad. BrowseByPhone has been developed by Birmingham-based wireless Internet specialist, Waperture, as an extension of its successful Wapgata wireless Internet service.
The launch service, called GataGrab, enables users of the BrowseByPhone website to tell the system which Web pages they are interested in ? a 'Grab', to use BrowseByPhone's terminology. From then on, they can dial the BrowseByPhone access number and hear the page being read back to them, live off the Internet.
Thanks to services like BrowseByPhone, all of the UK?s telephone users are now effectively potential Internet users, with no need to buy a PC or special phone. Users range from sports and hobby enthusiasts accessing their club or team events and results lists, to business executives accessing driving directions or company reports, to web masters remotely monitoring website use. Waperture is also keen to investigate applications for the visually impaired.
BrowseByPhone also integrates with WAP technology to enable users to view their selected web pages on a WAP phone instead of listening to them. The system dynamically translates from web page protocols to WAP protocols, potentially making any web page accessible from a WAP phone. Small and medium size enterprises (SMEs) can use BrowseByPhone to give themselves an instant telephone information line and WAP presence, simply by setting up a Grab to a page on their existing website.
The BrowseByPhone service can be accessed on the Internet from www.wapgata.com or www.browsebyphone.com, and over WAP from wap.wapgata.com. To access from a UK telephone, dial for the BrowseByPhone/Gatagrab service, or for a GataGrab demo.
The voice portal feature of the BrowseByPhone service is central to its operations, as David Shaw, Marketing Manager of Waperture, says.
"Although we still feel that well developed WAP applications have their place, the fact is that much of what people want to access when they are mobile contains too much data for a WAP phone, and even now WAP phones have nowhere near the penetration of ordinary mobile phones.
"Even with the development of GPRS (general packet radio service) and 3G wireless technologies, voice access will still be preferable to text or graphical interfaces for many users and applications," adds Shaw.
"BrowseByPhone allows anybody with any type of telephone to access Web-based content. Also, the GataGrab service is targeted at individual web pages to make user interaction simple and easy to use. As users become familiar with voice Internet access we may extend the service to provide almost full web surfing capabilities."
Once the service is established in the UK, Waperture is looking to expand into the US and Europe, and is currently talking to voice technology companies in Ireland, Spain, Germany and the US. Waperture also plans to release a number of different voice services under the BrowseByPhone banner, including conventional portal services such as sports, news and weather, and the ability for users to build their own voice sites and services.
It is no accident that for humans, speech is far and away the preferred method of communicating. It is far less effort to learn and use than writing, much quicker, and can be done anywhere. Originally, of course, it required being within earshot of your intended audience, but Alexander Graham Bell's invention solved that problem for us more than a century ago.
Today, voice-based communication is transforming the way we use one of the 20th century's last major inventions, the Internet and the World Wide Web, for the same reasons. In computing terms, these can be summed up very simply as: "A voice portal means all you need to do is boot up with your ear, and log on with your mouth."
? ? ?
is a independent journalist specialising on technical issues.
The editors of HLTCentral would welcome any feedback on the article.