AT&T Labs Mashes Up Voice, TV, Gestures, and Twitter

By  |  Wednesday, November 17, 2010 at 11:46 am

“Watching TV is supposed to be fun, right?” asked AT&T’s Michael Johnston. In a press event at the AT&T Labs in New York City, Johnston and other researchers showed off  iRemote, Talkalytics, and dozens of other projects now under way for using AT&T’s long-time Watson speech recognition together with search, gestures, and Twitter analysis.

With all the hundreds of TV channels available today, it can be harder than ever to figure out what to watch, Johnston observed. But through a new iRemote app currently in development, you can speak voice commands into a smartphone to get an immediate list of “all reality shows on Thursday night”–and other categories of TV programs small enough to easily digest — on your TV screen.

Other apps, such as Google Voice, can already do voice search, Johnston conceded. But unlike Google, AT&T is focusing its voice search on contents of databases, such as TV listings and other directories, as opposed to the entire Web. Through years of research, he contended, AT&T’s voice search is fine-tuned  for delivering very accurate results in these kinds of voice searches.

“We think that Internet TV is the future,” said Junlan Feng, a researcher in a neighboring booth. On flat panel TV screens there, AT&T showed how conglomerations of tweets from Twitter can be organized into new kinds of categories for display to end users and companies.

AT&T is studying how consumers might tweet their responses to TV shows and then compare their own thoughts to other tweets. As a consumer, you could switch out of a TV program like Meet the Press to look at Top Tweets, Recent Tweets, and Most Positive and Most Negative Tweets, and Popularity of TV Show, measured by number of tweets per day.

But why would TV viewers care about these aggregated Tweets, anyway? “People always want to know what others are saying,” says Feng. After all, she noted, people are already commenting on news stories on the Web and responding to each other. Feng predicted that collected responses from social media might even replace conventional polls like those used for Nielsen ratings some day, since tweets are instantaneous.

Businesses are interested in Tweets, too, to keep on top of public response to “events” such as product recalls. In a demo of its Sonar (Social Network Analysis and Reporting) project, AT&T showed line graphs of tweeted “public sentiment” over its own decisions to first pull–and then reinstate–food programming on UVerse last week.

But research like this calls for development of new ways of analyzing and categorizing data — involving the application of principles of language syntax and semantics, for example — to “give you the meaning of the tweets,” she said. Tweets raise special challenges, I was told. Tweets are short in length–with an upper limit of 140 characters–and are “often ungrammatical.”

In a demo of Talkalytics, AT&T showed how sophisticated algorithms have already been built for quick analysis across large volumes of recorded customer service phone calls. Phone calls are first turned into searchable text through TTS (text-to-speech) conversion. After issues like service outages have been highlighted by computer systems, humans step in to do further trouble-shooting.

AT&T Mobility has been using this kind of technology internally for about a year now, I was told. (By the way, users are informed in advance that their calls might be recorded and monitored.)

AT&T is also working on search technologies which combine voice with gestures for so-called “multimodal search.” One already available app like that, AT&T’s own Speak4it, is aimed at helping users of Apple iPhones and iPads find nearby restaurants, stores and other places.

Although the app is enabled for voice commands, you can also use a touchscreen to draw a circle around a geographic area–or a line pointing in a particular direction — if you have a basic idea of where you’re headed but aren’t sure of the exact address, for instance.

To get voice search and other speech apps out the door faster, AT&T is now collaborating with third-party developers. Vlingo, one partner, has already released a multiplatform smartphone app with hands-free operations that include letting you speak instead of text if you want to do messaging while driving a car.

AT&T plans to launch a Web portal in 2011 that will open up its speech API for wider third-party development. The portal will support creation of new apps for iPhones, BlackBerries, and any other mobile phone environments supported by AT&T at the time.


Read more: , , ,

2 Comments For This Post

  1. how to relive stress Says:

    awesome post i have bookmarked your blog :).

  2. die steel Says:

    Enjoyed every bit of your website.Thanks Again. Great.