CN104754364A

CN104754364A - Video advertisement voice interaction system and method

Info

Publication number: CN104754364A
Application number: CN201510145559.8A
Authority: CN
Inventors: 张云锋; 蒋子俊; 周盛; 姚键; 张大伟; 曹磊; 唐端荣; 潘柏宇; 卢述奇
Original assignee: Unification Infotech (beijing) Co Ltd
Current assignee: Unification Infotech (beijing) Co Ltd
Priority date: 2015-03-30
Filing date: 2015-03-30
Publication date: 2015-07-01

Abstract

The invention discloses a video advertisement voice interaction system and method, belongs to the technical field of internet video advertisements and aims to solve the problem of the prior art that the user is required to register and pay in case of skipping the video advertisements and suffered with losses once just simply skips the video advertisements. The video advertisement voice interaction system comprises a video play client, an advertisement play server and a voice recognition server. The video advertisement voice interaction method includes that as the video play client plays the video advertisements, the user opens a voice monitoring switch to input voice, the voice monitoring module collects voice information and sends voice data extracted to the voice recognition server, the voice recognition server returns a voice data recognition result back the video play client, and the video play client calls a related connector of a player to trigger a related event.

Description

Video ads voice interactive system and method

Technical field

The present invention is specifically related to a kind of video ads voice interactive system and method, belongs to internet video technical field of advertisement.

Background technology

Video ads has become advertisement form main in the Internet at present, increasing video ads brings very large worry to user, for this reason, number of site starts to provide for some premium customers the video ads can selecting to play, and user can select to skip some advertisement, but this needs user to register and pays, most of user can not select to register and the form of paying, and simply skip video ads, advertiser must be made to incur loss, lose the chance of publicity product.

Summary of the invention

Therefore, to the present invention is directed in prior art user to skip video ads and select to need register and pay, most of user can not select to register and the form of paying, and simply skip video ads, advertiser must be made to incur loss, lose the problem of the chance of publicity product, a kind of video ads voice interactive system is provided, comprise video playback client, advertisement releasing server, speech recognition server, advertisement releasing server is used for providing video ads code to video playback client according to the video ads request of video playback client, it is characterized in that, described video playback client comprises audio monitoring switch, audio monitoring module, audio monitoring module is for collecting voice messaging, extract speech data and send to speech recognition server, speech recognition server is for identifying speech data and resulting text being returned to video playback client.

Described speech recognition server comprises sound identification module, described sound identification module comprises acoustic model, dictionary file, language model, acoustic model obtains after carrying out feature extraction and acoustic training model to sound bank, language model obtains after carrying out language model training according to the text provided in text library, deposits the mapping relations table of word and phoneme in dictionary file.

Described video playback client is mobile phone, panel computer, notebook computer or desktop computer.

The video ads voice interactive method realized by said system, it is characterized in that, described method is: video playback client sends ad-request to advertisement releasing server, advertisement releasing server provides ad code to video playback client, video playback client terminal playing video ads, when audio monitoring on off state is opening, if user carries out phonetic entry, audio monitoring module can collect voice messaging, and speech data is sent to speech recognition server, the resulting text of speech data identification is returned to video playback client by speech recognition server, whether specified command is comprised in video playback client judged result text, if had, the relevant interface triggering dependent event of player is then called with these orders.

Specified command comprises built-in command and in non-built order.

After each trigger event occurs, video playback client carries out log recording by the log recording interface calling advertisement releasing server and provide.

Beneficial effect of the present invention is: adopt video ads voice interactive system of the present invention and method, by interactive voice technology, achieve the interactive voice of user and system, both met client not need to register the demand that paying gets final product skip advertisements, again can by the restriction of voice interactive system, as client needs to say the modes such as advertised product title, the product of advertiser is made to obtain the effect of publicity surpassed the expectation.User can also realize other functions such as replay, time-out by interactive voice.

Accompanying drawing explanation

Fig. 1 is the structural representation of video ads voice interactive system of the present invention;

Fig. 2 is the Play Control flow chart of video playback client;

Fig. 3 is speech-recognition services realization flow figure.

Reference numeral is as follows:

1, video playback client;

2, advertisement releasing server;

3, speech recognition server.

Embodiment

Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described:

As shown in Figure 1, video ads voice interactive system, comprise video playback client 1, advertisement releasing server 2, speech recognition server 3, advertisement releasing server 2 is for providing video ads code to video playback client 1 according to the video ads request of video playback client 1, video playback client 1 comprises audio monitoring switch, audio monitoring module, audio monitoring switch is for opening and closing audio monitoring module, audio monitoring module is for collecting voice messaging, extract speech data and send to speech recognition server, speech recognition server 3 is for identifying speech data and resulting text being returned to video playback client 1.The Play Control flow process of video playback client 1 as shown in Figure 2.

Speech recognition server 3 comprises sound identification module, sound identification module comprises acoustic model, dictionary file, language model, acoustic model obtains after carrying out feature extraction and acoustic training model to sound bank, language model obtains after carrying out language model training according to the text provided in text library, deposits the mapping relations table of word and phoneme in dictionary file.Speech-recognition services realization flow as shown in Figure 3.

Video playback client 1 is mobile phone, panel computer, notebook computer or desktop computer.Be applicable to various platform.

The video ads voice interactive method realized by said system, video playback client 1 sends ad-request to advertisement releasing server 2, advertisement releasing server 2 provides ad code to video playback client 1, ad code is the character string of XML or the JSON form generated according to the good advertisement interaction protocol of predefined, the inside contains variously plays relevant information to advertisement, as: the URL of ad material, the exposure of advertisement and click-through count and the URL finished playing, the exposure and click monitoring URL etc. of advertisement, client meeting analyzing XML or JSON string, then the triggering of advertisement broadcasting and dependent event is carried out.Each attribute having the advertisement of interactive voice effect demand can have " skip advertisements keyword " by name, generally can get the brand name of this advertisement as keyword, the record of newly-increased interactive voice effect daily record, for counting user some interactive information to the advertisement of playing, advertiser's reference can be supplied to.Concrete grammar is a newly-increased node " skipword " under the ad node that each advertisement is corresponding, its value is the keyword of skip advertisements, in addition after skipword node, a node " recurl " is increased newly again, its value is the log interface URL of recording user interbehavior, the parameter comprised in this URL can be recorded in daily record, wherein there is an actid parameter, value be one grand: " ##ACTIONID## ", corresponding value can be replaced to again request corresponding for this URL is sent when actual sending request according to the request of the actual triggering of user.Video playback client 1 playing video advertisement, when audio monitoring on off state is opening, if user carries out phonetic entry, audio monitoring module can collect voice messaging, and speech data is sent to speech recognition server 3, the resulting text of speech data identification is returned to video playback client 1 by speech recognition server 3, whether specified command is comprised in video playback client 1 judged result text, if had, then call the relevant interface triggering dependent event of player with these orders.

Specified command comprises built-in command and in non-built order.Such as

" replay ": built-in command, replays Current ad;

" time-out ": built-in command, suspends and plays Current ad;

" Great Wall ": in non-built order, for this order, keyword (Skipword) is skipped in the advertisement of having said Current ad as user, i.e. the brand name of Current ad, so skip Current ad.

After each trigger event occurs, video playback client 1 carries out log recording by the log recording interface calling advertisement releasing server 2 and provide.

JSON fragment as follows, returning results of the ad placement services end obtained when being certain ad-request of a client transmission is two marketing advertisements of two brands in Great Wall and the Changjiang river respectively.Wherein " ads " is an array, the inside houses multiple " ad " child node, the corresponding advertisement of each " ad " child node, " skipword " child node is had again in each " ad " child node, as user's opening voice listening key and when sending the sound on " Great Wall ", id be 123 advertisement will stop play, leap to next id be 124 advertisement play.

Client is after collecting voice messaging, recurl node below can be checked, if this node exists, then take out its URL, then " ##ACTIONID## " in URL is replaced with the actual event triggered by speech recognition character string out numbering (number format as: 1: to replay, 2: to suspend, 3: skip), then this URL is accessed, this URL corresponds to a log collection service of ad placement services end, can relevant parameter be resolved after this service reception request, and complete the record of daily record.Main JSON code is as follows:

The above is the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the prerequisite not departing from principle of the present invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1. a video ads voice interactive system, comprise video playback client, advertisement releasing server, speech recognition server, advertisement releasing server is used for providing video ads code to video playback client according to the video ads request of video playback client, it is characterized in that, described video playback client comprises audio monitoring switch, audio monitoring module, audio monitoring module is for collecting voice messaging, extract speech data and send to speech recognition server, speech recognition server is for identifying voice and recognition result text being returned to video playback client.

2. video ads voice interactive system as claimed in claim 1, it is characterized in that, described speech recognition server comprises sound identification module, described sound identification module comprises acoustic model, dictionary file, language model, acoustic model obtains after carrying out feature extraction and acoustic training model to sound bank, language model obtains after carrying out language model training according to the text provided in text library, deposits the mapping relations table of word and phoneme in dictionary file.

3. video ads voice interactive system as claimed in claim 1, it is characterized in that, described video playback client is mobile phone, panel computer, notebook computer or desktop computer.

4. the video ads voice interactive method that the system according to any one of claims 1 to 3 realizes, it is characterized in that, described method is: video playback client sends ad-request to advertisement releasing server, advertisement releasing server provides ad code to video playback client, video playback client terminal playing video ads, when audio monitoring on off state is opening, if user carries out phonetic entry, audio monitoring module can collect voice messaging, and speech data is sent to speech recognition server, the resulting text of speech data identification is returned to video playback client by speech recognition server, whether specified command is comprised in video playback client judged result text, if had, the relevant interface triggering dependent event of player is then called with these orders.

5. video ads voice interactive method as claimed in claim 4, it is characterized in that, described specified command comprises built-in command and in non-built order.

6. video ads voice interactive method as claimed in claim 4, is characterized in that, after each trigger event occurs, video playback client carries out log recording by the log recording interface calling advertisement releasing server and provide.