CN106027485A - Rich media display method and system based on voice interaction - Google Patents
Rich media display method and system based on voice interaction Download PDFInfo
- Publication number
- CN106027485A CN106027485A CN201610279818.0A CN201610279818A CN106027485A CN 106027485 A CN106027485 A CN 106027485A CN 201610279818 A CN201610279818 A CN 201610279818A CN 106027485 A CN106027485 A CN 106027485A
- Authority
- CN
- China
- Prior art keywords
- information
- rich media
- user
- speech data
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 230000003993 interaction Effects 0.000 title abstract description 13
- 238000001228 spectrum Methods 0.000 claims abstract description 28
- 230000002452 interceptive effect Effects 0.000 claims description 60
- 230000008569 process Effects 0.000 claims description 25
- 230000001747 exhibiting effect Effects 0.000 claims description 17
- 230000005540 biological transmission Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 abstract 2
- 230000000694 effects Effects 0.000 description 11
- 230000008859 change Effects 0.000 description 4
- 230000004069 differentiation Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/54—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a rich media display method based on voice interaction. The method comprises the steps of acquiring voice data input by a user; performing frequency domain conversion processing on the voice data, thus acquiring spectrum characteristics of the voice data, and looking up a preset user attribute list according to the spectrum characteristics to acquire the attribute of the user; performing semantic recognition on the voice data to acquire semantic information of the voice data, and finding and acquiring rich media information related to the semantic information according to the semantic information; and displaying the rich media information according to a preset display rule corresponding to the user attribute. The invention also discloses a rich media display system based on voice interaction. According to the rich media display method and system based on voice interaction, the voice data is subjected to frequency domain conversion processing, and the user attribute is acquired, so that differential processing of voice interaction can be achieved according to the user attribute, and the display of the rich media is more targeted. By acquiring the rich media information related to the voice, the voice interaction efficiency is improved.
Description
Technical field
The present invention relates to the display technique field of speech processes and Rich Media, particularly relate to a kind of based on voice friendship
Mutual Rich Media's methods of exhibiting and system.
Background technology
Along with the development of information technology, user interaction techniques is widely used.And interactive voice
As continue keyboard mutuality, mouse user interaction patterns of new generation alternately and after touch screen interaction, convenient with it
Feature efficiently, is gradually approved by users and has by the potential prospect of large-scale promotion, and in these phases
In the application closed, wisdom speech business and correlation function thereof are the most attractive.Such as, intelligent mobile is eventually
Application relevant to voice on end gets more and more, and intelligent television manufacturer replaces also by quoting voice interaction technique
Change traditional hand-held remote controller.In prior art, interactive voice is based on speech recognition technology, that is, voice
Interactive system, after receiving one section of voice, first carries out content recognition to speech data, obtains content recognition
As a result, and according to this content recognition result user view is known.Afterwards, voice interactive system is anticipated according to user
Figure carries out the operation corresponding with this voice, or returns the information corresponding with this voice to terminal use.
But, existing voice interactive system, on the one hand it is merely able to identify the difference comprising semanteme in speech data
Not, it is impossible to enough realize the differentiation to different user and process, on the other hand, existing voice interactive system its
The effect of mutual display is the most single, and only voice or only word is mutual, and this wants to obtain for those
For taking the user of more information resource, function and effect are not the most especially desirable.Especially in child's
Cultivating or education aspect, existing interactive system cannot meet the use demand of child user.
Summary of the invention
In view of this, it is an object of the invention to propose a kind of Rich Media based on interactive voice methods of exhibiting and
System, makes the displaying of Rich Media have more specific aim, improves the effect that interactive voice is shown.
A kind of based on interactive voice the Rich Media methods of exhibiting provided based on the above-mentioned purpose present invention, including:
Obtain the speech data of user's input;
Described speech data is carried out frequency domain transform process, obtains the spectrum signature of described speech data, according to
Described spectrum signature searches the Customer attribute row form preset, and obtains the attribute of user;
Described speech data is carried out semantics recognition, it is thus achieved that the semantic information of described speech data, according to described
The rich media information relevant to institute semantic information is searched and obtained to semantic information;
According to default show corresponding with described user property, rule carries out the displaying of rich media information.
Preferably, also include after the step of the semantic information of the described speech data of described acquisition:
Described speech data is converted to Word message, within the default time, by described word-information display
In interface.
Preferably, described described speech data is carried out frequency domain transform process, obtain the frequency of described speech data
Also include before the step of spectrum signature:
Judge whether active user is system user according to described speech data,
If system user, then obtain the pre-set user information corresponding with active user;By described speech data
Carry out semantics recognition, it is thus achieved that the semantic information of described speech data, search according to institute's semantic information and obtain
The rich media information relevant to institute semantic information;According to the default displaying rule corresponding with described user profile
Then carry out the displaying of rich media information;
If active user is not system user, then perform described speech data is carried out frequency domain transform process,
Step to the spectrum signature of described speech data.
Preferably, described rich media information is carried out according to the default displaying rule corresponding with described user property
The step shown includes:
The displaying of rich media file, wherein, described Rich Media is carried out according to the displaying rule in rich media information
Comprising rich media file information in information and show Rule Information accordingly, described displaying rule includes Rich Media
The order of document presentation and the mode of displaying.
Preferably, described rich media information is carried out according to the default displaying rule corresponding with described user property
Displaying step after also include:
According to described rich media information, play the voice guidance message preset;
Obtain the new speech data of user's input;
According to described new speech data, show the rich media information that described new speech data is corresponding.
Preferably, described search and obtain the Rich Media relevant to institute semantic information according to institute's semantic information
The step of information also includes:
The rich media information obtained according to lookup, obtains the attribute character of described rich media information;
According to attribute character and the attribute of user of rich media information, search and whether judge described Rich Media
Belong to the mask information preset under this user property;
The most then filter out described rich media information.
Present invention also offers a kind of Rich Media based on interactive voice display systems, including:
Data acquisition module, for obtaining the speech data of user's input;Speech data acquisition obtained is sent out
Give attribute and search module and data obtaining module;
Attribute searches module, for receiving the speech data that described data acquisition module sends, to described voice
Data carry out frequency domain transform process, obtain the spectrum signature of described speech data, look into according to described spectrum signature
Look for default Customer attribute row form, obtain the attribute of user;The customer attribute information obtained is sent to information
Display module;
Data obtaining module, for receiving the speech data that described data acquisition module sends, by described voice
Data carry out semantics recognition, it is thus achieved that the semantic information of described speech data, search also according to institute's semantic information
Obtain the rich media information relevant to institute semantic information;The information that the rich media information of acquisition is sent to is shown
Module;
Information display module, searches, for receiving described attribute, the customer attribute information and described that module sends
The rich media information that data obtaining module sends, according to the default displaying rule corresponding with described user property
Carry out the displaying of rich media information.
Preferably, described system also includes word display module;
Described data acquisition module is additionally operable to, and the speech data obtained is sent to described word display module;
Described word display module is used for, and receives the speech data of described data acquisition module transmission and by described
Speech data is converted to Word message, within the default time, by described word-information display in interface.
Preferably, also include: user's judge module,
Described data acquisition module is additionally operable to, and the speech data obtained is sent to described user's judge module;
Described user's judge module is used for, and receives the speech data that described data acquisition module sends, according to institute
State speech data and judge whether active user is system user, if system user, then obtain and active user
Corresponding pre-set user information;Described pre-set user information is sent to information display module;If active user
It not system user, then described speech data is sent to attribute and searches module;
Described information display module is additionally operable to, and receives the pre-set user information that described user's judge module sends,
According to default show corresponding with described pre-set user information, rule carries out the displaying of rich media information.
Preferably, described information display module is additionally operable to,
The displaying of rich media file, wherein, described Rich Media is carried out according to the displaying rule in rich media information
Comprising rich media file information in information and show Rule Information accordingly, described displaying rule includes Rich Media
The order of document presentation and the mode of displaying.
Preferably, also include: guide module;
Described information display module is additionally operable to, according to described rich media information to guiding module transmission guiding to refer to
Order;
Described guiding module is used for, and receives the key instruction that information display module sends, and plays the voice preset
Guidance information;
Described data acquisition module is additionally operable to, and obtains the new speech data of user's input;By described new language
Sound data are sent to described information display module;
Described information display module is additionally operable to, and receives the new speech data that described data acquisition module sends,
According to described new speech data, show the rich media information that described new speech data is corresponding.
Preferably, described data obtaining module is additionally operable to,
The rich media information obtained according to lookup, obtains the attribute character of described rich media information;
According to attribute character and the attribute of user of rich media information, search and whether judge described Rich Media
Belong to the mask information preset under this user property;
The most then filter out described rich media information.
From the above it can be seen that the present invention provide Rich Media based on interactive voice methods of exhibiting and be
System, by speech data carries out frequency domain transform process, and utilizes frequency domain transform to process the spectrum signature obtained
Obtain the attribute of user, and then can be according to the different attribute of user, it is achieved the differentiation of interactive voice processes,
The displaying making Rich Media has more specific aim.That is, can be according to the different attribute of user, and then according to not
Same shows that rule carries out the displaying of Rich Media.Meanwhile, described Rich Media based on interactive voice methods of exhibiting
And system passes through semantics recognition, it is possible to search and acquire the rich media information relevant to speech data, enter
And improve information content and the type that interactive voice is shown so that efficiency and effect that interactive voice is shown are big
Big raising.
Accompanying drawing explanation
The stream of one embodiment of Rich Media based on the interactive voice methods of exhibiting that Fig. 1 provides for the present invention
Cheng Tu;
Another embodiment of Rich Media based on the interactive voice methods of exhibiting that Fig. 2 provides for the present invention
Flow chart;
The stream of one embodiment of Rich Media based on the interactive voice display systems that Fig. 3 provides for the present invention
Cheng Tu;
Another embodiment of Rich Media based on the interactive voice display systems that Fig. 4 provides for the present invention
Flow chart.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment,
And referring to the drawings, the present invention is described in more detail.
It should be noted that the statement of all uses " first " and " second " is all in the embodiment of the present invention
The parameter of entity or non-equal in order to distinguish two same names non-equal, it is seen that " first " " second "
Only for the convenience of statement, should not be construed as the restriction to the embodiment of the present invention, subsequent embodiment is to this no longer
Illustrate one by one.
With reference to shown in Fig. 1, for a reality of Rich Media based on the interactive voice methods of exhibiting that the present invention provides
Execute the flow chart of example.Described Rich Media based on interactive voice methods of exhibiting includes:
Step 101, obtains the speech data of user's input;
Wherein, speech data described here is the speech data that the phonetic entry according to user produces, voice
Input refers to that user says the voice needing to interact and refers at the phonetic incepting position of terminal or relevant device
Order, such as: user wants to search for Fructus Mali pumilae, then need to say the spoken language pronunciation of " Fructus Mali pumilae ".Additionally, user
Can also use the speech data that recorded as the phonetic entry of user, such as: use another to broadcast
The speech play that user is prerecorded by equipment of putting is out.Terminal or interactive system will be according to described voices
Data and user carry out interactive voice.
Step 102, carries out frequency domain transform process to described speech data, obtains the frequency spectrum of described speech data
Feature, searches the Customer attribute row form preset, obtains the attribute of user according to described spectrum signature;
Wherein, described frequency domain transform processes and refers to that speech data acquisition obtained carries out frequency-domain analysis, obtains
The frequency domain information of voice, and then obtain the spectral characteristic of speech data.Described Customer attribute row form refers in advance
Arrange is used for relation list the most corresponding with spectrum signature for different user properties, by the frequency obtained
Spectrum signature searches described Customer attribute row form, it becomes possible to determine the user property that described speech data is corresponding.Institute
State the attributes such as age that user property comprises user, sex, ethnic group.It is, of course, also possible to according to spectrum signature
Acquisition can distinguish other differences of user property.
Step 103, carries out semantics recognition by described speech data, it is thus achieved that the semantic information of described speech data,
Search and obtain the rich media information relevant to institute semantic information according to institute's semantic information;
Wherein, first described speech data is generally carried out language by the described semantics recognition that carried out by described speech data
Sound identification, then carries out semantics recognition according to the result of speech recognition.Described speech recognition refers to determine described
Word content in speech data, described semantics recognition refers to identify the artistic conception in language and implication.Described richness
Media include: all kinds of multimedia messages such as word, picture, video, audio frequency.
Step 104, according to default show corresponding with described user property, rule carries out rich media information
Show.
Wherein, the displaying rule that described user property is corresponding refers to that each class user is respectively to there being an exhibition
Show that rule, described displaying rule include showing that the layout at interface, the order of Rich Media's displaying, Rich Media show
Form etc..
From above-described embodiment, described Rich Media based on interactive voice methods of exhibiting is by inputting user
Speech data carry out frequency domain transform process, then utilize frequency domain transform to process the spectrum signature that obtains and used
The attribute at family, and then can be according to the different attribute of user, it is achieved the differentiation of interactive voice processes, and makes richness
The displaying of media has more specific aim.That is, can for different types of user, interactive system or terminal
Realize different interaction, enabling according to the different attribute of user, and then according to different displaying rule
Then carry out the displaying of Rich Media.Meanwhile, described Rich Media based on interactive voice methods of exhibiting is known by semanteme
Not, it is possible to search and acquire the rich media information relevant to speech data, and then improve interactive voice
The information content shown and type, namely substantially increase efficiency and the effect that interactive voice is shown.Especially pin
For child field or the interactive voice of education sector, rich matchmaker based on interactive voice of the present invention
Body display method makes user pass through interactive voice can not only obtain more abundant information resources, Er Qietong
Cross the interactive display of Rich Media, it is possible to be greatly improved the Experience Degree of user, attract the user's attention while power also
The impression of user can be deepened, improve and cultivate and the effect of education.
As some preferred embodiments of the present invention, the step of the semantic information of the described speech data of described acquisition
The most also include: described speech data is converted to Word message according to the result of semantics recognition, default
In time, by described word-information display in interface.By converting voice data into Word message, one
Aspect, it is possible to the voice making user confirm that interactive system or terminal are identified is the most correct, on the other hand,
Also make user the most directly perceived for the displaying of interaction results.Generally, Word message is shown in interface one
Need after fixing time to hide, to avoid bringing interference to the displaying of Rich Media.The displaying side of described Word message
Formula both can be illustrated in interface top, it is also possible to described Word message is illustrated in some corner at interface
In, the lower right corner at such as interface, its display mode can be selected the most accordingly.So, not only improve
The accuracy of voice messaging in interactive voice, and for being directed to the user having defective vision, it is possible to enter one
Step determines that the voice oneself inputted is the most accurate, meanwhile, is also convenient for other users looked on identification and carries out voice
The voice messaging of mutual user's input.
As another preferred embodiment of the present invention, described described speech data is carried out at frequency domain transform
Reason, also includes before the step of the spectrum signature obtaining described speech data:
Judge whether active user is system user according to described speech data,
If system user, then obtain the pre-set user information corresponding with active user;By described speech data
Carry out semantics recognition, it is thus achieved that the semantic information of described speech data, search according to institute's semantic information and obtain
The rich media information relevant to institute semantic information;According to the default displaying rule corresponding with described user profile
Then carry out the displaying of rich media information;
If active user is not system user, then perform described speech data is carried out frequency domain transform process,
Step to the spectrum signature of described speech data.
Wherein, described system user refers to the user with account prestored in system, it is also possible to be referred to as member
User.Such as, for some mobile phone terminal, can prestore mobile phone owner's on mobile phone
Spectrum information, and record more details of this user, here, mobile phone owner is system user.
So, by judging whether active user is that system user can further discriminate between system user and nonsystematic is used
Family.And, by pre-set user information, it is possible to obtain the information that more detailed system user is relevant, enter
And realize Rich Media more accurately for system user and show.
In some preferred embodiments, described according to the default displaying rule corresponding with described user property
The step carrying out rich media information displaying includes: carry out Rich Media's literary composition according to the displaying rule in rich media information
The displaying of part, wherein, comprises rich media file information and corresponding displaying rule letter in described rich media information
Breath, the described order and the mode of displaying showing that rule includes that rich media file shows.Described rich-media content
Including: video, audio frequency, word, animation, the application program that more even can perform.Certainly, rich matchmaker
The content of body is not limited to the above-mentioned content enumerated, and can add the most accordingly, the most no longer repeat.
The embodiment of described displaying rule or displaying rule is including but not limited to following manner: broadcasting video, broadcasting audio frequency,
Audio frequency and video played in order, picture presentation, background display audio frequency, animation, or similar mutual display advertising
Other rule or mode, exhibition methods of application etc..Such as: plurality of pictures both can slide exhibition successively
Show, it is also possible to show as in PPT.Described Rich Media can include showing rule, it is also possible to no
Comprise displaying rule, when not including showing rule, can default setting default rule in systems, will be silent
Recognize rule as showing rule.In such manner, it is possible to the displaying realizing Different Rule for Rich Media, improve richness
The multiformity of display advertising.And, by displaying rule is added in rich media information, enabling
System sets for each search-type and shows rule accordingly, and then realize more abundant displaying side
Formula.
Further, described rich media information is carried out according to the default displaying rule corresponding with described user property
Displaying step after also include:
According to described rich media information, play the voice guidance message preset;
Obtain the new speech data of user's input;
According to described new speech data, show the rich media information that described new speech data is corresponding.
So, user can guide user again to select follow-up corresponding richness according to described guidance information further
Media, then show the rich media file that user is subsequently selected so that mutual displaying process is more flexible,
Improve the Experience Degree of user.
As another preferred embodiment of the present invention, described according to default corresponding with described user property
Show that the regular step 104 carrying out rich media information displaying includes: according to user property, search the use preset
Family attribute and the corresponding relation list showing rule, obtain the displaying rule of active user, described displaying rule
Comprise the sequence of rich media information;Described rich media information is broadcast automatically according to the order in described displaying rule
Put displaying.Wherein, described described rich media information is play exhibition automatically according to the order in described displaying rule
Show and refer to Rich Media is play respectively according to the order shown in rule or displayed.Here, it is directed to
For Voice & Video information, described broadcasting refers to directly play audio or video file;For word or
For pictorial information, described broadcasting refers to that the mode using thunder scholar's lantern slide plays out, it is of course also possible to
The broadcasting form using other plays out.So so that all rich matchmaker that user obtains according to speech data
Body can display from trend user, and the order shown is based on the exhibition corresponding to user's self attributes
Show rule, not only increase the Experience Degree that user is mutual, and enhance the effect that rich media information is shown
Really, and then improve the efficiency of interactive voice.
As further embodiment of the present invention, described according to the default displaying corresponding with described user property
The step of the displaying that rule carries out rich media information also includes: whether the rich media information judging current presentation is
Voice or video information;If the rich media information of current presentation is voice or video information, the most do not perform any
Operation;If the rich media information of current presentation is not voice or video information, then obtains and believe with described Rich Media
The voice messaging of breath binding, and described voice messaging is shown with rich media information simultaneously.It is directed to not be language
Sound or the message file of video information, need to preset the voice messaging of a binding, so, carrying out
When the information such as picture or word is shown, it is also possible to reach the effect that voice is shown, such as: displaying is to close
Picture in panel computer, then, being directed to each computer picture will have a voice messaging substantially
It is introduced, such as: " * * brand flat board ".Certainly, described voice messaging can also is that interactive system or end
Hold and resolve, according to rich media information, the voice messaging obtained, such as: Rich Media is Word message, then be mutual
These Word messages can be converted into voice messaging by system on backstage, and ties up with these Word messages
Fixed.So, not only increase the multiformity of interactive voice, and further increase the Experience Degree of user.
In some optional embodiments, described search and obtain according to institute's semantic information and described semantic letter
The step 103 of the rich media information that manner of breathing closes also includes: the rich media information obtained according to lookup, obtains institute
State the attribute character of rich media information;According to attribute character and the attribute of user of rich media information, search
And judge whether described Rich Media belongs to the mask information preset under this user property;The most then filter described
Rich media information.Wherein, described attribute character refers generally to the type of rich media information, such as: some regards
Frequency file can be divided into literature and art, pornographic, violence, homicide, risk, science fiction etc. type, some picture
Can be divided into bloody, pure and fresh, feel sick etc. type.For different user properties, set the most respectively
The list of one mask information.Such as: for child, mask information is: game, violence, homicide,
Unfavorable information such as pornographic, and for adult, mask information is: some specific religion, criminal
Crime etc. information.Can also as required, the crowd for different sexes sets different mask information lists.
So, be conducive to improving further the efficiency of interactive voice, it is to avoid some mistakes or be not suitable for the exhibition of information
Show, for education sector, many can be disperseed the information screen of child attention fall by mask information,
The interactive voice process making child is the most healthy, effectively.
With reference to shown in Fig. 2, another of Rich Media based on the interactive voice methods of exhibiting provided for the present invention
The flow chart of embodiment.Shown Rich Media based on interactive voice methods of exhibiting includes:
Step 201, obtains the speech data of user's input;
Step 202, it is judged that whether active user is system user, the most then execution step 204, otherwise,
Perform step 203;
Step 203, carries out frequency domain transform process to described speech data, obtains the frequency spectrum of described speech data
Feature, searches the Customer attribute row form preset, obtains the attribute of user according to described spectrum signature
Step 204, obtains the pre-set user information corresponding with active user;
Step 205, carries out semantics recognition by described speech data, it is thus achieved that the semantic information of described speech data,
Search and obtain the rich media information relevant to institute semantic information according to institute's semantic information;
Step 206, is converted to Word message by described speech data, within the default time, by described literary composition
Word information is shown in the top at interface;
Step 207, the rich media information obtained according to lookup, obtain the attribute character of described rich media information;
Step 208, it is judged that whether described Rich Media belongs to the mask information preset under this user property;If so,
Then perform step 210, otherwise perform step 209;
Step 209, according to step 208, described Rich Media is not belonging under this user property the shielding letter preset
Breath, then retain shown rich media information;
Step 210, according to step 208, described Rich Media belongs to the mask information preset under this user property,
Then filter out described rich media information, namely propose this class rich media information;
Step 211, plays displaying by described rich media information automatically according to the order in described displaying rule.
From above-described embodiment, described Rich Media based on interactive voice methods of exhibiting is by by voice messaging
It is converted into Word message and shows, improve the accuracy of interactive voice, by obtaining the spy of rich media information
Reference breath can get rid of, for corresponding user property, the mask information preset so that the displaying of all Rich Medias
Process is more stable and reliable, improves the bandwagon effect of interactive voice.Therefore, of the present invention based on
Rich Media's methods of exhibiting of interactive voice not only increases the accuracy of interactive voice, and improves mutual
Effect, brings more preferable interactive experience to user.
With reference to shown in Fig. 3, for a reality of Rich Media based on the interactive voice display systems that the present invention provides
Execute the flow chart of example.Described Rich Media based on interactive voice display systems, including:
Data acquisition module 301, for obtaining the speech data of user's input;The voice number that acquisition is obtained
Module 302 and data obtaining module 303 is searched according to being sent to attribute;
Attribute searches module 302, for receiving the speech data that described data acquisition module 301 sends, right
Described speech data carries out frequency domain transform process, obtains the spectrum signature of described speech data, according to described frequency
Spectrum signature searches the Customer attribute row form preset, and obtains the attribute of user;The customer attribute information obtained is sent out
Give information display module 304;
Data obtaining module 303, for receiving the speech data that described data acquisition module 301 sends, will
Described speech data carries out semantics recognition, it is thus achieved that the semantic information of described speech data, according to described semantic letter
Breath is searched and obtains the rich media information relevant to institute semantic information;The rich media information of acquisition is sent to
Information display module 304;
Information display module 304, searches, for receiving described attribute, the customer attribute information that module 302 sends
And the rich media information that described data obtaining module 303 sends, according to default with described user property pair
That answers shows that rule carries out the displaying of rich media information.
From above-described embodiment, described Rich Media based on interactive voice display systems is obtained by described data
Delivery block 301 obtains the speech data of user's input, searches module 302 by described attribute and determines user's
Attribute, obtains, by described data obtaining module 303, the rich media information that speech data is corresponding, finally by
Rich media information is shown by described information display module 304 according to default displaying rule.So, no
Only make interactive system can carry out the interaction process of differentiation according to the different attribute of user, and significantly carry
High user carries out efficiency and the effect of interactive voice.
In some preferred embodiments of the present invention, shown in reference Fig. 4, described rich matchmaker based on interactive voice
Body display system also includes word display module 305;Described data acquisition module 301 is additionally operable to, and will obtain
Speech data be sent to described word display module 305;Described word display module 305 is used for, and receives
Described data acquisition module send speech data and described speech data is converted to Word message, preset
Time in, by described word-information display in interface.
In other preferred embodiments of the present invention, also include: user's judge module 306,
Described data acquisition module 301 is additionally operable to, and the speech data obtained is sent to described user and judges mould
Block 306;
Described user's judge module 306 is used for, and receives the speech data that described data acquisition module 301 sends,
Judge whether active user is system user according to described speech data, if system user, then obtain and work as
The pre-set user information that front user is corresponding;Described pre-set user information is sent to information display module 304;
If active user is not system user, then described speech data is sent to attribute and searches module 302;
Described information display module 304 is additionally operable to, and receives the default use that described user's judge module 306 sends
Family information, according to default show corresponding with described pre-set user information, rule carries out the exhibition of rich media information
Show.
In further embodiment of the present invention, described information display module 304 is additionally operable to, according to Rich Media
Displaying rule in information carries out the displaying of rich media file, wherein, comprises rich matchmaker in described rich media information
Body fileinfo and show Rule Information accordingly, described displaying rule includes the order that rich media file is shown
With the mode shown.
In optional embodiment of the present invention, also include: guide module 307;
Described information display module 304 is additionally operable to, according to described rich media information to guiding module 307 to send
Key instruction;
Described guiding module 307 is used for, and receives the key instruction that information display module 304 sends, and plays pre-
If voice guidance message;
Described data acquisition module 301 is additionally operable to, and obtains the new speech data of user's input;By described newly
Speech data be sent to described information display module 304;
Described information display module 304 is additionally operable to, and receives the new language that described data acquisition module 301 sends
Sound data, according to described new speech data, show the rich media information that described new speech data is corresponding.
As one preferred embodiment of the present invention, described data obtaining module 303 is additionally operable to, according to lookup
The rich media information obtained, obtains the attribute character of described rich media information;Attribute according to rich media information
Feature and the attribute of user, search and judge whether described Rich Media belongs to the screen preset under this user property
Cover information;The most then filter out described rich media information.
With reference to shown in Fig. 4, another of Rich Media based on the interactive voice display systems provided for the present invention
The flow chart of embodiment.Described Rich Media based on interactive voice display systems includes: data acquisition module
301, attribute searches module 302, data obtaining module 303, information display module 304, word displaying mould
Block 305, user's judge module 306 and guiding module 307.
Those of ordinary skill in the field are it is understood that the discussion of any of the above embodiment is merely illustrative
, it is not intended that hint the scope of the present disclosure (including claim) is limited to these examples;In the present invention
Thinking under, can also be combined between the technical characteristic in above example or different embodiment, step
Suddenly can realize with random order, and there is other change of many of the different aspect of the present invention as above
Change, for they not offers in details simple and clear.
It addition, for simplifying explanation and discussing, and in order to obscure the invention, provided
Accompanying drawing can illustrate or can not illustrate and integrated circuit (IC) chip and the known power supply of other parts
/ grounding connection.Furthermore, it is possible to illustrate device in block diagram form, in order to avoid obscuring the invention,
And this have also contemplated that following facts, i.e. the details about the embodiment of these block diagram arrangements is highly to depend on
In will implement the present invention platform (that is, these details should be completely in the reason of those skilled in the art
In the range of solution).Elaborating that detail (such as, circuit) is to describe the exemplary embodiment of the present invention
In the case of, it will be apparent to those skilled in the art that can there is no these details
In the case of or these details change in the case of implement the present invention.Therefore, these descriptions should be recognized
For being illustrative and not restrictive.
Although invention has been described to have been incorporated with the specific embodiment of the present invention, but according to above
Description, these embodiments a lot of replace, amendment and modification will be for those of ordinary skills
Obviously.Such as, other memory architecture (such as, dynamic ram (DRAM)) can use
The embodiment discussed.
Embodiments of the invention be intended to fall within the broad range of claims all so
Replacement, amendment and modification.Therefore, all within the spirit and principles in the present invention, any omission of being done,
Amendment, equivalent, improvement etc., should be included within the scope of the present invention.
Claims (12)
1. Rich Media based on an interactive voice methods of exhibiting, it is characterised in that including:
Obtain the speech data of user's input;
Described speech data is carried out frequency domain transform process, obtains the spectrum signature of described speech data, according to
Described spectrum signature searches the Customer attribute row form preset, and obtains the attribute of user;
Described speech data is carried out semantics recognition, it is thus achieved that the semantic information of described speech data, according to described
The rich media information relevant to institute semantic information is searched and obtained to semantic information;
According to default show corresponding with described user property, rule carries out the displaying of rich media information.
Method the most according to claim 1, it is characterised in that the described speech data of described acquisition
Also include after the step of semantic information:
Described speech data is converted to Word message, within the default time, by described word-information display
In interface.
Method the most according to claim 1, it is characterised in that described described speech data is carried out
Frequency domain transform processes, and also includes before the step of the spectrum signature obtaining described speech data:
Judge whether active user is system user according to described speech data,
If system user, then obtain the pre-set user information corresponding with active user;By described speech data
Carry out semantics recognition, it is thus achieved that the semantic information of described speech data, search according to institute's semantic information and obtain
The rich media information relevant to institute semantic information;According to the default displaying rule corresponding with described user profile
Then carry out the displaying of rich media information;
If active user is not system user, then perform described speech data is carried out frequency domain transform process,
Step to the spectrum signature of described speech data.
Method the most according to claim 1, it is characterised in that described according to default with described use
What family attribute was corresponding shows that the regular step carrying out rich media information displaying includes:
The displaying of rich media file, wherein, described Rich Media is carried out according to the displaying rule in rich media information
Comprising rich media file information in information and show Rule Information accordingly, described displaying rule includes Rich Media
The order of document presentation and the mode of displaying.
Method the most according to claim 1, it is characterised in that described according to default with described use
Also include after the step of the displaying that the displaying rule that family attribute is corresponding carries out rich media information:
According to described rich media information, play the voice guidance message preset;
Obtain the new speech data of user's input;
According to described new speech data, show the rich media information that described new speech data is corresponding.
Method the most according to claim 1, it is characterised in that described look into according to institute's semantic information
The step looking for and obtaining the rich media information relevant to institute semantic information also includes:
The rich media information obtained according to lookup, obtains the attribute character of described rich media information;
According to attribute character and the attribute of user of rich media information, search and whether judge described Rich Media
Belong to the mask information preset under this user property;
The most then filter out described rich media information.
7. Rich Media based on an interactive voice display systems, it is characterised in that including:
Data acquisition module, for obtaining the speech data of user's input;Speech data acquisition obtained is sent out
Give attribute and search module and data obtaining module;
Attribute searches module, for receiving the speech data that described data acquisition module sends, to described voice
Data carry out frequency domain transform process, obtain the spectrum signature of described speech data, look into according to described spectrum signature
Look for default Customer attribute row form, obtain the attribute of user;The customer attribute information obtained is sent to information
Display module;
Data obtaining module, for receiving the speech data that described data acquisition module sends, by described voice
Data carry out semantics recognition, it is thus achieved that the semantic information of described speech data, search also according to institute's semantic information
Obtain the rich media information relevant to institute semantic information;The information that the rich media information of acquisition is sent to is shown
Module;
Information display module, searches, for receiving described attribute, the customer attribute information and described that module sends
The rich media information that data obtaining module sends, according to the default displaying rule corresponding with described user property
Carry out the displaying of rich media information.
System the most according to claim 7, it is characterised in that described system also includes that word is shown
Module;
Described data acquisition module is additionally operable to, and the speech data obtained is sent to described word display module;
Described word display module is used for, and receives the speech data of described data acquisition module transmission and by described
Speech data is converted to Word message, within the default time, by described word-information display in interface.
System the most according to claim 7, it is characterised in that also include: user's judge module,
Described data acquisition module is additionally operable to, and the speech data obtained is sent to described user's judge module;
Described user's judge module is used for, and receives the speech data that described data acquisition module sends, according to institute
State speech data and judge whether active user is system user, if system user, then obtain and active user
Corresponding pre-set user information;Described pre-set user information is sent to information display module;If active user
It not system user, then described speech data is sent to attribute and searches module;
Described information display module is additionally operable to, and receives the pre-set user information that described user's judge module sends,
According to default show corresponding with described pre-set user information, rule carries out the displaying of rich media information.
System the most according to claim 7, it is characterised in that described information display module is additionally operable to,
The displaying of rich media file, wherein, described Rich Media is carried out according to the displaying rule in rich media information
Comprising rich media file information in information and show Rule Information accordingly, described displaying rule includes Rich Media
The order of document presentation and the mode of displaying.
11. systems according to claim 7, it is characterised in that also include: guide module;
Described information display module is additionally operable to, according to described rich media information to guiding module transmission guiding to refer to
Order;
Described guiding module is used for, and receives the key instruction that information display module sends, and plays the voice preset
Guidance information;
Described data acquisition module is additionally operable to, and obtains the new speech data of user's input;By described new language
Sound data are sent to described information display module;
Described information display module is additionally operable to, and receives the new speech data that described data acquisition module sends,
According to described new speech data, show the rich media information that described new speech data is corresponding.
12. systems according to claim 7, it is characterised in that described data obtaining module is additionally operable to,
The rich media information obtained according to lookup, obtains the attribute character of described rich media information;
According to attribute character and the attribute of user of rich media information, search and whether judge described Rich Media
Belong to the mask information preset under this user property;
The most then filter out described rich media information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610279818.0A CN106027485A (en) | 2016-04-28 | 2016-04-28 | Rich media display method and system based on voice interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610279818.0A CN106027485A (en) | 2016-04-28 | 2016-04-28 | Rich media display method and system based on voice interaction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106027485A true CN106027485A (en) | 2016-10-12 |
Family
ID=57081725
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610279818.0A Pending CN106027485A (en) | 2016-04-28 | 2016-04-28 | Rich media display method and system based on voice interaction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106027485A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132805A (en) * | 2017-12-20 | 2018-06-08 | 深圳Tcl新技术有限公司 | Voice interactive method, device and computer readable storage medium |
CN109147800A (en) * | 2018-08-30 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Answer method and device |
CN109165336A (en) * | 2018-08-23 | 2019-01-08 | 广东小天才科技有限公司 | A kind of information output controlling method and private tutor's equipment |
CN111081248A (en) * | 2019-12-27 | 2020-04-28 | 安徽仁昊智能科技有限公司 | Artificial intelligence speech recognition device |
CN111638789A (en) * | 2020-05-29 | 2020-09-08 | 广东小天才科技有限公司 | Data output method and terminal equipment |
CN112458703A (en) * | 2019-08-19 | 2021-03-09 | 青岛海尔洗衣机有限公司 | Information display processing method and device, washing machine and storage medium |
WO2022041192A1 (en) * | 2020-08-29 | 2022-03-03 | 深圳市永兴元科技股份有限公司 | Voice message processing method and device, and instant messaging client |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN201733370U (en) * | 2010-05-05 | 2011-02-02 | 康佳集团股份有限公司 | Set-top box with children mode |
CN102708174A (en) * | 2012-05-04 | 2012-10-03 | 奇智软件(北京)有限公司 | Method and device for displaying rich media information in browser |
CN103677516A (en) * | 2013-11-27 | 2014-03-26 | 青岛海信电器股份有限公司 | Interface generating method and device of terminal |
CN104795067A (en) * | 2014-01-20 | 2015-07-22 | 华为技术有限公司 | Voice interaction method and device |
CN105095406A (en) * | 2015-07-09 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and apparatus for voice search based on user feature |
-
2016
- 2016-04-28 CN CN201610279818.0A patent/CN106027485A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN201733370U (en) * | 2010-05-05 | 2011-02-02 | 康佳集团股份有限公司 | Set-top box with children mode |
CN102708174A (en) * | 2012-05-04 | 2012-10-03 | 奇智软件(北京)有限公司 | Method and device for displaying rich media information in browser |
CN103677516A (en) * | 2013-11-27 | 2014-03-26 | 青岛海信电器股份有限公司 | Interface generating method and device of terminal |
CN104795067A (en) * | 2014-01-20 | 2015-07-22 | 华为技术有限公司 | Voice interaction method and device |
CN105095406A (en) * | 2015-07-09 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and apparatus for voice search based on user feature |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108132805A (en) * | 2017-12-20 | 2018-06-08 | 深圳Tcl新技术有限公司 | Voice interactive method, device and computer readable storage medium |
CN109165336A (en) * | 2018-08-23 | 2019-01-08 | 广东小天才科技有限公司 | A kind of information output controlling method and private tutor's equipment |
CN109147800A (en) * | 2018-08-30 | 2019-01-04 | 百度在线网络技术(北京)有限公司 | Answer method and device |
US11475897B2 (en) | 2018-08-30 | 2022-10-18 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for response using voice matching user category |
CN112458703A (en) * | 2019-08-19 | 2021-03-09 | 青岛海尔洗衣机有限公司 | Information display processing method and device, washing machine and storage medium |
CN112458703B (en) * | 2019-08-19 | 2024-03-12 | 青岛海尔洗衣机有限公司 | Information display processing method and device, washing machine and storage medium |
CN111081248A (en) * | 2019-12-27 | 2020-04-28 | 安徽仁昊智能科技有限公司 | Artificial intelligence speech recognition device |
CN111638789A (en) * | 2020-05-29 | 2020-09-08 | 广东小天才科技有限公司 | Data output method and terminal equipment |
WO2022041192A1 (en) * | 2020-08-29 | 2022-03-03 | 深圳市永兴元科技股份有限公司 | Voice message processing method and device, and instant messaging client |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106027485A (en) | Rich media display method and system based on voice interaction | |
Stiernstedt et al. | Watching reality from a distance: Class, genre and reality television | |
US20150254349A1 (en) | System and Method for Providing Content in Real-Time | |
Thurlow et al. | Visualizing teens and technology: A social semiotic analysis of stock photography and news media imagery | |
CN109558513A (en) | A kind of content recommendation method, device, terminal and storage medium | |
Raymond | Gender and sexuality in animated television sitcom interaction | |
CN108012173A (en) | A kind of content identification method, device, equipment and computer-readable storage medium | |
Hills | Television aesthetics: A pre-structuralist danger? | |
CN112653902A (en) | Speaker recognition method and device and electronic equipment | |
CN102855317A (en) | Multimode indexing method and system based on demonstration video | |
Yang | Lightness, wildness, and ambivalence: China and new media studies | |
Díaz-Cintas | 10 Audiovisual Translation in Mercurial Mediascapes | |
CN109326151A (en) | Implementation method, client and server based on semantics-driven virtual image | |
Coupland | Social context, style, and identity in sociolinguistics | |
Jessen et al. | Cross-media communication in advertising: exploring multimodal connections between television commercials and websites | |
Navas Medrano et al. | Enabling remote deictic communication with mobile devices: An elicitation study | |
CN112165627A (en) | Information processing method, device, storage medium, terminal and system | |
EP3664080A1 (en) | Information processing device, information processing method, and program | |
Campos et al. | Machine Generation of Audio Description for Blind and Visually Impaired People | |
CN111160051B (en) | Data processing method, device, electronic equipment and storage medium | |
CN114765033A (en) | Information processing method and device based on live broadcast room | |
Godwin-Jones | Technology-mediated SLAEvolving Trends and Emerging Technologies | |
KR20120027647A (en) | Learning contents generating system and method thereof | |
Manolas et al. | Soundtrack loudness as a depth cue in stereoscopic 3D media | |
CN110415015A (en) | Product degree of recognition analysis method, device, terminal and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20161012 |