CN110113646B - AI voice-based intelligent interactive processing method, system and storage medium - Google Patents
AI voice-based intelligent interactive processing method, system and storage medium Download PDFInfo
- Publication number
- CN110113646B CN110113646B CN201910239885.3A CN201910239885A CN110113646B CN 110113646 B CN110113646 B CN 110113646B CN 201910239885 A CN201910239885 A CN 201910239885A CN 110113646 B CN110113646 B CN 110113646B
- Authority
- CN
- China
- Prior art keywords
- user
- intelligent
- voiceprint
- voice
- scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 116
- 238000003672 processing method Methods 0.000 title claims abstract description 27
- 238000004458 analytical method Methods 0.000 claims abstract description 103
- 230000003993 interaction Effects 0.000 claims abstract description 101
- 238000012545 processing Methods 0.000 claims abstract description 59
- 238000000034 method Methods 0.000 claims abstract description 27
- 230000008569 process Effects 0.000 claims abstract description 17
- 230000004044 response Effects 0.000 claims abstract description 7
- 230000006399 behavior Effects 0.000 claims description 67
- 230000008451 emotion Effects 0.000 claims description 33
- 238000010276 construction Methods 0.000 claims description 23
- 230000015654 memory Effects 0.000 claims description 19
- 238000000354 decomposition reaction Methods 0.000 claims description 16
- 230000002996 emotional effect Effects 0.000 claims description 15
- 230000014509 gene expression Effects 0.000 claims description 11
- 238000007405 data analysis Methods 0.000 claims description 10
- 230000008921 facial expression Effects 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 238000004891 communication Methods 0.000 claims description 8
- 230000009471 action Effects 0.000 claims description 6
- 238000013473 artificial intelligence Methods 0.000 description 100
- 230000006870 function Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 238000009432 framing Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000010411 cooking Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 235000015277 pork Nutrition 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/4223—Cameras
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/442—Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
- H04N21/44213—Monitoring of end-user related data
- H04N21/44218—Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/4508—Management of client data or end-user data
- H04N21/4532—Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4667—Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- Child & Adolescent Psychology (AREA)
- Computer Networks & Wireless Communication (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention discloses an intelligent interactive processing method, a system and a storage medium based on AI voice, wherein the method comprises the following steps: the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera; the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data; and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result. The invention provides an AI voice-based intelligent interaction processing method and system convenient for intelligent recognition and interactive recommendation, which enable a smart television to have a better intelligent interaction function and are convenient for a user to use.
Description
Technical Field
The invention relates to the technical field of intelligent home furnishing, in particular to an intelligent interactive processing method and system based on AI voice and a storage medium.
Background
With the development of scientific technology, intelligent consumer electronics are becoming popular, and voiceprint recognition, which is one of AI speech technologies, is a leading technology at present, and can identify the voice attributes of speakers (gender, age, and voice affiliation of different speakers (which user says a sentence can be distinguished through voiceprint)).
Current voiceprint recognition applications remain in the infancy and are essentially still at the level of being able to identify some underlying voiceprint attributes (e.g., male, female, old, young, voiceprint affiliation (who's voiceprint)), lacking AI home scenario application level development based on voiceprint recognition technology.
The smart television in the prior art also has a better intelligent interaction function, and is sometimes inconvenient for users to use
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
In view of the defects of the prior art, the invention aims to provide an intelligent interactive processing method, system and storage medium based on AI voice, and provides an intelligent interactive processing method and system based on AI voice, which are convenient for intelligent recognition and interactive recommendation, so that a better intelligent interactive function is added to the intelligent television, and the use by a user is facilitated.
In order to achieve the purpose, the invention adopts the following technical scheme:
an AI voice-based intelligent interactive processing method comprises the following steps:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result.
The intelligent interactive processing method based on AI voice, wherein the step A further comprises: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance.
The intelligent interactive processing method based on AI voice, wherein the step B comprises:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The intelligent interactive processing method based on AI voice, wherein the step B of analyzing and processing the voice image information of the user by using an AI home intelligent interactive scene database which is pre-constructed and corresponds to the user behavior data comprises the steps of:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing the voice instruction of the user by constructing an AI family intelligent interaction scene.
The intelligent interactive processing method based on AI voice, wherein the steps of performing semantic recognition and scene construction of voice instructions comprise:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: and what processing event happens in what voiceprint scene combination, what interaction happens when the voiceprint scene combination happens and what interaction is carried out by the user after the voiceprint scene combination happens are used for predicting the next action of the user through historical data analysis and outputting some preprocessing.
The intelligent interactive processing method based on AI voice, wherein the step C comprises:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
An intelligent interactive processing system based on AI voice, comprising: a processor, a memory, and a communication bus;
the memory stores an AI voice based intelligent interactive processing program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
when the processor executes the AI voice-based intelligent interactive processing program, the following steps are realized:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result.
The intelligent interactive processing system based on the AI voice, wherein the processor executes the intelligent interactive processing program based on the AI voice, and further implements the following steps:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The intelligent interactive processing system based on the AI voice, wherein the processor executes the intelligent interactive processing program based on the AI voice, and further implements the following steps:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
A storage medium in which one or more programs are stored, the one or more programs being executable by one or more processors to implement the steps of any one of the AI speech-based intelligent interactive processing methods.
Compared with the prior art, the intelligent interactive processing method, the system and the storage medium based on the AI voice provided by the invention have the advantages that the intelligent camera with the far-field voice module voiceprint recognition is carried on the intelligent television, the user interacts with the television through the far-field voice of the intelligent camera, each sentence of voice interaction of the user is analyzed and processed through the AI family intelligent interactive system block, and the analyzed and processed contents comprise: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step of behavior of the user is judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attribute, user interaction record and user association (interaction between the user and the user) record) are created intelligently, and the voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved; all the data mentioned above are stored on the cloud.
The invention provides a deep emotion interaction experience for smart home and AI voice intelligent interaction, improves the experience and interest of products, improves the intelligent experience of household smart home with a television as the center, and provides a companion home experience. The invention increases better intelligent interaction function for the intelligent television and is convenient for users to use.
Drawings
Fig. 1 is a flowchart of an AI speech-based intelligent interactive processing method according to the present invention.
Fig. 2 is a functional block diagram of a mobile terminal according to a preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the intelligent interactive processing method based on AI speech provided by the present invention includes the following steps:
s100, connecting and setting an intelligent camera with far-field voice module voiceprint recognition on the intelligent television in advance, and interacting with the intelligent television through the far-field voice module of the intelligent camera;
in the embodiment of the invention, the intelligent camera with the far-field voice module voiceprint recognition function is required to be connected and arranged on the intelligent television in advance and is used for interacting with the intelligent television through the far-field voice module of the intelligent camera. The intelligent television carries an intelligent camera with a far-field voice module voiceprint recognition function, a user interacts with the television through far-field voice of the intelligent camera, and each sentence of voice interaction of the user is analyzed and processed through an AI (artificial intelligence) family intelligent interaction system block.
The step S100 further includes: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance. For example, when the user speaks the behavior data of 'what is fun' in voice, the user is correspondingly recommended to 'games or travel items that the user needs to play frequently'.
S200, the intelligent camera shoots and acquires voice image information of the user in real time, and the voice image information of the user is analyzed and processed by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data.
The step S200 specifically includes:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The step B of analyzing and processing the voice image information of the user by using the pre-constructed AI home intelligent interactive scene database corresponding to the user behavior data includes:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing the voice instruction of the user by constructing an AI family intelligent interaction scene.
The steps of performing semantic recognition and scene construction of the voice command comprise:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: and what processing event happens in what voiceprint scene combination, what interaction happens when the voiceprint scene combination happens and what interaction is carried out by the user after the voiceprint scene combination happens are used for predicting the next action of the user through historical data analysis and outputting some preprocessing.
In this step S200, the user interacts with the television through the far-field speech of the intelligent camera, each sentence of speech interaction of the user is analyzed and processed through the AI home intelligent interaction system block, and the analyzed and processed contents include: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step behaviors of the user are judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attributes, user interaction records and user association (interaction between the user and the user) records) are created intelligently, and the voice instruction of the user is further analyzed and processed through constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved. All the data mentioned above are stored on the cloud.
S300, the smart television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis processing result.
The step S300 specifically includes:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
For example: a user A + a user B send an instruction to a camera [ we do today ], an AI family intelligent interaction system analyzes whether before the A/B user has a television at the same time, if the A/B user has the television at the same time, interactive memories of things that they have done before are given, and present opinions and recommendations are given according to present family scenes, the opinions and the recommendations are diverse, application data (such as watching television, playing games and cooking) in the television, shopping (new style recommendations, shopping discounts), travel (travel recommendations) and other operation data can be given, the data are pre-judged according to the behaviors of the users, and the data are continuously learned and corrected according to the interactive behaviors of the users, so that the AI family intelligent interaction system is close to the habits and obtained by the users.
The invention is described in further detail below by way of a specific application example:
s11, the smart television is provided with an intelligent camera with a far-field voice module for voiceprint recognition.
S12, when the intelligent television is turned on, the intelligent camera is in a working state.
And S13, the intelligent camera monitors the user 'S speaking and transmits the user' S speaking record to the AI home intelligent interactive system.
S14, the AI home intelligent interactive system analyzes and processes the speaking of the user; the content of the analysis process includes: semantic recognition of voice commands (voice command decomposition): the method comprises the steps of analyzing whether the speech of a user belongs to an instruction class (the instruction class belongs to the fact that the speech intention of the user is very clear, and the instruction can be executed without scene analysis, such as how I want to watch Liu De movie, listen to a beautiful song, eat red pork and the like) or a scene construction class (such as how hot weather exists, what you do at present, what you feel chatty, what you eat at noon and the like).
Voiceprint attributes of the current user (voiceprint recognition (gender, age, etc.): which voiceprint users have been simultaneously present
Vocal print emotional characteristics (excited, sad, flat, etc.): what the scene of voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is (the scene is analyzed by voiceprint (default definition: excited, warm, happy, hot, etc.))
Face recognition (user attributes, expression attributes), user system association): who and who appeared at the same time, what the expression was, and what the time was.
User home scene analysis (one person, multiple persons, combination of persons, home scene (party, dinner party, leisure, etc. according to the predetermined template analysis by intelligent camera view)
Emotional analysis of the user (by voiceprint + voiceprint emotional characteristics + facial expression + scene)
Scene history analysis (which voiceprint scene combinations have occurred, what processing events have occurred, what time has occurred, what interaction has been performed by the user after the occurrence, prediction of the user's next-step behavior through historical data analysis, and output of some pre-processing)
The voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, and the scene construction capability and the emotion interaction capability of AI voice are improved.
And S15, when the intelligent camera detects the voice data of the user and transmits the voice data to the AI home intelligent interactive system, the AI home intelligent interactive system creates an attribute record of the user, takes the ID, the voiceprint attribute and the face attribute of the user as the identification values of the user, and can locate the user through any one of the three attributes.
And S16, when the AI family intelligent interactive system detects an unfamiliar voiceprint or human face, creating an attribute record of the user by default, and increasing the voiceprint attribute of the user corresponding to the voiceprint through subsequent interactive intelligence. And conversely, if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is intelligently increased through subsequent interaction.
And S17, after the successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and the like of the user (including instruction history sent by the user and record of instruction execution, subsequent interaction performed by the user on the instruction and the like, and the basic data of the user is listed in 6, 7, 8, 9, 10 and 11 but not limited to the listed data records).
And S18, decomposing the voice sent by the user through the AI family intelligent interactive system to obtain the pre-execution operation of the user or recommend the best interactive scene of the user.
Such as: a user A + a user B send an instruction to a camera [ we do today ], an AI family intelligent interaction system analyzes whether before the A/B user has a television at the same time, if the A/B user has the television at the same time, interactive memories of things that they have done before are given, and present opinions and recommendations are given according to present family scenes, the opinions and the recommendations are diverse, application data (such as watching television, playing games and cooking) in the television, shopping (new style recommendations, shopping discounts), travel (travel recommendations) and other operation data can be given, the data are pre-judged according to the behaviors of the users, and the data are continuously learned and corrected according to the interactive behaviors of the users, so that the AI family intelligent interaction system is close to the habits and obtained by the users.
Therefore, the invention provides an intelligent interactive processing method based on AI voice, and provides an intelligent interactive processing method and system based on AI voice, which are convenient for intelligent recognition and interactive recommendation, so that a better intelligent interactive function is added to the intelligent television, and the intelligent television is convenient for users to use.
As shown in fig. 2, based on the above intelligent interactive processing method based on AI voice, the present invention further provides an intelligent interactive processing system based on AI voice, where the intelligent interactive processing system based on AI voice may be an intelligent television, a desktop computer, a notebook computer, a palm computer, and an intelligent device with an intelligent sound. The AI-based intelligent interactive processing system comprises a processor 10, a memory 20 and a display screen 30, wherein the processor 10 is connected with the memory 20 through a communication bus 50, and the display screen 30 is connected with the processor 10 through the communication bus 50. FIG. 2 shows only some of the components of the AI voice-based intelligent interactive processing system, but it is to be understood that not all of the shown components are required and that more or fewer components can alternatively be implemented.
The storage 20 may be an internal storage unit of the AI voice-based intelligent interactive processing system in some embodiments, for example, a memory of the AI voice-based intelligent interactive processing system. The memory 20 may also be an external storage device of the AI voice-based intelligent interactive processing system in other embodiments, such as a plug-in usb disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the AI voice-based intelligent interactive processing system. Further, the memory 20 may also include both an internal storage unit and an external storage device of the AI voice based intelligent interactive processing system. The memory 20 is used for storing application software installed in the AI voice-based intelligent interactive processing system and various types of data, such as program codes for installing the AI voice-based intelligent interactive processing system. The memory 20 may also be used to temporarily store data that has been output or is to be output. In an embodiment, the memory 20 stores an AI voice-based intelligent interactive processing method program 40, and the AI voice-based intelligent interactive processing method program 40 can be executed by the processor 10, so as to implement the AI voice-based intelligent interactive processing method in the present application.
The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), a microprocessor, a mobile phone baseband processor or other data Processing chip, and is configured to run program codes stored in the memory 20 or process data, such as executing the AI voice-based intelligent interactive Processing method.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display screen 30 is used for displaying information in the AI voice-based intelligent interactive processing system and for displaying a visual user interface.
In one embodiment, when the processor 10 executes the AI voice-based intelligent interactive processing method program 40 in the memory 20, the following steps are implemented:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the smart television pre-judges the behavior habits of the user and performs corresponding interactive response according to the analysis processing result, which is specifically described above.
When the processor executes the AI-voice-based intelligent interactive processing program, the following steps are also realized:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
When the processor executes the AI-voice-based intelligent interactive processing program, the following steps are also realized:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
after performing AI home intelligent interactive decomposition on the voice image information of the user, obtaining a pre-execution operation of the user, or recommending the best interactive scene of the user and performing corresponding prompting, as described above.
Based on the foregoing embodiments, the present invention further provides a computer-readable storage medium, where one or more programs are stored, and the one or more programs are executable by one or more processors to implement the steps in the AI voice-based intelligent interactive processing method according to any item above, specifically as described above.
In summary, according to the intelligent interactive processing method, system and storage medium based on AI voice provided by the present invention, the intelligent camera with far-field voice module voiceprint recognition is mounted on the intelligent television, the user interacts with the television through the far-field voice of the intelligent camera, each sentence of voice interaction of the user is analyzed and processed through the AI family intelligent interactive system block, and the analyzed and processed contents include: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step of behavior of the user is judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attribute, user interaction record and user association (interaction between the user and the user) record) are created intelligently, and the voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved; all the data mentioned above are stored on the cloud.
The invention provides a deep emotion interaction experience for smart home and AI voice intelligent interaction, improves the experience and interest of products, improves the intelligent experience of household smart home with a television as the center, and provides a companion home experience. The invention increases better intelligent interaction function for the intelligent television and is convenient for users to use.
Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.
Claims (4)
1. An intelligent interactive processing method based on AI voice is characterized by comprising the following steps:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
the step B comprises the following steps:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
the step B of analyzing and processing the voice image information of the user by utilizing an AI family intelligent interactive scene database which is pre-constructed and corresponds to the user behavior data comprises the following steps:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
the step of performing semantic recognition and scene construction of the voice instruction comprises the following steps:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
C. the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result;
the step C comprises the following steps:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
2. The AI voice-based intelligent interactive processing method according to claim 1, wherein the step a further comprises: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance.
3. An intelligent interactive processing system based on AI speech, comprising:
a processor, a memory, and a communication bus;
the memory stores an AI voice based intelligent interactive processing program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
when the processor executes the AI voice-based intelligent interactive processing program, the following steps are realized:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result;
when the processor executes the AI voice-based intelligent interaction processing program, the following steps are also realized:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
when the processor executes the AI voice-based intelligent interaction processing program, the following steps are also realized:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
4. A storage medium, characterized in that the computer-readable storage medium stores one or more programs which are executable by one or more processors to implement the steps in the AI speech-based intelligent interactive processing method according to any one of claims 1-2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910239885.3A CN110113646B (en) | 2019-03-27 | 2019-03-27 | AI voice-based intelligent interactive processing method, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910239885.3A CN110113646B (en) | 2019-03-27 | 2019-03-27 | AI voice-based intelligent interactive processing method, system and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110113646A CN110113646A (en) | 2019-08-09 |
CN110113646B true CN110113646B (en) | 2021-09-21 |
Family
ID=67484676
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910239885.3A Active CN110113646B (en) | 2019-03-27 | 2019-03-27 | AI voice-based intelligent interactive processing method, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110113646B (en) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110750773B (en) * | 2019-09-16 | 2023-08-18 | 康佳集团股份有限公司 | Image recognition method based on voiceprint attribute, intelligent terminal and storage medium |
CN110931011A (en) * | 2020-01-07 | 2020-03-27 | 杭州凯旗科技有限公司 | AI intelligent voice interaction method applied to intelligent retail equipment |
CN111326158A (en) * | 2020-01-23 | 2020-06-23 | 深圳市安顺康医疗电子有限公司 | Voice control method based on intelligent terminal |
CN111324202A (en) * | 2020-02-19 | 2020-06-23 | 中国第一汽车股份有限公司 | Interaction method, device, equipment and storage medium |
CN111901672A (en) * | 2020-06-12 | 2020-11-06 | 深圳市京华信息技术有限公司 | Artificial intelligence image processing method |
CN111967380A (en) * | 2020-08-16 | 2020-11-20 | 云知声智能科技股份有限公司 | Content recommendation method and system |
CN112203144A (en) * | 2020-10-12 | 2021-01-08 | 广州欢网科技有限责任公司 | Intelligent television program recommendation method and device and intelligent television |
CN112261289B (en) * | 2020-10-16 | 2022-08-26 | 海信视像科技股份有限公司 | Display device and AI algorithm result acquisition method |
CN112383748B (en) * | 2020-11-02 | 2023-05-02 | 中国联合网络通信集团有限公司 | Video information storage method and device |
CN112397061B (en) * | 2020-11-04 | 2023-10-27 | 中国平安人寿保险股份有限公司 | Online interaction method, device, equipment and storage medium |
CN112651334B (en) * | 2020-12-25 | 2023-05-23 | 三星电子(中国)研发中心 | Robot video interaction method and system |
CN115689810B (en) * | 2023-01-04 | 2023-04-04 | 深圳市人马互动科技有限公司 | Data processing method based on man-machine conversation and related device |
CN116453549A (en) * | 2023-05-05 | 2023-07-18 | 广西牧哲科技有限公司 | AI dialogue method based on virtual digital character and online virtual digital system |
CN116913277B (en) * | 2023-09-06 | 2023-11-21 | 北京惠朗时代科技有限公司 | Voice interaction service system based on artificial intelligence |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104038836A (en) * | 2014-06-03 | 2014-09-10 | 四川长虹电器股份有限公司 | Television program intelligent pushing method |
CN106682090A (en) * | 2016-11-29 | 2017-05-17 | 上海智臻智能网络科技股份有限公司 | Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment |
-
2019
- 2019-03-27 CN CN201910239885.3A patent/CN110113646B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104038836A (en) * | 2014-06-03 | 2014-09-10 | 四川长虹电器股份有限公司 | Television program intelligent pushing method |
CN106682090A (en) * | 2016-11-29 | 2017-05-17 | 上海智臻智能网络科技股份有限公司 | Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110113646A (en) | 2019-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110113646B (en) | AI voice-based intelligent interactive processing method, system and storage medium | |
US10762299B1 (en) | Conversational understanding | |
CN107632706B (en) | Application data processing method and system of multi-modal virtual human | |
US10360265B1 (en) | Using a voice communications device to answer unstructured questions | |
CN107481720B (en) | Explicit voiceprint recognition method and device | |
US11132547B2 (en) | Emotion recognition-based artwork recommendation method and device, medium, and electronic apparatus | |
CN107704169B (en) | Virtual human state management method and system | |
CN112997171A (en) | Analyzing web pages to facilitate automated navigation | |
CN108353103A (en) | Subscriber terminal equipment and its method for recommendation response message | |
US20130143185A1 (en) | Determining user emotional state | |
US10719695B2 (en) | Method for pushing picture, mobile terminal, and storage medium | |
WO2021056837A1 (en) | Customization platform and method for service quality evaluation product | |
US11392213B2 (en) | Selective detection of visual cues for automated assistants | |
US20220234593A1 (en) | Interaction method and apparatus for intelligent cockpit, device, and medium | |
US20180272240A1 (en) | Modular interaction device for toys and other devices | |
Lopatovska et al. | User recommendations for intelligent personal assistants | |
WO2017157174A1 (en) | Information processing method, device, and terminal device | |
CN111797249A (en) | Content pushing method, device and equipment | |
CN112867985A (en) | Determining whether to automatically resume a first automated assistant session after interrupting suspension of a second session | |
CN111797304A (en) | Content pushing method, device and equipment | |
CN109325173B (en) | Reading content personalized recommendation method and system based on AI open platform | |
KR20220155601A (en) | Voice-based selection of augmented reality content for detected objects | |
CN113703585A (en) | Interaction method, interaction device, electronic equipment and storage medium | |
US20230376328A1 (en) | Personalized user interface | |
CN112135170A (en) | Display device, server and video recommendation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |