CN110113646B - AI voice-based intelligent interactive processing method, system and storage medium - Google Patents

AI voice-based intelligent interactive processing method, system and storage medium Download PDF

Info

Publication number
CN110113646B
CN110113646B CN201910239885.3A CN201910239885A CN110113646B CN 110113646 B CN110113646 B CN 110113646B CN 201910239885 A CN201910239885 A CN 201910239885A CN 110113646 B CN110113646 B CN 110113646B
Authority
CN
China
Prior art keywords
user
intelligent
voiceprint
voice
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910239885.3A
Other languages
Chinese (zh)
Other versions
CN110113646A (en
Inventor
周胜杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Konka Electronic Technology Co Ltd
Original Assignee
Shenzhen Konka Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Konka Electronic Technology Co Ltd filed Critical Shenzhen Konka Electronic Technology Co Ltd
Priority to CN201910239885.3A priority Critical patent/CN110113646B/en
Publication of CN110113646A publication Critical patent/CN110113646A/en
Application granted granted Critical
Publication of CN110113646B publication Critical patent/CN110113646B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Child & Adolescent Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses an intelligent interactive processing method, a system and a storage medium based on AI voice, wherein the method comprises the following steps: the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera; the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data; and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result. The invention provides an AI voice-based intelligent interaction processing method and system convenient for intelligent recognition and interactive recommendation, which enable a smart television to have a better intelligent interaction function and are convenient for a user to use.

Description

AI voice-based intelligent interactive processing method, system and storage medium
Technical Field
The invention relates to the technical field of intelligent home furnishing, in particular to an intelligent interactive processing method and system based on AI voice and a storage medium.
Background
With the development of scientific technology, intelligent consumer electronics are becoming popular, and voiceprint recognition, which is one of AI speech technologies, is a leading technology at present, and can identify the voice attributes of speakers (gender, age, and voice affiliation of different speakers (which user says a sentence can be distinguished through voiceprint)).
Current voiceprint recognition applications remain in the infancy and are essentially still at the level of being able to identify some underlying voiceprint attributes (e.g., male, female, old, young, voiceprint affiliation (who's voiceprint)), lacking AI home scenario application level development based on voiceprint recognition technology.
The smart television in the prior art also has a better intelligent interaction function, and is sometimes inconvenient for users to use
Accordingly, the prior art is yet to be improved and developed.
Disclosure of Invention
In view of the defects of the prior art, the invention aims to provide an intelligent interactive processing method, system and storage medium based on AI voice, and provides an intelligent interactive processing method and system based on AI voice, which are convenient for intelligent recognition and interactive recommendation, so that a better intelligent interactive function is added to the intelligent television, and the use by a user is facilitated.
In order to achieve the purpose, the invention adopts the following technical scheme:
an AI voice-based intelligent interactive processing method comprises the following steps:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result.
The intelligent interactive processing method based on AI voice, wherein the step A further comprises: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance.
The intelligent interactive processing method based on AI voice, wherein the step B comprises:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The intelligent interactive processing method based on AI voice, wherein the step B of analyzing and processing the voice image information of the user by using an AI home intelligent interactive scene database which is pre-constructed and corresponds to the user behavior data comprises the steps of:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing the voice instruction of the user by constructing an AI family intelligent interaction scene.
The intelligent interactive processing method based on AI voice, wherein the steps of performing semantic recognition and scene construction of voice instructions comprise:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: and what processing event happens in what voiceprint scene combination, what interaction happens when the voiceprint scene combination happens and what interaction is carried out by the user after the voiceprint scene combination happens are used for predicting the next action of the user through historical data analysis and outputting some preprocessing.
The intelligent interactive processing method based on AI voice, wherein the step C comprises:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
An intelligent interactive processing system based on AI voice, comprising: a processor, a memory, and a communication bus;
the memory stores an AI voice based intelligent interactive processing program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
when the processor executes the AI voice-based intelligent interactive processing program, the following steps are realized:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result.
The intelligent interactive processing system based on the AI voice, wherein the processor executes the intelligent interactive processing program based on the AI voice, and further implements the following steps:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The intelligent interactive processing system based on the AI voice, wherein the processor executes the intelligent interactive processing program based on the AI voice, and further implements the following steps:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
A storage medium in which one or more programs are stored, the one or more programs being executable by one or more processors to implement the steps of any one of the AI speech-based intelligent interactive processing methods.
Compared with the prior art, the intelligent interactive processing method, the system and the storage medium based on the AI voice provided by the invention have the advantages that the intelligent camera with the far-field voice module voiceprint recognition is carried on the intelligent television, the user interacts with the television through the far-field voice of the intelligent camera, each sentence of voice interaction of the user is analyzed and processed through the AI family intelligent interactive system block, and the analyzed and processed contents comprise: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step of behavior of the user is judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attribute, user interaction record and user association (interaction between the user and the user) record) are created intelligently, and the voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved; all the data mentioned above are stored on the cloud.
The invention provides a deep emotion interaction experience for smart home and AI voice intelligent interaction, improves the experience and interest of products, improves the intelligent experience of household smart home with a television as the center, and provides a companion home experience. The invention increases better intelligent interaction function for the intelligent television and is convenient for users to use.
Drawings
Fig. 1 is a flowchart of an AI speech-based intelligent interactive processing method according to the present invention.
Fig. 2 is a functional block diagram of a mobile terminal according to a preferred embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, the intelligent interactive processing method based on AI speech provided by the present invention includes the following steps:
s100, connecting and setting an intelligent camera with far-field voice module voiceprint recognition on the intelligent television in advance, and interacting with the intelligent television through the far-field voice module of the intelligent camera;
in the embodiment of the invention, the intelligent camera with the far-field voice module voiceprint recognition function is required to be connected and arranged on the intelligent television in advance and is used for interacting with the intelligent television through the far-field voice module of the intelligent camera. The intelligent television carries an intelligent camera with a far-field voice module voiceprint recognition function, a user interacts with the television through far-field voice of the intelligent camera, and each sentence of voice interaction of the user is analyzed and processed through an AI (artificial intelligence) family intelligent interaction system block.
The step S100 further includes: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance. For example, when the user speaks the behavior data of 'what is fun' in voice, the user is correspondingly recommended to 'games or travel items that the user needs to play frequently'.
S200, the intelligent camera shoots and acquires voice image information of the user in real time, and the voice image information of the user is analyzed and processed by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data.
The step S200 specifically includes:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
The step B of analyzing and processing the voice image information of the user by using the pre-constructed AI home intelligent interactive scene database corresponding to the user behavior data includes:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing the voice instruction of the user by constructing an AI family intelligent interaction scene.
The steps of performing semantic recognition and scene construction of the voice command comprise:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: and what processing event happens in what voiceprint scene combination, what interaction happens when the voiceprint scene combination happens and what interaction is carried out by the user after the voiceprint scene combination happens are used for predicting the next action of the user through historical data analysis and outputting some preprocessing.
In this step S200, the user interacts with the television through the far-field speech of the intelligent camera, each sentence of speech interaction of the user is analyzed and processed through the AI home intelligent interaction system block, and the analyzed and processed contents include: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step behaviors of the user are judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attributes, user interaction records and user association (interaction between the user and the user) records) are created intelligently, and the voice instruction of the user is further analyzed and processed through constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved. All the data mentioned above are stored on the cloud.
S300, the smart television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis processing result.
The step S300 specifically includes:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
For example: a user A + a user B send an instruction to a camera [ we do today ], an AI family intelligent interaction system analyzes whether before the A/B user has a television at the same time, if the A/B user has the television at the same time, interactive memories of things that they have done before are given, and present opinions and recommendations are given according to present family scenes, the opinions and the recommendations are diverse, application data (such as watching television, playing games and cooking) in the television, shopping (new style recommendations, shopping discounts), travel (travel recommendations) and other operation data can be given, the data are pre-judged according to the behaviors of the users, and the data are continuously learned and corrected according to the interactive behaviors of the users, so that the AI family intelligent interaction system is close to the habits and obtained by the users.
The invention is described in further detail below by way of a specific application example:
s11, the smart television is provided with an intelligent camera with a far-field voice module for voiceprint recognition.
S12, when the intelligent television is turned on, the intelligent camera is in a working state.
And S13, the intelligent camera monitors the user 'S speaking and transmits the user' S speaking record to the AI home intelligent interactive system.
S14, the AI home intelligent interactive system analyzes and processes the speaking of the user; the content of the analysis process includes: semantic recognition of voice commands (voice command decomposition): the method comprises the steps of analyzing whether the speech of a user belongs to an instruction class (the instruction class belongs to the fact that the speech intention of the user is very clear, and the instruction can be executed without scene analysis, such as how I want to watch Liu De movie, listen to a beautiful song, eat red pork and the like) or a scene construction class (such as how hot weather exists, what you do at present, what you feel chatty, what you eat at noon and the like).
Voiceprint attributes of the current user (voiceprint recognition (gender, age, etc.): which voiceprint users have been simultaneously present
Vocal print emotional characteristics (excited, sad, flat, etc.): what the scene of voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is (the scene is analyzed by voiceprint (default definition: excited, warm, happy, hot, etc.))
Face recognition (user attributes, expression attributes), user system association): who and who appeared at the same time, what the expression was, and what the time was.
User home scene analysis (one person, multiple persons, combination of persons, home scene (party, dinner party, leisure, etc. according to the predetermined template analysis by intelligent camera view)
Emotional analysis of the user (by voiceprint + voiceprint emotional characteristics + facial expression + scene)
Scene history analysis (which voiceprint scene combinations have occurred, what processing events have occurred, what time has occurred, what interaction has been performed by the user after the occurrence, prediction of the user's next-step behavior through historical data analysis, and output of some pre-processing)
The voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, and the scene construction capability and the emotion interaction capability of AI voice are improved.
And S15, when the intelligent camera detects the voice data of the user and transmits the voice data to the AI home intelligent interactive system, the AI home intelligent interactive system creates an attribute record of the user, takes the ID, the voiceprint attribute and the face attribute of the user as the identification values of the user, and can locate the user through any one of the three attributes.
And S16, when the AI family intelligent interactive system detects an unfamiliar voiceprint or human face, creating an attribute record of the user by default, and increasing the voiceprint attribute of the user corresponding to the voiceprint through subsequent interactive intelligence. And conversely, if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is intelligently increased through subsequent interaction.
And S17, after the successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and the like of the user (including instruction history sent by the user and record of instruction execution, subsequent interaction performed by the user on the instruction and the like, and the basic data of the user is listed in 6, 7, 8, 9, 10 and 11 but not limited to the listed data records).
And S18, decomposing the voice sent by the user through the AI family intelligent interactive system to obtain the pre-execution operation of the user or recommend the best interactive scene of the user.
Such as: a user A + a user B send an instruction to a camera [ we do today ], an AI family intelligent interaction system analyzes whether before the A/B user has a television at the same time, if the A/B user has the television at the same time, interactive memories of things that they have done before are given, and present opinions and recommendations are given according to present family scenes, the opinions and the recommendations are diverse, application data (such as watching television, playing games and cooking) in the television, shopping (new style recommendations, shopping discounts), travel (travel recommendations) and other operation data can be given, the data are pre-judged according to the behaviors of the users, and the data are continuously learned and corrected according to the interactive behaviors of the users, so that the AI family intelligent interaction system is close to the habits and obtained by the users.
Therefore, the invention provides an intelligent interactive processing method based on AI voice, and provides an intelligent interactive processing method and system based on AI voice, which are convenient for intelligent recognition and interactive recommendation, so that a better intelligent interactive function is added to the intelligent television, and the intelligent television is convenient for users to use.
As shown in fig. 2, based on the above intelligent interactive processing method based on AI voice, the present invention further provides an intelligent interactive processing system based on AI voice, where the intelligent interactive processing system based on AI voice may be an intelligent television, a desktop computer, a notebook computer, a palm computer, and an intelligent device with an intelligent sound. The AI-based intelligent interactive processing system comprises a processor 10, a memory 20 and a display screen 30, wherein the processor 10 is connected with the memory 20 through a communication bus 50, and the display screen 30 is connected with the processor 10 through the communication bus 50. FIG. 2 shows only some of the components of the AI voice-based intelligent interactive processing system, but it is to be understood that not all of the shown components are required and that more or fewer components can alternatively be implemented.
The storage 20 may be an internal storage unit of the AI voice-based intelligent interactive processing system in some embodiments, for example, a memory of the AI voice-based intelligent interactive processing system. The memory 20 may also be an external storage device of the AI voice-based intelligent interactive processing system in other embodiments, such as a plug-in usb disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the AI voice-based intelligent interactive processing system. Further, the memory 20 may also include both an internal storage unit and an external storage device of the AI voice based intelligent interactive processing system. The memory 20 is used for storing application software installed in the AI voice-based intelligent interactive processing system and various types of data, such as program codes for installing the AI voice-based intelligent interactive processing system. The memory 20 may also be used to temporarily store data that has been output or is to be output. In an embodiment, the memory 20 stores an AI voice-based intelligent interactive processing method program 40, and the AI voice-based intelligent interactive processing method program 40 can be executed by the processor 10, so as to implement the AI voice-based intelligent interactive processing method in the present application.
The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), a microprocessor, a mobile phone baseband processor or other data Processing chip, and is configured to run program codes stored in the memory 20 or process data, such as executing the AI voice-based intelligent interactive Processing method.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display screen 30 is used for displaying information in the AI voice-based intelligent interactive processing system and for displaying a visual user interface.
In one embodiment, when the processor 10 executes the AI voice-based intelligent interactive processing method program 40 in the memory 20, the following steps are implemented:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. and the smart television pre-judges the behavior habits of the user and performs corresponding interactive response according to the analysis processing result, which is specifically described above.
When the processor executes the AI-voice-based intelligent interactive processing program, the following steps are also realized:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
and prejudging according to the behavior habits of the users, and continuously learning and correcting according to the interactive behaviors of the users.
When the processor executes the AI-voice-based intelligent interactive processing program, the following steps are also realized:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
after performing AI home intelligent interactive decomposition on the voice image information of the user, obtaining a pre-execution operation of the user, or recommending the best interactive scene of the user and performing corresponding prompting, as described above.
Based on the foregoing embodiments, the present invention further provides a computer-readable storage medium, where one or more programs are stored, and the one or more programs are executable by one or more processors to implement the steps in the AI voice-based intelligent interactive processing method according to any item above, specifically as described above.
In summary, according to the intelligent interactive processing method, system and storage medium based on AI voice provided by the present invention, the intelligent camera with far-field voice module voiceprint recognition is mounted on the intelligent television, the user interacts with the television through the far-field voice of the intelligent camera, each sentence of voice interaction of the user is analyzed and processed through the AI family intelligent interactive system block, and the analyzed and processed contents include: semantic recognition of voice commands (voice command decomposition, which decomposes commands into a definite command class and a scene construction class (new field classification can be added according to the improvement of an analysis system)), voiceprint attributes of current users (voiceprint recognition (gender, age), voiceprint emotional characteristics (excitement, worry, calm, and the like), face recognition (user attributes, expression attributes), user system association), user family scene analysis (one person, multiple persons, personnel combination, family scenes (party, dinner, leisure, and the like, which are analyzed according to a preset template through intelligent camera framing), emotion analysis of users (which is analyzed through voiceprint + voiceprint emotional characteristics + face expression + scene), scene history analysis (which voiceprint scene combinations have processed events, what happens, what interaction the users have done after the occurrence), through historical data analysis, the next step of behavior of the user is judged in advance, and some preprocessing outputs are carried out)), user system big data (user ID, user attribute, user interaction record and user association (interaction between the user and the user) record) are created intelligently, and the voice instruction of the user is further analyzed and processed by constructing an AI family intelligent interaction scene, so that the scene construction capacity and the emotion interaction capacity of AI voice are improved; all the data mentioned above are stored on the cloud.
The invention provides a deep emotion interaction experience for smart home and AI voice intelligent interaction, improves the experience and interest of products, improves the intelligent experience of household smart home with a television as the center, and provides a companion home experience. The invention increases better intelligent interaction function for the intelligent television and is convenient for users to use.
Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims (4)

1. An intelligent interactive processing method based on AI voice is characterized by comprising the following steps:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
the step B comprises the following steps:
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
the step B of analyzing and processing the voice image information of the user by utilizing an AI family intelligent interactive scene database which is pre-constructed and corresponds to the user behavior data comprises the following steps:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
the step of performing semantic recognition and scene construction of the voice instruction comprises the following steps:
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
C. the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result;
the step C comprises the following steps:
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
2. The AI voice-based intelligent interactive processing method according to claim 1, wherein the step a further comprises: and A1, constructing an AI family intelligent interaction scene database corresponding to the user behavior data in advance.
3. An intelligent interactive processing system based on AI speech, comprising:
a processor, a memory, and a communication bus;
the memory stores an AI voice based intelligent interactive processing program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
when the processor executes the AI voice-based intelligent interactive processing program, the following steps are realized:
A. the method comprises the following steps that an intelligent camera with far-field voice module voiceprint recognition is connected and arranged on the intelligent television in advance and used for interacting with the intelligent television through the far-field voice module of the intelligent camera;
B. the intelligent camera shoots and acquires the voice image information of the user in real time, and analyzes and processes the voice image information of the user by utilizing an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data;
C. the intelligent television pre-judges the behavior habits of the user and carries out corresponding interactive response according to the analysis and processing result;
when the processor executes the AI voice-based intelligent interaction processing program, the following steps are also realized:
a1, an AI family intelligent interaction scene database corresponding to the user behavior data is pre-constructed;
when the intelligent television is started, the intelligent camera is in a working state;
the intelligent camera shoots and acquires voice image information of a user in real time, intercepts the speaking voice of the user and records the speaking voice of the user for AI family intelligent interactive processing;
the AI family intelligent interaction processing utilizes an AI family intelligent interaction scene database which is constructed in advance and corresponds to the user behavior data to analyze and process the voice image information of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
when the processor executes the AI voice-based intelligent interaction processing program, the following steps are also realized:
performing semantic recognition and scene construction of voice commands;
performing voiceprint attribute analysis, voiceprint emotion characteristic analysis, face recognition analysis, user family scene analysis, user emotion analysis and scene history analysis of the current user;
intelligently creating user system big data, and analyzing and processing a voice instruction of a user by constructing an AI home intelligent interaction scene;
performing semantic recognition of voice instruction decomposition: analyzing whether the speaking of the user belongs to an instruction class or a scene construction class;
the step of performing voiceprint attribute analysis of the current user comprises:
performing voiceprint attribute identification of the current user: which voiceprint users have appeared at the same time;
the voiceprint emotional characteristic analysis comprises the following steps: what the scene of the voiceprint appears, what the voiceprint scene of each person is, what the comprehensive scene is;
the face recognition analysis comprises: who and who appeared at the same time, what the expression was, what the time was;
the user family scene analysis is analyzed according to a preset template through the view finding of the intelligent camera;
the emotion analysis of the user is carried out through voiceprints, voiceprint emotion characteristics, human face expressions and scenes;
the scene history analysis includes: which voiceprint scene combinations have occurred with what processing event, when, what interaction the user has performed after the occurrence, for predicting the next action of the user through historical data analysis, and outputting some preprocessing;
the smart television creates an attribute record of the user according to the analysis processing result, takes the ID, the voiceprint attribute and the face attribute of the user as the identification value of the user, and locates the user according to any one of the three attributes;
when an unfamiliar voiceprint or face is detected, an attribute record of a user is established by default, and the voiceprint attribute of the user corresponding to the voiceprint is increased through subsequent interactive intelligence; if the user firstly records the user ID with the voiceprint attribute increased, the face attribute of the user is increased through subsequent interactive intelligence;
after a successful user is created, automatically creating a big data table based on the user ID, wherein the data table records various behavior records, interaction records and interaction records of the user;
pre-judging according to the behavior habits of the users, and continuously learning and correcting according to the interaction behaviors of the users;
and after carrying out AI family intelligent interactive decomposition on the voice image information of the user, obtaining the pre-execution operation of the user, or recommending the best interactive scene of the user and carrying out corresponding prompt.
4. A storage medium, characterized in that the computer-readable storage medium stores one or more programs which are executable by one or more processors to implement the steps in the AI speech-based intelligent interactive processing method according to any one of claims 1-2.
CN201910239885.3A 2019-03-27 2019-03-27 AI voice-based intelligent interactive processing method, system and storage medium Active CN110113646B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910239885.3A CN110113646B (en) 2019-03-27 2019-03-27 AI voice-based intelligent interactive processing method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910239885.3A CN110113646B (en) 2019-03-27 2019-03-27 AI voice-based intelligent interactive processing method, system and storage medium

Publications (2)

Publication Number Publication Date
CN110113646A CN110113646A (en) 2019-08-09
CN110113646B true CN110113646B (en) 2021-09-21

Family

ID=67484676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910239885.3A Active CN110113646B (en) 2019-03-27 2019-03-27 AI voice-based intelligent interactive processing method, system and storage medium

Country Status (1)

Country Link
CN (1) CN110113646B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750773B (en) * 2019-09-16 2023-08-18 康佳集团股份有限公司 Image recognition method based on voiceprint attribute, intelligent terminal and storage medium
CN110931011A (en) * 2020-01-07 2020-03-27 杭州凯旗科技有限公司 AI intelligent voice interaction method applied to intelligent retail equipment
CN111326158A (en) * 2020-01-23 2020-06-23 深圳市安顺康医疗电子有限公司 Voice control method based on intelligent terminal
CN111324202A (en) * 2020-02-19 2020-06-23 中国第一汽车股份有限公司 Interaction method, device, equipment and storage medium
CN111901672A (en) * 2020-06-12 2020-11-06 深圳市京华信息技术有限公司 Artificial intelligence image processing method
CN111967380A (en) * 2020-08-16 2020-11-20 云知声智能科技股份有限公司 Content recommendation method and system
CN112203144A (en) * 2020-10-12 2021-01-08 广州欢网科技有限责任公司 Intelligent television program recommendation method and device and intelligent television
CN112261289B (en) * 2020-10-16 2022-08-26 海信视像科技股份有限公司 Display device and AI algorithm result acquisition method
CN112383748B (en) * 2020-11-02 2023-05-02 中国联合网络通信集团有限公司 Video information storage method and device
CN112397061B (en) * 2020-11-04 2023-10-27 中国平安人寿保险股份有限公司 Online interaction method, device, equipment and storage medium
CN112651334B (en) * 2020-12-25 2023-05-23 三星电子(中国)研发中心 Robot video interaction method and system
CN115689810B (en) * 2023-01-04 2023-04-04 深圳市人马互动科技有限公司 Data processing method based on man-machine conversation and related device
CN116453549A (en) * 2023-05-05 2023-07-18 广西牧哲科技有限公司 AI dialogue method based on virtual digital character and online virtual digital system
CN116913277B (en) * 2023-09-06 2023-11-21 北京惠朗时代科技有限公司 Voice interaction service system based on artificial intelligence

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104038836A (en) * 2014-06-03 2014-09-10 四川长虹电器股份有限公司 Television program intelligent pushing method
CN106682090A (en) * 2016-11-29 2017-05-17 上海智臻智能网络科技股份有限公司 Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104038836A (en) * 2014-06-03 2014-09-10 四川长虹电器股份有限公司 Television program intelligent pushing method
CN106682090A (en) * 2016-11-29 2017-05-17 上海智臻智能网络科技股份有限公司 Active interaction implementing device, active interaction implementing method and intelligent voice interaction equipment

Also Published As

Publication number Publication date
CN110113646A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CN110113646B (en) AI voice-based intelligent interactive processing method, system and storage medium
US10762299B1 (en) Conversational understanding
CN107632706B (en) Application data processing method and system of multi-modal virtual human
US10360265B1 (en) Using a voice communications device to answer unstructured questions
CN107481720B (en) Explicit voiceprint recognition method and device
US11132547B2 (en) Emotion recognition-based artwork recommendation method and device, medium, and electronic apparatus
CN107704169B (en) Virtual human state management method and system
CN112997171A (en) Analyzing web pages to facilitate automated navigation
CN108353103A (en) Subscriber terminal equipment and its method for recommendation response message
US20130143185A1 (en) Determining user emotional state
US10719695B2 (en) Method for pushing picture, mobile terminal, and storage medium
WO2021056837A1 (en) Customization platform and method for service quality evaluation product
US11392213B2 (en) Selective detection of visual cues for automated assistants
US20220234593A1 (en) Interaction method and apparatus for intelligent cockpit, device, and medium
US20180272240A1 (en) Modular interaction device for toys and other devices
Lopatovska et al. User recommendations for intelligent personal assistants
WO2017157174A1 (en) Information processing method, device, and terminal device
CN111797249A (en) Content pushing method, device and equipment
CN112867985A (en) Determining whether to automatically resume a first automated assistant session after interrupting suspension of a second session
CN111797304A (en) Content pushing method, device and equipment
CN109325173B (en) Reading content personalized recommendation method and system based on AI open platform
KR20220155601A (en) Voice-based selection of augmented reality content for detected objects
CN113703585A (en) Interaction method, interaction device, electronic equipment and storage medium
US20230376328A1 (en) Personalized user interface
CN112135170A (en) Display device, server and video recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant