WO2019056267A1

WO2019056267A1 - Hierarchical interactive decision making method, interactive terminal, and cloud server

Info

Publication number: WO2019056267A1
Application number: PCT/CN2017/102746
Authority: WO
Inventors: 廉士国; 刘兆祥; 王宁
Original assignee: 达闼科技（北京）有限公司
Priority date: 2017-09-21
Filing date: 2017-09-21
Publication date: 2019-03-28
Also published as: CN107820619A; CN107820619B

Abstract

A hierarchical interactive decision making method comprises the following steps: obtaining information about a target object (101); identifying a feature of the target object (102); obtaining a corresponding attribute grade according to the feature of the target object, and prioritizing the attribute grade (103); performing, according to a priority sequence of the attribute grade, a grade-by-grade attribute determination on the feature of the target object (104); if the feature of the target object satisfies a grading standard of the current attribute grade, migrating the feature of the target object to a higher attribute grade (105); and if the feature of the target object does not satisfy a grading standard of the current attribute grade, outputting a determination result of the current attribute grade and an attribute grade of a lower priority as a basis for making an interactive decision (106).

Description

Hierarchical interactive decision making method, interactive terminal and cloud server

Technical field

The present application relates to the field of robot interaction, and specifically relates to a hierarchical interaction decision method, an interactive terminal, and a cloud server.

Background technique

With the development of network transmission and big data technology and the improvement of hardware processing capabilities, more and more robots have entered people's family life. The current human-computer interaction methods are basically people's questions. Although the answer methods are various and more intelligent, most of them are passively receiving the user's question information, and responding directly to the user after screening. There is a lack of correlation between screening information.

With the emergence and popularity of smart devices, the interaction between smart devices and people has become more and more frequent, and the natural experience of human-computer interaction has become more and more demanding. For example, the interaction between a smart guide device and a blind person, or the interaction between a welcome robot and a guest.

For example, in a guide blind scene, if the guide device detects the person's information, image analysis can determine more character features. According to different task characteristics, the blind person is given a reminder voice, for example, when the name or gender is recognized, the voice prompt “Before your friend Xiao Ming” and “There is a woman in front”. In the welcome robot scene, the welcome robot recognizes the guests according to the machine vision and greets them by voice. For example, “Hello! Dear VIP customer Zhang Xiaoming”, “Hello, Ms!”. This kind of interaction can bring a user-friendly experience and improve service quality.

However, the prior art interactive terminal directly interacts according to various object features recognized by the machine vision, so that interactions based on advanced machine vision may also occur in some scenes, for example, due to light, angle or occlusion. People can't ensure that male and female, unable to detect human expression and age, and other machine intelligence can't judge, if the interactive user is a female, and the interactive terminal says "Hello, sir!" will be caused by gender recognition errors. .

As mentioned above, how to use machine intelligence to achieve a compromise between user-friendly experience and reliability is an urgent problem to be solved.

Therefore, the prior art robot interaction technology has yet to be improved.

Summary of the invention

The present application provides a hierarchical interaction decision method, an interactive terminal, and a cloud server, which pre-sets the attribute classification of the target object, and prioritizes the attribute rankings to form a multi-level hierarchical neural network. Attribute judgment based on the target object features recognized by machine vision and robot semantic understanding Break and output the interactive decision-making basis for the current target object information, and try to identify the more detailed attributes of the person and the object, making the interactive terminal more intelligent and flexible, and improving the user experience.

To solve the above technical problem, the present application provides the following technical solutions.

In a first aspect, an embodiment of the present application provides a hierarchical interaction decision method, including the following steps:

Obtaining target object information and identifying the target object feature;

Obtaining a corresponding attribute ranking according to the target object feature, and prioritizing the attribute ranking;

Performing a stepwise attribute determination on the target object feature according to the prioritization order of the attribute ranking, and classifying the attribute to a higher priority attribute when the target object feature satisfies the current attribute classification grading standard;

When the target object feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interactive decision.

In a second aspect, an embodiment of the present application further provides an interaction terminal, including an information acquisition module, an identification module, an attribute module, a determination module, and an output module.

The information acquiring module is configured to acquire target object information, and the identifying module is configured to identify the target object feature;

The attribute module is configured to obtain a corresponding attribute classification according to the target object feature, and prioritize the attribute classification;

The determining module is configured to perform a stepwise attribute determination on the target object feature according to the prioritized order of the attribute ranking, and the target object feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied by the classification criterion;

When the target object module does not meet the grading standard of the current attribute grading, the output module is used to output the current attribute grading and the subordinate attribute grading judgment result as the basis of the interaction decision.

In a third aspect, the embodiment of the present application further provides a cloud server, including a receiving module, an attribute module, a determining module, an output module, and a sending module.

The receiving module is configured to receive a target object feature that is sent by the interaction terminal and is identified according to the acquired target object information;

The judging module is configured to perform hierarchical attribute determination based on the priority order of the attribute ranking based on the target object feature, and the target object module is hierarchically migrated to a higher priority attribute when the current attribute classification grading standard is satisfied;

The output module is used for output when the target object module does not meet the grading criteria of the current attribute hierarchy. The current attribute grading and the judgment result of the subordinate attribute grading are used as the basis for the interactive decision;

The sending module is configured to send the basis.

In a fourth aspect, the embodiment of the present application further provides an electronic device, including:

At least one processor; and,

a memory communicatively coupled to the at least one processor, a communication component, an audio data collector, and a video data collector; wherein

The memory stores instructions executable by the at least one processor, the instructions being invoked by the at least one processor to invoke data of the audio data collector and the video data collector, and establishing a connection with the cloud server through the communication component to enable the At least one processor is capable of performing the method as described above.

In a fifth aspect, the embodiment of the present application further provides a non-transitory computer readable storage medium, where the computer-readable storage medium is stored, where the computer-executable instructions are used to cause a computer to execute the above The method described.

In a sixth aspect, the embodiment of the present application further provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when When the program instructions are executed by the computer, the computer is caused to perform the method as described above.

The utility model has the beneficial effects that the hierarchical interaction decision method, the interaction terminal and the cloud server provided by the embodiments of the present application pre-set a reasonable attribute classification of the target object, prioritize the attribute classifications, and form a multi-level hierarchical neural network. According to the machine object and the semantic definition of the robot, the target object features are judged step by step, and the optimal decision basis for the current target object information is outputted, so that the interactive terminal is more intelligent and flexible, and the user experience is improved. In this embodiment, the target object, such as the attributes of the person and the object, is determined as much as possible, and more interactive decision-making basis is provided for the actual interactive application scenario, thereby improving the user's deep interactive experience.

DRAWINGS

The one or more embodiments are exemplified by the accompanying drawings in the accompanying drawings, and FIG. The figures in the drawings do not constitute a scale limitation unless otherwise stated.

1 is a system architecture diagram of hierarchical interaction decision provided by an embodiment of the present application;

2 is a block diagram of an interaction terminal provided by an embodiment of the present application;

3 is a main flowchart of hierarchical interaction decision of an interactive terminal according to an embodiment of the present application;

4 is an overall flowchart of hierarchical interaction decision of an interactive terminal according to an embodiment of the present application;

FIG. 5 is a flowchart of step-by-step determination of a face recognition embodiment of an interactive terminal according to an embodiment of the present application; FIG.

6 is a multi-level hierarchical neural network diagram of an interaction terminal according to an embodiment of the present application;

FIG. 7 is a flowchart of step-by-step determination of a vehicle identification embodiment of an interactive terminal according to an embodiment of the present application; FIG.

FIG. 8 is a flowchart of step-by-step judgment of an audio recognition embodiment of an interactive terminal according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a cloud server module according to an embodiment of the present application;

10 is a flowchart of a method for implementing hierarchical interaction decision on a cloud server side provided by an embodiment of the present application;

FIG. 11 is a schematic diagram showing the hardware structure of an electronic device according to the hierarchical interaction decision method provided by the embodiment of the present application.

Detailed ways

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The hierarchical interaction decision method, the interaction terminal and the cloud server provided by the embodiments of the present application pre-set attribute classification of the target object, prioritize the attribute classification, form a multi-level hierarchical neural network, and recognize and recognize according to machine vision and robot semantics. The target object feature performs attribute judgment step by step, and outputs the interaction decision basis with the most information about the current target object, so that the interactive terminal is more intelligent and flexible, and improves the user experience.

The hierarchical interactive decision content of the present application includes: acquiring target object information, identifying the target object feature; acquiring corresponding attribute classification according to the target object feature, prioritizing the attribute ranking; and ranking the priority order according to the attribute classification, based on the target The object feature performs step-by-step attribute judgment, and when the current attribute classification grading standard is satisfied, the attribute is graded to a higher priority attribute; when the current attribute grading grading standard is not satisfied, the current attribute grading and the following priority attribute grading result are outputted. As the basis for interactive decision making.

The prioritization of the attribute ranking may be starting from a primary attribute to an advanced attribute, and starting from the primary attribute to the advanced attribute based on the target object feature for progressive attribute determination. It can be understood that the stepwise attribute judgment based on the target object feature may also be judged step by step from the advanced attribute to the primary attribute, as long as the deeper result can be recognized as the result output of the system, thereby identifying as much as possible. More detailed attributes of characters and objects to support deep interaction.

The hierarchical interactive decision-making method, the interactive terminal and the cloud server of the present application are machine-optimized decision-making methods based on confidence priority, which realizes optimal recognition of characters and objects, and tries to identify more detailed attributes of characters and objects, thereby realizing terminal and person. Friendly interaction between.

Referring to FIG. 1 , in the hierarchical interaction decision system of the present application, each interaction device is connected to the cloud server 100. The interactive terminal 100 can be smart glasses 110, can be a robot 120, can be a smart terminal 130, or can be a smart helmet 140 or the like.

When the interactive terminal is working, the interactive terminal collects information about the opposite target object, such as picture image information or audio and audio information; and hierarchically and adaptively identifies the target object and the target object feature from the target object information; based on the identified target object and the target object The feature, the interactive terminal device sends corresponding interaction information to the person; the user responds to the interactive terminal and the like.

In order to ensure the recognition accuracy and prevent unnecessary errors and defects, in the present application, when the attribute determination is performed step by step based on the target object feature, the priority order of the determination result is evaluated and the confidence analysis is performed. For example, taking the identification of a character as an example, according to the difficulty level of the identification of various attributes of the character by the interactive terminal, the priority order of the attribute classification of the different attributes is: L0 (person) > L1 (sex) > L2 (person name) > L3 (expression). That is, first, it is judged whether or not the person can be identified, and the personless target object is output, and if possible, the gender is further recognized, and if the gender is confirmed, the name is recognized again. Among them, the difficulty of identifying different attributes can be sorted according to the recognition algorithm of different attributes under the same conditions, such as the recognition rate of inputting the same image data. For example, it is often difficult to identify a person's name, gender recognition is difficult to face detection, or sort according to the mutual inclusion relationship between attributes. For example, to identify a gender, it is necessary to first detect the presence of a face. Among them, the discrimination of each attribute depends on the corresponding confidence. For example, when L0 recognizes a face, only the confidence exceeds W0, it is considered that the person is recognized; when L1 identifies the gender, only when the confidence exceeds W1, the gender can be recognized. When L2 recognizes a person's name, only when the confidence exceeds W2, the name recognition is successful.

Example 1

Referring to FIG. 2, the embodiment relates to an interactive terminal. The attribute module, the judgment module, and the confidence module for implementing the hierarchical interaction decision are set in the interaction terminal.

The interaction terminal includes an information acquisition module 20, an identification module 22, an attribute module 30, a determination module 40, a confidence module 42, an output module 50, and an interaction module 60.

The information acquisition module 20 acquires target object information, and the recognition module 22 identifies the target object feature. The attribute module 30 obtains a corresponding attribute ranking according to the target object feature, and the attribute ranking is prioritized from the primary attribute to the advanced attribute.

The judging module 40 starts from the primary attribute to the advanced attribute, and performs stepwise attribute determination based on the target object feature, and the target object feature is hierarchically migrated to the higher priority attribute when the current attribute classification is satisfied. When the target object feature does not satisfy the grading standard of the current attribute grading, the output module 50 outputs the current attribute grading and the subordinate attribute grading judgment result as the basis of the interaction decision.

The confidence module is configured to perform a confidence analysis on the judgment result of each attribute classification.

The judging module is based on the judgment result of all subordinate attribute grading in the judgment of the superior attribute grading.

The target object information includes image information as well as audio information.

The attribute module 30 classifies the attributes of different groups.

Referring to FIG. 5 and FIG. 6, in an embodiment, the attribute module 30 is a face attribute module. The information acquisition module 20 acquires image information of a face, and the recognition module 22 identifies a face feature based on the image information. The face attribute module is configured to obtain a corresponding attribute ranking according to the facial feature. For example, the priority order of the attribute ranking is: person, gender, person name, and expression.

The determining module 40 starts from the attribute of the person to the expression attribute, and performs stepwise attribute determination based on the facial feature. The confidence module 42 starts from the attribute of the person to the expression attribute, and performs level-by-level confidence determination based on the facial feature. When the face feature satisfies the grading standard of the current attribute grading, the attribute is migrated to a higher priority attribute;

When the facial feature does not satisfy the grading standard of the current attribute grading, the output module 50 outputs all the judgment results of the current attribute grading and the attribute grading of the following priority levels as the basis of the interactive decision.

In this embodiment, the person and its attributes are identified from the screen according to the confidence level. Identification of attributes for people, such as people, names, genders, expressions, etc. From the perspective of confidence, the processing unit of the interactive terminal makes intelligent analysis and decision making on the screen:

When detecting whether someone is present, it is equivalent to detecting whether the person is paying attention to the interactive device. The human body detection/face detection technology is used to detect the presence of a person; the face pose estimation technique is used to detect the posture of the face, such as the spatial orientation, in degrees. It is judged whether the face posture is long time t>=T, and T is a threshold value, which can be set by experience, for example, T=2 seconds. The angle difference d<=D toward the interactive terminal, D is a threshold, which can be set empirically, for example D=20 degrees. Among them, T and D are detection thresholds, which can be used as a confidence level for judging the existence of a human face.

Detect other attributes of people, such as gender. The attribute detection algorithm is used to identify attributes such as the gender of the person corresponding to the face image. Followed by the gender identification algorithm as an example. Considering the lack of accuracy of the attribute recognition algorithm, we make the following decisions based on the two parameters of “male” confidence Na and “female” confidence Nv: if Na-Nv≥R, the decision result is male; if Nv-Na ≥R, the decision result is female; otherwise, the output <0, Face>, indicating that the face is not recognized. Among them, R is the threshold for measuring gender differences, which can be used as a confidence level for discriminating gender. It can be selected according to experience, for example, R=20 (Nv+Na=100).

The name is detected by the face recognition technology to determine whether the detected face is a pre-stored or registered face, that is, the face similarity S ≥ C, where C is a similarity threshold, which can be used as a confidence level for determining the name.

Referring to FIG. 6, features for discriminating "face", "gender", and "person name" can be iteratively calculated from a picture, such as based on a multi-layer neural network principle. The bottom layer of the neural network is calculated for discriminating No is the face feature of the face 1, the middle layer calculates the face feature 2 and the face feature 3 from the feature 1, and the upper layer can be calculated based on the features of the next layer, for example, calculating the person from the face feature 2 The face feature 3 is used to discriminate a specific "person name".

Based on the identified person and its attributes, the terminal device sends a corresponding interaction signal to the person;

Based on the output of step 2), the intelligent machine responds accordingly.

Take the blind guide helmet as an example:

If the output is <1, Face>, the smart helmet emits a sound "someone in front";

If the output is <1, Male>, the smart helmet emits a sound "There is a man in front";

If the output is <1,Female>, the smart helmet makes a sound "There is a lady in front";

If the output is <2, NameInfo>, the smart helmet emits a sound "Before NameInfo";

Take the welcome robot as an example:

If the output is <1, Face>, the welcome robot makes a sound "Hello! What can I help you?";

If the output is <1, Male>, the welcome robot makes a sound "Hello! Sir!";

If the output is <1,Female>, the welcome robot makes a sound "Hello! Ms!";

If the output is <2, NameInfo>, the welcome robot sounds "Hello! VIP Customer NameInfo!".

The intelligent identification and decision-making process can be implemented in the interactive terminal or on the cloud server side. The specific solution implemented in the cloud server refers to Embodiment 3. In this embodiment, the interactive terminal needs to collect the collected object information, such as an image or Audio data is transmitted to the cloud server.

In another embodiment, the attribute module 30 is a vehicle attribute module.

The information acquisition module 20 acquires image information of the vehicle, and the recognition module 22 identifies the vehicle feature based on the image information. The vehicle attribute module obtains a corresponding attribute ranking based on the vehicle characteristics. For example, the prioritization of the attribute hierarchy is: car, color, model, brand, and style.

The judging module 40 starts from the vehicle attribute to the style attribute, and performs stepwise attribute determination based on the vehicle characteristic, and the confidence module 42 starts from the vehicle attribute to the style attribute, and performs stepwise confidence determination based on the vehicle characteristic, the vehicle When the feature meets the grading criteria of the current attribute grading, the attribute is migrated to a higher priority attribute;

When the vehicle feature does not meet the grading criteria of the current attribute grading, the output module 50 outputs all the judgment results of the current attribute grading and the attribute grading of the following priority levels as the basis for the interactive decision.

Referring to FIG. 7 , for example, taking vehicle identification as an example, it is sequentially determined whether there is a vehicle, a vehicle color, a vehicle model is recognized, a vehicle brand is recognized, a vehicle style is recognized, and the like. The result of the judgment can be considered in the algorithm Make optimization decisions based on the rate and give the corresponding interactive output. The decision sequence is shown in Figure 7.

In this embodiment, the hierarchical decision method can be based on different target object features, and different features are extracted from the original input image or audio for different levels of decision. The specific classification packet can be pre-stored in the interactive terminal.

The image information may be various, and may include vehicle recognition, fruit recognition, animal recognition, and the like in addition to face recognition.

In yet another embodiment, the attribute module 30 is a sound attribute module.

The information acquisition module 20 acquires target object audio information, and the recognition module 22 identifies an audio feature of the target object based on the audio information. The sound attribute module obtains a corresponding attribute level according to the audio feature of the target object. For example, the order of prioritization of the attribute hierarchy is: vocals, languages, keywords, and semantics.

The determining module 40 starts from the vocal attribute to the semantic attribute, and performs stepwise attribute determination based on the audio feature of the target object. The confidence module 42 starts from the vocal attribute to the semantic attribute, and performs a step-by-step confidence judgment based on the audio feature of the target object. When the audio feature satisfies the grading standard of the current attribute grading, the attribute is hierarchically migrated to the attribute with higher priority. When the audio feature does not meet the grading criteria of the current attribute grading, the output module 50 outputs all the judgment results of the current attribute grading and the attribute grading of the following priority levels as the basis for the interactive decision.

Referring to FIG. 8, for example, it is sequentially determined whether there is a human voice, a vocal language, an extracted keyword, a recognition semantic, and the like. The judgment result can be optimized based on the recognition rate of the algorithm, and the corresponding interactive output is given. The order of decision making can be: vocal, language, keywords and semantics. For example, to recognize the vocal, the interactive terminal can say "Hello!"; if the language is recognized, the intelligent interactive terminal can say "Hello!" in the corresponding language; if the keyword "finance" is recognized, the intelligent interactive terminal can say "There is the Bank of China's latest financial information, I don't know if you are interested." If you identify the customer's intention "I want to understand high interest rate management," the interactive terminal can say "high interest rate financial information is as follows...".

Example 2

As shown in FIG. 2, the embodiment relates to a cloud server 100, wherein an attribute module, a determining module, and a confidence module for implementing hierarchical interaction decision are disposed in the cloud server 100.

Referring to FIG. 9 , the cloud server includes a receiving module 102 , a sending module 104 , an attribute module 130 , a determining module 140 , an output module 150 , and a confidence module 142 .

The receiving module 102 receives the target object feature identified by the interactive terminal according to the acquired target object information. The attribute module 130 acquires a corresponding attribute ranking according to the target object feature, and the attribute ranking is prioritized from the primary attribute to the advanced attribute.

The determining module 140 starts from the primary attribute to the advanced attribute, and performs stepwise attribute determination based on the target object feature, and the target object feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied; the target object feature When the grading standard of the current attribute grading is not satisfied, the output module 150 outputs all the judgment results of the current attribute grading and the subordinate attribute grading as the basis of the interactive decision.

The sending module 104 sends the basis to the connected interactive terminal, and the interactive terminal performs a certain depth interaction with the user based on the received interaction decision.

The confidence module 142 performs a confidence analysis on the determination result of each attribute ranking.

In the cloud server, the judging module 140 is based on the judgment result of all subordinate attribute gradings in the superior attribute grading judgment.

Referring to FIG. 10, a flow chart of implementing a hierarchical interaction decision method on the cloud server 100 side is shown.

Step 301: The receiving module receives, by the interaction terminal, a target object feature that is identified according to the acquired target object information.

Step 302: The attribute module acquires a corresponding attribute classification according to the target object feature, and the attribute level is prioritized from the primary attribute to the advanced attribute.

Step 303: The judging module starts from the primary attribute to the advanced attribute, and performs stepwise attribute determination based on the target object feature.

Step 304: The confidence module performs a confidence analysis on the judgment result of each attribute classification.

Step 305: Whether the confidence threshold grading standard is met;

Step 306: Whether the grading standard of the current attribute grading is satisfied;

Step 307: The target object feature is hierarchically migrated to a higher priority attribute when the grading standard of the current attribute grading is satisfied;

Step 308: When the target object feature does not meet the grading standard of the current attribute grading, the judgment result of the current attribute grading and the following attribute grading is output as the basis of the interactive decision;

Step 309: The sending module is configured to send the basis.

Example 3

Referring to FIG. 3, the embodiment relates to a hierarchical interaction decision method, which mainly includes the following steps:

Step 101: Acquire target object information, where the target object information includes image information and audio information;

Step 102: Identify the target object feature.

Step 103: Acquire a corresponding attribute ranking according to the target object feature, where the attribute ranking is prioritized from the primary attribute to the advanced attribute;

Step 104: Starting from the primary attribute to the advanced attribute, performing hierarchical attribute determination based on the target object feature, and the superior attribute leveling is based on the judgment result of all the lower attribute attributes in the judgment;

Step 105: The target object feature is hierarchically migrated to a higher priority attribute when the grading standard of the current attribute grading is satisfied;

Step 106: When the target object feature does not meet the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interaction decision.

Referring to FIG. 4, the hierarchical interaction decision method further includes a confidence analysis step when performing stepwise attribute determination based on the target object feature.

Step 201: Perform a confidence analysis on the judgment result of each attribute classification to ensure the recognition accuracy rate;

Step 203: Whether the grading standard of the current attribute grading is satisfied, for example, whether the face is recognized, and if the face is recognized, it is further determined whether the character gender, etc.;

Step 205: Whether the confidence threshold grading criterion is met, for example, whether the face meets the threshold grading standard of the face image data, or whether the gender satisfies the set image feature threshold, such as the length of the hair;

Step 207: The target object feature is hierarchically migrated to a higher priority attribute when the grading standard of the current attribute grading is satisfied;

Step 209: When the target object feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interaction decision.

The hierarchical interactive decision method, the interactive terminal and the cloud server of the embodiment pre-set the attribute grading of the target object, prioritize the attribute grading, and perform attribute judgment step by step according to the target object features recognized by the machine vision and the robot semantic understanding. The interactive decision basis for the current target object information is outputted, so that the interactive terminal is more intelligent and flexible, and the user experience is improved. And based on the basis of interactive decision-making, the interactive content between the interactive terminal and the user is more rich and interesting, and the recognition person is taken as an example: when the camera is facing the camera under the condition of good illumination and close proximity, the name of the person can be recognized; Half face, or side-to-machine camera, can only identify the gender of the person; when facing the camera, it can only identify whether it is an individual, or take the identification of the vehicle as an example: the blind person walks with the guide helmet on the side of the road Sometimes it is possible to identify the model and color, and sometimes only recognize the color. Interactive content is more interesting and interesting.

Example 4

11 is a hardware structure of an electronic device 600 for hierarchical hierarchical decision making according to an embodiment of the present application. Schematically, as shown in FIG. 11, the electronic device 600 includes:

One or more processors 610, a memory 620, an audio data collector 630, a video data collector 640, a communication component 650, and a display unit 660 are illustrated by one processor 610 in FIG. The output of the audio data collector is the input of an audio recognition module, and the output of the video data collector identifies the input of the module. The memory 620 stores instructions executable by the at least one processor 610, the instructions being invoked by the at least one processor to invoke data of the audio data collector and the video data collector, and the communication component 650 establishes a connection with the cloud server. To enable the at least one processor to perform the hierarchical interaction decision method.

The processor 610, the memory 620, the display unit 660, and the human-machine interaction unit 630 may be connected by a bus or other means, and the connection by a bus is taken as an example in FIG.

The memory 620 is a non-volatile computer readable storage medium, and can be used for storing a non-volatile software program, a non-volatile computer executable program, and a module, such as a program corresponding to the hierarchical interactive decision method in the embodiment of the present application. An instruction/module (for example, the identification module 22, the attribute module 30, the determination module 40, the confidence module 42 and the interaction module 60 shown in FIG. 2). The processor 610 executes various functional applications and data processing of the server by running non-volatile software programs, instructions, and modules stored in the memory 620, that is, implementing the hierarchical interaction decision method in the above method embodiments.

The memory 620 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to usage of the interactive terminal, and the like. Moreover, memory 620 can include high speed random access memory, and can also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, memory 620 can optionally include memory remotely located relative to processor 610, which can be connected to the robotic interactive electronic device via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory 620, and when executed by the one or more processors 610, perform a hierarchical interaction decision method in any of the above method embodiments, for example, performing the above described FIG. In steps 101 to 106 of the method, the method steps 201 to 209 in FIG. 4 described above are executed, and the identification module 22, the attribute module 30, the determination module 40, the confidence module 42 and the interaction module 60 and the diagram in FIG. 2 are implemented. 9 functions of the attribute module 130, the determination module 140, the confidence module 142, and the transmission module 104.

The above products can perform the methods provided by the embodiments of the present application, and have the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiments of the present application.

The electronic device of the embodiment of the present application exists in various forms, including but not limited to:

(1) Mobile communication equipment: This type of equipment is characterized by its mobile communication function.

(2) Three-dimensional display devices: These devices can display and play multimedia content. Such devices include: virtual reality helmets, enhanced display helmets, or enhanced display glasses.

(3) Server: A device that provides computing services. The server consists of a processor, a hard disk, a memory, a system bus, etc. The server is similar to a general-purpose computer architecture, but because of the need to provide highly reliable services, processing power and stability High reliability in terms of reliability, security, scalability, and manageability.

(4) Robots and blind guides.

Embodiments of the present application provide a non-transitory computer readable storage medium storing computer-executable instructions that are executed by one or more processors, for example, to perform the above The method steps 101 to 106 in FIG. 3 described above, the method steps 201 to 209 in FIG. 4 described above are performed, and the identification module 22, the attribute module 30, the determination module 40, the confidence module 42 and the interaction in FIG. 2 are implemented. The functions of the module 60 and the attribute module 130, the determination module 140, the confidence module 142, and the transmission module 104 in FIG.

The device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Through the description of the above embodiments, those skilled in the art can clearly understand that the various embodiments can be implemented by means of software plus a general hardware platform, and of course, by hardware. A person skilled in the art can understand that all or part of the process of implementing the above embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, and are not limited thereto; in the idea of the present application, the technical features in the above embodiments or different embodiments may also be combined. The steps may be carried out in any order, and there are many other variations of the various aspects of the present application as described above, which are not provided in the details for the sake of brevity; although the present application has been described in detail with reference to the foregoing embodiments, The skilled person should understand that the technical solutions described in the foregoing embodiments may be modified, or some of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the embodiments of the present application. The scope of the technical solution.

Claims

A hierarchical interactive decision making method, comprising the steps of:

Obtain target object information and identify target object features;

Obtaining a corresponding attribute ranking according to the target object feature, and prioritizing the attribute ranking;

And performing hierarchical attribute determination on the target object feature according to a prioritized order of the attribute ranking, where the target object feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied;

When the target object feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interactive decision.
The method according to claim 1, wherein the stepwise attribute determination based on the target object feature further comprises:

Confidence analysis is performed on the judgment result of each attribute classification.
The method according to claim 2, wherein the superior attribute ranking is based on the judgment result of all the lower attribute classifications at the time of the judgment.
The method according to claim 3, wherein the target object information comprises image information and audio information.
A method according to any one of claims 1 to 4, characterized in that

Obtaining image information of a face, and identifying a face feature according to the image information;

Obtaining a corresponding attribute ranking according to the facial feature, the prioritized order of the attribute ranking is: a person, a gender, a person name, and an expression;

Starting from the person attribute to the expression attribute, performing stepwise attribute judgment and confidence judgment on the face feature, and the face feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied;

When the facial feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interactive decision.
A method according to any one of claims 1 to 4, characterized in that

Acquiring image information of the vehicle, and identifying the vehicle feature according to the image information;

And obtaining, according to the vehicle feature, a corresponding vehicle attribute classification and a classification criterion corresponding to the facial attribute classification, wherein the priority ranking order of the attribute classification is: a car, a color, a model, a brand, and a style;

Starting from the vehicle attribute to the model attribute, performing stepwise attribute determination and confidence determination on the vehicle feature, and classifying the attribute to a higher priority attribute when the vehicle feature satisfies the current attribute classification grading standard;

When the vehicle feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interactive decision.
A method according to any one of claims 1 to 4, characterized in that

Obtaining target object audio information, and identifying an audio feature of the target object according to the audio information;

Acquiring corresponding attribute rankings according to the audio features of the target object, the priority order of the attribute rankings is: vocals, languages, keywords, and semantics;

Starting from the vocal attribute to the semantic attribute, performing stepwise attribute judgment and confidence judgment on the feature of the target object, and the audio feature is hierarchically migrated to a higher priority attribute when the grading standard of the current attribute grading is satisfied;

When the audio feature does not satisfy the grading standard of the current attribute grading, the judgment result of the attribute grading of the current attribute grading and the following priority is output as the basis of the interactive decision.
An interactive terminal, comprising: an information acquisition module, an identification module, an attribute module, a judgment module, and an output module,

The information acquiring module is configured to acquire target object information, and the identifying module is configured to identify a target object feature;

The attribute module is configured to obtain a corresponding attribute classification according to the target object feature, and prioritize the attribute classification;

The determining module is configured to perform hierarchical attribute determination on the target object feature according to a prioritized order of the attribute ranking, and the target object feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied by the current attribute classification ;

When the target object feature does not meet the grading criteria of the current attribute grading, the output module is configured to output the judgment result of the attribute grading of the current attribute grading and the following priority as the basis of the interaction decision.
The interactive terminal according to claim 8, wherein the determining module further comprises a confidence module, wherein the confidence module is configured to perform a confidence analysis on the determination result of each attribute ranking.
The interactive terminal according to claim 9, wherein the judging module is based on the judgment result of all subordinate attribute gradings in the superior attribute grading judgment.
The interactive terminal according to claim 10, wherein the target object information comprises image information and audio information, and the interactive terminal is a robot or a wearable display device or Mobile terminal or guide blind device.
The interactive terminal according to any one of claims 8 to 11, wherein the attribute module is a face attribute module;

The information acquiring module is configured to acquire image information of a human face, and the identifying module is configured to identify a facial feature according to the image information;

The face attribute module is configured to obtain a corresponding attribute ranking according to the facial feature, and the priority order of the attribute ranking is: a person, a gender, a person name, and an expression;

The determining module includes a confidence module, and the determining module is configured to start from the human attribute to an expression attribute, and perform stepwise attribute determination based on the facial feature, where the confidence module is configured to start from the human attribute An expression attribute, which is determined based on the face feature, and the face feature is hierarchically migrated to a higher priority attribute when the face feature meets the current attribute classification level criterion;

When the facial feature does not meet the grading standard of the current attribute grading, the output module is configured to output the current attribute grading and the following attribute grading result as the basis of the interactive decision.
The interactive terminal according to any one of claims 8 to 11, wherein the attribute module is a vehicle attribute module;

The information acquiring module is configured to acquire image information of a vehicle, and the identifying module is configured to identify a vehicle feature according to the image information;

The vehicle attribute module is configured to obtain a corresponding attribute ranking according to the vehicle feature, and the priority order of the attribute ranking is: a car, a color, a model, a brand, and a style;

The judging module includes a confidence module, and the judging module is configured to perform a stepwise attribute determination based on the vehicle characteristics from the vehicle attribute to the style attribute, where the confidence module is used to start from the vehicle attribute to the style Attribute, performing a stepwise confidence determination based on the vehicle characteristics, and the vehicle feature is hierarchically migrated to a higher priority attribute when the grading standard of the current attribute grading is satisfied;

When the vehicle feature does not meet the grading standard of the current attribute grading, the output module is configured to output the judgment result of the attribute grading of the current attribute grading and the following priority as the basis of the interaction decision.
The interactive terminal according to any one of claims 8 to 11, wherein the attribute module is a sound attribute module;

The information acquiring module is configured to acquire target object audio information, and the identifying module is configured to identify an audio feature of the target object according to the audio information;

The sound attribute module is configured to obtain a corresponding attribute ranking according to an audio feature of the target object, and the priority order of the attribute ranking is: a voice, a language, a keyword, and a semantic;

The determining module includes a confidence module, and the determining module is configured to start with the vocal attribute a semantic attribute, performing a stepwise attribute determination based on an audio feature of the target object, the confidence module is configured to start from the vocal attribute to a semantic attribute, and perform a stepwise confidence determination based on an audio feature of the target object, When the audio feature satisfies the grading standard of the current attribute grading, the grading migration is performed to the attribute with higher priority;

When the audio feature does not meet the grading standard of the current attribute grading, the output module is configured to output a judgment result of the attribute grading of the current attribute grading and the following priority as the basis of the interaction decision.
A cloud server, comprising: a receiving module, an attribute module, a determining module, an output module, and a sending module,

The receiving module is configured to receive, by the interaction terminal, a target object feature that is identified according to the acquired target object information;

The attribute module is configured to obtain a corresponding attribute classification according to the target object feature, and prioritize the attribute classification;

The determining module is configured to perform hierarchical attribute determination based on the target object feature according to the priority order of the attribute ranking, and the target object feature is hierarchically migrated to a higher priority attribute when the current attribute classification is satisfied by the current attribute classification ;

When the target object feature does not meet the grading standard of the current attribute grading, the output module is configured to output the current attribute grading and the subordinate attribute grading judgment result as the basis of the interaction decision;

The sending module is configured to send the basis.
The cloud server according to claim 15, wherein the determining module further comprises a confidence module, wherein the confidence module is configured to perform a confidence analysis on the judgment result of each attribute ranking.
The cloud server according to claim 16, wherein the judging module is based on the judgment result of all subordinate attribute gradings in the judging of the superior attribute grading.
An electronic device, comprising:

At least one processor; and,

a memory communicatively coupled to the at least one processor, a communication component, an audio data collector, and a video data collector; wherein

The memory stores instructions executable by the at least one processor, the instructions being invoked by the at least one processor to invoke data of an audio data collector and a video data collector, and establishing a connection with a cloud server through a communication component To enable the at least one processor to perform the method of any of claims 1-7.
A non-transitory computer readable storage medium, wherein the computer readable storage medium stores computer executable instructions for causing a computer to perform the claims 1-7 The method of any of the preceding claims.
A computer program product, comprising: a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer, The computer performs the method of any of claims 1-7.