WO2019051814A1

WO2019051814A1 - Target recognition method and apparatus, and intelligent terminal

Info

Publication number: WO2019051814A1
Application number: PCT/CN2017/101967
Authority: WO
Inventors: 廉士国; 刘兆祥; 王宁
Original assignee: 达闼科技（北京）有限公司
Priority date: 2017-09-15
Filing date: 2017-09-15
Publication date: 2019-03-21
Also published as: CN107995982A; CN107995982B

Abstract

A target recognition method and apparatus, and an intelligent terminal. The method comprises: collecting information about a target to be detected (110), wherein the target to be detected comprises at least two attribute types, and a priority relationship is set between the at least two attribute types; and outputting, based on the information, a recognition result of the target to be detected (120), wherein the recognition result is a determination result corresponding to one of the attribute types, the degree of confidence of the determination result satisfies a pre-set condition, and the attribute type corresponding to the recognition result has the highest priority level among the attribute types corresponding to the determination result, the degree of confidence of which satisfies the pre-set condition. The method can ensure the reliability of the output recognition result under different recognition scenarios, and also output more detailed recognition results as far as possible, thereby improving the user experience.

Description

Target recognition method, device and intelligent terminal

Technical field

The embodiments of the present invention relate to the field of intelligent identification technologies, and in particular, to a target identification method, apparatus, and intelligent terminal.

Background technique

With the advancement of the machine intelligence process, the interaction between people and intelligent terminals becomes more and more frequent, and the natural experience of human-computer interaction becomes more and more important. Among them, an important factor affecting the natural experience of human-computer interaction is the level of detail and reliability of the identification of the target to be measured by the intelligent terminal.

Currently, most smart terminals are expected to output a higher-level target recognition result such as a person's name, model (or series) of the car, license plate number, and cat breed to enhance the human-computer interaction experience.

However, in an actual scenario, the environment is variable, and the recognition capability of the smart terminal is limited. In some scenarios, the smart terminal may not be able to accurately identify the target to be tested, for example, due to light, angle, or occlusion. Identify who this person is, and for example, because of the distance and angle, you can't be sure of the brand or model of the car. In this case, if the intelligent terminal is required to output a higher-level recognition result, it may cause embarrassment due to the recognition error; or, if the output result is discarded because the detailed result is not obtained, Not conducive to a user-friendly experience.

Therefore, how to achieve a compromise between the reliability and the level of detail of the target recognition is an urgent problem to be solved by the existing intelligent identification technology.

Summary of the invention

The embodiment of the present application provides a target recognition method, device, and intelligent terminal, which can solve the problem of how to achieve a compromise between reliability and detail level of target recognition.

In a first aspect, an embodiment of the present application provides a target identification method, which is applied to an intelligent terminal, and includes include:

Collecting information for the object to be tested, the object to be tested includes at least two types of attributes, and a priority relationship is set between the at least two types of attributes;

And outputting, according to the information, the recognition result of the object to be tested, the recognition result is a determination result corresponding to one of the attribute types, the confidence of the determination result satisfies a preset condition, and the recognition result corresponds to The attribute type has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.

In a second aspect, an embodiment of the present application provides a target identification apparatus, including:

An information collecting unit, configured to collect information about a target to be tested, where the object to be tested includes at least two types of attributes, and a priority relationship is set between the at least two types of attributes;

a recognition unit, configured to output a recognition result of the object to be tested based on the information, where the recognition result is a determination result corresponding to one of the attribute types, the confidence of the determination result satisfies a preset condition, and The attribute type corresponding to the recognition result has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.

In a third aspect, an embodiment of the present application provides an intelligent terminal, including:

At least one processor; and,

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the target recognition method as described above.

In a fourth aspect, an embodiment of the present application provides a non-transitory computer readable storage medium, where the non-transitory computer readable storage medium stores computer executable instructions for causing a smart terminal to execute the above The target recognition method.

In a fifth aspect, the embodiment of the present application further provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when the program When the instruction is executed by the smart terminal, causing the smart terminal to execute as above The target recognition method.

The beneficial effects of the embodiments of the present application are as follows: the target identification method, apparatus, and intelligent terminal provided by the embodiments of the present application divide multiple attributes with priority order for the attributes of the target to be tested according to the degree of detail of the description of the object to be tested. Type, and in the process of identification, obtain the confidence of the judgment result under each attribute type, and then output the judgment result corresponding to the attribute type with the highest priority among the judgment results satisfying the preset condition according to the actual recognition situation. The recognition result of the target to be measured can ensure the reliability of the output recognition result under different recognition scenarios, and at the same time, output a more detailed recognition result as much as possible, that is, the final recognition result can be reliability and A compromise between levels of detail improves the user experience.

DRAWINGS

The one or more embodiments are exemplified by the accompanying drawings in the accompanying drawings, and FIG. The figures in the drawings do not constitute a scale limitation unless otherwise stated.

FIG. 1 is a schematic flowchart diagram of a target recognition method according to an embodiment of the present application;

2 is a schematic flow chart of another object recognition method provided by an embodiment of the present application;

3 is a schematic structural diagram of a target recognition apparatus according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of hardware of an intelligent terminal according to an embodiment of the present application.

Detailed ways

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

It should be noted that, if there is no conflict, the various features in the embodiments of the present application may be combined with each other, and are all within the protection scope of the present application. In addition, although the functional module partitioning is performed in the device schematic, the logical sequence is shown in the flowchart, but in some cases, the illustrated may be performed in a different manner from the modules in the device, or in the order in the flowchart. Or the steps described.

The embodiment of the present application provides a target recognition method, device, and intelligent terminal, which can be applied to any application field related to target recognition, such as: intelligent guide blind, welcome robot, service robot, intrusion object detection, semantic recognition, and the like.

The target recognition method provided by the embodiment of the present application is an intelligent optimization identification method based on the “priority” of the attribute type of the target to be measured and the “confidence” of the determination result under each attribute type, according to the The degree of detail of the description of the target is divided into a plurality of attribute types having priority order for the attribute of the target to be tested (where the higher the priority type, the higher the degree of detail corresponding to the judgment result), and in the process of identification In the judgment result under each attribute type, a confidence level for evaluating the reliability of the judgment result is set, and then the judgment corresponding to the attribute type having the highest priority among the judgment results satisfying the preset condition is output according to the actual recognition situation. As a result of the recognition of the object to be tested, it is possible to ensure the reliability of the output recognition result under different recognition scenarios, and at the same time, output a more detailed recognition result as much as possible, that is, the final recognition result can be obtained at A compromise between reliability and level of detail enhances the user experience.

Therefore, when the target recognition method and device and the intelligent terminal provided by the embodiment of the present application identify the same person/object (the object to be tested), different levels of recognition results may be output under different recognition environments. For example, in the case of identifying a person, when the light is good, the distance is close, and the person being tested is facing the machine camera, the "person name" of the person to be tested can be identified; when the person being tested covers half of the face by hand, or sideways When the machine camera is used, only the "gender" of the person being tested can be identified; when the person being tested is facing the camera of the machine, only whether it is a "person" can be identified.

The object recognition method and apparatus provided by the embodiments of the present application can be applied to any type of smart terminal, such as a robot, a guide glasses, a smart helmet, a smart phone, a tablet computer, a server, and the like. The smart terminal can include any suitable type of storage medium for storing data, such as a magnetic disk, a compact disc (CD-ROM), a read-only memory or a random access memory. The smart terminal may also include one or more logical computing modules that perform any suitable type of function or operation in parallel, such as viewing a database, image processing, etc., in a single thread or multiple threads. The logic operation module may be any suitable type of electronic circuit or chip-type electronic device capable of performing logical operation operations, such as a single core processor, a multi-core processor, a graphics processing unit (GPU), or the like.

Specifically, the embodiments of the present application are further described below in conjunction with the accompanying drawings.

Embodiment 1

1 is a schematic flowchart of a target identification method according to an embodiment of the present application. Referring to FIG. 1, the method includes but is not limited to:

110. Collect information about the target to be tested.

In this embodiment, the target to be tested may include, but is not limited to, a person, an animal, an object, and the like. According to the degree of detail of the description of the object to be measured, at least two different levels of attribute types may be divided for the object to be tested, and priority relationships are set for the attribute types according to the level of detail. Among them, it can be considered that the attribute type with difficulty in identifying is relatively high in detail, and the difficulty level of recognition can be sorted according to the recognition rate of different attribute types under the same conditions (for example, inputting the same picture) (for example) Usually, person name recognition is difficult to identify by gender, gender recognition is difficult for face/human recognition); or, it can be sorted according to the mutual inclusion relationship between attribute types (for example, to identify the gender, the presence of the face must be recognized first).

For example: assuming that the target to be tested is a person, the attribute type of the person may be set according to the degree of detail of the description of the object to be tested, including: "person name", "gender", and "whether it is a person", and according to the difficulty level of recognition, You can set the priority order of these attribute types to be: L1 (person name) > L2 (gender) > L3 (whether it is human). For another example, if the target to be tested is a vehicle, the attribute types of the vehicle can be set to include: “a license plate”, “a model of the vehicle”, “the color of the vehicle”, and “whether or not the vehicle”, and according to the difficulty level of the identification, Set the priority order of these attribute types to: L1 (vehicle license plate) > L2 (model of the car) > L3 (color of the car) > L4 (whether it is a car).

In addition, in this embodiment, the “information” may be any judgment basis that can reflect the attribute of the object to be tested, and the type of the information may include, but is not limited to, image information, sound information, thermal infrared image, near-infrared image. , ultrasonic signals, electromagnetic reflection signals, etc.

When performing this step 110, information about the object to be tested may be collected by one or more sensors, for example, collecting image information for the object to be tested through the camera, collecting sound information for the object to be tested through the microphone, and passing the thermal infrared. The sensor collects a thermal infrared image or the like for the target to be tested.

120. Output a recognition result of the object to be tested based on the information.

In the embodiment, in the process of identifying the target to be tested based on the collected information, each attribute type of the object to be tested corresponds to a determination result, and each determination result corresponds to a reliable one for characterizing the determination result. Confidence in sexuality (or credibility). For example, based on the collected image information for a person, the judgment results obtained by the target to be tested include: “Zhang San” (confidence is 70%), “male” (confidence is 89%), “person” ( The confidence level is 100%), then, "Zhang San", "Men" and "People" are the judgment results corresponding to the attribute types "person name", "gender" and "whether or not" of the object to be tested. The confidence level of the judgment result can be determined by the similarity degree of the feature comparison, and the higher the degree of similarity, the higher the confidence degree.

In particular, in the embodiment, the outputting result is a determination result corresponding to one of the attribute types of the object to be tested, the confidence of the determination result satisfies a preset condition, and the attribute type corresponding to the recognition result is in confidence. The priority of the attribute type corresponding to the judgment result satisfying the preset condition is the highest.

The “preset condition” may be set according to an actual application scenario, and used to identify the reliability of a certain judgment result. Specifically, the preset condition may be: a confidence level of the determination result is greater than or equal to a confidence threshold corresponding to the corresponding attribute type. The confidence threshold corresponding to each attribute type may be the same. For example, the confidence thresholds corresponding to the attribute types “person name”, “gender”, and “whether or not” are both 70%, and the judgment result of the target to be tested is obtained. Including: "Zhang San" (confidence is 70%), "male" (confidence is 89%), "person" (confidence is 100%), then, the judgment results "Zhang San", "male" and " The confidence degree of the person meets the preset condition. At this time, the recognition result of the object to be tested is the judgment result "Zhang San" corresponding to the attribute type "person name" having the highest priority among the three. Or, in other embodiments, the confidence threshold corresponding to each attribute type may also be different. For example, the confidence threshold corresponding to the attribute type “person name” may be preset to be 75%, corresponding to the attribute type “gender”. The confidence threshold is 85%, and the confidence threshold corresponding to the attribute type “is it human” is 95%. If the target to be tested is the same, “Zhang San” (confidence is 70%), “male” ( Confidence is 89%), "People" (confidence is 100%), then the judgment result that the confidence meets the preset condition includes only "male" and "person". At this time, the recognition result of the target to be tested is The highest priority attribute type "gender" corresponds to the judgment result "male".

Specifically, in the embodiment, the identification result of the target to be tested is output based on the collected information. Embodiments may include, but are not limited to, the following two embodiments:

In the first implementation manner, the determination result corresponding to each attribute type of the object to be tested and the confidence of each determination result may be firstly obtained based on the collected information; and then the priority of the determination result that the confidence degree satisfies the preset condition is output. The judgment result corresponding to the highest attribute type is used as the recognition result of the object to be tested.

In this embodiment manner, the determination result corresponding to each attribute type of the object to be tested based on the collected information may be implemented by using a suitable algorithm (for example, a neural network). For example, if the target to be measured is a person and the information collected by the smart terminal is the image information of the person, the smart terminal may iteratively calculate the attribute type “whether it is a person”, “gender” and “person name” from the image. The judgment result, for example, firstly calculates the feature 1 for discriminating whether it is a person by the bottom layer of the neural network, and obtains the judgment result of "whether it is a person" according to the feature 1 and the confidence of the judgment result; then, in the neural network The middle layer calculates feature 2 for discriminating "gender" based on feature 1, and obtains the judgment result corresponding to "gender" according to feature 2 and the confidence of the judgment result; finally, based on feature 2 at the uppermost layer of the neural network The feature 3 for discriminating the "person name" is obtained, and according to the feature 3, the judgment result corresponding to the "person name" and the confidence of the judgment result are obtained. After all the judgment results and their confidences are obtained, the judgment result that the confidence degree satisfies the preset requirement is first screened, and then the judgment of the target to be tested is selected to the highest degree of detail (that is, the corresponding attribute type has the highest priority). The result is the recognition result of the target to be tested.

In the second implementation manner, based on the collected information, the determination result corresponding to each attribute type of the target to be tested and the confidence of each determination result may be obtained step by step according to the priority from high to low, until the first When the determination result that the confidence degree satisfies the preset condition appears, the determination result that the first confidence degree satisfies the preset condition is output as the recognition result of the object to be tested. That is, when the information for the object to be tested is collected, firstly, based on the collected information, the first-level judgment result corresponding to the attribute type with the highest priority and the first-level confidence level of the first-level judgment result are obtained, if the first-level confidence is satisfied. If the preset condition (for example, the first-level confidence is greater than or equal to the first-level confidence threshold), the first-level judgment result is directly output as the recognition result of the target to be tested, otherwise, the attribute level corresponding to the next level is obtained based on the collected information. The second-level judgment result and the second-level confidence of the second-level judgment result; if the second-level confidence level satisfies a preset condition (for example, the second-level confidence level is greater than or equal to the second-level confidence threshold), the second-level judgment result is output As a result of the identification of the target to be tested, otherwise, the determination result corresponding to the attribute type of the next level and the confidence thereof are obtained based on the collected information, and the loop is repeated until the determination result that the confidence level satisfies the preset condition is obtained.

In this embodiment, different features may be extracted from the collected information for different levels of judgment. For example, if the target to be measured is a vehicle and the collected information is image information for the vehicle, The feature a can be extracted from the image information for identifying whether there is a car in the image, the feature b is extracted for identifying the color of the car in the image, and the feature c is extracted for identifying the type of the car (car, truck, bus, etc.) Wait.

In this embodiment, the judgment result corresponding to each attribute type of the object to be tested and the confidence thereof are obtained step by step according to the order of priority from high to low, when the first confidence level satisfies the judgment result of the preset condition Directly outputting the judgment result that the first confidence degree satisfies the preset condition, without identifying and judging each attribute type, thereby reducing the amount of data processing, and improving the recognition without affecting the level of detail and reliability. effectiveness.

In addition, for different application scenarios and application requirements, further expansion can be performed on the basis of the

above steps

110 and 120.

For example, in some application scenarios in which human-computer interaction is possible, such as smart guide blind, a welcome robot, a service robot, etc., the target recognition method may further include: transmitting an interaction signal corresponding to the recognition result.

For example: if the recognition result outputted in step 120 is “Zhang San”, the smart glasses or smart helmet for guiding blind can send a voice prompt “Your friend Zhang San” to the user for welcoming or providing service. The robot can say "Hello! VIP customer Zhang San!" to the target, and/or adjust the gestures exclusive to VIP customers. For another example, if the recognition result outputted in step 120 is “male”, the smart glasses or the smart helmet for guiding the blind can give the user a voice prompt “there is a man in front”, and the robot for welcoming or providing the service can be tested. The goal is "Hello! Sir!".

According to the foregoing technical solution, the object recognition method provided by the embodiment of the present application is that the target identification method provided by the embodiment of the present application divides multiple attribute types with priority order for the attribute of the target to be tested according to the detailed level of the description of the object to be tested. And in the process of identification, get each attribute type The confidence of the result of the determination is determined, and then the judgment result corresponding to the attribute type having the highest priority among the judgment results satisfying the preset condition is output as the recognition result of the object to be tested according to the actual recognition situation, and can be in different recognition scenarios. The reliability of the recognition result of the output is ensured, and at the same time, the more detailed recognition result is output as much as possible, that is, the final recognition result can be compromised between reliability and detail level, thereby improving the user experience.

Embodiment 2

Further, in order to improve the efficiency and the level of detail of the target recognition, the second embodiment of the present application further provides another target identification method. In this embodiment, the collected information includes at least two types of information sources.

Specifically, please refer to FIG. 2, which includes but is not limited to:

210. Collect at least two information sources for the target to be tested.

In the present embodiment, the "information source" refers to a source of information capable of reflecting an attribute of a target to be tested. The “at least two information sources” may be at least two different types of information, such as any two or more of image information, sound information, thermal infrared images, near infrared images, ultrasonic signals, or electromagnetic reflection signals; Alternatively, the “at least two information sources” may also be some type of information collected from at least two angles or moments, for example, image information (or sound information) of the target to be measured is collected from multiple angles, and each Image information (or sound information) acquired from a single perspective can be used as a source of information. Of course, it can be understood that the “at least two information sources” may also be a combination of the above two forms. For example, the information collected for the target to be tested includes image information collected from multiple angles and from one image. Sound information collected at an angle.

In this embodiment, the specific implementation manner of collecting each information source may refer to step 110 in the foregoing embodiment 1, and will not be described in detail herein.

220. Output a recognition result of the target to be tested based on the at least two information sources.

In this embodiment, the recognition result of the target to be tested is obtained by means of multi-information fusion.

Specifically, in this embodiment, the specific implementation manner of outputting the recognition result of the object to be tested based on the collected at least two types of information sources may include, but is not limited to, the following three implementation manners:

In the first embodiment, the target to be tested may be identified by means of “sub-mode fusion”, that is, firstly, the sub-recognition results of the object to be tested are acquired based on the collected at least two information sources, and then the sub-identifications are determined according to the sub-identifications. The result outputs the recognition result of the object to be tested. The “sub-recognition result” refers to a recognition result obtained based on only one information source, and each information source corresponds to one sub-recognition result. Therefore, in this embodiment, the sub-recognition result also includes at least two, and each sub-recognition result has a corresponding confidence level for evaluating the reliability of the sub-recognition result.

Specifically, in this embodiment manner, the sub-recognition result corresponding to each information source may be separately obtained by the target identification method (shown in FIG. 1) provided in the first embodiment, and then selected from the sub-recognition results. The most detailed sub-identification result is used as the recognition result of the object to be tested. The level of detail of the sub-recognition result may be determined by the priority of the attribute type corresponding to the sub-recognition result, and the higher the priority of the corresponding attribute type, the higher the level of detail, for example, the obtained sub-recognition result includes “ "person" and "girl", wherein the attribute type corresponding to the child recognition result "person" is "whether it is a person", the attribute type of the child recognition result "girl" is "gender", and the priority of "gender" is higher than " If the person is a person, the sub-recognition result "girl" is higher in detail than the sub-recognition result "person", so that the sub-recognition result "girl" can be used as the recognition result of the object to be tested.

For example, if the collected information includes the image information and the sound information, the steps 110 to 120 in the first embodiment may be performed based on the collected image information; and the first embodiment is executed based on the collected sound information. Steps 110 to 120. It is assumed that the sub-recognition result output based on the acquired image information is “person”, and the sub-recognition result output based on the collected sound information is “Li Si”, and the sub-recognition result “Li Si” with higher level of detail can be output. "As the result of the identification of the target to be tested.

In addition, in practical applications, there may be cases where the obtained sub-recognition result with the highest degree of detail is included, and there are contradictions between the plurality of sub-recognition results, for example, the sub-recognition result with the highest level of detail obtained. Including "boys" and "girls", the sub-recognition results "boys" and "girls" correspond to the attribute types are "gender", but only one recognition result can be output under the same attribute type. At this time, the sub-recognition result with the highest degree of confidence can be selected from the sub-identification result with the highest degree of detail as the recognition result of the object to be tested, for example, the confidence of the sub-recognition result “boy” is 70%, and the sub-recognition result “ The confidence of the girl is 90% (>70%), so that the sub-recognition result “girl” can be selected as the target of the object to be tested. result.

In this embodiment, by generating the sub-recognition result obtained by the at least two information sources to generate the recognition result of the object to be tested, the level of detail of the target recognition can be further improved.

In the second embodiment, the target to be tested may be identified by means of a hierarchical decision fusion, that is, the test is obtained step by step according to the priority from the highest to the lowest based on the at least two information sources. The sub-judgment result corresponding to each attribute type of the target and the sub-confidence of each sub-judgment result, until the sub-judgment result in which the first sub-confidence satisfies the preset condition occurs, the output of the first sub-confidence satisfies the pre- The sub-judgment result of the condition is used as the recognition result of the object to be tested.

The “sub-judgment result” refers to a judgment result of the object to be tested under a certain attribute type based on only one information source analysis, and each information source corresponds to one sub-judgment result under each attribute type. . The "sub-confidence" refers to the degree of credibility of the sub-judgment result and is used to characterize the reliability of the sub-judgment result.

For example: Assume that the target to be tested is a person, and the attribute types include: “person name”, “gender” and “whether it is a person”, and the priority relationship is: L1 (person name)>L2 (gender)>L3 (whether it is a person), collection The obtained information includes image information and sound information. First, the sub-judgment result corresponding to the “person name” and the confidence thereof are first obtained based on the image information and the sound information, respectively, and it is assumed that the sub-judgment result obtained based on the image information is “Li Si”. And the sub-confidence of "Li Si" satisfies the first preset condition, and the sub-judgment result obtained based on the sound information is "Zhang San", but the sub-confidence of "Zhang San" does not satisfy the second preset condition, The "person name" cannot be recognized based on the sound information. At this time, the sub-judgment result "Li Si" is the sub-judgment result that the first sub-confidence satisfies the preset condition (meeting the first preset condition or the second preset condition). Thereby, "Li Si" can be output as the recognition result of the object to be tested.

In addition, in some embodiments, when a plurality of confidence levels satisfy the preset condition at the same time, and different sub-judgment results, that is, the sub-judgment result that the first sub-confidence satisfies the preset condition When multiples are included, the sub-judgment result with the highest confidence among the sub-judgment results may be selected as the recognition result of the target to be tested.

In this embodiment, the target to be tested is hierarchically identified based on at least two information sources, respectively, as long as one of the information sources is analyzed to obtain an optimal recognition result (ie, the most reliable and detailed identification node). Therefore, the optimal recognition result can be directly output, and the efficiency of target recognition can be improved.

In a third implementation manner, the target to be tested may be identified by using a “hierarchical fusion decision”, that is, the to-be-selected from the at least two information sources is sequentially stepped according to the order of priority from high to low. Measure the feature corresponding to each attribute type of the target, and obtain the judgment result corresponding to each attribute type and the confidence of the judgment result according to the feature corresponding to each attribute type; until the first confidence level satisfies the determination result of the preset condition When it appears, the judgment result that the first confidence level satisfies the preset condition is output as the recognition result of the object to be tested.

For example: Assume that the target to be tested is a person, and the attribute types include: “person name”, “gender” and “whether it is a person”, and the priority relationship is: L1 (person name)>L2 (gender)>L3 (whether it is a person), collection The obtained information includes image information and sound information, and then, the first-level feature A1 for identifying the "person name" may be first extracted from the collected image information, and the "person name" is extracted from the sound information. The first-level feature A2, and then the two types of first-level features A1 and A2 are merged together (for example, A1 and A2 are spliced together by a neural network separator that combines two types of features) to generate feature A, and then identify according to feature A. The first-level judgment result of the person name and the first-level confidence of the first-level judgment result, if the first-level confidence level satisfies the first-level preset condition, the first-level judgment result is output; otherwise, the collected image information is extracted A secondary feature B1 for identifying "gender", and a secondary feature B2 for identifying "gender" are extracted from the sound information, and then the two types of secondary features B1 and B2 are fused together to generate feature B. And then, according to feature B, the second-level judgment result of “gender” and the second-level confidence of the second-level judgment result are recognized, and if the second-level confidence level satisfies the second-level preset condition, the second-level judgment result is output; otherwise, continue The judgment result corresponding to the attribute type of the next level and the confidence thereof are obtained, and the loop is obtained until the judgment result that the confidence degree satisfies the preset condition is obtained.

In this embodiment, by hierarchically merging the collected features of at least two information sources, the determination information of the target recognition can be enriched, which not only can improve the detail level of the target recognition, but also improve the efficiency of the target recognition.

According to the foregoing technical solution, the object recognition method provided by the embodiment of the present application can improve the target identification method provided by the embodiment of the present application by collecting at least two types of information sources and outputting the recognition result of the object to be tested according to the at least two information sources. The level of detail and efficiency of target recognition.

Embodiment 3

FIG. 3 is a schematic structural diagram of a target recognition apparatus according to an embodiment of the present application. Referring to FIG. 3, the target identification apparatus 3 includes:

The information collecting unit 31 is configured to collect information about a target to be tested, where the object to be tested includes at least two types of attributes, and a priority relationship is set between the at least two types of attributes;

The identification unit 32 is configured to output a recognition result of the object to be tested based on the information, where the recognition result is a determination result corresponding to one of the attribute types, the confidence of the determination result satisfies a preset condition, and The attribute type corresponding to the recognition result has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.

In the present embodiment, when the information collecting unit 31 collects information for the object to be tested, the recognition unit 32 outputs the recognition result of the object to be tested based on the information. The object to be tested includes at least two attribute types, and a priority relationship is set between the at least two attribute types; the recognition result is a determination result corresponding to one of the attribute types, and the determination result is The confidence level satisfies the preset condition, and the attribute type corresponding to the recognition result has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.

In some embodiments, the identifying unit 32 is specifically configured to: obtain, according to the information, a determination result corresponding to each attribute type of the object to be tested and a confidence level of each determination result; and output confidence that the preset condition meets the preset condition The judgment result corresponding to the attribute type having the highest priority among the judgment results is used as the recognition result of the object to be tested.

In some embodiments, the identification unit 32 includes an analysis module 321 and an output module 322. The analysis module 321 is configured to acquire the information according to the target to be tested according to the priority from high to low. a determination result corresponding to each attribute type of the object to be tested and a confidence level of each determination result; the output module 322 is configured to output the first confidence when the first confidence level meets the preset condition The judgment result that satisfies the preset condition is used as the recognition result of the object to be tested. Further, in other embodiments, if the information collected by the information collection unit 31 includes at least two types of information sources, the analysis module 321 is specifically configured to: according to the order of priority from high to low, And extracting, from the at least two information sources, the features corresponding to each attribute type of the object to be tested, and obtaining the determination result corresponding to each attribute type and the determination result according to the feature corresponding to each attribute type Confidence.

In addition, in some embodiments, when the information collected by the information collecting unit 31 includes at least two types of information sources, the identifying unit 32 is specifically configured to: based on the at least two information sources respectively, according to the priority from high to low. The sub-judging result corresponding to each attribute type of the object to be tested and the sub-confidence of each sub-judgment result are obtained step by step until the sub-judgment result of the first sub-confidence satisfying the preset condition appears, and the output is a sub-judgment result that satisfies a preset condition as a recognition result of the object to be tested; or, a sub-recognition result of the object to be tested is acquired based on the at least two information sources, wherein each An information source corresponds to a sub-recognition result; and the recognition result of the object to be tested is output according to the sub-recognition result.

Moreover, in some embodiments, the target recognition device 3 further includes:

The interaction unit 33 is configured to send an interaction signal corresponding to the recognition result.

It should be noted that, since the target recognition device and the target recognition method in the foregoing method embodiments are based on the same inventive concept, the corresponding content and the beneficial effects of the foregoing method embodiments are also applicable to the device embodiment, and More details.

According to the foregoing technical solution, the object recognition apparatus provided by the embodiment of the present application divides the attribute type of the priority order into the attribute of the object to be tested according to the detailed degree of the description of the object to be tested. And in the process of identification, the confidence unit obtains the confidence of the judgment result under each attribute type, and then outputs the judgment result corresponding to the attribute type with the highest priority among the judgment results satisfying the preset condition according to the actual recognition situation. As the recognition result of the object to be tested, it is possible to ensure the reliability of the output recognition result under different recognition scenarios, and at the same time, output a more detailed recognition result as much as possible, that is, the final recognition result can be reliably A compromise between sex and level of detail improves the user experience.

Embodiment 4

4 is a schematic structural diagram of hardware of an intelligent terminal according to an embodiment of the present application, where the smart terminal The 400 can be any type of smart terminal, such as a robot, a blind eyeglass, a smart helmet, a smart phone, a tablet, a server, etc., and can perform the target recognition method provided by the first embodiment and/or the second embodiment.

Specifically, referring to FIG. 4, the smart terminal 400 includes:

One or more processors 401 and memory 402 are exemplified by one processor 401 in FIG.

The processor 401 and the memory 402 can be connected by a bus or other means, and the connection by a bus is taken as an example in FIG.

The memory 402 is used as a non-transitory computer readable storage medium, and can be used for storing a non-transitory software program, a non-transitory computer executable program, and a module, such as a program instruction/module corresponding to the target recognition method in the embodiment of the present application. (For example, the information collecting unit 31, the identifying unit 32, and the interactive unit 33 shown in Fig. 3). The processor 401 executes various functional applications and data processing of the target recognition device by executing non-transitory software programs, instructions, and modules stored in the memory 402, that is, implementing the target recognition method of any of the above method embodiments.

The memory 402 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to usage of the target identification device, and the like. Moreover, memory 402 can include high speed random access memory, and can also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 402 can optionally include memory remotely located relative to processor 401, which can be connected to smart terminal 400 over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory 402, and when executed by the one or more processors 401, perform a target recognition method in any of the above method embodiments, for example, performing the above described FIG. Method step 110 to step 120, method step 210 to step 220 in FIG. 2, implement the functions of units 31-33 in FIG.

Embodiment 5

The embodiment of the present application further provides a non-transitory computer readable storage medium, the non-transient computing The machine readable storage medium stores computer executable instructions that are executed by one or more processors, such as by one of the processors 401 of FIG. 4, such that the one or more processors perform any of the above The object recognition method in the method embodiment, for example, performs the method steps 110 to 120 in FIG. 1 described above, and the method steps 210 to 220 in FIG. 2 implement the functions of the units 31-33 in FIG.

The device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

Through the description of the above embodiments, those skilled in the art can clearly understand that the various embodiments can be implemented by means of software plus a general hardware platform, and of course, by hardware. A person skilled in the art can understand that all or part of the process of implementing the above embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a non-transitory computer readable storage medium. The program, when executed, may include the flow of an embodiment of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

The above product can perform the target recognition method provided by the embodiment of the present application, and has the corresponding functional modules and beneficial effects of performing the target recognition method. For the technical details that are not described in detail in this embodiment, refer to the object recognition method provided by the embodiment of the present application.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, and are not limited thereto; in the idea of the present application, the technical features in the above embodiments or different embodiments may also be combined. The steps may be carried out in any order, and there are many other variations of the various aspects of the present application as described above, which are not provided in the details for the sake of brevity; although the present application has been described in detail with reference to the foregoing embodiments, The skilled person should understand that the technical solutions described in the foregoing embodiments may be modified, or some of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the embodiments of the present application. The scope of the technical solution.

Claims

A target recognition method is applied to an intelligent terminal, and is characterized in that:

Collecting information for the object to be tested, the object to be tested includes at least two types of attributes, and a priority relationship is set between the at least two types of attributes;

And outputting, according to the information, the recognition result of the object to be tested, the recognition result is a determination result corresponding to one of the attribute types, the confidence of the determination result satisfies a preset condition, and the recognition result corresponds to The attribute type has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.
The object recognition method according to claim 1, wherein the outputting the recognition result of the object to be tested based on the information comprises:

Obtaining, according to the information, a determination result corresponding to each attribute type of the object to be tested and a confidence level of each determination result;

The determination result corresponding to the attribute type having the highest priority among the determination results satisfying the preset condition is used as the recognition result of the object to be tested.
The object recognition method according to claim 1, wherein the outputting the recognition result of the object to be tested based on the information comprises:

Determining, according to the information, the determination result corresponding to each attribute type of the object to be tested and the confidence of each determination result according to the priority from high to low, until the first confidence meets the preset condition When the judgment result appears, the determination result that the first confidence level satisfies the preset condition is output as the recognition result of the object to be tested.
The object recognition method according to claim 3, wherein the information includes at least two types of information sources, and the obtaining, according to the information, the step-by-step acquisition according to the priority from high to low The judgment result corresponding to each attribute type of the target and the confidence of each judgment result include:

Extracting, according to the order of priority from high to low, the features corresponding to each attribute type of the object to be tested are extracted step by step, and acquiring each genus according to the feature corresponding to each attribute type The judgment result corresponding to the sex type and the confidence of the judgment result.
The object recognition method according to claim 1, wherein the information includes at least two types of information sources, and the outputting the recognition result of the object to be tested based on the information includes:

Obtaining, according to the at least two information sources, the sub-judgment result corresponding to each attribute type of the object to be tested and the sub-confidence of each sub-judgment result, according to the order of priority from high to low, respectively, until the first When the sub-judgment result whose sub-confidence satisfies the preset condition appears, the sub-judgment result whose first sub-confidence satisfies the preset condition is output as the recognition result of the object to be tested.
The object recognition method according to claim 1, wherein the information includes at least two types of information sources, and the outputting the recognition result of the object to be tested based on the information includes:

Obtaining a sub-recognition result of the object to be tested, respectively, based on the at least two information sources, where each information source corresponds to one sub-recognition result;

And outputting the recognition result of the object to be tested according to the sub-identification result.
The object recognition method according to any one of claims 1 to 6, wherein the method further comprises:

An interaction signal corresponding to the recognition result is transmitted.
A target recognition device is applied to a smart terminal, and includes:

An information collecting unit, configured to collect information about a target to be tested, where the object to be tested includes at least two types of attributes, and a priority relationship is set between the at least two types of attributes;

a recognition unit, configured to output a recognition result of the object to be tested based on the information, where the recognition result is a determination result corresponding to one of the attribute types, the confidence of the determination result satisfies a preset condition, and The attribute type corresponding to the recognition result has the highest priority among the attribute types corresponding to the determination result that the confidence degree satisfies the preset condition.
The object recognition device according to claim 8, wherein the identification unit is specifically configured to:

Obtaining a judgment result corresponding to each attribute type of the object to be tested and each sentence based on the information The confidence of the result of the break;

The determination result corresponding to the attribute type having the highest priority among the determination results satisfying the preset condition is used as the recognition result of the object to be tested.
The object recognition device according to claim 8, wherein the identification unit comprises:

An analysis module, configured to obtain, according to the information, a judgment result corresponding to each attribute type of the object to be tested and a confidence level of each determination result according to a priority from high to low;

And an output module, configured to output, when the first confidence level meets the preset condition, the determination result that the first confidence meets the preset condition is used as the recognition result of the object to be tested.
The target recognition device according to claim 10, wherein the information comprises at least two types of information sources, and the analysis module is specifically configured to:

Extracting, according to the order of priority from high to low, the features corresponding to each attribute type of the object to be tested are obtained step by step, and acquiring each attribute type according to the feature corresponding to each attribute type Corresponding judgment result and confidence of the judgment result.
The object recognition apparatus according to claim 8, wherein the information includes at least two types of information sources, and the identification unit is specifically configured to:

Obtaining, according to the at least two information sources, the sub-judgment result corresponding to each attribute type of the object to be tested and the sub-confidence of each sub-judgment result, according to the order of priority from high to low, respectively, until the first When the sub-judgment result whose sub-confidence satisfies the preset condition appears, the sub-judgment result whose first sub-confidence satisfies the preset condition is output as the recognition result of the object to be tested.
The object recognition apparatus according to claim 8, wherein the information includes at least two types of information sources, and the identification unit is specifically configured to:

Obtaining a sub-recognition result of the object to be tested, respectively, based on the at least two information sources, where each information source corresponds to one sub-recognition result;

And outputting the recognition result of the object to be tested according to the sub-identification result.
The object recognition device according to any one of claims 8 to 13, wherein the object recognition device further comprises:

And an interaction unit, configured to send an interaction signal corresponding to the recognition result.
An intelligent terminal, comprising:

At least one processor; and,

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the method of any of claims 1-7 Target identification method.
A non-transitory computer readable storage medium, characterized in that the non-transitory computer readable storage medium stores computer executable instructions for causing a smart terminal to perform the claims 1-7 The target recognition method described in any one of the above.
A computer program product, comprising: a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a smart terminal, The smart terminal is caused to perform the object recognition method according to any one of claims 1-7.