CN112085063A

CN112085063A - Target identification method and device, terminal equipment and storage medium

Info

Publication number: CN112085063A
Application number: CN202010797205.2A
Authority: CN
Inventors: 李扬; 程骏; 庞建新
Original assignee: Ubtech Robotics Corp
Current assignee: Beijing Youbixuan Intelligent Robot Co ltd
Priority date: 2020-08-10
Filing date: 2020-08-10
Publication date: 2020-12-15
Anticipated expiration: 2040-08-10
Also published as: CN112085063B

Abstract

The application is applicable to the technical field of machine vision, and provides a target identification method, a device, a terminal device and a storage medium, wherein the method comprises the following steps: detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object; classifying the target object through a second neural network model to obtain a second classification label of the target object; and determining the category of the target object according to the first classification label and the second classification label. According to the embodiment of the application, the first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the classification burden of the second neural network model are balanced, and the target identification efficiency is improved.

Description

Target identification method and device, terminal equipment and storage medium

Technical Field

The present application belongs to the technical field of machine vision, and in particular, to a target identification method, apparatus, terminal device, and storage medium.

Background

With the development of machine vision technology, object recognition technology (such as vehicle type recognition, garbage type recognition, etc.) based on machine vision has gained more and more attention, and object recognition is to detect an object from an image or a video and classify the object to determine what type the object belongs to.

Deep learning is a new field in machine vision, and in recent years, a target recognition technology based on deep learning is widely applied due to high precision, however, at present, the target recognition technology based on deep learning mainly detects a target first, and then classifies the target by using a classifier after detecting the target in an image. When too many classes need to be classified, the classifier is overloaded, resulting in inefficient recognition.

Disclosure of Invention

The embodiment of the application provides a target identification method, a target identification device, terminal equipment and a storage medium, and aims to solve the problem that the existing target identification efficiency is not high.

In a first aspect, an embodiment of the present application provides a target identification method, including:

detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object;

classifying the target object through a second neural network model to obtain a second classification label of the target object;

and determining the category of the target object according to the first classification label and the second classification label.

In a second aspect, an embodiment of the present application provides an object recognition apparatus, including:

the first obtaining module is used for detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object;

a second obtaining module, configured to classify the target object through a second neural network model, and obtain a second classification label of the target object;

and the determining module is used for determining the category of the target object according to the first classification label and the second classification label.

In a third aspect, an embodiment of the present application provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the object identification method when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of the object recognition method.

In a fifth aspect, embodiments of the present application provide a computer program product, which, when run on an electronic device, causes the electronic device to perform the steps of the above object recognition method.

Compared with the prior art, the embodiment of the application has the advantages that: the method comprises the steps that a target image is detected through a first neural network model, and a target object in the target image and a first classification label of the target object are obtained; classifying the target object through a second neural network model to obtain a second classification label of the target object; and determining the category of the target object according to the first classification label and the second classification label. The first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the classification burden of the second neural network model are balanced, and the target identification efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of a target identification method according to an embodiment of the present application;

fig. 2 is a schematic specific flowchart of step S103 according to an embodiment of the present application;

fig. 3 is a schematic diagram of a pre-constructed matrix in a specific application scenario according to an embodiment of the present application;

FIG. 4 is a diagram illustrating an output result in a specific application scenario according to an embodiment of the present application;

FIG. 5 is a schematic diagram of an object recognition device according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of a terminal device according to another embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The target identification method provided by the embodiment of the application can be applied to terminal devices such as a robot, a mobile phone, a tablet computer, a wearable device, a vehicle-mounted device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, a super-mobile personal computer (UMPC), a netbook, and a Personal Digital Assistant (PDA), and the embodiment of the application does not limit the specific types of the terminal devices at all.

In order to explain the technical means described in the present application, the following examples are given below.

Example one

Referring to fig. 1, a target identification method provided in an embodiment of the present application includes:

step S101, detecting a target image through a first neural network model, and obtaining a target object in the target image and a first classification label of the target object.

Specifically, the first neural network model is a pre-constructed and trained neural network model, and is configured to detect whether the target object exists in the target image, and output a first classification tag of the target object when the target object exists.

In one particular application scenario, where the target object may be a vehicle, the first category label includes a size of the vehicle, which may include a variety of attribute sizes, such as a large size vehicle, a medium size vehicle, and a small size vehicle. The first neural network model outputs the attribute size of the vehicle when the vehicle is present in the detection target image. In practical applications, the first classification tag may be one or more other classification tags indicating characteristics of a target object, and if the target object to be detected is a vehicle, the first classification tag may be a brand of the vehicle, and the application scenarios are only examples, and may also be applied to garbage recognition, face recognition, and the like, which are not limited thereto.

The first neural network model may be a neural network model constructed in advance by a lightweight network model, a large number of images including the target object are prepared according to the target object to be recognized, and if the target object to be recognized is a vehicle, pictures including various types of vehicles are prepared, and a first classification label is applied to each picture. And training a large number of images containing the target object on the first neural network model until the preset loss function of the first neural network model converges, and judging that the first neural network model is the trained neural network model. The preset loss function of the first neural network model may be a cross entropy loss function or a mean square error loss function, or the like.

In one embodiment, the detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object includes: detecting a target image through a first neural network model, and when a target object in the target image is detected, obtaining the coordinates of the target object in the target image and a first classification label of the target object. Specifically, when the target object is detected, the position information of the target object in the image and the first classification tag of the target object may be output.

And S102, classifying the target object through a second neural network model to obtain a second classification label of the target object.

Specifically, the second neural network model is a pre-constructed and trained neural network model, and is used for classifying the target object detected by the first neural network model, and the output result of the second neural network model is a second classification label of the target object.

In a specific application scenario, the target object may be a vehicle, the first classification tag includes a size of the vehicle, and the second classification tag includes a model of the vehicle, where the model of the vehicle includes models of multiple attributes, such as cars, coaches, and vans. The second neural network model is used for outputting the attribute of the vehicle model. In practical application, the second classification label may be one or more other classification labels indicating characteristics of the target object, the first classification label and the second classification label are labels classified from different characteristic dimensions of the target object, the two characteristic dimensions of the application scenario and the first classification label and the second classification label are only examples, and are not limited thereto, and the application scenario may also be application scenarios such as spam recognition and face recognition.

The second neural network mode may be a neural network model constructed in advance by a lightweight network model, a large number of target object images are prepared according to a target object to be identified, if the target object to be identified is a vehicle, pictures of various types of vehicles are prepared, and second classification label labeling is performed on each picture. And training a large number of target object images on the second neural network model until the preset loss function of the second neural network model converges, and judging that the second neural network model is the trained neural network model. The preset loss function of the second neural network model can be a cross entropy loss function or a mean square error loss function and other types of loss functions.

In one embodiment, before the classifying the target object by the second neural network model and obtaining the second classification label of the target object, the method includes: and extracting the target object according to the coordinates of the target object in the target image. Specifically, when the position of the target object in the target image is determined through the first neural network model, the target object is extracted according to the position in the target image, and the extracted target object is input into the second neural network model, so that the second neural network model only needs to classify the target object and does not need to detect, and the classification efficiency can be improved.

In one embodiment, the extracting the target object according to the coordinates of the target object in the target image includes: determining a corresponding rectangular frame according to the coordinates of the target object in the target image; and extracting the target object according to the coordinates of the rectangular frame in the target image. In a specific application, because a picture input by the neural network is generally a picture with a regular shape, a rectangular frame including the target object is drawn in the target image according to the coordinates of the target object in the target image; specifically, the method includes the steps of determining the upper side of a rectangular frame by separating the coordinate of the pixel point at the uppermost position of the target object in the target image by a distance of a first preset pixel number, determining the lower side of the rectangular frame by separating the coordinate of the pixel point at the lowermost position of the target object in the target image by a distance of a second preset pixel number, determining the left side of the rectangular frame by separating the coordinate of the pixel point at the leftmost position of the target object in the target image by a distance of a third preset pixel number, determining the right side of the rectangular frame by separating the coordinate of the pixel point at the rightmost position of the target object in the target image by a distance of a fourth preset pixel number, and after the rectangular frame is determined, clipping the target image according to the coordinate of the rectangular frame in the target image to extract a regular image of the target object.

Step S103, determining the category of the target object according to the first classification label and the second classification label.

Specifically, the category of the target object is determined according to the first classification tag and the second classification tag, and if the first classification tag of the target object is determined to be a large-sized vehicle through the first neural network, and the second classification tag of the target object is determined to be a passenger car through the second neural network, the category of the target object is determined to be the passenger car of the large-sized vehicle. If the first classification label of the target object is determined to be the small-size vehicle through the first neural network, and the second classification label of the target object is determined to be the passenger car through the second neural network, the class of the target object is determined to be the passenger car of the small-size vehicle.

In one embodiment, before inputting the target image to the first neural network model for target detection, the method comprises the following steps: constructing a matrix containing M multiplied by M elements; wherein M is an integer greater than or equal to 2; storing category information of a class of objects in each element of the matrix; establishing and storing an incidence relation between each element in the ith column of the matrix and the first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer; and establishing and storing the incidence relation between each element in the ith row of the matrix and the second classification label of the ith attribute. The ith column is any column in the matrix, and the ith row is any row in the matrix.

In practical application, the first classification tag and the second classification tag are different features of two dimensions of a target object, the category of the target object may not be directly obtained according to the different features of the two dimensions, when N categories (N is equal to M × M) need to be classified, a matrix of M × M elements can be established in advance, and category information of one category of objects is stored in each element of the matrix; each element in the same column of the matrix corresponds to the same first classification label, and different column elements correspond to different first classification labels; each element in the same row of the matrix corresponds to the same second classification label, and different row elements correspond to different second classification labels.

In an embodiment, as shown in fig. 3, the step S103 specifically includes steps S1031 to S1034:

step S1031, determining column coordinates of the target object in the matrix according to the first classification label;

specifically, because different column elements in the matrix correspond to different first classification tags, which column of the matrix the target object belongs to can be determined according to the first classification tags output by the first neural network.

Step S1032, determining the row coordinate of the target object in the matrix according to the second classification label;

specifically, different row elements in the matrix correspond to different second classification labels, and the target object is determined to which row in the matrix the target object belongs according to the second classification label output by the second neural network.

Step S1033, determining the coordinate of the target object in the matrix according to the row coordinate and the column coordinate;

specifically, the coordinate position of the target object in the matrix can be determined according to the row coordinate and the column coordinate to which the target object belongs.

Step S1034, determining the category of the target object according to the coordinates of the target object in the matrix.

Specifically, the category information of the object stored corresponding to the coordinate position of the target object in the matrix is obtained.

For better understanding of the embodiment of the present application, please refer to fig. 3, which is a schematic diagram of a matrix in a specific application scenario, for example, a matrix of M × M elements is a 3 × 3 matrix, and each element stores corresponding category information. Each element in the same column of the matrix corresponds to the same first classification tag (e.g., all elements in the first column correspond to small-sized vehicles, all elements in the second column correspond to medium-sized vehicles), each element in different columns corresponds to different first classification tags, each element in the same row of the matrix corresponds to the same second classification tag (e.g., all elements in the first row correspond to trucks, all elements in the second row correspond to buses), and each element in different rows corresponds to different second classification tags. When the first classification label output by the first neural network is a small-size vehicle, the column coordinate of the target object in the matrix is determined to be a first column, when the second classification label output by the second neural network is a bridge vehicle, the row coordinate of the target object in the matrix is determined to be a third row, the target object can be determined to belong to the third row and the first column in the matrix according to the column coordinate and the row coordinate, and the category of the target object can be determined to be a sedan according to category information (sedan) stored in the third row and the first column. Therefore, when N classes (N is equal to M multiplied by M) need to be classified, each neural network model only needs to classify M classes, M is smaller than N, and especially when the number of classes is large, M is far smaller than N, the fewer classes of the neural network have better classification effect, and therefore the target identification efficiency can be improved. When the N categories to be classified are not a complete square number, one or more elements in the established matrix may not store the category information of the target object, and then when the object is identified to belong to the coordinates in the matrix which does not store the category information of the target object, the result is output as an identification error. The examples are only for the purpose of facilitating understanding, and the classification is specifically made according to practical applications, and is not limited herein.

In one embodiment, after determining the category of the target object according to the first category label and the second category label, the method includes: and displaying the rectangular frame and the category information of the target object in a preset mode in the target image. For example, the rectangular frame may be displayed in a preset color and the category information of the target object may be displayed at a position spaced apart from the rectangle by a preset distance. In an application scenario, as shown in fig. 4, a schematic diagram of an output result in an application scenario is shown, where the target image includes a target vehicle, and a rectangular frame of the target vehicle and category information of the target vehicle are output as a car.

According to the embodiment of the application, the first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the classification burden of the second neural network model are balanced, and the target identification efficiency is improved.

Fig. 5 shows a block diagram of a target recognition apparatus provided in the embodiment of the present application, corresponding to the target recognition method described in the above embodiment, and only the relevant parts of the embodiment of the present application are shown for convenience of description. Referring to fig. 5, the object recognition apparatus 500 includes:

a first obtaining module 501, configured to detect a target image through a first neural network model, and obtain a target object in the target image and a first classification label of the target object;

a second obtaining module 502, configured to classify the target object through a second neural network model, and obtain a second classification label of the target object;

a determining module 503, configured to determine the category of the target object according to the first classification tag and the second classification tag.

In one embodiment, the object recognition apparatus further comprises:

a building module for building a matrix comprising M x M elements; wherein M is an integer greater than or equal to 2;

the storage module is used for storing the category information of a class of objects in each element of the matrix;

the first establishing module is used for establishing and storing the incidence relation between each element in the ith column of the matrix and the first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer;

and the second establishing module is used for establishing and storing the incidence relation between each element in the ith row of the matrix and the second classification label of the ith attribute.

In one embodiment, the determining module 503 includes:

the first determining unit is used for determining the column coordinates of the target object in the matrix according to the first classification label;

the second determining unit is used for determining the row coordinates of the target object in the matrix according to the second classification label;

a third determining unit, configured to determine coordinates of the target object in the matrix according to the row coordinates and the column coordinates;

and the fourth determining unit is used for determining the category of the target object according to the coordinates of the target object in the matrix.

In one embodiment, the first obtaining module is specifically configured to:

detecting a target image through a first neural network model, and when a target object in the target image is detected, obtaining the coordinates of the target object in the target image and a first classification label of the target object.

In one embodiment, the object recognition apparatus further comprises:

and the target extracting module is used for extracting the target object according to the coordinate of the target object in the target image and the coordinate of the target object in the target image before the second obtaining module is triggered.

In one embodiment, the target extraction module is specifically configured to: determining a corresponding rectangular frame according to the coordinates of the target object in the target image; and extracting the target object according to the coordinates of the rectangular frame in the target image.

In one embodiment, the object recognition apparatus further comprises:

and the display module is used for displaying the rectangular frame and the category information of the target object in a preset mode in the target image.

The first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the classification burden of the second neural network model are balanced, and the target identification efficiency is improved.

As shown in fig. 6, an embodiment of the present invention further provides a terminal device 600 including: a processor 601, a memory 602 and a computer program 603, such as an object recognition program, stored in said memory 602 and executable on said processor 601. The processor 601, when executing the computer program 603, implements the steps in the various embodiments of the object recognition method described above. The processor 601, when executing the computer program 603, implements the functions of the modules in the above-described device embodiments, such as the functions of the modules 601 to 603 shown in fig. 6.

Illustratively, the computer program 603 may be partitioned into one or more modules that are stored in the memory 602 and executed by the processor 601 to implement the present invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 603 in the terminal device 600. For example, the computer program 603 may be divided into a first obtaining module, a second obtaining module and a determining module, and specific functions of the modules are described in the foregoing embodiments, and are not described herein again.

The terminal device 600 may be a robot, a desktop computer, a notebook, a palm computer, or other computing devices. The terminal device may include, but is not limited to, a processor 601, a memory 602. Those skilled in the art will appreciate that fig. 6 is merely an example of a terminal device 600 and does not constitute a limitation of terminal device 600 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.

The Processor 601 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 602 may be an internal storage unit of the terminal device 600, such as a hard disk or a memory of the terminal device 600. The memory 602 may also be an external storage device of the terminal device 600, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 600. Further, the memory 602 may also include both an internal storage unit and an external storage device of the terminal device 600. The memory 602 is used for storing the computer programs and other programs and data required by the terminal device. The memory 602 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated module, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A method of object recognition, comprising:

2. The target recognition method of claim 1, prior to inputting the target image to the first neural network model for target detection, comprising:

constructing a matrix containing M multiplied by M elements; wherein M is an integer greater than or equal to 2;

storing category information of a class of objects in each element of the matrix;

establishing and storing an incidence relation between each element in the ith column of the matrix and the first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer;

and establishing and storing the incidence relation between each element in the ith row of the matrix and the second classification label of the ith attribute.

3. The object identifying method of claim 2, wherein determining the class of the object from the first class label and the second class label comprises:

determining column coordinates of the target object in the matrix according to the first classification label;

determining the row coordinates of the target object in the matrix according to the second classification label;

determining the coordinates of the target object in the matrix according to the row coordinates and the column coordinates;

and determining the category of the target object according to the coordinates of the target object in the matrix.

4. The target recognition method of any one of claims 1 to 3, wherein the detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object comprises:

5. The object recognition method of claim 4, wherein before the classifying the object by the second neural network model and obtaining the second classification label of the object, the method comprises:

and extracting the target object according to the coordinates of the target object in the target image.

6. The object recognition method according to claim 5, wherein the extracting the target object according to the coordinates of the target object in the target image comprises:

determining a corresponding rectangular frame according to the coordinates of the target object in the target image;

and extracting the target object according to the coordinates of the rectangular frame in the target image.

7. The object identifying method according to claim 6, comprising, after determining the class of the object from the first class label and the second class label:

and displaying the rectangular frame and the category information of the target object in a preset mode in the target image.

8. An object recognition apparatus, comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.