CN112085063B

CN112085063B - Target identification method, device, terminal equipment and storage medium

Info

Publication number: CN112085063B
Application number: CN202010797205.2A
Authority: CN
Inventors: 李扬; 程骏; 庞建新
Original assignee: Ubtech Robotics Corp
Current assignee: Beijing Youbixuan Intelligent Robot Co ltd
Priority date: 2020-08-10
Filing date: 2020-08-10
Publication date: 2023-10-13
Anticipated expiration: 2040-08-10
Also published as: CN112085063A

Abstract

The application is applicable to the technical field of machine vision, and provides a target identification method, a device, terminal equipment and a storage medium, wherein the method comprises the following steps: detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object; classifying the target object through a second neural network model to obtain a second classification label of the target object; and determining the category of the target object according to the first classification label and the second classification label. According to the embodiment of the application, the first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the second neural network model is balanced, and the recognition efficiency of the target is improved.

Description

Target identification method, device, terminal equipment and storage medium

Technical Field

The application belongs to the technical field of machine vision, and particularly relates to a target identification method, a device, terminal equipment and a storage medium.

Background

With the development of machine vision technology, machine vision-based object recognition technology (such as vehicle type recognition, garbage type recognition, etc.) has also gained more and more attention, and the object is to detect objects from images or videos and classify the objects to determine what type the objects belong to.

Deep learning is a new field in machine vision, and in recent years, a target recognition technology based on deep learning has been widely applied due to higher precision, however, the current target recognition technology based on deep learning mainly detects targets first, and then classifies the targets by using a classifier after detecting the targets in an image. When the number of categories to be classified is too large, the classifier is overloaded, resulting in low recognition efficiency.

Disclosure of Invention

The embodiment of the application provides a target identification method, a device, terminal equipment and a storage medium, aiming at solving the problem of low efficiency of target identification in the prior art.

In a first aspect, an embodiment of the present application provides a target recognition method, including:

detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object;

classifying the target object through a second neural network model to obtain a second classification label of the target object;

and determining the category of the target object according to the first classification label and the second classification label.

In a second aspect, an embodiment of the present application provides an object recognition apparatus, including:

the first obtaining module is used for detecting a target image through a first neural network model and obtaining a target object in the target image and a first classification label of the target object;

the second obtaining module is used for classifying the target object through a second neural network model to obtain a second classification label of the target object;

and the determining module is used for determining the category of the target object according to the first classification label and the second classification label.

In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the above object recognition method when executing the computer program.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the target recognition method described above.

In a fifth aspect, embodiments of the present application provide a computer program product for causing an electronic device to carry out the steps of the above-mentioned object recognition method when the computer program product is run on the electronic device.

Compared with the prior art, the embodiment of the application has the beneficial effects that: according to the embodiment of the application, the target image is detected through the first neural network model, and the target object in the target image and the first classification label of the target object are obtained; classifying the target object through a second neural network model to obtain a second classification label of the target object; and determining the category of the target object according to the first classification label and the second classification label. Because the first neural network model can detect the target object and classify the target object from the first classification label, the second neural network model only needs to classify the target object from the second classification label, and the classification burden of the first neural network model and the second neural network model is balanced, so that the recognition efficiency of the target is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a target recognition method according to an embodiment of the application;

FIG. 2 is a schematic diagram showing a specific flow of step S103 according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a pre-constructed matrix in a specific application scenario according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an output result in a specific application scenario according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a target recognition device according to another embodiment of the present application;

fig. 6 is a schematic structural diagram of a terminal device according to another embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

The target recognition method provided by the embodiment of the application can be applied to terminal equipment such as robots, mobile phones, tablet computers, wearable equipment, vehicle-mounted equipment, augmented reality (augmented reality, AR)/Virtual Reality (VR) equipment, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistant, PDA) and the like, and the embodiment of the application does not limit the specific types of the terminal equipment.

In order to explain the technical scheme of the application, the following examples are used for illustration.

Example 1

Referring to fig. 1, a target recognition method provided in an embodiment of the present application includes:

step S101, detecting a target image through a first neural network model to obtain a target object in the target image and a first classification label of the target object.

Specifically, the first neural network model is a neural network model which is built in advance and is already trained, and is used for detecting whether the target object exists in the target image, and outputting a first classification label of the target object when the target object exists.

In one specific application scenario, the target object may be a vehicle, and the first classification tag includes a size of the vehicle, and the size of the vehicle may include various attribute sizes, such as a large-sized vehicle, a medium-sized vehicle, and a small-sized vehicle. The first neural network model outputs an attribute size of the vehicle when detecting the presence of the vehicle in the target image. In practical applications, the first classification tag may be one or more other classification tags indicating features of the target object, and if the target object to be detected is a vehicle, the first classification tag may be a brand of the vehicle, and the application scenario may be, for example, garbage recognition, face recognition, and the like, which is not limited.

The first neural network model may be a neural network model constructed in advance through a lightweight network model, a large number of images containing the target object are prepared according to the target object to be identified, if the target object to be identified is a vehicle, pictures including various types of vehicles are prepared, and a first classification label is labeled on each picture. And training the first neural network model by preparing a large number of images containing the target object until the preset loss function of the first neural network model converges, and judging the first neural network model as a trained neural network model. The predetermined loss function of the first neural network model may be a cross entropy loss function or a mean square error loss function or the like.

In one embodiment, the detecting the target image through the first neural network model, to obtain the target object in the target image and the first classification label of the target object, includes: and detecting a target image through a first neural network model, and when a target object in the target image is detected, obtaining coordinates of the target object in the target image and a first classification label of the target object. Specifically, when the target object is detected, the position information of the target object in the image and the first classification label of the target object are output.

Step S102, classifying the target object through a second neural network model to obtain a second classification label of the target object.

Specifically, the second neural network model is a neural network model which is built in advance and is already trained, and is used for classifying the target object detected by the first neural network model, and the output result of the second neural network model is a second classification label of the target object.

In a specific application scenario, the target object may be a vehicle, the first classification tag includes a size of the vehicle, the second classification tag includes a model of the vehicle, and the model includes models of various attributes, such as a car, a passenger car, a van, and the like. The second neural network model is used for outputting attributes of vehicle types. In practical applications, the second classification tag may be one or more other classification tags indicating features of the target object, where the first classification tag and the second classification tag are tags classifying from different feature dimensions of the target object, and the application scenario and the two feature dimensions of the first classification tag and the second classification tag are just examples, which are not limited, and may be application scenarios applied to garbage recognition, face recognition, and so on.

The second neural network mode can be a neural network model which is built through a lightweight network model in advance, a large number of target object images are prepared according to target objects to be identified, if the target objects to be identified are vehicles, pictures of various types of vehicles are prepared, and second classification labels are marked on each picture. Training the second neural network model by preparing a large number of target object images until the preset loss function of the second neural network model is converged, and judging the second neural network model as a trained neural network model. The predetermined loss function of the second neural network model may be a cross entropy loss function or a mean square error loss function or the like.

In one embodiment, before the classifying the target object by the second neural network model to obtain the second classification label of the target object, the method includes: and extracting the target object according to the coordinates of the target object in the target image. Specifically, when the position of the target object in the target image is determined through the first neural network model, the target object is extracted according to the position in the target image, and the extracted target object is input into the second neural network model, so that the second neural network model only needs to classify the target object without detection, and the classification efficiency can be improved.

In one embodiment, the extracting the target object according to the coordinates of the target object in the target image includes: determining a corresponding rectangular frame according to the coordinates of the target object in the target image; and extracting the target object according to the coordinates of the rectangular frame in the target image. In a specific application, since the picture input by the neural network is generally a regular-shape picture, a rectangular frame including the target object is drawn in the target image according to the coordinates of the target object in the target image; the method specifically includes that the upper edge of a rectangular frame is determined by the distance between the coordinate of the pixel point at the uppermost position in a target image and the coordinate of the pixel point at the lowermost position in the target image by a first preset pixel number, the lower edge of the rectangular frame is determined by the distance between the coordinate of the pixel point at the lowermost position in the target image and the coordinate of the pixel point at the leftmost position in the target image by a third preset pixel number, the left edge of the rectangular frame is determined by the distance between the coordinate of the pixel point at the leftmost position in the target image and the coordinate of the pixel point at the rightmost position in the target image by a fourth preset pixel number, the right edge of the rectangular frame is determined by the distance between the coordinate of the rectangular frame and the coordinate of the pixel point at the leftmost position in the target image by a fourth preset pixel number, and after the rectangular frame is determined, the target image is cut according to the coordinate of the rectangular frame in the target image, so that a regular image of the target object is extracted.

Step S103, determining the category of the target object according to the first classification label and the second classification label.

Specifically, the category of the target object is determined according to the first classification tag and the second classification tag, if the first classification tag of the target object is determined to be a large-sized vehicle through the first neural network, and if the second classification tag of the target object is determined to be a passenger car through the second neural network, the category of the target object is determined to be a passenger car of the large-sized vehicle. If the first classification label of the target object is determined to be a small-sized vehicle through the first neural network, and the second classification label of the target object is determined to be a passenger car through the second neural network, the class of the target object is determined to be the passenger car of the small-sized vehicle.

In one embodiment, before inputting the target image into the first neural network model for target detection, comprising: constructing a matrix containing M×M elements; wherein M is more than or equal to 2 and is an integer; storing class information of a class of objects in each element of the matrix; establishing and storing an association relation between each element in the ith column of the matrix and a first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer; and establishing and storing the association relation between each element in the ith row of the matrix and the second classification label of the ith attribute. The ith column is any column in the matrix, and any row in the ith behavior matrix.

In practical application, the first classification label and the second classification label are different characteristics of two dimensions of the target object, the class of the target object may not be directly obtained according to the different characteristics of the two dimensions, when N classes (N is equal to M×M) need to be classified, a matrix of M×M elements can be established in advance, and class information of one class of object is stored in each element of the matrix; each element in the same column of the matrix corresponds to the same first classification label, and elements in different columns correspond to different first classification labels; each element in the same row of the matrix corresponds to the same second class label, and elements in different rows correspond to different second class labels.

In one embodiment, as shown in fig. 2, the step S103 specifically includes steps S1031 to S1034:

step S1031, determining column coordinates of the target object in the matrix according to the first classification label;

specifically, since different column elements in the matrix correspond to different first classification labels, according to the first classification labels output by the first neural network, it can be determined which column in the matrix the target object belongs to.

Step S1032, determining row coordinates of the target object in the matrix according to the second classification labels;

specifically, as different row elements in the matrix correspond to different second classification labels, according to the second classification labels output by the second neural network, determining which row in the matrix the target object belongs to.

Step S1033, determining coordinates of the target object in the matrix according to the row coordinates and the column coordinates;

specifically, the coordinate position of the target object in the matrix can be determined according to the obtained row coordinates and column coordinates to which the target object belongs.

Step S1034, determining the category of the target object according to the coordinates of the target object in the matrix.

Specifically, the coordinate position of the target object in the matrix is obtained, and the category information of the stored object corresponding to the coordinate position is obtained.

For a better understanding of the embodiments of the present application, please refer to fig. 3, which is a schematic diagram of a matrix in a specific application scenario, for example, a matrix of m×m elements is a matrix of 3×3, and each element stores corresponding category information. Each element in the same column of the matrix corresponds to the same type of first class label (e.g., all elements of the first column correspond to small-sized vehicles, all elements of the second column correspond to medium-sized vehicles), elements of different columns correspond to different first class labels, each element in the same row of the matrix corresponds to the same type of second class label (e.g., all elements of the first row correspond to trucks, all elements of the second row correspond to buses), and elements of different rows correspond to different second class labels. When the first classification label output by the first neural network is a small-sized vehicle, determining that the object belongs to a first column in the matrix, when the second classification label output by the second neural network is a bridge car, determining that the object belongs to a third row in the matrix, determining that the object belongs to the first column in the third row in the matrix according to the column coordinates and the row coordinates, and determining that the object is a small car according to the class information (small car) stored in the first column in the third row. Therefore, when N categories (N is equal to M×M) are required to be classified, each neural network model only needs to classify M categories, M is smaller than N, especially when M is far smaller than N when more categories are classified, the less classification categories of the neural network have better classification effect, and therefore the recognition efficiency of the target can be improved. When the N categories to be divided are not a complete square number, one or more elements in the established matrix can not store the category information of the target object, and when the coordinates in the matrix which does not store the category information of the target object are identified, the output result is an identification error. This is merely exemplary to facilitate understanding, and is not limited to this specific example, as it may be used in practice.

In one embodiment, after determining the class of the target object according to the first classification tag and the second classification tag, the method includes: and displaying the rectangular frame and the category information of the target object in the target image in a preset mode. For example, the rectangular frame may be displayed in a preset color and the category information of the target object may be displayed at a position that is a preset distance away from the rectangular frame. In one application scenario, as shown in fig. 4, a schematic diagram of an output result in one application scenario is shown, where the target image includes a target vehicle, and a rectangular frame of the target vehicle and class information of the target vehicle are output as a sedan.

According to the embodiment of the application, the first neural network model can detect the target object and classify the target object from the first classification label, and the second neural network model only needs to classify the target object from the second classification label, so that the classification burden of the first neural network model and the second neural network model is balanced, and the recognition efficiency of the target is improved.

Corresponding to the object recognition method described in the above embodiments, fig. 5 shows a block diagram of the object recognition device according to the embodiment of the present application, and for convenience of explanation, only the portion related to the embodiment of the present application is shown. Referring to fig. 5, the object recognition apparatus 500 includes:

a first obtaining module 501, configured to detect a target image through a first neural network model, and obtain a target object in the target image and a first classification label of the target object;

a second obtaining module 502, configured to classify the target object through a second neural network model, and obtain a second classification label of the target object;

a determining module 503, configured to determine a category of the target object according to the first classification tag and the second classification tag.

In one embodiment, the object recognition apparatus further includes:

a construction module for constructing a matrix containing m×m elements; wherein M is more than or equal to 2 and is an integer;

the storage module is used for storing category information of one type of object in each element of the matrix;

the first establishing module is used for establishing and storing the association relation between each element in the ith column of the matrix and the first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer;

and the second establishing module is used for establishing and storing the association relation between each element in the ith row of the matrix and the second classification label of the ith attribute.

In one embodiment, the determining module 503 includes:

a first determining unit, configured to determine column coordinates of the target object in the matrix according to the first classification tag;

a second determining unit, configured to determine row coordinates of the target object in the matrix according to the second classification tag;

a third determining unit configured to determine coordinates of the target object in the matrix according to the row coordinates and the column coordinates;

and a fourth determining unit, configured to determine a category of the target object according to coordinates of the target object in the matrix.

In one embodiment, the first obtaining module is specifically configured to:

and detecting a target image through a first neural network model, and when a target object in the target image is detected, obtaining coordinates of the target object in the target image and a first classification label of the target object.

In one embodiment, the object recognition apparatus further includes:

and the target extraction module is used for extracting the target object according to the coordinates of the target object in the target image and the coordinates of the target object in the target image before the second acquisition module is triggered.

In one embodiment, the target extraction module is specifically configured to: determining a corresponding rectangular frame according to the coordinates of the target object in the target image; and extracting the target object according to the coordinates of the rectangular frame in the target image.

In one embodiment, the object recognition apparatus further includes:

and the display module is used for displaying the rectangular frame and the category information of the target object in the target image in a preset mode.

Because the first neural network model can detect the target object and classify the target object from the first classification label, the second neural network model only needs to classify the target object from the second classification label, and the classification burden of the first neural network model and the second neural network model is balanced, so that the recognition efficiency of the target is improved.

As shown in fig. 6, an embodiment of the present application further provides a terminal device 600 including: a processor 601, a memory 602 and a computer program 603, e.g. an object recognition program, stored in said memory 602 and executable on said processor 601. The processor 601, when executing the computer program 603, implements the steps of the various target recognition method embodiments described above. The processor 601, when executing the computer program 603, performs the functions of the modules of the device embodiments described above, such as the functions of the modules 601 to 603 shown in fig. 6.

Illustratively, the computer program 603 may be partitioned into one or more modules that are stored in the memory 602 and executed by the processor 601 to perform the present application. The one or more modules may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 603 in the terminal device 600. For example, the computer program 603 may be divided into a first obtaining module, a second obtaining module and a determining module, where specific functions of each module are described in the above embodiments, and are not described herein.

The terminal device 600 may be a robot, a mobile terminal device, a desktop computer, a notebook computer, a palm computer, or other computing devices. The terminal device may include, but is not limited to, a processor 601, a memory 602. It will be appreciated by those skilled in the art that fig. 6 is merely an example of a terminal device 600 and is not limiting of the terminal device 600, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the terminal device may also include input and output devices, network access devices, buses, etc.

The processor 601 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 602 may be an internal storage unit of the terminal device 600, for example, a hard disk or a memory of the terminal device 600. The memory 602 may also be an external storage device of the terminal device 600, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 600. Further, the memory 602 may also include both an internal storage unit and an external storage device of the terminal device 600. The memory 602 is used for storing the computer program and other programs and data required by the terminal device. The memory 602 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated modules, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method of target identification, comprising:

determining the category of the target object according to the first classification tag and the second classification tag;

before inputting the target image into the first neural network model for target detection, the method comprises the following steps:

constructing a matrix containing M×M elements; wherein M is more than or equal to 2 and is an integer;

storing class information of a class of objects in each element of the matrix;

establishing and storing an association relation between each element in the ith column of the matrix and a first classification label of the ith attribute; wherein i is more than or equal to 1 and less than or equal to M, and i is an integer;

establishing and storing an association relation between each element in the ith row of the matrix and a second classification label of the ith attribute;

determining the category of the target object according to the first classification tag and the second classification tag, including:

determining column coordinates of the target object in the matrix according to the first classification label;

determining row coordinates of the target object in the matrix according to the second classification labels;

determining coordinates of the target object in the matrix according to the row coordinates and the column coordinates;

and determining the category of the target object according to the coordinates of the target object in the matrix.

2. The method for identifying a target according to claim 1, wherein the detecting the target image by the first neural network model to obtain the target object in the target image and the first classification tag of the target object includes:

3. The method for identifying a target according to claim 2, wherein before classifying the target object by the second neural network model to obtain the second classification label of the target object, the method comprises:

and extracting the target object according to the coordinates of the target object in the target image.

4. The target recognition method according to claim 3, wherein the extracting the target object according to coordinates of the target object in the target image comprises:

determining a corresponding rectangular frame according to the coordinates of the target object in the target image;

and extracting the target object according to the coordinates of the rectangular frame in the target image.

5. The target recognition method according to claim 4, wherein after determining the category of the target object from the first classification tag and the second classification tag, comprising:

and displaying the rectangular frame and the category information of the target object in the target image in a preset mode.

6. An object recognition apparatus, comprising:

the second establishing module is used for establishing and storing the association relation between each element in the ith row of the matrix and the second classification label of the ith attribute;

the determining module is used for determining the category of the target object according to the first classification label and the second classification label;

the determining module includes:

7. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 5 when executing the computer program.

8. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the method according to any one of claims 1 to 5.