CN116152573A

CN116152573A - Image recognition method, device, electronic equipment and computer readable storage medium

Info

Publication number: CN116152573A
Application number: CN202310323650.9A
Authority: CN
Inventors: 张子越; 房浩
Original assignee: BOE Technology Group Co Ltd
Current assignee: BOE Technology Group Co Ltd
Priority date: 2023-03-29
Filing date: 2023-03-29
Publication date: 2023-05-23

Abstract

The embodiment of the application provides an image identification method, an image identification device, electronic equipment and a computer readable storage medium, and relates to the technical field of computers. The method comprises the following steps: in this embodiment of the present application, the parameter information of the target object may be determined by performing recognition processing on the image to be recognized by using a preset recognizer. The preset identifier is obtained by training based on an initial identification model and a guide model, wherein the guide model is a large model with good performance and generalization capability; the knowledge learned by the guiding model can be used for guiding the training of the initial recognition model, so that the initial recognition model serving as the small model learns the good performance and generalization capability of the guiding model, and finally the preset recognizer meeting the training conditions is obtained. Therefore, the method and the device for identifying the images of the mobile terminal improve the performance of the preset identifier, reduce the parameter quantity of the preset identifier, realize the light weight and convenience of the preset identifier and improve the speed of image identification.

Description

Image recognition method, device, electronic equipment and computer readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image recognition method, an image recognition device, an electronic device, and a computer readable storage medium.

Background

With the continuous development of artificial intelligence technology, image recognition is an important field of artificial intelligence technology, and is widely applied to various application scenarios, for example, image recognition can be applied to security check, identity verification, mobile payment, unmanned goods shelves, intelligent retail and other application scenarios.

Among other things, image recognition techniques utilize computer vision and deep learning algorithms or models to process, analyze, and understand images to identify targets and objects in various different modes. However, in the practical application process, the model for image recognition is usually large in scale, and the convenience and recognition speed of image recognition are low.

Disclosure of Invention

The object of the present application is to solve at least one of the above technical drawbacks, in particular the technical drawbacks of the convenience of image recognition and the low recognition speed.

According to one aspect of the present application, there is provided an image recognition method, the method comprising:

acquiring an image to be identified comprising a target object;

The image to be identified is identified through a preset identifier, and parameter information of the target object is determined;

the preset identifier is obtained based on initial identification model and guide model training;

in the training process, a sample image is input into the initial recognition model to obtain recognition image features, and the initial recognition model is trained according to the difference between the guide image features and the recognition image features to obtain the preset recognizer meeting training conditions; the guiding image features are obtained by inputting the sample image into the guiding model;

the feature dimension of the guiding image feature is the same as that of the identification image feature; the number of parameters of the guided model is greater than the number of parameters of the initial recognition model.

Optionally, the acquiring the image to be identified including the target object includes:

intercepting an image frame containing the target object in a target multimedia file;

or alternatively, the process may be performed,

and acquiring an image frame containing the target object in a preset database.

Optionally, the identifying the image to be identified by a preset identifier, determining parameter information of the target object includes:

Calculating the object feature of the target object in the image to be identified through the preset identifier;

and comparing the object features with preset object features in a feature database to determine the parameter information of the target object.

Optionally, the comparing the object feature with a preset object feature in a feature database, and determining parameter information of the target object includes:

determining the similarity between the image features and the preset object features;

and determining the parameter information of the target object according to the preset object characteristics with the maximum similarity.

Optionally, before the identifying the image to be identified by the preset identifier, the method further includes:

acquiring a training sample set; the training sample set comprises sample images; the sample image comprises a sample object;

inputting the sample images into an initial guiding model to obtain a corresponding identification result of each sample image; the identification result comprises initial guiding characteristics and identification parameter information of the sample object;

determining a training loss value according to the identification parameter information and the reference parameter information;

and training the initial guiding model based on the training loss value to obtain the guiding model conforming to training conditions.

inputting a sample image into the initial recognition model to obtain recognition image characteristics;

downsampling the initial guiding feature to obtain the guiding image feature; the feature dimension of the guiding image feature is the same as that of the identification image feature;

determining a training loss value according to the guiding image characteristics and the identifying image characteristics;

and based on the training loss value, repeating training on the initial recognition model until the preset recognizer meeting the training ending condition is obtained.

Optionally, the acquiring a training sample set includes:

acquiring a first image of a first object under different illumination intensities;

labeling the first image, and labeling object parameters of the first object to obtain the sample image;

the training sample set is obtained based on the sample image.

Optionally, after determining the parameter information of the target object, the method further includes:

labeling the image to be identified according to the parameter information;

and taking the image to be identified after the labeling processing as the sample image.

Optionally, the downsampling process includes at least one of:

maximum sampling treatment;

and (5) average sampling processing.

According to another aspect of the present application, there is provided an image recognition apparatus including:

the image acquisition module is used for acquiring an image to be identified comprising a target object;

the image recognition module is used for recognizing the image to be recognized through a preset recognizer and determining the parameter information of the target object;

According to another aspect of the present application, there is provided an electronic device including:

A memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the image recognition method according to any one of the first aspects of the present application.

For example, in a third aspect of the present application, there is provided a computing device comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface are communicated with each other through the communication bus;

the memory is configured to store at least one executable instruction, where the executable instruction causes the processor to perform operations corresponding to the image recognition method according to the first aspect of the present application.

According to a further aspect of the present application there is provided a computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the image recognition method of any of the first aspects of the present application.

For example, in a fourth aspect of the embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the image recognition method shown in the first aspect of the present application.

According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from a computer-readable storage medium by a processor of a computer device, which executes the computer instructions, causing the computer device to perform the methods provided in the various alternative implementations of the first aspect described above.

The beneficial effects that this application provided technical scheme brought are:

in this embodiment of the present application, the parameter information of the target object may be determined by performing recognition processing on the image to be recognized by using a preset recognizer. The preset identifier is obtained by training based on an initial identification model and a guide model, wherein the guide model is a large model with good performance and generalization capability; the knowledge learned by the guiding model can be used for guiding the training of the initial recognition model, so that the initial recognition model serving as the small model learns the good performance and generalization capability of the guiding model, and finally the preset recognizer meeting the training conditions is obtained. Therefore, the method and the device for identifying the images of the mobile terminal improve the performance of the preset identifier, reduce the parameter quantity of the preset identifier, realize the light weight and convenience of the preset identifier and improve the speed of image identification.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic overall architecture of an image recognition method according to an embodiment of the present application;

fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a model structure of an image recognition method according to an embodiment of the present application;

FIG. 4 is a second schematic diagram of a model structure of an image recognition method according to an embodiment of the present disclosure;

FIG. 5 is a third schematic diagram of a model structure of an image recognition method according to an embodiment of the present disclosure;

FIG. 6 is a second flowchart of an image recognition method according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an image recognition device according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device for image recognition according to an embodiment of the present application.

Detailed Description

Embodiments of the present application are described below with reference to the drawings in the present application. It should be understood that the embodiments described below with reference to the drawings are exemplary descriptions for explaining the technical solutions of the embodiments of the present application, and the technical solutions of the embodiments of the present application are not limited.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and "comprising," when used in this application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, all of which may be included in the present application. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates that at least one of the items defined by the term, e.g., "a and/or B" may be implemented as "a", or as "B", or as "a and B".

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

At least part of the content in the image recognition method provided by the embodiment of the application relates to the fields of machine learning and the like in the artificial intelligence field, and also relates to various fields of Cloud technology, such as Cloud computing in Cloud technology (Cloud technology), cloud service and related data computing processing fields in the big data field.

Artificial intelligence (Artificial Intelligence, AI for short) is a theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Among them, machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

In order to further explain the technical solutions provided in the embodiments of the present application, the following details are described with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide the method operational steps as shown in the following embodiments or figures, more or fewer operational steps may be included in the method based on routine or non-inventive labor. In steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided by the embodiments of the present application.

First, fig. 1 is a system architecture diagram of an image recognition method according to an embodiment of the present application. The system may comprise a server 101 and a cluster of terminals, wherein the server 101 may be considered as a background server for the image recognition process.

The terminal cluster may include: terminal 102, terminal 103, terminals 104, … …, wherein a client supporting image recognition processing may be installed in the terminals. There may be a communication connection between terminals, for example, between terminal 102 and terminal 103, and between terminal 103 and terminal 104.

Meanwhile, the server 101 may provide services for the terminal cluster through a communication connection function, and any terminal in the terminal cluster may have a communication connection with the server 101, for example, a communication connection exists between the terminal 102 and the server 101, and a communication connection exists between the terminal 103 and the server 101, where the above communication connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may also be directly or indirectly connected through a wireless communication manner, or may also be other manners.

The network of communication connections may be wide area networks or local area networks, or a combination of both. The application is not limited herein.

The method provided by the embodiments of the present application may be performed by a computer device, including but not limited to a terminal (also including the user terminal described above) or a server (also including the server 101 described above). The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.

The embodiments of the present application are not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described together in the following method embodiments, which are not described in detail herein.

The embodiment of the application provides a possible implementation manner, and the scheme can be executed by any electronic device, and optionally, any electronic device can be a server device with image recognition capability or an apparatus or a chip integrated on the devices. Fig. 2 is a schematic flow chart of an image recognition method according to an embodiment of the present application, where the method includes the following steps:

step S201: an image to be identified including a target object is acquired.

Optionally, the image recognition method of the embodiment of the application may be applied to image recognition of an image to be recognized to recognize an application scenario of object parameters of a target object.

The target object may include a person object, an animal object, an object, and the like, which is not limited in the embodiment of the present application.

The image to be identified is an image comprising the target object; in some alternative embodiments, the image to be identified may include a complete image of the target object, and in addition, the image to be identified may also include a partial image of the target object. For example, in the embodiment of the present application, the target object may be taken as a person object, and the image to be identified may include a complete image of the whole body of the person object, and in addition, the image to be identified may also include an image of the upper body of the person object, and so on.

In an actual implementation scenario, the image to be identified may be obtained from a multimedia file, for example, at least one frame of the image to be identified may be obtained from a video file. As an example, in a scene such as pedestrian re-recognition, an image to be recognized including a pedestrian may be acquired from an acquired video stream including the pedestrian. In other scenarios, the image to be identified may also be obtained from an image file. For example, the image to be identified may be obtained from a preset database in which image files are stored, and so on.

Step S202: and carrying out recognition processing on the image to be recognized through a preset recognizer, and determining the parameter information of the target object.

The preset identifier is obtained based on initial identification model and guiding model training.

In particular, the parameter information may comprise object characteristic parameters of the target object. For example, taking the target object as a person object as an example, the parameter information may include a sex parameter of the person object; age parameters; height parameters; identity parameters; where the identity parameters are e.g. name parameters of the target object, etc. As another example, taking the target object as an animal object, the parameter information may include a category parameter of the animal object; age parameters; height parameters; body weight parameters, and the like; wherein the class parameter is used to characterize which type of animal the target object belongs to, for example, the class parameter of the target object may be fish, amphibian, birds, insects, etc.

In this embodiment of the present application, the identifying process may be performed on the image to be identified by using a preset identifier, so as to determine parameter information of the target object.

The preset identifier may extract an object feature of the target object from an image to be identified, and then identify the object feature to determine an object parameter of the target object.

As an example, taking the target object as a human object, the parameter information includes an identity parameter of the target object as an example: in an actual implementation scenario, the character features of the target object may be extracted from the image to be identified by a preset identifier, where the character features may include facial features, head features, limb features, dressing features, and the like. And then, comparing the extracted character features with preset features in a pre-configured feature database, and determining the identity parameters of the target object through the feature similarity obtained by comparison. Specifically, the character characteristics of the character object with known identity parameters can be stored in the characteristic database, so that the identity parameter of the character object with the highest similarity with the target character characteristic can be determined as the identity parameter of the target object by comparing the characteristic similarity. Specifically, for example, in a scene of re-identifying pedestrians, a person image may be captured from an acquired video stream including pedestrians, then person features in the person image may be extracted by a preset identifier of the present application, the extracted person features may be compared with preset features in a feature database, feature distances between the person features and the preset features may be calculated, and further the feature distances may be ranked, where the smaller the feature distance, the higher the similarity between the person features and the preset features is explained, so that the identity of the person may be determined according to the preset feature with the smallest feature distance from the person features.

In this embodiment of the present application, the preset identifier is obtained based on an initial recognition model and a guiding model.

In some optional embodiments, the guiding model may be a large model with a large number of network layers, a large number of parameters and good performance and generalization capability; the initial recognition model can be a small model with smaller network scale and limited expression capacity; the number of parameters of the guided model is greater than the number of parameters of the initial recognition model.

Under the condition, the knowledge learned by the guiding model can be used for guiding the training of the initial recognition model, so that the initial recognition model has good performance equivalent to that of the guiding model, but the parameter quantity is greatly reduced, and the model compression and acceleration are realized; thus, the initial recognition model is trained, and the preset recognizer meeting the training conditions is finally obtained.

Specifically, in the training process, a sample image can be input into the initial recognition model to obtain recognition image characteristics; and continuously carrying out iterative correction on the initial recognition model according to the difference between the guide image features and the recognition image features until the preset recognizer meeting the training conditions is obtained.

Wherein the identification image features are obtained by the guiding model, and the identification image features comprise object features of sample objects in the sample image. The guiding image features are obtained by inputting the sample image into the guiding model; the instructional image features include object features of a sample object in the sample image.

It can be appreciated that in the training process, the initial recognition model and the guiding model input the same sample image, and then the recognition image features recognized by the initial model are made to be similar to the guiding image features recognized by the guiding model as far as possible through training, so that the initial recognition model learns good performance and generalization capability of the guiding model.

In addition, it should be noted that, in the training process, since the feature dimension of the initial guiding feature identified by the guiding model is larger than the feature dimension of the identified image feature, in order to implement compression of the model, in the training process, the initial guiding feature may be sampled to obtain guiding image features identical to the feature dimension of the identified image feature. Wherein, optionally, the sampling process may include a maximum sampling process, an average sampling process, and the like.

In an optional embodiment of the present application, the acquiring an image to be identified includes:

or alternatively, the process may be performed,

and acquiring an image frame containing the target object in a preset database.

The embodiment of the application can be described by taking the target object as a person object as an example: in an actual implementation scenario, the image to be identified may be obtained from a multimedia file, for example, at least one frame of the image to be identified may be obtained from a video file. As an example, in a scene such as pedestrian re-recognition, an image to be recognized including a pedestrian may be acquired from an acquired video stream including the pedestrian. In other scenarios, the image to be identified may also be obtained from an image file. For example, the image to be identified may be obtained from a preset database in which image files are stored, and so on.

In an optional embodiment of the present application, the identifying, by a preset identifier, the image to be identified, and determining parameter information of the target object includes:

In an optional embodiment of the present application, the comparing the object feature with a preset object feature in a feature database, and determining parameter information of the target object includes:

As an example, taking the target object as a human object, the parameter information includes an identity parameter of the target object as an example: in an actual implementation scenario, the character features (i.e., the object features in the embodiments of the present application) of the target object may be extracted from the image to be identified by a preset identifier, where the character features may include facial features, head features, limb features, wearing features, and so on. And then, comparing the extracted character features with preset object features in a pre-configured feature database, and determining identity parameters of the target object through feature similarity obtained by comparison.

Specifically, the character characteristics of the character object with known identity parameters can be stored in the characteristic database, so that the identity parameter of the character object with the highest similarity with the target character characteristic can be determined as the identity parameter of the target object by comparing the characteristic similarity.

In an optional embodiment of the present application, before the identifying the image to be identified by the preset identifier, the method further includes:

Alternatively, in the embodiment of the present application, the initial guidance model may use a Resnet50 convolutional neural network model. As shown in fig. 3, which is a schematic structural diagram of the Resnet50, the Resnet50 includes 5 stages, where the structure of Stage 0 is relatively simple and can be considered as a preprocessing Stage for INPUT. The remaining 4 stages (Stage 1 to Stage 4) are each composed of a Bottleneck structure, which can be divided into BTNK1 and BTNK2, and the specific structures of BTNK1 and BTNK2 can be seen in fig. 4. Wherein Stage1 comprises 3 Bottleneck; stage 2 contains 4 Bottleneck; stage 3 contains 6 Bottleneck; stage4 contains 3 bottlenecks.

In the embodiment of the application, the training Loss can be trained by using a Triplet (Triplet Loss) Loss function and a class (Classification Loss) Loss function.

The Triplet Loss comprises a plurality of triplets, namely an anchor point anchor, positive sample positive and negative sample negative. For a Triplet (a, p, n), its Triplet Loss can be expressed as: l=max [ d (a, p) -d (a, n) +margin,0 ].

Wherein d (a, p) and d (a, n) represent self-defined distance functions, which have the meaning of reducing the distance between positive (positive sample) and anchor and expanding the distance between negative (negative sample) and anchor. margin represents a distance.

Based on the triples, a positive pair < a, p > and a negative pair < a, n > can be constructed. The purpose of the Triplet pass is to separate the positive and negative pair over a distance (margin).

Classification Loss is to classify the IDs between different identities. Classification Loss can be expressed specifically as:

where y' represents the output through the activation function, which is between 0 and 1. y represents a recognition result parameter, namely the recognition identity is the same as the true identity, and y=1; otherwise y=0. Classification function loss for positive samples, the larger the output probability, the smaller the loss. For negative samples, the smaller the output probability, the smaller the penalty.

In the training process of the application, the initial guiding feature in the identification result is the object feature of the sample object identified by the initial guiding model. And the identification parameter information is the object parameter of the sample object output by the initial guiding model. The reference parameter information is a real object parameter of the sample object.

Alternatively, in the embodiment of the present application, the initial recognition model may use a MobileNet convolutional neural network model. The model structure of MobileNet can be seen in fig. 5.

In the training process, since the feature dimension of the initial guiding feature obtained through the guiding model is larger than the feature dimension of the identified image feature, in order to realize the compression of the model, the initial guiding feature may be sampled during the training process to obtain guiding image features identical to the feature dimension of the identified image feature. Wherein, optionally, the sampling process may include a maximum sampling process, an average sampling process, and the like.

For example, the feature dimension of the initial guidance feature is (N, N/2, D), and the feature dimension of the identification image feature is (N, N/2, D/2). For the initial guidance feature, a Sliding Window (Sliding Window) sampling method may be used, window Size is set to 2, stride is set to 2, and the maximum sampling feature F of the initial guidance feature is calculated _max Average sampling feature F _avg And accumulating the maximum sampling feature and the average sampling feature according to the bits, and then taking the bit average value of the accumulated value to obtain a guiding image feature F _T . Guide image feature F _T The feature dimensions of (2) are the same as the feature dimensions of the identified image features, i.e., the feature dimensions are (N, N/2, d/2).

In the training process, a training loss value can be determined according to the difference between the guide image features and the recognition image features, and based on the training loss value, iterative correction is continuously carried out on the initial recognition model until the preset recognizer meeting the training condition is obtained.

The training loss can adopt L2 Norm constraint, so that the guiding image features exert feature constraint and guiding function on the recognition image features, and the output of each layer of the initial recognition model is as close as possible to the output of each layer of the guiding model, thereby achieving a better feature extraction effect.

In one embodiment of the present application, the acquiring a training sample set includes:

the training sample set is obtained based on the sample image.

In this embodiment, to improve the recognition capability of the model, when a training sample set is constructed, a first image of a first object under different illumination intensities may be acquired, for example, taking the first object as a pedestrian, and in an actual scene, a pedestrian image under different illumination intensities may be acquired. For example, real world illumination may be simulated by multiple light sources and pedestrian images at different angles may be acquired by multiple cameras.

For illumination adjustment, a linear dimming mode can be adopted, and 0-10V analog signals are used for controlling light intensity. And for different illumination intensities, the calculation method of the GAMMA Value can be adopted to adapt to the external real light intensity, so that the effect of adapting to the real illumination condition is achieved.

For labeling, the initial ID of the pedestrian can be marked firstly, then a tracking-by-detection algorithm is adopted, the pedestrian is tracked by taking yolo-v5 as a reference network, data are collected through a frame regression-box algorithm, and the multi-view image of the pedestrian is labeled.

Aiming at the condition that multiple people exist in a picture, the embodiment of the application can adopt the technology of adopting one view angle as a reference view angle and adopting multi-view angle positioning for other view angles, distinguish the identity setting of different pedestrians among different cameras, and achieve the automatic labeling of multiple people. The automatic labeling greatly reduces the data acquisition cost and the data labeling cost, and has multiplexing and expansibility.

In one embodiment of the present application, after the determining the parameter information of the target object, the method further includes:

labeling the image to be identified according to the parameter information;

In some alternative embodiments, the training sample set may also be expanded with the application of the preset identifier. For example, after the parameter information of the target object is identified, the image to be identified may be labeled according to the parameter information, that is, the image to be identified is labeled with the parameter information as a label, and then the image to be identified after the labeling is used as the sample image. Thus, after the training sample set is expanded, the training model is subjected to fine correction training through the expanded training sample set, and the training of the initial recognition model is guided through the guide model after fine correction, so that the accuracy of the model is improved.

The application scenario and the overall implementation flow of the present application are described below with reference to fig. 6:

the embodiment of the application can be applied to a pedestrian re-recognition application scene, and in the implementation process, a current training data set and a Resnet50 Model can be used for training a Teacher Model (the Teacher Model is a guiding Model of the application); then, training a Student Model (Student Model is a preset recognizer of the present application) using the current training dataset, the trained Teacher Model, and the MobileNet Model; the image recognition platform recognizes through a Student Model; the Student Model receives the image to be identified, outputs an identification result, and amplifies an offline pedestrian data set; fine calibrating the Teacher Model using the amplification dataset off-line; updating the Student Model using only the Teacher Model guidance offline; the platform Student Model is iteratively updated.

An embodiment of the present application provides an image recognition apparatus, as shown in fig. 7, the image recognition apparatus 70 may include: an image acquisition module 701, an image recognition module 702, wherein,

an image acquisition module 701, configured to acquire an image to be identified including a target object;

the image recognition module 702 is configured to perform recognition processing on the image to be recognized by using a preset recognizer, and determine parameter information of the target object;

In one embodiment of the present application, the image acquisition module is specifically configured to intercept, in a target multimedia file, an image frame containing the target object;

Or alternatively, the process may be performed,

and acquiring an image frame containing the target object in a preset database.

In one embodiment of the present application, the image recognition module is specifically configured to calculate, by using the preset identifier, an object feature of the target object in the image to be recognized;

In one embodiment of the present application, the image recognition module is specifically configured to determine a similarity between the image feature and the preset object feature;

In one embodiment of the present application, the apparatus further includes a first training module, configured to obtain a training sample set before the identifying, by a preset identifier, the image to be identified; the training sample set comprises sample images; the sample image comprises a sample object;

In one embodiment of the present application, the apparatus further includes a second training module, configured to, before the identifying the image to be identified by the preset identifier,

In one embodiment of the present application, the first training module is specifically configured to obtain a first image of a first object under different illumination intensities;

The training sample set is obtained based on the sample image.

In one embodiment of the present application, the apparatus further includes a labeling module, configured to label the image to be identified according to the parameter information after the parameter information of the target object is determined;

In one embodiment of the present application, the downsampling process includes at least one of:

maximum sampling treatment;

and (5) average sampling processing.

The apparatus of the embodiments of the present application may perform the method provided by the embodiments of the present application, and implementation principles of the method are similar, and actions performed by each module in the apparatus of each embodiment of the present application correspond to steps in the method of each embodiment of the present application, and detailed functional descriptions of each module of the apparatus may be referred to in the corresponding method shown in the foregoing, which is not repeated herein.

An embodiment of the present application provides an electronic device, including: a memory and a processor; at least one program stored in the memory for execution by the processor, which, when executed by the processor, performs: in this embodiment of the present application, the parameter information of the target object may be determined by performing recognition processing on the image to be recognized by using a preset recognizer. The preset identifier is obtained by training based on an initial identification model and a guide model, wherein the guide model is a large model with good performance and generalization capability; the knowledge learned by the guiding model can be used for guiding the training of the initial recognition model, so that the initial recognition model serving as the small model learns the good performance and generalization capability of the guiding model, and finally the preset recognizer meeting the training conditions is obtained. Therefore, the method and the device for identifying the images of the mobile terminal improve the performance of the preset identifier, reduce the parameter quantity of the preset identifier, realize the light weight and convenience of the preset identifier and improve the speed of image identification.

In an alternative embodiment, there is provided an electronic device, as shown in fig. 8, the electronic device 4000 shown in fig. 8 includes: a processor 4001 and a memory 4003. Wherein the processor 4001 is coupled to the memory 4003, such as via a bus 4002. Optionally, the electronic device 4000 may further comprise a transceiver 4004, the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data, etc. It should be noted that, in practical applications, the transceiver 4004 is not limited to one, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.

The processor 4001 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. The processor 4001 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.

Bus 4002 may include a path to transfer information between the aforementioned components. Bus 4002 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 4002 can be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.

Memory 4003 may be, but is not limited to, ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, EEPROM (Electrically Erasable Programmable Read Only Memory ), CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The memory 4003 is used for storing application program codes (computer programs) for executing the present application, and execution is controlled by the processor 4001. The processor 4001 is configured to execute application program codes stored in the memory 4003 to realize what is shown in the foregoing method embodiment.

Among them, electronic devices include, but are not limited to: mobile phones, notebook computers, multimedia players, desktop computers, etc.

The present application provides a computer readable storage medium having a computer program stored thereon, which when run on a computer, causes the computer to perform the corresponding method embodiments described above.

The terms "first," "second," "third," "fourth," "1," "2," and the like in the description and in the claims of this application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in other sequences than those illustrated or otherwise described.

It should be understood that, although the flowcharts of the embodiments of the present application indicate the respective operation steps by arrows, the order of implementation of these steps is not limited to the order indicated by the arrows. In some implementations of embodiments of the present application, the implementation steps in the flowcharts may be performed in other orders as desired, unless explicitly stated herein. Furthermore, some or all of the steps in the flowcharts may include multiple sub-steps or multiple stages based on the actual implementation scenario. Some or all of these sub-steps or phases may be performed at the same time, or each of these sub-steps or phases may be performed at different times, respectively. In the case of different execution time, the execution sequence of the sub-steps or stages may be flexibly configured according to the requirement, which is not limited in the embodiment of the present application.

The foregoing is merely an optional implementation manner of the implementation scenario of the application, and it should be noted that, for those skilled in the art, other similar implementation manners based on the technical ideas of the application are adopted without departing from the technical ideas of the application, and also belong to the protection scope of the embodiments of the application.

Claims

1. An image recognition method, comprising:

acquiring an image to be identified comprising a target object;

2. The image recognition method according to claim 1, wherein the acquiring the image to be recognized including the target object includes:

or alternatively, the process may be performed,

and acquiring an image frame containing the target object in a preset database.

3. The image recognition method according to claim 1, wherein the performing recognition processing on the image to be recognized by a preset recognizer, determining parameter information of the target object, includes:

4. The image recognition method according to claim 1, wherein the comparing the object feature with a preset object feature in a feature database, and determining parameter information of the target object, includes:

5. The image recognition method according to claim 1, wherein before the recognition processing of the image to be recognized by a preset recognizer, the method further comprises:

6. The image recognition method according to claim 5, wherein before the recognition processing of the image to be recognized by a preset recognizer, the method further comprises:

7. The method of image recognition according to claim 5, wherein the acquiring a training sample set comprises:

the training sample set is obtained based on the sample image.

8. The image recognition method according to claim 5, wherein after the determining of the parameter information of the target object, the method further comprises:

labeling the image to be identified according to the parameter information;

9. The image recognition method of claim 6, wherein the downsampling process comprises at least one of:

maximum sampling treatment;

and (5) average sampling processing.

10. An image recognition apparatus, comprising:

11. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the image recognition method according to any one of claims 1-9.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the image recognition method according to any one of claims 1-9.