CN116958777A - Image recognition method and device, storage medium and electronic equipment - Google Patents

Image recognition method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN116958777A
CN116958777A CN202310239686.9A CN202310239686A CN116958777A CN 116958777 A CN116958777 A CN 116958777A CN 202310239686 A CN202310239686 A CN 202310239686A CN 116958777 A CN116958777 A CN 116958777A
Authority
CN
China
Prior art keywords
target
sample
recognition model
samples
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310239686.9A
Other languages
Chinese (zh)
Inventor
张博深
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202310239686.9A priority Critical patent/CN116958777A/en
Publication of CN116958777A publication Critical patent/CN116958777A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The application discloses an image recognition method, an image recognition device, a storage medium and electronic equipment. Wherein the method comprises the following steps: acquiring a target image to be identified; inputting a target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by using a plurality of target samples and is used for recognizing the image, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by using a target loss curve in the training process of the secondary training, and the updating weight of the noise samples on the image recognition model is smaller than that of the clean samples on the image recognition model; and obtaining an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image, and can be applied to artificial intelligence scenes and relates to machine learning and other technologies. The application solves the technical problem of lower image recognition accuracy.

Description

Image recognition method and device, storage medium and electronic equipment
Technical Field
The present application relates to the field of computers, and in particular, to an image recognition method, an image recognition device, a storage medium, and an electronic apparatus.
Background
In an image recognition scene, an image recognition model is generally used for recognizing an image in a mode of participation of the model, but the degree of difference between many types of images may be slight, so that the problem of low accuracy of image recognition occurs. Therefore, there is a problem in that the image recognition accuracy is low.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides an image recognition method, an image recognition device, a storage medium and electronic equipment, which are used for at least solving the technical problem of low image recognition accuracy.
According to an aspect of an embodiment of the present application, there is provided an image recognition method including: acquiring a target image to be identified; inputting the target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by utilizing a plurality of target samples, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by utilizing a target loss curve in the training process of the secondary training, the target loss curve is a loss curve corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the update weight of the noise samples to the image recognition model is smaller than that of the clean samples to the image recognition model; and acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
According to another aspect of the embodiment of the present application, there is also provided an image recognition apparatus including: the first acquisition unit is used for acquiring a target image to be identified; the first input unit is used for inputting the target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training on a plurality of target samples and is used for recognizing the image, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by using target loss curves in the training process of the secondary training, the target loss curves are loss curves corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the update weight of the noise samples on the image recognition model is smaller than that of the clean samples on the image recognition model; and the second acquisition unit is used for acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
As an alternative, the apparatus further comprises: a third obtaining unit, configured to obtain the plurality of target samples before the target image is input into an image recognition model, where the target samples carry any one of at least two labels; the second input unit is used for inputting the plurality of target samples into an initial first recognition model to perform the one-time training until a trained second recognition model is obtained before the target image is input into an image recognition model, and acquiring the target loss curve corresponding to each target sample in the plurality of target samples in the one-time training process; the third input unit is configured to input the plurality of target samples into the second recognition model before the target image is input into the image recognition model, and screen each target sample by using the target loss curve to obtain sample reference information, where the sample reference information is used to indicate that each target sample belongs to the clean sample or the noise sample; and the training unit is used for iteratively updating the first recognition model based on the sample reference information until the trained image recognition model is obtained before the target image is input into the image recognition model.
As an alternative, the second obtaining unit includes: the acquisition module is used for acquiring the image recognition result output by the image recognition model, wherein the image recognition result is used for indicating that the target image is the image type matched with the target label, and the target label is one of the at least two labels.
As an alternative, the third input unit includes: the first input module is used for inputting the target samples into the second recognition model for multiple rounds of iterative training; the first screening module is configured to screen each target sample by using the target loss curve in the process of multiple rounds of iterative training, so as to obtain sample reference information of each round of the target sample in the process of multiple rounds of iterative training, where the sample reference information of each round is used to indicate that each target sample belongs to the clean sample or the noise sample in each round of multiple rounds of iterative training.
As an alternative, the training unit includes: the execution module is used for executing the following steps until the image recognition model is obtained: determining a first recognition model of a current round and current round sample reference information corresponding to each target sample, wherein the current round sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample in the current round of the multi-round iterative training; determining a first target sample which belongs to the noise sample and is indicated by the current round sample reference information and a second target sample which belongs to the clean sample and is indicated by the current round sample reference information from the plurality of target samples; under the condition that the training result of the current wheel corresponding to the first recognition model of the current wheel does not meet the convergence condition, back propagation is carried out by utilizing the second target sample so as to update the model weight of the first recognition model, and the updated first recognition model is determined to be the first recognition model of the current wheel; and under the condition that the current wheel training result meets the convergence condition, determining a first recognition model of the current wheel as the trained image recognition model.
As an alternative, the apparatus further comprises: the first training module is used for training the first recognition model of the current wheel by using the second target sample before the second target sample is used for back propagation to update the model weight of the first recognition model and determining the updated first recognition model as the first recognition model of the current wheel, so as to obtain the training result of the current wheel; or, the apparatus further comprises: the second training module is used for training the first recognition model of the current wheel by utilizing the plurality of target samples before the first target sample which is indicated by the current wheel sample reference information and belongs to the noise sample and the second target sample which is indicated by the current wheel sample reference information and belongs to the clean sample are determined from the plurality of target samples, so that the current wheel training result is obtained; the apparatus further comprises: the first determining unit is configured to determine, after the training of the first recognition model of the current wheel by using the plurality of target samples to obtain the current wheel training result, the first target sample and the second target sample from the plurality of target samples when the current wheel training result corresponding to the first recognition model of the current wheel does not meet a convergence condition.
As an alternative, the third input unit includes: the second input module is used for inputting the target samples into the second recognition model for multiple rounds of iterative training; the statistics module is used for counting loss change curves respectively corresponding to the target samples in the multi-round iterative training process after the multi-round iterative training is finished; and the combination module is used for combining the loss change curve and the target loss curve to screen each target sample so as to obtain overall sample reference information, wherein the overall sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample in the overall round of the multi-round iterative training.
As an alternative, the training unit includes: a second determining unit configured to determine, from the plurality of target samples, a third target sample belonging to the noise sample indicated by the whole sample reference information, and a fourth target sample belonging to the clean sample indicated by the whole sample reference information; and the third training module is used for carrying out iterative training on the first recognition model by utilizing the fourth target sample, and updating the model weight of the first recognition model in the iterative training process until the trained image recognition model is obtained.
As an alternative, the apparatus further comprises: the calculating unit is used for calculating primary reference values corresponding to each sample category by using the target loss curve before the target samples are input into the second recognition model and the target loss curve is used for screening each target sample to obtain sample reference information, wherein the sample category is used for representing the category corresponding to the carried label; the third input unit includes: the third input module is used for inputting the plurality of target samples into the second recognition model to obtain a secondary loss curve corresponding to each target sample; the first calculation module is used for calculating a secondary reference value corresponding to each target sample by using the secondary loss curve; the comparison module is used for comparing the primary reference value and the secondary reference value which belong to the same sample category to obtain a target comparison result; and the second screening module is used for screening each target sample according to the target comparison result to obtain the sample reference information.
As an alternative, the computing unit includes: the second calculation module is used for calculating a first loss mean value and a first loss variance corresponding to each sample category by using the target loss curve, wherein the first loss mean value is the loss mean value corresponding to all target samples belonging to the same sample category, and the first loss variance is the loss variance corresponding to all target samples belonging to the same sample category; the first computing module includes: the calculation sub-module is used for calculating a second loss mean value and a second loss variance corresponding to each target sample by using the secondary loss curve; the comparison module comprises: the comparison sub-module is used for comparing the first loss average value and the second loss average value which belong to the same sample category to obtain a first comparison result; and comparing the first loss variance and the second loss variance belonging to the same sample category to obtain a second comparison result; the second screening module includes: and the screening submodule is used for screening each target sample according to the first comparison result and the second comparison result to obtain the sample reference information.
According to still another aspect of an embodiment of the present application, there is provided a computer-readable storage medium, wherein the computer-readable storage medium includes a computer program, and wherein the computer program is executed by an electronic device (e.g., a user device or a server) to perform an image recognition method as above.
According to still another aspect of the embodiment of the present application, there is further provided an electronic device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the image recognition method described above through the computer program.
In the embodiment of the application, a target image to be identified is acquired; inputting the target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by utilizing a plurality of target samples, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by utilizing a target loss curve in the training process of the secondary training, the target loss curve is a loss curve corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the update weight of the noise samples to the image recognition model is smaller than that of the clean samples to the image recognition model; and acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image. The image recognition is carried out through the image recognition model, the loss curve obtained in the primary training stage of the image recognition model is utilized to screen samples used in the primary training stage of the image recognition model, and the updating weight of noise samples obtained by screening on the image recognition model is reduced, so that the purpose of reducing the negative influence of the noise samples on the image recognition model is achieved, the technical effect of improving the image recognition accuracy is achieved, and the technical problem of lower image recognition accuracy is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic illustration of an application environment of an alternative image recognition method according to an embodiment of the present application;
FIG. 2 is a schematic illustration of a flow of an alternative image recognition method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative image recognition method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 7 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 9 is a schematic diagram of another alternative image recognition method according to an embodiment of the present application;
FIG. 10 is a schematic diagram of an alternative image recognition device according to an embodiment of the present application;
fig. 11 is a schematic structural view of an alternative electronic device according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, the following noun explanations are made:
artificial intelligence (Artificial Intelligence, AI for short) is a theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Computer Vision (CV) is a science of researching how to make a machine "look at", and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing, so that the Computer is processed into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, and map construction, among others, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and others.
Machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.
Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.
Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a large amount of computing, storage resources, such as video websites, picture-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing.
Artificial intelligence cloud services, also commonly referred to as AIaaS (AI as Service, chinese is "AI as Service"). The service mode of the artificial intelligent platform is the mainstream at present, and particularly, the AIaaS platform can split several common AI services and provide independent or packaged services at the cloud. This service mode is similar to an AI theme mall: all developers can access one or more artificial intelligence services provided by the use platform through an API interface, and partial deep developers can also use an AI framework and AI infrastructure provided by the platform to deploy and operate and maintain self-proprietary cloud artificial intelligence services.
The scheme provided by the embodiment of the application relates to the technologies of artificial intelligence, such as computer vision technology, machine learning and the like, and is specifically described by the following embodiments:
according to an aspect of the embodiment of the present application, there is provided an image recognition method, optionally, as an alternative implementation, the image recognition method may be applied, but not limited to, in the environment shown in fig. 1. Which may include, but is not limited to, a user device 102 and a server 112, which may include, but is not limited to, a display 104, a processor 106, and a memory 108, the server 112 including a database 114 and a processing engine 116.
The specific process comprises the following steps:
step S102, the user equipment 102 acquires a target image to be identified;
steps S104-S106, the target image is sent to the server 112 through the network 110;
step S108-S110, the server 112 inputs the target image into the image recognition model through the processing engine 116, and obtains the image recognition result output by the image recognition model;
steps S112-S114, the image recognition result is sent to the user device 102 through the network 110, and the user device 102 displays the image recognition result on the display 104 through the processor 106, and stores the image recognition result in the memory 108.
In addition to the example shown in fig. 1, the above steps may be performed by the user device or the server independently, or by the user device and the server cooperatively, such as the steps of inputting the target image into the image recognition model, acquiring the image recognition result output by the image recognition model, and the like performed by the user device 102, thereby reducing the processing pressure of the server 112. The user device 102 includes, but is not limited to, a handheld device (e.g., a mobile phone), a notebook computer, a tablet computer, a desktop computer, a vehicle-mounted device, a smart television, etc., and the application is not limited to a specific implementation of the user device 102. The server 112 may be a single server or a server cluster composed of a plurality of servers, or may be a cloud server.
Alternatively, as an optional implementation manner, as shown in fig. 2, the image recognition method may be performed by an electronic device, such as a user device or a server shown in fig. 1, and specific steps include:
s202, acquiring a target image to be identified;
s204, inputting a target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by using a plurality of target samples, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by using target loss curves in the training process of the secondary training, the target loss curves are loss curves corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the updating weight of the noise samples to the image recognition model is smaller than that of the clean samples to the image recognition model;
s206, obtaining an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
Optionally, in this embodiment, the above image recognition method may be applied, but not limited to, in an industrial defect quality inspection scene, where the industrial defect quality inspection may be, but not limited to, quality inspection of an industrial product in a manufacturing process, and conventional industrial quality inspection is generally performed by a quality inspection worker to perform manual visual inspection, and in recent years, with the rising of AI technology, the quality inspection accuracy rate may be greatly improved and the labor cost may be saved by using AI quality inspection based on machine vision.
Specifically, the technology carries out training of an image recognition model through a sample marked by an artificial label, wherein the trained image recognition model is input by shooting pictures on the surface of an industrial product, and the output is the confidence of the defect, so that the defective image and the non-defective image are distinguished. However, quality inspection of industrial defects is not a simple multi-classification task, the degree of many defective images is slight (even can be classified as a defect-free image), and the problem that the marking errors are caused by high similarity possibly exists between different defects, at this time, the artificial labels may have many subjectivity, and further the artificial marking information of the defects is caused to have noise, and training of the model in the noisy labels can influence the performance of the model, and further the problem that the accuracy of image recognition is low occurs.
Aiming at the problems, the embodiment screens out samples with noise labels by utilizing a multi-training mode, reduces the updating weight of the samples with noise labels on the image recognition model, reduces the influence of the performance of the model with noise labels, and further improves the accuracy of image recognition.
Optionally, in this embodiment, the target sample may be, but not limited to, a sample image marked by various labels, where marking of the labels may have a problem of marking errors, so that marking information of the sample image may be noisy, and further the target sample may be further classified into a clean sample or a noise sample, where the clean sample may be, but not limited to, a sample image with no noise marking information, and the noise sample may be, but not limited to, a sample image with noise marking information.
Optionally, in this embodiment, the target loss curve is a loss curve (loss curves) corresponding to each target sample in multiple target samples obtained in a training process of one training, where the loss curve may be, but is not limited to, used to represent a change in a loss value of the sample in an iterative training process of a model, and further, by taking the target loss curve as an example, in a multi-round iterative training process of one training, a loss value corresponding to each target sample in each round is recorded, and then the loss value is counted, so as to obtain a target loss curve of each target sample in a training process of one training.
Optionally, in this embodiment, the target loss curve is used to screen multiple target samples during the training process of the secondary training, if the target loss curve indicates the loss value change of each target sample, the larger the loss value change is, the larger the difference of the corresponding target samples can be, but is not limited to, the larger the difference is, but the reason for the larger difference can be, but is not limited to, that the target samples are provided with noise labeling information, and the target samples (noise samples) with noise labeling information are further filtered during the training process of the secondary training (for example, the update weight of the noise samples to the image recognition model is reduced), so that the influence of the samples with noise labels on the model performance can be naturally reduced, and the accuracy of image recognition is further improved.
Optionally, in this embodiment, the update weight of the noise sample on the image recognition model is smaller than the update weight of the clean sample on the image recognition model, where the update weight may be understood as, but is not limited to, the participation weight of each target sample in the weight update process of the image recognition model, and the weight update may be understood as, but is not limited to, subtracting the counter-propagating error from the original weight of the model, but the counter-propagating error value may be positive or negative, so it may be understood that when the counter-propagating error is positive, the value of the weight is reduced, when the counter-propagating error is negative, the value of the weight is increased, and when the sample belongs to the noise sample, the value of the weight is also reduced (or the value is cleared, indicating that the noise sample does not participate in the weight update process of the subsequent or current round);
further illustratively, as shown in fig. 3, optionally, the image recognition model is secondarily trained by using the sample a, the sample B, and the sample C, and clean samples and noise samples in the sample a, the sample B, and the sample C are determined by using a target loss curve obtained by the primary training, for example, the sample a is a clean sample, the sample B is a noise sample, and the sample C is a clean sample; in the weight updating stage, the updating weight of the noise sample is further reduced or the updating weight of the clean sample is improved, so that the updating weight of the clean sample is larger than that of the noise sample, or the noise sample is directly not used for the subsequent or current round of weight updating, and only the clean sample is used for the subsequent or current round of weight updating, so that an updated image recognition model is obtained, for example, the model weight corresponding to the image recognition model before updating is 1, and the model weight corresponding to the image recognition model after updating is 2.
Optionally, in this embodiment, the image recognition result is used to indicate an image type of the target image, for example, the image recognition result may be, but not limited to, a confidence level of the target image, where the image type of the target image is a quality-acceptable image type if the confidence level is greater than or equal to a confidence threshold; otherwise, if the confidence coefficient is smaller than the confidence coefficient threshold value, the image type of the target image is the image type with unqualified quality; in addition, to improve the image recognition diversity, the image types may also include, but are not limited to, multiple types, such as a first image type, a second image type, a third image type, and the like, and the image recognition result may be, but is not limited to, confidence of various types of images.
It should be noted that, image recognition is performed through the image recognition model, and a loss curve obtained in a primary training stage of the image recognition model is utilized to screen samples used in the primary training stage of the image recognition model, and update weight of noise samples obtained by screening on the image recognition model is reduced, so that negative influence of the noise samples on the image recognition model is reduced, and therefore, the technical effect of improving the accuracy of image recognition is achieved.
Further by way of example, an optional acquisition of a target image 402 to be identified, such as shown in FIG. 4; inputting the target image 402 into an image recognition model 404, wherein the image recognition model 404 is a neural network model for recognizing an image, which is obtained by performing primary training and secondary training by using a plurality of target samples 404-1, the target samples 404-1 are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples 404-1 by using a target loss curve 404-3 in the training process of the secondary training, the target loss curve 404-3 is a loss curve corresponding to each target sample 404-1 obtained in the training process of the primary training, and the update weight of the noise samples to the image recognition model 404 is smaller than the update weight of the clean samples to the image recognition model 404; an image recognition result 406 output by the image recognition model 404 is obtained, wherein the image recognition result 406 is used to indicate the image type of the target image 402.
According to the embodiment provided by the application, the target image to be identified is obtained; inputting a target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by utilizing a plurality of target samples, the target sample is a clean sample or a noise sample, the noise sample is a sample obtained by screening the plurality of target samples by utilizing a target loss curve in the training process of the secondary training, the target loss curve is a loss curve corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the updating weight of the noise sample to the image recognition model is smaller than that of the clean sample to the image recognition model; and acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image. The image recognition is carried out through the image recognition model, the loss curve obtained in the primary training stage of the image recognition model is utilized to screen samples used in the primary training stage of the image recognition model, and the updating weight of noise samples obtained by screening on the image recognition model is reduced, so that the purpose of reducing the negative influence of the noise samples on the image recognition model is achieved, and the technical effect of improving the image recognition accuracy is achieved.
As an alternative, before inputting the target image into the image recognition model, the method further comprises:
s1-1, acquiring a plurality of target samples, wherein the target samples carry any one of at least two labels;
s1-2, inputting a plurality of target samples into an initial first recognition model for training for one time until a trained second recognition model is obtained, and acquiring a target loss curve corresponding to each target sample in the plurality of target samples in the training process of one time of training;
s1-3, inputting a plurality of target samples into a second recognition model, and screening each target sample by utilizing a target loss curve to obtain sample reference information, wherein the sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample;
s1-4, iteratively updating the first recognition model based on the sample reference information until a trained image recognition model is obtained.
Alternatively, in the present embodiment, as shown in fig. 4, the image recognition model 404-2 (the second recognition model) obtained by performing the first training on the initial first recognition model may be, but is not limited to, a training basis for the second training, and the image recognition model 404-4 obtained by the second training is determined as the trained image recognition model 404.
According to the embodiment provided by the application, a plurality of target samples are obtained, wherein the target samples carry any one of at least two labels; inputting a plurality of target samples into an initial first recognition model for one-time training until a trained second recognition model is obtained, and obtaining a corresponding target loss curve of each target sample in the plurality of target samples in a training process of one-time training; inputting a plurality of target samples into a second recognition model, and screening each target sample by utilizing a target loss curve to obtain sample reference information, wherein the sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample; and iteratively updating the first recognition model based on the sample reference information until a trained image recognition model is obtained, so that the aim of reducing the negative influence of the noise sample on the image recognition model is fulfilled, and the technical effect of improving the image recognition accuracy is realized.
As an alternative, obtaining the image recognition result output by the image recognition model includes:
and obtaining an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating that the target image is the image type matched with the target label, and the target label is one of at least two labels.
Optionally, in this embodiment, the target sample may be, but not limited to, a sample image marked by any one of at least two labels, and marking of each label may have a problem of marking errors, so that marking information of the sample image is noisy, and the target sample may be further classified into a clean sample or a noisy sample.
As an alternative, inputting a plurality of target samples into the second recognition model, and screening each target sample by using a target loss curve to obtain sample reference information, including:
s2-1, inputting a plurality of target samples into a second recognition model to perform multi-round iterative training;
s2-2, screening each target sample by utilizing a target loss curve in the process of multiple rounds of iterative training to obtain sample reference information of each round of sample in the process of multiple rounds of iterative training, wherein the sample reference information of each round is used for indicating that each target sample belongs to a clean sample or a noise sample in each round of multiple rounds of iterative training.
Optionally, in this embodiment, the screening of each target sample with the target loss curve may be, but is not limited to, whether each target sample belongs to a clean sample or a noise sample in each iteration training; alternatively, it is assumed that the target sample 1 belongs to a clean sample in 1 round of iterative training, but the target sample 1 does not necessarily belong to a clean sample in 2 rounds of iterative training, but belongs to a noise sample.
It should be noted that, the sample reference information of each target sample in the multiple iterative training process changes along with the change of the iterative rounds and is not a fixed value, so that the noise sample screening process is an adaptive transformation along with the iterative rounds, and has stronger flexibility compared with the process of directly judging according to the fixed sample reference information.
According to the embodiment provided by the application, a plurality of target samples are input into a second recognition model for multiple rounds of iterative training; in the process of multiple rounds of iterative training, each target sample is screened by utilizing the target loss curve, so that each round of sample reference information of each target sample in the process of multiple rounds of iterative training is obtained, wherein each round of sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in each round of multiple rounds of iterative training, and the aim of stronger flexibility of screening each target sample by utilizing the target loss curve is achieved, and the technical effect of improving the flexibility of sample screening is achieved.
As an alternative, the iterative updating of the first recognition model based on the sample reference information until a trained image recognition model is obtained includes:
The following steps are performed until an image recognition model is obtained:
s3-1, determining a first identification model of a current round and current round sample reference information corresponding to each target sample, wherein the current round sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in a current round of multi-round iterative training;
s3-2, determining a first target sample which belongs to a noise sample and is indicated by the current round sample reference information and a second target sample which belongs to a clean sample and is indicated by the current round sample reference information from a plurality of target samples;
s3-3, under the condition that the current training result of the current wheel corresponding to the first recognition model of the current wheel does not meet the convergence condition, using a second target sample to conduct counter propagation so as to update the model weight of the first recognition model, and determining the updated first recognition model as the first recognition model of the current wheel;
s3-4, determining the first recognition model of the current wheel as a trained image recognition model under the condition that the training result of the current wheel meets the convergence condition.
It should be noted that, in order to reduce the negative effect of the noise sample on the image recognition model, the method may, but is not limited to, performing the update of the recognition model only by using the clean sample after screening and determining the clean sample and the noise sample of the current round, and not using the noise sample, so that the negative effect caused by the noise sample is weakened by the updated recognition model.
By the embodiment provided by the application, the following steps are executed until an image recognition model is obtained: determining a first recognition model of a current round and current round sample reference information corresponding to each target sample, wherein the current round sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in the current round of multi-round iterative training; determining a first target sample which belongs to a noise sample and is indicated by the sample reference information of the current round and a second target sample which belongs to a clean sample and is indicated by the sample reference information of the current round from a plurality of target samples; under the condition that the training result of the current wheel corresponding to the first recognition model of the current wheel does not meet the convergence condition, back propagation is carried out by utilizing a second target sample so as to update the model weight of the first recognition model, and the updated first recognition model is determined to be the first recognition model of the current wheel; under the condition that the current wheel training result meets the convergence condition, the first recognition model of the current wheel is determined to be a trained image recognition model, so that the aim of reducing negative influence of noise samples on the image recognition model is fulfilled, and the technical effect of improving the model training quality of the recognition model is realized.
As an alternative, before back-propagating with the second target sample to update the model weight of the first recognition model and determining the updated first recognition model as the first recognition model of the current wheel, the method further includes: training a first recognition model of the current wheel by using a second target sample to obtain a training result of the current wheel;
in order to improve the training quality of the model, the noise samples and the clean samples are screened first, and then the clean samples are directly used for training, so that the current training results are obtained.
As an alternative, before determining, from the plurality of target samples, a first target sample belonging to the noise sample and indicated by the current round of sample reference information, and a second target sample belonging to the clean sample and indicated by the current round of sample reference information, the method further includes: training a first recognition model of the current wheel by using a plurality of target samples to obtain a training result of the current wheel;
after training the first recognition model of the current wheel by using the plurality of target samples to obtain a training result of the current wheel, the method further comprises the following steps: and under the condition that the current wheel training result corresponding to the first identification model of the current wheel does not meet the convergence condition, determining a first target sample and a second target sample from the plurality of target samples.
It should be noted that, in order to improve the training efficiency of the model, a plurality of target samples may be used directly for training without distinction, and then the noise samples and the clean samples are screened under the condition of non-convergence, and under the condition of convergence, the step of screening the noise samples and the clean samples may be omitted, thereby saving unnecessary steps in the model training process.
According to the embodiment provided by the application, the first recognition model of the current wheel is trained by using the second target sample, so that the training result of the current wheel is obtained; or training the first recognition model of the current wheel by utilizing a plurality of target samples to obtain a training result of the current wheel; under the condition that the current wheel training results corresponding to the first identification model of the current wheel do not meet the convergence condition, determining a first target sample and a second target sample from a plurality of target samples, and further achieving the purpose of providing various model training modes meeting various requirements, and achieving the technical effect of improving the diversity of model training.
As an alternative, inputting a plurality of target samples into the second recognition model, and screening each target sample by using a target loss curve to obtain sample reference information, including:
S4-1, inputting a plurality of target samples into a second recognition model to perform multi-round iterative training;
s4-2, after the multi-round iterative training is finished, calculating loss change curves respectively corresponding to each target sample in the multi-round iterative training process;
s4-3, screening each target sample by combining the loss change curve and the target loss curve to obtain overall sample reference information, wherein the overall sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in an overall round of multi-round iterative training.
Optionally, in this embodiment, iteratively updating the first recognition model based on the sample reference information until a trained image recognition model is obtained includes:
s5-1, determining a third target sample which is indicated by the whole sample reference information and belongs to a noise sample and a fourth target sample which is indicated by the whole sample reference information and belongs to a clean sample from a plurality of target samples;
s5-2, performing iterative training on the first recognition model by using a fourth target sample, and updating the model weight of the first recognition model in the iterative training process until a trained image recognition model is obtained.
It should be noted that, to improve the integrity of the model training, it is possible, but not limited to, to perform the screening of the noise sample and the clean sample based on the target loss curve obtained by the primary training after the secondary training of the model, and perform the secondary training of the integrity by using the clean sample.
According to the embodiment provided by the application, a third target sample which is indicated by the whole sample reference information and belongs to a noise sample and a fourth target sample which is indicated by the whole sample reference information and belongs to a clean sample are determined from a plurality of target samples; and carrying out iterative training on the first recognition model by using a fourth target sample, and updating the model weight of the first recognition model in the iterative training process until a trained image recognition model is obtained, thereby achieving the aim of improving the model training integrity and further achieving the technical effect of improving the training quality of the model.
As an alternative, before inputting the plurality of target samples into the second recognition model and screening each target sample by using the target loss curve to obtain the sample reference information, the method further includes: calculating primary reference values corresponding to each sample category by using a target loss curve, wherein the sample category is used for representing the category corresponding to the carried label;
inputting a plurality of target samples into a second recognition model, and screening each target sample by utilizing a target loss curve to obtain sample reference information, wherein the method comprises the following steps:
s6-1, inputting a plurality of target samples into a second recognition model to obtain a secondary loss curve corresponding to each target sample;
S6-2, calculating a secondary reference value corresponding to each target sample by using a secondary loss curve;
s6-3, comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result;
s6-4, screening each target sample according to the target comparison result to obtain sample reference information.
It should be noted that, in order to improve accuracy of sample screening, a loss curve obtained by one training is used to perform one-time reference value calculation of each class of samples, so as to obtain average values of each class of samples; and then carrying out secondary reference values of the samples to be judged by using the loss curve obtained by secondary training, comparing the secondary reference values with primary reference values corresponding to the categories to which the samples to be judged belong so as to compare whether the samples to be judged are lower than an average value, if so, indicating that the samples to be judged are lower than the average level of the categories, namely, determining that the samples to be judged are noise samples.
According to the embodiment provided by the application, the primary reference value corresponding to each sample category is calculated by utilizing the target loss curve, wherein the sample category is used for representing the category corresponding to the carried label; inputting a plurality of target samples into a second recognition model to obtain a secondary loss curve corresponding to each target sample; calculating a secondary reference value corresponding to each target sample by using the secondary loss curve; comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result; and screening each target sample according to the target comparison result to obtain sample reference information, thereby achieving the aim of improving the accuracy of sample screening and further achieving the technical effect of improving the training quality of the model.
As an alternative, calculating the primary reference value corresponding to each sample class by using the target loss curve includes: calculating a first loss mean value and a first loss variance corresponding to each sample category by using a target loss curve, wherein the first loss mean value is the loss mean value corresponding to all target samples belonging to the same sample category, and the first loss variance is the loss variance corresponding to all target samples belonging to the same sample category;
calculating a secondary reference value corresponding to each target sample by using the secondary loss curve, including: calculating a second loss mean value and a second loss variance corresponding to each target sample by using the secondary loss curve;
comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result, wherein the method comprises the following steps: comparing the first loss average value and the second loss average value belonging to the same sample class to obtain a first comparison result; and comparing the first loss variance and the second loss variance belonging to the same sample class to obtain a second comparison result;
screening each target sample according to the target comparison result to obtain sample reference information, wherein the method comprises the following steps: and screening each target sample according to the first comparison result and the second comparison result to obtain sample reference information.
Alternatively, in this embodiment, the mean (mean) may, but is not limited to, refer to a sum of all data in a set of data divided by the number of the set of data, and reflects an index of the trend in the data set; variance (variance) may be, but is not limited to, a measure of the degree of dispersion when a random variable or a set of data is measured by a probability theory, in which variance is used to measure the degree of deviation between the random variable and its mathematical expectation (i.e., mean), and a statistical variance, in which variance (sample variance) is the average of the squared values of the differences between each sample value and the average of the population of sample values.
According to the embodiment provided by the application, a first loss mean value and a first loss variance corresponding to each sample category are calculated by utilizing the target loss curve, wherein the first loss mean value is the loss mean value corresponding to all target samples belonging to the same sample category, and the first loss variance is the loss variance corresponding to all target samples belonging to the same sample category; calculating a second loss mean value and a second loss variance corresponding to each target sample by using the secondary loss curve; comparing the first loss average value and the second loss average value belonging to the same sample class to obtain a first comparison result; and comparing the first loss variance and the second loss variance belonging to the same sample class to obtain a second comparison result; and screening each target sample according to the first comparison result and the second comparison result to obtain sample reference information, so that the purpose of refining the screening granularity of the sample is achieved, and the technical effect of improving the screening fineness of the sample is achieved.
As an alternative, to facilitate understanding, the above image recognition method is applied to an industrial defect quality inspection scene, and a conventional machine vision-based industrial defect quality inspection algorithm performs manual feature extraction on an input image, including gradient features, texture features, and the like, and then trains an SVM classifier (or a tree-based classifier, such as a random forest) according to the extracted manual features to classify a defective image of a current picture.
However, the above method has at least two problems, namely, the generalization of the extracted manual features is poor, and some harmful features which confuse the subsequent classifier are often extracted for diversified video data; and secondly, the feature extraction and the training of the classifier are independently carried out, and the training cost of the model is relatively high.
For the two problems, the original image can be directly sent into a CNN network structure to extract the characteristics, then the full-connection layer is used for classification, the model is trained according to a softmax loss function, the end-to-end training is carried out, the whole step does not need to carry out manual characteristic design, the most suitable characteristics for the current classification task can be automatically learned during model training, the whole process is end-to-end, and independent training of the two steps of characteristic extraction and classification is not needed.
However, the above-mentioned whole-image classification based on CNN still has some problems, such as quality inspection of industrial defects is not a simple multi-classification task, the degree of many defect images is slight (even can be classified as OK images), and there may be a problem that similarity between different defects is high, so that labeling errors are caused, at this time, an artificial label may have many subjectivity, and further, such defect artificial labeling information is caused to have noise, and training of a model in such noisy label may affect the performance of the model.
Aiming at the problems, the embodiment provides an industrial defect detection algorithm for self-adaptive noise data screening based on a loss curve, and the industrial defect detection algorithm is characterized in that a training process of two stages is designed, wherein the first stage is used for acquiring a training patrol change curve of a single sample, the second stage is used for self-adaptive noise sample filtering according to the sample loss curve, so that the influence of the noise sample is reduced, a more accurate and robust model can be obtained after the second stage training is finished, and the model prediction result of an actual deployment stage is improved.
Alternatively, in this embodiment, the confidence that the single image is a defect is directly output, so as to obtain a detection result (defect or normal) of whether the single image is a defect finally, as shown in fig. 5, the image with the defect confidence of 0.05 is an OK image, and the image with the defect confidence of 0.95 is a defect image.
It should be noted that, the industrial data generally has label noise, and the noise sample generates ambiguity to the supervision information of the model in the training stage due to label error, so that the prediction result of the model has larger variation (poor consistency), further the loss value is generally larger and the variance is larger, but the embodiment performs adaptive noise data screening based on the characteristic of the noise sample and combining with the loss curve;
further illustrating, the present embodiment is optionally divided into three phases, namely a primary training phase, a secondary training phase, and a testing phase, which are specifically as follows:
for a training stage, as shown in fig. 6, training data is input, feature extraction is performed through a depth model f (·; θ), and a target probability result prediction p is output:
p=f(x;θ)
wherein θ represents a weight parameter of the depth model, and performs loss calculation according to the prediction probability and the artificial tag result corresponding to the input data:
l=CE(p,y)
where CE () represents the cross entropy loss function. The model parameters are then iteratively updated with a gradient descent using the loss l, at which point the loss value for each sample is recorded.
After training is finished, obtaining model parameters theta * And a loss variation curve for each sample, as shown in fig. 7, wherein the dotted line in fig. 7 represents a noise sample, the solid line represents a clean sample, the ordinate represents a loss value, and the abscissa represents an iteration round. The loss curve of a single sample can be expressed as l i =[l i,0 ,l i,1 ,…,l i,N ]Where N represents the total number of iterative rounds of training.
After the loss curves of all samples are obtained, performing loss mean value calculation according to categories, and acquiring the loss mean value and variance of each category at the moment t:
wherein D (·) represents the dilichlet function: indicating that only when the sample of label class c will participate in the calculation:
N c indicating the total number of samples of label class c, N t Representing the length over a period of time before and after time t (e.g., calculating the intra-class variance for the previous and subsequent 0 iteration runs).
For the secondary training phase, a loss curve is optionally taken into account for screening of noise samples, as shown in fig. 8. Specifically, in the second training stage, the embodiment loads the model weight θ of the first stage first * And based thereon, continue training. For each batch of training samples { x ] of size B at time t 1 ,x 2 ,…,x B The present embodiment utilizesAnd +.>To identify it. For each sample x within the current batch i The present embodiment first calculates its current loss:
l′ i,t =CE(p,y)
then calculate the mean μ 'of the current sample' t N t Variance sigma 'over time length' t Selecting class mean and variance corresponding to the current sampleAnd +.>) Comparison is performed:
when the following condition is satisfied, the current sample is determined to be a noise sample:
At this time, sample x i Rejection is performed without participating in the lost counter-propagating update gradient.
It should be noted that the mean value μ' t Sum of variances sigma' t The calculation of (a) is changed along with the change of the iteration round t and is not a fixed value, so that the noise sample screening process is adaptively changed along with the iteration round t, and the method has higher flexibility than the method of judging according to the mean value and the variance of a scalar value.
*′
After training is finished, the final model weight theta is obtained.
For the test phase, the depth model optionally directly performs feature extraction and probability output p on it, and then performs defect determination according to a threshold, for example, if more than 0.5 is a defect, and if less than 0.5 is an OK image, as shown in fig. 9.
According to the embodiment provided by the application, aiming at an industrial defect detection task, the defect degree of an input image can be accurately detected by adopting the embodiment, an industrial defect detection algorithm for self-adaptive noise data screening based on a loss curve is designed, a training process of two stages is designed, the first stage is used for acquiring a training inspection change curve of a single sample, the second stage is used for self-adaptive noise sample filtering according to the sample loss curve, so that the influence of the noise sample is reduced, a more accurate and robust model can be obtained after the second stage training is finished, the model prediction result of an actual deployment stage is improved, and a reliable technical support is provided for industrial AI defect quality detection.
It will be appreciated that in the specific embodiments of the present application, related data such as user information is involved, and when the above embodiments of the present application are applied to specific products or technologies, user permissions or consents need to be obtained, and the collection, use and processing of related data need to comply with related laws and regulations and standards of related countries and regions.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
According to another aspect of the embodiment of the present application, there is also provided an image recognition apparatus for implementing the above image recognition method. As shown in fig. 10, the apparatus includes:
a first obtaining unit 1002, configured to obtain a target image to be identified;
a first input unit 1004, configured to input a target image into an image recognition model, where the image recognition model is a neural network model for recognizing an image, the neural network model being obtained by performing primary training and secondary training with a plurality of target samples, the target samples being clean samples or noise samples, the noise samples being samples obtained by screening the plurality of target samples with a target loss curve in a training process of the secondary training, the target loss curve being a loss curve corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and an update weight of the noise samples to the image recognition model being smaller than an update weight of the clean samples to the image recognition model;
A second obtaining unit 1006, configured to obtain an image recognition result output by the image recognition model, where the image recognition result is used to indicate an image type of the target image.
Specific embodiments may refer to the examples shown in the image recognition apparatus, and in this example, details are not described herein.
As an alternative, the apparatus further includes:
a third obtaining unit, configured to obtain a plurality of target samples before inputting the target image into the image recognition model, where the target samples carry any one of at least two labels;
the second input unit is used for inputting a plurality of target samples into the initial first recognition model for one time of training before inputting the target image into the image recognition model until a trained second recognition model is obtained, and obtaining a corresponding target loss curve of each target sample in the plurality of target samples in the training process of one time of training;
the third input unit is used for inputting a plurality of target samples into the second recognition model before inputting the target image into the image recognition model, and screening each target sample by utilizing a target loss curve to obtain sample reference information, wherein the sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample;
And the training unit is used for iteratively updating the first recognition model based on the sample reference information until the trained image recognition model is obtained before the target image is input into the image recognition model.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the second obtaining unit 1006 includes:
the image recognition module is used for obtaining an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating that the target image is the image type matched with the target label, and the target label is one of at least two labels.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the third input unit includes:
the first input module is used for inputting a plurality of target samples into the second recognition model to perform multi-round iterative training;
the first screening module is used for screening each target sample by utilizing the target loss curve in the process of multiple rounds of iterative training to obtain sample reference information of each round of the target sample in the process of multiple rounds of iterative training, wherein the sample reference information of each round is used for indicating that each target sample belongs to a clean sample or a noise sample in each round of multiple rounds of iterative training respectively.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the training unit comprises:
the execution module is used for executing the following steps until an image recognition model is obtained:
determining a first recognition model of a current round and current round sample reference information corresponding to each target sample, wherein the current round sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in the current round of multi-round iterative training;
determining a first target sample which belongs to a noise sample and is indicated by the sample reference information of the current round and a second target sample which belongs to a clean sample and is indicated by the sample reference information of the current round from a plurality of target samples;
under the condition that the training result of the current wheel corresponding to the first recognition model of the current wheel does not meet the convergence condition, back propagation is carried out by utilizing a second target sample so as to update the model weight of the first recognition model, and the updated first recognition model is determined to be the first recognition model of the current wheel;
and under the condition that the training result of the current wheel meets the convergence condition, determining the first recognition model of the current wheel as a trained image recognition model.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the apparatus further includes: the first training module is used for training the first recognition model of the current wheel by using the second target sample before the second target sample is used for back propagation to update the model weight of the first recognition model and determining the updated first recognition model as the first recognition model of the current wheel, so as to obtain a training result of the current wheel; or alternatively, the first and second heat exchangers may be,
the apparatus further comprises: the second training module is used for training the first recognition model of the current wheel by utilizing the plurality of target samples before determining the first target sample which is indicated by the current wheel sample reference information and belongs to the noise sample and the second target sample which is indicated by the current wheel sample reference information and belongs to the clean sample from the plurality of target samples, so as to obtain a current wheel training result;
the apparatus further comprises: the first determining unit is used for determining a first target sample and a second target sample from the plurality of target samples under the condition that the current wheel training result corresponding to the first recognition model of the current wheel does not meet the convergence condition after training the first recognition model of the current wheel by utilizing the plurality of target samples to obtain the current wheel training result.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the third input unit includes:
the second input module is used for inputting a plurality of target samples into the second recognition model to perform multi-round iterative training;
the statistics module is used for counting loss change curves respectively corresponding to each target sample in the multi-round iterative training process after the multi-round iterative training is finished;
and the combination module is used for combining the loss change curve and the target loss curve to screen each target sample so as to obtain overall sample reference information, wherein the overall sample reference information is used for indicating that each target sample belongs to a clean sample or a noise sample in the overall turn of the multi-turn iterative training.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the training unit comprises:
a second determining unit configured to determine, from among the plurality of target samples, a third target sample belonging to the noise sample indicated by the whole-sample reference information, and a fourth target sample belonging to the clean sample indicated by the whole-sample reference information;
And the third training module is used for carrying out iterative training on the first recognition model by utilizing the fourth target sample, and updating the model weight of the first recognition model in the iterative training process until a trained image recognition model is obtained.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the apparatus further includes: the calculating unit is used for inputting a plurality of target samples into the second recognition model, screening each target sample by utilizing a target loss curve, and calculating a primary reference value corresponding to each sample category by utilizing the target loss curve before sample reference information is obtained, wherein the sample category is used for representing the category corresponding to the carried label;
a third input unit comprising:
the third input module is used for inputting a plurality of target samples into the second recognition model to obtain a secondary loss curve corresponding to each target sample;
the first calculation module is used for calculating a secondary reference value corresponding to each target sample by using the secondary loss curve;
the comparison module is used for comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result;
And the second screening module is used for screening each target sample according to the target comparison result to obtain sample reference information.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
As an alternative, the computing unit includes: the second calculation module is used for calculating a first loss mean value and a first loss variance corresponding to each sample category by using the target loss curve, wherein the first loss mean value is the loss mean value corresponding to all target samples belonging to the same sample category, and the first loss variance is the loss variance corresponding to all target samples belonging to the same sample category;
a first computing module comprising: the calculation sub-module is used for calculating a second loss mean value and a second loss variance corresponding to each target sample by using the secondary loss curve;
a comparison module, comprising: the comparison sub-module is used for comparing the first loss average value and the second loss average value belonging to the same sample category to obtain a first comparison result; and comparing the first loss variance and the second loss variance belonging to the same sample class to obtain a second comparison result;
a second screening module comprising: and the screening submodule is used for screening each target sample according to the first comparison result and the second comparison result to obtain sample reference information.
Specific embodiments may refer to examples shown in the above image recognition method, and in this example, details are not described herein.
According to a further aspect of the embodiment of the present application, there is also provided an electronic device for implementing the above-mentioned image recognition method, which may be, but is not limited to, the user device 102 or the server 112 shown in fig. 1, the embodiment taking the electronic device as the user device 102 as an example, and further as shown in fig. 11, the electronic device includes a memory 1102 and a processor 1104, the memory 1102 having stored therein a computer program, the processor 1104 being configured to execute the steps of any of the above-mentioned method embodiments by means of the computer program.
Alternatively, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of the computer network.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, acquiring a target image to be identified;
s2, inputting a target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by using a plurality of target samples, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by using target loss curves in the training process of the secondary training, the target loss curves are loss curves corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the updating weight of the noise samples to the image recognition model is smaller than that of the clean samples to the image recognition model;
S3, obtaining an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
Alternatively, it will be appreciated by those skilled in the art that the structure shown in fig. 11 is merely illustrative, and fig. 11 is not intended to limit the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 11, or have a different configuration than shown in FIG. 11.
The memory 1102 may be used to store software programs and modules, such as program instructions/modules corresponding to the image recognition method and apparatus in the embodiment of the present application, and the processor 1104 executes the software programs and modules stored in the memory 1102 to perform various functional applications and data processing, that is, implement the image recognition method described above. Memory 1102 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 1102 may further include memory remotely located relative to processor 1104, which may be connected to the user device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1102 may be used for storing information such as a target image, an image recognition model, and an image recognition result, but is not limited to the above. As an example, as shown in fig. 11, the memory 1102 may include, but is not limited to, a first acquiring unit 1002, a first input unit 1004, and a second acquiring unit 1006 in the image recognition device. In addition, other module units in the image recognition apparatus may be included, but are not limited to, and are not described in detail in this example.
Optionally, the transmission device 1106 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission device 1106 includes a network adapter (Network Interface Controller, NIC) that may be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 1106 is a Radio Frequency (RF) module for communicating wirelessly with the internet.
In addition, the electronic device further includes: a display 1108 for displaying the target image, the image recognition model, and the image recognition result; and a connection bus 1110 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the user device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting the plurality of nodes through a network communication. Among them, the nodes may form a Peer-To-Peer (P2P) network, and any type of computing device, such as a server, a user device, and other electronic devices, may become a node in the blockchain system by joining the Peer-To-Peer network.
According to one aspect of the present application, a computer program product is provided, comprising a computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. When executed by a central processing unit, performs various functions provided by embodiments of the present application.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
It should be noted that the computer system of the electronic device is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
The computer system includes a central processing unit (Central Processing Unit, CPU) which can execute various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) or a program loaded from a storage section into a random access Memory (Random Access Memory, RAM). In the random access memory, various programs and data required for the system operation are also stored. The CPU, the ROM and the RAM are connected to each other by bus. An Input/Output interface (i.e., I/O interface) is also connected to the bus.
The following components are connected to the input/output interface: an input section including a keyboard, a mouse, etc.; an output section including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and the like, and a speaker, and the like; a storage section including a hard disk or the like; and a communication section including a network interface card such as a local area network card, a modem, and the like. The communication section performs communication processing via a network such as the internet. The drive is also connected to the input/output interface as needed. Removable media such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, and the like are mounted on the drive as needed so that a computer program read therefrom is mounted into the storage section as needed.
In particular, the processes described in the various method flowcharts may be implemented as computer software programs according to embodiments of the application. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium. The computer program, when executed by a central processing unit, performs the various functions defined in the system of the application.
The foregoing embodiment numbers of the present application are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In several embodiments provided by the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application, which are intended to be comprehended within the scope of the present application.

Claims (14)

1. An image recognition method, comprising:
acquiring a target image to be identified;
inputting the target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training by utilizing a plurality of target samples, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by utilizing a target loss curve in the training process of the secondary training, the target loss curve is a loss curve corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the update weight of the noise samples to the image recognition model is smaller than that of the clean samples to the image recognition model;
And acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
2. The method of claim 1, wherein prior to said inputting the target image into an image recognition model, the method further comprises:
acquiring the plurality of target samples, wherein the target samples carry any one of at least two labels;
inputting the plurality of target samples into an initial first recognition model for one time of training until a trained second recognition model is obtained, and acquiring the target loss curve corresponding to each target sample in the plurality of target samples in the one time of training;
inputting the plurality of target samples into the second recognition model, and screening each target sample by utilizing the target loss curve to obtain sample reference information, wherein the sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample;
and iteratively updating the first recognition model based on the sample reference information until the trained image recognition model is obtained.
3. The method according to claim 2, wherein the obtaining the image recognition result output by the image recognition model includes:
and acquiring the image recognition result output by the image recognition model, wherein the image recognition result is used for indicating that the target image is the image type matched with the target label, and the target label is one of the at least two labels.
4. The method of claim 2, wherein inputting the plurality of target samples into the second recognition model and screening the respective target samples using the target loss curve to obtain sample reference information comprises:
inputting the target samples into the second recognition model for multiple rounds of iterative training;
and screening each target sample by utilizing the target loss curve in the multi-round iterative training process to obtain each round of sample reference information of each target sample in the multi-round iterative training process, wherein the each round of sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample in each round of the multi-round iterative training.
5. The method of claim 4, wherein iteratively updating the first recognition model based on the sample reference information until a trained image recognition model is obtained, comprises:
the following steps are executed until the image recognition model is obtained:
determining a first recognition model of a current round and current round sample reference information corresponding to each target sample, wherein the current round sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample in the current round of the multi-round iterative training;
determining a first target sample which belongs to the noise sample and is indicated by the current round sample reference information and a second target sample which belongs to the clean sample and is indicated by the current round sample reference information from the plurality of target samples;
under the condition that the training result of the current wheel corresponding to the first recognition model of the current wheel does not meet the convergence condition, back propagation is carried out by utilizing the second target sample so as to update the model weight of the first recognition model, and the updated first recognition model is determined to be the first recognition model of the current wheel;
And under the condition that the current wheel training result meets the convergence condition, determining a first recognition model of the current wheel as the trained image recognition model.
6. The method of claim 5, wherein the step of determining the position of the probe is performed,
before said back-propagating with said second target sample to update the model weights of said first recognition model and determining the updated first recognition model as the first recognition model of said current wheel, said method further comprises: training the first recognition model of the current wheel by using the second target sample to obtain the training result of the current wheel; or, before the determining, from the plurality of target samples, a first target sample belonging to the noise sample indicated by the current round of sample reference information and a second target sample belonging to the clean sample indicated by the current round of sample reference information, the method further includes: training a first recognition model of the current wheel by utilizing the target samples to obtain a training result of the current wheel;
after the training of the first recognition model of the current wheel using the plurality of target samples to obtain the current wheel training result, the method further includes: and determining the first target sample and the second target sample from the plurality of target samples under the condition that the current wheel training result corresponding to the first identification model of the current wheel does not meet the convergence condition.
7. The method of claim 2, wherein inputting the plurality of target samples into the second recognition model and screening the respective target samples using the target loss curve to obtain sample reference information comprises:
inputting the target samples into the second recognition model for multiple rounds of iterative training;
after the multi-round iterative training is finished, calculating loss change curves respectively corresponding to the target samples in the multi-round iterative training process;
and screening each target sample by combining the loss change curve and the target loss curve to obtain overall sample reference information, wherein the overall sample reference information is used for indicating that each target sample belongs to the clean sample or the noise sample in the overall round of the multi-round iterative training.
8. The method of claim 7, wherein iteratively updating the first recognition model based on the sample reference information until the trained image recognition model is obtained, comprises:
determining a third target sample which belongs to the noise sample and is indicated by the whole sample reference information and a fourth target sample which belongs to the clean sample and is indicated by the whole sample reference information from the plurality of target samples;
And carrying out iterative training on the first recognition model by using the fourth target sample, and updating the model weight of the first recognition model in the iterative training process until the trained image recognition model is obtained.
9. The method according to any one of claims 2 to 8, wherein,
before the target samples are input into the second recognition model and the target loss curves are utilized to screen the target samples, sample reference information is obtained, the method further comprises: calculating primary reference values corresponding to each sample category by using the target loss curve, wherein the sample category is used for representing the category corresponding to the carried tag;
inputting the plurality of target samples into the second recognition model, and screening each target sample by using the target loss curve to obtain sample reference information, wherein the method comprises the following steps:
inputting the target samples into the second recognition model to obtain a secondary loss curve corresponding to each target sample;
calculating a secondary reference value corresponding to each target sample by using the secondary loss curve;
Comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result;
and screening each target sample according to the target comparison result to obtain the sample reference information.
10. The method of claim 9, wherein the step of determining the position of the substrate comprises,
the calculating the primary reference value corresponding to each sample category by using the target loss curve comprises the following steps: calculating a first loss mean value and a first loss variance corresponding to each sample category by using the target loss curve, wherein the first loss mean value is the loss mean value corresponding to all target samples belonging to the same sample category, and the first loss variance is the loss variance corresponding to all target samples belonging to the same sample category;
the calculating the secondary reference value corresponding to each target sample by using the secondary loss curve includes: calculating a second loss mean value and a second loss variance corresponding to each target sample by using the secondary loss curve;
the comparing the primary reference value and the secondary reference value belonging to the same sample category to obtain a target comparison result comprises: comparing the first loss average value and the second loss average value belonging to the same sample category to obtain a first comparison result; and comparing the first loss variance and the second loss variance belonging to the same sample category to obtain a second comparison result;
The step of screening the target samples according to the target comparison result to obtain the sample reference information includes: and screening each target sample according to the first comparison result and the second comparison result to obtain the sample reference information.
11. An image recognition apparatus, comprising:
the first acquisition unit is used for acquiring a target image to be identified;
the first input unit is used for inputting the target image into an image recognition model, wherein the image recognition model is a neural network model which is obtained by performing primary training and secondary training on a plurality of target samples and is used for recognizing the image, the target samples are clean samples or noise samples, the noise samples are samples obtained by screening the plurality of target samples by using target loss curves in the training process of the secondary training, the target loss curves are loss curves corresponding to each target sample in the plurality of target samples obtained in the training process of the primary training, and the update weight of the noise samples on the image recognition model is smaller than that of the clean samples on the image recognition model;
And the second acquisition unit is used for acquiring an image recognition result output by the image recognition model, wherein the image recognition result is used for indicating the image type of the target image.
12. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a computer program, wherein the computer program, when run by an electronic device, performs the method of any one of claims 1 to 10.
13. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method as claimed in any one of claims 1 to 10.
14. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 10 by means of the computer program.
CN202310239686.9A 2023-03-03 2023-03-03 Image recognition method and device, storage medium and electronic equipment Pending CN116958777A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310239686.9A CN116958777A (en) 2023-03-03 2023-03-03 Image recognition method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310239686.9A CN116958777A (en) 2023-03-03 2023-03-03 Image recognition method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN116958777A true CN116958777A (en) 2023-10-27

Family

ID=88451737

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310239686.9A Pending CN116958777A (en) 2023-03-03 2023-03-03 Image recognition method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116958777A (en)

Similar Documents

Publication Publication Date Title
US20210256320A1 (en) Machine learning artificialintelligence system for identifying vehicles
CN110659723B (en) Data processing method and device based on artificial intelligence, medium and electronic equipment
CN114998251B (en) Air multi-vision platform ground anomaly detection method based on federal learning
CN114611672B (en) Model training method, face recognition method and device
CN115984537A (en) Image processing method and device and related equipment
CN117152459B (en) Image detection method, device, computer readable medium and electronic equipment
WO2024179409A9 (en) Three-dimensional industrial anomaly detection method and apparatus, storage medium, and electronic device
Xiang et al. Crowd density estimation method using deep learning for passenger flow detection system in exhibition center
CN116701706B (en) Data processing method, device, equipment and medium based on artificial intelligence
Yamaguchi et al. Road crack detection interpreting background images by convolutional neural networks and a self‐organizing map
CN113408564A (en) Graph processing method, network training method, device, equipment and storage medium
CN116977265A (en) Training method and device for defect detection model, computer equipment and storage medium
CN117010454A (en) Neural network training method, device, electronic equipment and storage medium
CN116958777A (en) Image recognition method and device, storage medium and electronic equipment
Yawale et al. Design of a high-density bio-inspired feature analysis deep learning model for sub-classification of natural & synthetic imagery
CN113869367A (en) Model capability detection method and device, electronic equipment and computer readable medium
CN117574160B (en) Tag identification method and device for media information, storage medium and electronic equipment
CN117011575B (en) Training method and related device for small sample target detection model
CN116109823B (en) Data processing method, apparatus, electronic device, storage medium, and program product
CN117011631A (en) Training method, device, equipment, medium and product of image recognition model
CN116958776A (en) Image recognition method and device, storage medium and electronic equipment
CN117541824A (en) Identification method and device, storage medium and electronic equipment
CN116645564A (en) Vehicle recognition model training, vehicle recognition method, device, equipment and medium
CN116977261A (en) Image processing method, image processing apparatus, electronic device, storage medium, and program product
CN116958564A (en) Image detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination