CN113505859B - Model training method and device, and image recognition method and device - Google Patents

Model training method and device, and image recognition method and device Download PDF

Info

Publication number
CN113505859B
CN113505859B CN202111035290.XA CN202111035290A CN113505859B CN 113505859 B CN113505859 B CN 113505859B CN 202111035290 A CN202111035290 A CN 202111035290A CN 113505859 B CN113505859 B CN 113505859B
Authority
CN
China
Prior art keywords
category
image
images
data corresponding
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111035290.XA
Other languages
Chinese (zh)
Other versions
CN113505859A (en
Inventor
蔡鑫
崔亚轩
邱慎杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Taimei Medical Technology Co Ltd
Original Assignee
Zhejiang Taimei Medical Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Taimei Medical Technology Co Ltd filed Critical Zhejiang Taimei Medical Technology Co Ltd
Priority to CN202111035290.XA priority Critical patent/CN113505859B/en
Publication of CN113505859A publication Critical patent/CN113505859A/en
Application granted granted Critical
Publication of CN113505859B publication Critical patent/CN113505859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a model training method and device and an image recognition method and device, and relates to the technical field of neural networks. The model training method comprises the following steps: determining a first class identification result corresponding to each of the M images based on first class distribution weight data corresponding to each of the M images by using a first identification model; determining category consistency rate data corresponding to the M images based on the category label data corresponding to the M images and the first category identification result; determining second category distribution weight data corresponding to the M images based on the category coincidence rate data corresponding to the M images; and training the first recognition model based on the second class distribution weight data corresponding to the M images respectively to obtain the image recognition model. According to the method and the device, the learning difficulty and weight of the image samples can be redefined for the mispredicted image samples, so that the model pays more attention to the mispredicted image samples, and the recognition accuracy of the image recognition model obtained through training is greatly improved.

Description

Model training method and device, and image recognition method and device
Technical Field
The application relates to the technical field of neural networks, in particular to a model training method and device and an image recognition method and device.
Background
Typically, a three-dimensional medical image sequence comprises a plurality of sites. In order to facilitate the operation of the reader, it is necessary to determine the region to which each image in the three-dimensional medical image sequence belongs (i.e., determine the category to which each image in the three-dimensional medical image sequence corresponds), so as to extract the image corresponding to the region based on the region.
However, the existing part identification method and part extraction method are low in accuracy.
Disclosure of Invention
The present application is proposed to solve the above-mentioned technical problems. The embodiment of the application provides a model training method and device and an image recognition method and device.
In a first aspect, an embodiment of the present application provides a model training method, where the model training method includes: determining a first class identification result corresponding to each of the M images based on first class distribution weight data corresponding to each of the M images by using a first identification model; determining category consistency rate data corresponding to the M images based on the category label data corresponding to the M images and the first category identification result; determining second category distribution weight data corresponding to the M images based on the category coincidence rate data corresponding to the M images; and training the first recognition model based on second class distribution weight data corresponding to the M images to obtain an image recognition model, wherein the image recognition model is used for determining class recognition results corresponding to the images in the image sequence to be recognized.
With reference to the first aspect, in some implementations of the first aspect, determining, based on the category label data and the first category identification result that correspond to each of the M images, category coincidence rate data that corresponds to each of the M images includes: for each image in the M images, determining corresponding boundary consistency rate data of the image based on the class label data and the first class identification result corresponding to the image; and determining the category matching rate data corresponding to the image based on the boundary matching rate data corresponding to the image.
With reference to the first aspect, in certain implementations of the first aspect, the determining, based on the class label data corresponding to the image and the first class identification result, boundary consistency rate data corresponding to the image includes: determining sequence information of adjacent prediction categories corresponding to the images based on the first category identification results corresponding to the images; determining sequence information of adjacent real categories corresponding to the images based on the category label data corresponding to the images; and determining the boundary consistency rate data corresponding to the image based on the sequence information of the adjacent prediction type and the sequence information of the adjacent real type corresponding to the image.
With reference to the first aspect, in certain implementations of the first aspect, the sequence information of the adjacent prediction classes includes middle sequence information and end sequence information of a prediction class to which the image belongs, middle sequence information and start sequence information of a next prediction class adjacent to the prediction class to which the image belongs, and the sequence information of the adjacent real classes includes middle sequence information and end sequence information of a real class to which the image belongs, and middle sequence information and start sequence information of a next real class adjacent to the real class to which the image belongs. Determining boundary consistency rate data corresponding to the image based on sequence information of adjacent prediction categories and sequence information of adjacent real categories corresponding to the image, wherein the determining comprises the following steps: determining first offset data based on starting sequence information of a next prediction category adjacent to a prediction category to which the image belongs, ending sequence information of the prediction category to which the image belongs, starting sequence information of a next real category adjacent to a real category to which the image belongs, and ending sequence information of the real category to which the image belongs; determining second offset data based on the intermediate sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the intermediate sequence information of the next real category adjacent to the real category to which the image belongs, and the intermediate sequence information of the real category to which the image belongs; and determining corresponding boundary consistency rate data of the image based on the first offset data and the second offset data.
With reference to the first aspect, in some implementations of the first aspect, the determining the class matching rate data corresponding to the image from the boundary matching rate data corresponding to the image includes: determining consistency rate classification interval information corresponding to the image sequence samples to be identified based on layer thickness information and/or layer spacing information corresponding to the image sequence samples to be identified; and determining the category matching rate data corresponding to the image based on the matching rate classification section information and the boundary matching rate data corresponding to the image.
With reference to the first aspect, in some implementations of the first aspect, determining, based on the class coincidence rate data corresponding to each of the M images, second class assignment weight data corresponding to each of the M images includes: determining Gaussian distribution probability weight data corresponding to the M images respectively; and determining second category distribution weight data corresponding to the M images based on the category coincidence rate data and the Gaussian distribution probability weight data corresponding to the M images.
With reference to the first aspect, in some implementations of the first aspect, determining gaussian distribution probability weight data corresponding to each of the M images includes: and aiming at each image in the M images, determining the Gaussian radius based on the number of images contained in the true category to which the image belongs, and obtaining Gaussian distribution probability weight data by taking the central sequence information of the true category to which the image belongs as the mean value of distribution.
With reference to the first aspect, in some implementations of the first aspect, determining, based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images, second class distribution weight data corresponding to each of the M images includes: determining first incremental weight data corresponding to the M images based on the class consistency rate data and the Gaussian distribution probability weight data corresponding to the M images; and determining second category distribution weight data corresponding to the M images based on the first incremental weight data and the first category distribution weight data corresponding to the M images.
With reference to the first aspect, in some implementations of the first aspect, training the first recognition model based on second class assignment weight data corresponding to each of the M images to obtain an image recognition model includes: training the first recognition model based on second class distribution weight data corresponding to the M images respectively to obtain a second recognition model; determining a second category identification result corresponding to each of the M images by using a second identification model; obtaining current category distribution weight data corresponding to the M images based on the category label data corresponding to the M images and the second category identification result; determining third category distribution weight data corresponding to the M images based on the first category distribution weight data, the second category distribution weight data and the current category distribution weight data corresponding to the M images; and training a second recognition model based on the third class distribution weight data corresponding to the M images to obtain an image recognition model.
In a second aspect, an embodiment of the present application provides an image recognition method, including: determining an image recognition model, wherein the image recognition model is obtained by training based on the model training method mentioned in the first aspect; and determining the category identification results corresponding to the images in the image sequence to be identified by using the image identification model.
In a third aspect, an embodiment of the present application provides a model training apparatus, including: the first determining module is used for determining a first class identification result corresponding to each of the M images based on the first class distribution weight data corresponding to each of the M images by using the first identification model; the second determining module is used for determining category consistency rate data corresponding to the M images based on the category label data corresponding to the M images and the first category identification result; the third determining module is used for determining second category distribution weight data corresponding to the M images based on the category consistency rate data corresponding to the M images; and the training module is used for training the first recognition model based on the second class distribution weight data corresponding to the M images to obtain an image recognition model, wherein the image recognition model is used for determining the class recognition results corresponding to the images in the image sequence to be recognized.
In a fourth aspect, an embodiment of the present application provides an image recognition apparatus, including: a model determining module, configured to determine an image recognition model, where the image recognition model is obtained by training based on the model training method mentioned in the first aspect; and the identification result determining module is used for determining the category identification results corresponding to the images in the image sequence to be identified by using the image identification model.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, where the storage medium stores a computer program for executing the model training method mentioned in the first aspect and/or the image recognition method mentioned in the second aspect.
In a sixth aspect, an embodiment of the present application provides an electronic device, including: a processor; a memory for storing processor-executable instructions; the processor is configured to perform the model training method of the first aspect and/or the image recognition method of the second aspect.
The model training method provided by the embodiment of the application can consider the problem of difficulty in learning of the image sample relative to the model. In addition, in the embodiment of the present application, by determining the class coincidence rate data corresponding to each of the M images by using the class identification result output by the model of the current training round, and then determining the class assignment weight data of the next training round based on the class coincidence rate data corresponding to each of the M images, the determined class assignment weight data sufficiently takes into account the prediction offset data, that is, the prediction offset data is specifically taken into account from the perspective of the sequence information (i.e., the image position relationship) of the image sequence. By the arrangement, the learning difficulty and weight of the image samples can be redefined for the image samples subjected to the error prediction, so that the model can pay more attention to the image samples subjected to the error prediction, and the recognition accuracy of the image recognition model obtained by training is greatly improved.
Drawings
Fig. 1 is a schematic flow chart of a model training method according to an embodiment of the present application.
Fig. 2 is a schematic flow chart illustrating a process of determining category coincidence rate data corresponding to M images based on category label data corresponding to the M images and a first category identification result according to an embodiment of the present application.
Fig. 3 is a schematic flow chart illustrating a process of determining boundary coincidence rate data corresponding to an image based on class label data corresponding to the image and a first class identification result according to an embodiment of the present application.
Fig. 4 is a schematic flow chart illustrating a process of determining boundary coincidence rate data corresponding to an image based on sequence information of adjacent prediction categories and sequence information of adjacent real categories corresponding to the image according to an embodiment of the present application.
Fig. 5 is a schematic flowchart illustrating a process of determining category matching rate data corresponding to an image based on boundary matching rate data corresponding to the image according to an embodiment of the present application.
Fig. 6 is a schematic flow chart illustrating a process of determining second category distribution weight data corresponding to M images based on category coincidence rate data corresponding to the M images according to an embodiment of the present application.
Fig. 7 is a schematic flow chart illustrating a process of determining second class distribution weight data corresponding to M images based on class coincidence rate data and gaussian distribution probability weight data corresponding to the M images according to an embodiment of the present application.
Fig. 8 is a schematic flow chart illustrating that a second recognition model is trained based on second class distribution weight data corresponding to each of M images to obtain an image recognition model according to an embodiment of the present application.
Fig. 9 is a schematic flowchart of an image recognition method according to an embodiment of the present application.
Fig. 10 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application.
Fig. 11 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present application.
Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As is well known, typically, a three-dimensional medical image sequence comprises a plurality of sites. For example, the three-dimensional medical image sequence is a medical image sequence of the whole human body, and includes 13 parts, namely, brain nasopharynx, nasopharynx neck, cervico-thoracic region, chest, thoracico-abdominal region, abdomen, abdominopelvic region, pelvic region lower limb and lower limb. For the convenience of subsequent processing, it is necessary to determine the respective regions to which the images in the three-dimensional medical image sequence belong (i.e., determine the respective corresponding categories of the images in the three-dimensional medical image sequence), so as to extract the images corresponding to the respective regions based on the regions. For example, the reader can conveniently perform subsequent operations, such as Quality Control (QC) and the like, based on the divided part image.
The existing part recognition model can solve the technical problem of how to extract an image corresponding to a specific part from a three-dimensional medical image sequence based on the specific part, but the accuracy is low. One of the reasons is the problem of class imbalance, the other is the problem of difficult and easy samples.
In view of the above-mentioned problem of category imbalance, in general, the parts of the human body where the focus is likely to appear are more concentrated, and therefore the images of the relevant parts, such as the head, the chest, and the abdomen, are more frequently collected, while the image collection amount of the parts where the focus is not likely to appear (such as the foot, the lower leg, and the like) is much less. The image data volume difference of different parts is too large, so that the part recognition model is more biased to learn the characteristics of the part with more data volume in the learning process, and the part with smaller data volume has too little contribution to model updating due to the quantity problem, so that the model cannot learn the characteristics of the part recognition model, and the final effect of the model is greatly influenced.
The above-mentioned problem of the difficult and easy samples is exemplified by the above-mentioned medical image sequence of the whole human body including 13 parts in total. Specifically, among the 13 sites enumerated, 7 specific sites and 6 transition sites were included. Wherein, 7 specific parts comprise brain, nasopharynx, neck, chest, abdomen, pelvic cavity and lower limbs, and 6 transition parts comprise brain nasopharynx, nasopharynx neck, cervicothorax, chest and abdomen, abdominopelvic cavity and pelvic cavity lower limbs. That is, there is a transition between each adjacent two specific locations. The transition part is used as a middle area of two adjacent specific parts, and is characterized in that the image structure has the characteristics of the previous specific part and the next specific part, so that the transition part is more difficult to learn in the learning process. In addition, the transition part is only a short transition of two specific parts, and the data volume is relatively small, so that the transition part has the phenomenon of unbalanced category. Thus, in general, a specific site may be referred to as a simple sample, and a transition site may be referred to as a difficult sample.
In order to solve the technical problem of poor recognition accuracy of a part recognition model, embodiments of the present application provide a model training method and apparatus, and an image recognition method and apparatus, so as to achieve the purpose of improving the recognition accuracy of image recognition. It should be noted that the method provided by the embodiment of the present application is not limited to the three-dimensional medical image sequence in the medical scene, and can also be applied to video data in a natural scene. The model training method and the image recognition method according to the embodiment of the present application are described in detail below with reference to fig. 1 to 9.
Fig. 1 is a schematic flow chart of a model training method according to an embodiment of the present application. As shown in fig. 1, the model training method provided in the embodiment of the present application includes the following steps.
Step S100, determining respective first class identification results of the M images based on respective first class distribution weight data of the M images by using the first identification model. The M images belong to an image sequence sample to be identified, the M images correspond to N categories, and M and N are positive integers greater than 1.
For example, the meaning of the M images corresponding to the N classes is that the image sequence sample to be recognized relates to N classes (for example, N part classes), and then each of the M images belongs to one of the N classes.
Illustratively, the class corresponding to the image is assigned with weight data, and the class assigned to the image and capable of representing difficulty in learning is assigned with weight data according to the actual difficulty in learning of the image for the model. Correspondingly, the first category assignment weight data mentioned in step S100 refers to the initial category assignment weight data corresponding to the image. In some embodiments, before the model begins training, a uniform initial class assignment weight data, such as a value of 1, is assigned to each of the M images because the actual learning difficulty of each of the M images for the model is not known.
Illustratively, the first recognition model is an established initial recognition model. In some embodiments, the first recognition model is a multi-classification convolutional neural network model. The multi-classification convolutional neural network model can be learned through level transfer, reasoning is stored, new learning is conducted on subsequent levels, and feature extraction is not needed to be conducted independently before the model is used, so that the multi-classification convolutional neural network model is more suitable for the image processing scene.
It is understood that the first recognition model can output the class recognition result (i.e., the first class recognition result) even if it is not completely converged as the initial model or the intermediate model. Then, for each image in the M images, the category label data corresponding to the image is compared with the first category identification result, so that whether the first category identification result is correct or not can be verified.
Step S200, determining category consistent rate data corresponding to the M images based on the category label data corresponding to the M images and the first category identification result.
Illustratively, the class coincidence rate data is used to characterize the degree of offset between the first class identification result and the class label data corresponding to each of the M images. Specifically, the class matching rate corresponding to the image indicates matching rate data of a class to which the image belongs (i.e., matching rate data of a true class to which the image belongs). It can be understood that, for the category to which the image belongs, a real image set corresponding to the category to which the image belongs is determined based on the category tag data corresponding to each of the M images, a predicted image set corresponding to the category to which the image belongs is determined based on the first category identification result corresponding to each of the M images, and then a coincidence rate corresponding to the category to which the image belongs, that is, a category coincidence rate corresponding to the image is determined based on offset data (also referred to as deviation data) between the real image set and the predicted image set. It can be understood that, the above-mentioned N categories respectively correspond to one coincidence rate data, and then, the category coincidence rate data corresponding to the image can be understood as the coincidence rate data of the category to which the image belongs.
Step S300, based on the category coincidence rate data corresponding to the M images, determining second category distribution weight data corresponding to the M images.
Illustratively, for each of the M images, second category assignment weight data corresponding to the image is determined based on the first category assignment weight data and the category matching rate data corresponding to the image. That is, the determined second class assignment weight data relates to a parameter of class match rate data.
And S400, training the first recognition model based on the second class distribution weight data corresponding to the M images to obtain the image recognition model. The image recognition model is used for determining the category recognition results corresponding to the images in the image sequence to be recognized.
Illustratively, the image recognition model mentioned in step S400 is a trained, converged image recognition model.
Although step S400 refers to training the first recognition model based on the second class assignment weight data corresponding to each of the M images, it is understood that some necessary data of the image sequence sample to be recognized is also required for training the first recognition model. For example, in some embodiments, the above-mentioned sample of the image sequence to be identified includes at least one CT image sequence, and each image in each CT image sequence is represented by x (series, slice, data, gt, a, w). Wherein series represents the sequence id of the CT image sequence, slice represents the id of the image in the CT image sequence, data represents the image pixel data of the image, gt represents the true label of the image (i.e. the category label data mentioned below), a represents the category weight of the corresponding category of the image, w represents the redistribution difficulty weight of the image in the training (i.e. the category distribution weight data), and the initial value is 1 (i.e. the first category distribution weight data is 1).
In the actual training process, firstly, the sequence of M images in an image sequence sample to be recognized is disturbed, then, first class recognition results corresponding to the M images are generated based on first class distribution weight data corresponding to the M images and a first recognition model, then, class consistency rate data corresponding to the M images are determined based on class label data corresponding to the M images and the first class recognition results, second class distribution weight data corresponding to the M images are determined based on the class consistency rate data corresponding to the M images, then, the first recognition model is trained based on the second class distribution weight data corresponding to the M images, and the image recognition model can be obtained after n rounds of training. That is, the first recognition model and the image recognition model differ in that some specific model parameters differ. It is noted that the image recognition model may be considered as the model that converges eventually, and between the first recognition model and the image recognition model, it is also possible to generate a plurality of intermediate models based on the training round. The purpose of scrambling the order of M images is to prevent model jitter and speed up model convergence.
It should be noted that, as mentioned above, the order of M images in the image sequence sample to be recognized is scrambled, and only in the training process (for example, in the training process of the current round), when data such as the category matching rate is calculated, the order of the scrambled M images needs to be restored.
In some embodiments, the condition for determining whether the model converges is that the recognition accuracy of the model reaches a preset accuracy and there is no fluctuation exceeding a fluctuation threshold value in a plurality of iteration cycles.
The model training method provided by the embodiment of the application can consider the problem of difficulty in learning of the image sample relative to the model. In addition, in the embodiment of the present application, by determining the class coincidence rate data corresponding to each of the M images by using the class identification result output by the model of the current training round, and then determining the class assignment weight data of the next training round based on the class coincidence rate data corresponding to each of the M images, the determined class assignment weight data sufficiently takes into account the prediction offset data, that is, the prediction offset data is specifically taken into account from the perspective of the sequence information (i.e., the image position relationship) of the image sequence. By the arrangement, the learning difficulty and weight of the image samples can be redefined for the image samples subjected to the error prediction, so that the model can pay more attention to the image samples subjected to the error prediction, and the recognition accuracy of the image recognition model obtained by training is greatly improved.
Further, another embodiment of the present application is extended on the basis of the embodiment shown in fig. 1. Specifically, in the embodiment of the present application, before step S100, the following steps are further performed: and determining first class distribution weight data corresponding to the M images respectively based on the image sequence samples to be identified comprising the M images. That is, before step S100, a step of assigning first-class assignment weight data to the M images, that is, determining first-class assignment weight data corresponding to each of the M images, is further performed.
Illustratively, in some embodiments, the loss function used in the first recognition model is as shown in equation (1) below.
Figure 433189DEST_PATH_IMAGE001
(1)
In the formula (1), the first and second groups,
Figure 846853DEST_PATH_IMAGE002
indicating the class weight to which the current image corresponds,
Figure 605861DEST_PATH_IMAGE003
the probability of the prediction is represented by,
Figure 79568DEST_PATH_IMAGE004
indicating the assignment of the difficult and easy weight (i.e., the class assignment weight data mentioned in the above embodiment, such as the second class assignment weight data).
Fig. 2 is a schematic flow chart illustrating a process of determining category coincidence rate data corresponding to M images based on category label data corresponding to the M images and a first category identification result according to an embodiment of the present application. The embodiment shown in fig. 2 is extended based on the embodiment shown in fig. 1, and the differences between the embodiment shown in fig. 2 and the embodiment shown in fig. 1 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 2, in the embodiment of the present application, the step of determining the category matching rate data corresponding to each of the M images based on the category label data corresponding to each of the M images and the first category identification result includes the following steps.
Step S210, for each image of the M images, based on the category label data corresponding to the image and the first category identification result, determining boundary coincidence rate data corresponding to the image.
Illustratively, the boundary coincidence rate data refers to a degree of deviation between the predicted category boundary position and the corresponding true category boundary position.
In step S220, based on the boundary matching rate data corresponding to the image, the category matching rate data corresponding to the image is determined.
For example, the boundary coincidence rate data corresponding to the image includes boundary coincidence rate data of a start boundary of a category to which the image belongs and boundary coincidence rate data of an end boundary of a category to which the image belongs, and then the category coincidence rate data corresponding to the image may be determined based on the boundary coincidence rate data of the start boundary of the category to which the image belongs and the boundary coincidence rate data of the end boundary of the category to which the image belongs.
The method for determining the class matching rate data corresponding to the image by using the boundary matching rate data corresponding to the image can fully take offset data at the boundary into consideration. Because the image samples at the boundary are more difficult to learn by the model than the image samples at the non-boundary, the embodiment of the application can enable the model to pay more attention to the boundary image samples, and further improve the recognition accuracy of the trained image recognition model.
Fig. 3 is a schematic flow chart illustrating a process of determining boundary coincidence rate data corresponding to an image based on class label data corresponding to the image and a first class identification result according to an embodiment of the present application. The embodiment shown in fig. 3 is extended based on the embodiment shown in fig. 2, and the differences between the embodiment shown in fig. 3 and the embodiment shown in fig. 2 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 3, in the embodiment of the present application, the step of determining the boundary coincidence rate data corresponding to the image based on the category label data corresponding to the image and the first category identification result includes the following steps.
Step S211, determining sequence information of the adjacent prediction category corresponding to the image based on the first category identification result corresponding to the image.
Illustratively, the sequence information of the neighboring prediction category to which the image corresponds refers to the sequence information of the neighboring prediction category to which the image belongs.
Step S212, based on the category label data corresponding to the image, determines the sequence information of the adjacent real category corresponding to the image.
Illustratively, the sequence information of the neighboring real category to which the image corresponds refers to the sequence information of the neighboring real category to which the image belongs.
In step S213, based on the sequence information of the adjacent prediction class and the sequence information of the adjacent real class corresponding to the image, the boundary matching rate data corresponding to the image is determined.
A specific implementation manner of determining the boundary matching rate data corresponding to the image based on the sequence information of the adjacent prediction category and the sequence information of the adjacent real category corresponding to the image is illustrated in fig. 4.
Fig. 4 is a schematic flow chart illustrating a process of determining boundary coincidence rate data corresponding to an image based on sequence information of adjacent prediction categories and sequence information of adjacent real categories corresponding to the image according to an embodiment of the present application. Specifically, in the embodiment of the present application, the sequence information of the adjacent prediction classes includes intermediate sequence information and end sequence information of a prediction class to which an image belongs, intermediate sequence information and start sequence information of a next prediction class adjacent to the prediction class to which the image belongs. The sequence information of the adjacent real category includes intermediate sequence information and ending sequence information of the real category to which the image belongs, intermediate sequence information and starting sequence information of the next real category adjacent to the real category to which the image belongs.
As shown in fig. 4, the step of determining the boundary matching rate data corresponding to the image based on the sequence information of the adjacent prediction class and the sequence information of the adjacent real class corresponding to the image includes the following steps.
Step S2131 is a step of determining first offset data based on start sequence information of a next prediction class adjacent to a prediction class to which an image belongs, end sequence information of the prediction class to which the image belongs, start sequence information of a next real class adjacent to a real class to which the image belongs, and end sequence information of the real class to which the image belongs.
Step S2132 is a step of determining second offset data based on the intermediate sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the intermediate sequence information of the next real category adjacent to the real category to which the image belongs, and the intermediate sequence information of the real category to which the image belongs.
Step S2133 is to determine boundary coincidence rate data corresponding to the image based on the first offset data and the second offset data.
Illustratively, the boundary coincidence rate data may be calculated based on the following formula (2).
Figure 924027DEST_PATH_IMAGE005
(2)
In the formula (2), the first and second groups,kindicating the category, i.e. secondkAnd (4) each category. If a total of four classes are included for the image sequence sample to be identified,nrepresenting the start site class of the image sequence,mrepresenting the type of the end part of the image sequence, thenn=1,m=4, andkis a positive integer greater than or equal to 1 and less than 4.
Figure 875803DEST_PATH_IMAGE006
And
Figure 20476DEST_PATH_IMAGE007
the two adjacent part sample sets representing the prediction results may be understood as part sample sets of adjacent prediction classes corresponding to the classes to which the above-mentioned images belong.
Figure 665084DEST_PATH_IMAGE008
And
Figure 731260DEST_PATH_IMAGE009
two adjacent part sample sets representing true results can be understood as part sample sets of adjacent true categories corresponding to the categories to which the above-mentioned images belong.
In particular, the amount of the solvent to be used,
Figure 486727DEST_PATH_IMAGE010
start sequence information indicating a next prediction class adjacent to the prediction class to which the image belongs (i.e., sequence information of the start position of the next portion),
Figure 954748DEST_PATH_IMAGE011
indicating to which image belongsInformation on the ending sequence of the class (i.e., class iskSequence information of the end position of the part(s).
Figure 770258DEST_PATH_IMAGE012
Start sequence information representing a next real category adjacent to the real category to which the image belongs,
Figure 589309DEST_PATH_IMAGE013
and ending sequence information representing the real category to which the image belongs.
Figure 820570DEST_PATH_IMAGE014
Intermediate sequence information indicating a next prediction class adjacent to a prediction class to which the image belongs,
Figure 533311DEST_PATH_IMAGE015
intermediate sequence information representing prediction classes to which an image belongs (i.e. classes arekThe sequence information of the middle position of the site of (a),
Figure 129509DEST_PATH_IMAGE016
intermediate sequence information representing a next real category adjacent to the real category to which the image belongs,
Figure 560490DEST_PATH_IMAGE017
intermediate sequence information of the real category to which the image belongs.
It can be seen that the above equation (2) can characterize the degree of deviation between the predicted boundary position of the two categories and the actual boundary position.
The determination method of the boundary consistency rate data can make full use of the difference information between the first class identification result corresponding to the image and the class label data corresponding to the image, so that boundary consistency rate data capable of accurately representing boundary offset between adjacent classes can be obtained, and preconditions are provided for a subsequent model to pay more attention to boundary image samples.
Fig. 5 is a schematic flowchart illustrating a process of determining category matching rate data corresponding to an image based on boundary matching rate data corresponding to the image according to an embodiment of the present application. The embodiment shown in fig. 5 is extended based on the embodiment shown in fig. 2, and the differences between the embodiment shown in fig. 5 and the embodiment shown in fig. 2 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 5, in the embodiment of the present application, the step of determining the category matching rate data corresponding to the image based on the boundary matching rate data corresponding to the image includes the following steps.
Step S221, determining coincidence rate classification interval information corresponding to the image sequence samples to be identified based on layer thickness information and/or layer spacing information corresponding to the image sequence samples to be identified.
In step S222, the category matching rate data corresponding to the image is determined based on the matching rate classification section information and the boundary matching rate data corresponding to the image.
Illustratively, based on the layer thickness information and/or the layer spacing information corresponding to the image sequence samples to be identified, it is determined that the image sequence samples to be identified correspond to four coincidence rate classification intervals, namely, serious inconsistency intervals (0, a)]General inconsistent intervals (a, b)]Slight coincidence interval (b, 1) and complete coincidence interval [1]. Modulation coefficients for calculating image sample weights corresponding to the four coincidence rate classification sectionsCIs 1, 0.5, 0.1, 0. Where a and b may be determined based on empirical values.
For example, a =1/8, b =1/4, then the severe inconsistency interval (0, 1/8)]General discordance intervals (1/8, 1/4)]A slight coincidence interval (1/4, 1) and a complete coincidence interval [1]. Further, if calculated based on the above formula (2)
Figure 533125DEST_PATH_IMAGE018
Is in accordance with 1/8<
Figure 365952DEST_PATH_IMAGE019
<1/4, i.e. that
Figure 133051DEST_PATH_IMAGE020
Fall into generally inconsistent intervals. Then, the boundary one is obtainedData of rate of arrival
Figure 254591DEST_PATH_IMAGE021
Corresponding boundary increment coefficient
Figure 155551DEST_PATH_IMAGE022
. In some embodiments, the boundary delta coefficient may also be expressed as
Figure 452671DEST_PATH_IMAGE023
I.e. by
Figure 780884DEST_PATH_IMAGE024
It will be appreciated that after the boundary delta coefficients are obtained, the class consistent rate data may be determined. For example, the category coincidence rate data can be determined using the following equations (3) to (5).
Figure 796245DEST_PATH_IMAGE025
(3)
Figure 766475DEST_PATH_IMAGE026
(4)
Figure 183681DEST_PATH_IMAGE027
(5)
In the above-mentioned formulas (4) and (5),
Figure 682795DEST_PATH_IMAGE028
the data representing the category conformance rate is presented,
Figure 513348DEST_PATH_IMAGE029
. If it is not
Figure 631477DEST_PATH_IMAGE030
Determining category coincidence rate data based on the above equation (3), if
Figure 27823DEST_PATH_IMAGE031
Determining category coincidence rate data based on the above equation (4), if
Figure 573205DEST_PATH_IMAGE032
The category coincidence rate data is determined based on the above equation (5).
According to the model training method provided by the embodiment of the disclosure, the consistency rate classification interval information corresponding to the image sequence sample to be recognized is determined based on the layer thickness information and/or the layer spacing information corresponding to the image sequence sample to be recognized, and the category consistency rate data corresponding to the image is determined based on the consistency rate classification interval information and the boundary consistency rate data corresponding to the image, so that the purpose of determining the category consistency rate data corresponding to the image based on the boundary consistency rate data corresponding to the image is achieved. Because the coincidence rate classification interval information corresponding to the image sequence sample to be identified is determined based on the layer thickness information and/or the layer spacing information corresponding to the image sequence sample to be identified, the embodiment of the disclosure can establish the association relationship between the layer thickness information and/or the layer spacing information corresponding to the image sequence sample to be identified and the boundary coincidence rate data corresponding to the image based on the coincidence rate classification interval information corresponding to the image sequence sample to be identified, and further can enable the determined category coincidence rate data corresponding to the image to relate to the weight influence factor of the layer thickness information and/or the layer spacing information, thereby further improving the accuracy of the determined category coincidence rate data corresponding to the image.
Fig. 6 is a schematic flow chart illustrating a process of determining second category distribution weight data corresponding to M images based on category coincidence rate data corresponding to the M images according to an embodiment of the present application. The embodiment shown in fig. 6 is extended based on the embodiment shown in fig. 1, and the differences between the embodiment shown in fig. 6 and the embodiment shown in fig. 1 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 6, in the embodiment of the present application, the step of determining, based on the class coincidence rate data corresponding to each of the M images, second class assignment weight data corresponding to each of the M images includes the following steps.
Step S310, determining Gaussian distribution probability weight data corresponding to the M images.
Illustratively, the gaussian distribution probability weight data corresponding to an image is determined based on the gaussian distribution probabilities of all image samples contained in the class to which the image belongs.
For example, the probability weight data of Gaussian distribution corresponding to the imageBDetermined based on the following formula (6).
Figure 687791DEST_PATH_IMAGE033
(6)
In the formula (6), the first and second groups,
Figure 344032DEST_PATH_IMAGE034
a representation of the image sample is shown,
Figure 860464DEST_PATH_IMAGE035
representing a gaussian distribution probability formula based on the image sample.
Further, in some embodiments, for each of the M images, a gaussian radius is determined based on the number of images included in the true category to which the image belongs, and the central sequence information of the true category to which the image belongs is used as a mean value of distribution to obtain gaussian distribution probability weight data. Illustratively, the image corresponds to Gaussian distribution probability weight dataBDetermined based on the following formula (7).
Figure 842326DEST_PATH_IMAGE036
(7)
In the formula (7), the first and second groups,xrepresenting the distance of the current sample in the real label from the middle of the class where the sample is located,
Figure 444209DEST_PATH_IMAGE037
the mean value of the distribution is represented by the central sequence information of the real category to which the image (also called image sample) belongs,
Figure 904140DEST_PATH_IMAGE038
representing half of the total number of images contained in the true category to which the image belongs as a gaussian radius. It should be noted that, in combination with the above formula,
Figure 275079DEST_PATH_IMAGE039
the two boundary agreement rates corresponding to the current category are correlated. If the corresponding boundary consistency rates of the current categories are the same, then
Figure 490159DEST_PATH_IMAGE040
I.e. the center of the positive-too profile is not shifted. If not, the center of the positive distribution curve is shifted to the side where the border coincidence rate is large, so as to finally reduce the assigned weight of the image sample of the side, thereby making the wrong sample better learned.
Step S320, determining second category distribution weight data corresponding to the M images based on the category coincidence rate data and the gaussian distribution probability weight data corresponding to the M images.
A specific implementation manner of determining the second category distribution weight data corresponding to the M images based on the category coincidence rate data and the gaussian distribution probability weight data corresponding to the M images is further illustrated in combination with fig. 7.
Fig. 7 is a schematic flow chart illustrating a process of determining second class distribution weight data corresponding to M images based on class coincidence rate data and gaussian distribution probability weight data corresponding to the M images according to an embodiment of the present application. The embodiment shown in fig. 7 is extended based on the embodiment shown in fig. 6, and the differences between the embodiment shown in fig. 7 and the embodiment shown in fig. 6 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 7, in the embodiment of the present application, the step of determining, based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images, second class distribution weight data corresponding to each of the M images includes the following steps.
Step S321, determining first incremental weight data corresponding to each of the M images based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images.
Step S322, determining second category distribution weight data corresponding to the M images based on the first increment weight data and the first category distribution weight data corresponding to the M images.
Illustratively, the class of the current training round assigns weight data
Figure 189125DEST_PATH_IMAGE041
(which may also be referred to as the difficulty weight increment data for the current training round) may be determined based on equation (8) below. It is understood that the second category of weighting data mentioned in step S320 is the weighting data
Figure 577381DEST_PATH_IMAGE042
Figure 943772DEST_PATH_IMAGE043
(8)
In the formula (8), the first and second groups,
Figure 329754DEST_PATH_IMAGE044
first incremental weight data is represented, and 1 represents first class assigned weight data (it is understood that the initial class assigned weight data is 1).
Relatively speaking, images closer to the edge of the category are more difficult to learn well by the model. Therefore, the image close to the edge of the belonging category is endowed with a larger increment value based on the Gaussian distribution probability weight data, and the image close to the center of the belonging category is endowed with a smaller increment value, so that the model can better learn the characteristics of the image close to the edge of the belonging category. That is, the determined second class assignment weight data corresponding to each of the M images sufficiently takes into account the time-series characteristics of the image sequence.
Fig. 8 is a schematic flow chart illustrating that a second recognition model is trained based on second class distribution weight data corresponding to each of M images to obtain an image recognition model according to an embodiment of the present application. The embodiment shown in fig. 8 is extended based on the embodiment shown in fig. 1, and the differences between the embodiment shown in fig. 8 and the embodiment shown in fig. 1 will be emphasized below, and the descriptions of the same parts will not be repeated.
As shown in fig. 8, in the embodiment of the present application, the step of training the second recognition model based on the second class assignment weight data corresponding to each of the M images to obtain the image recognition model includes the following steps.
Step S410, training the first recognition model based on the second category distribution weight data corresponding to the M images respectively to obtain a second recognition model.
Step S420, determining a second category identification result corresponding to each of the M images by using the second identification model.
Step S430, obtaining current category distribution weight data corresponding to each of the M images based on the category label data corresponding to each of the M images and the second category identification result.
Step S440, determining third category distribution weight data corresponding to the M images based on the first category distribution weight data, the second category distribution weight data and the current category distribution weight data corresponding to the M images.
For each of the M images, the third class assignment weight data corresponding to the image is obtained by performing historical average calculation on the first class assignment weight data, the second class assignment weight data and the current class assignment weight data corresponding to the image. In other words, the third category assignment weight data corresponding to the image is obtained by averaging the historical category assignment weight data corresponding to the image. That is, the class assignment weight data of the next training round is obtained by averaging the class assignment weight data corresponding to each of all training rounds (including the current training round) before the next training round.
And S450, training a second recognition model based on the third class distribution weight data corresponding to the M images to obtain the image recognition model.
According to the embodiment of the application, the influence of the randomness of the training turns on weight distribution during weight redistribution can be prevented, and the robustness of the obtained image recognition model is further improved.
It should be noted that, in the practical application process, the specific implementation manner of step S450 may be to obtain a third recognition model, a fourth recognition model, a fifth recognition model, and the like, then calculate and obtain the category consistency rate data of the current training turn by continuously using the category recognition result determined by the obtained intermediate model, and then correct the category distribution weight data corresponding to each of the M images by using the category consistency rate data of the current training turn until the converged image recognition model is obtained.
In addition, it can be understood that, in the step of training the second recognition model based on the third category assignment weight data corresponding to each of the M images to obtain the image recognition model, based on a new recognition model (for example, a fourth recognition model) obtained after the training of each current training round is completed, new category matching rate data (which may be regarded as category matching rate data of a next training round) corresponding to each of the M images can be obtained based on the category recognition result corresponding to each of the M images output by the new recognition model and the category label data corresponding to each of the M images, and then the training of the next training round is continued based on the new category matching rate data corresponding to each of the M images, and the iteration is repeated in this way until the converged image recognition model is obtained.
Fig. 9 is a schematic flowchart of an image recognition method according to an embodiment of the present application. As shown in fig. 9, the image recognition method provided in the embodiment of the present application includes the following steps.
Step S500, determining an image recognition model.
Illustratively, the image recognition model is trained based on the model training method mentioned in the above embodiments.
Step S600, determining respective corresponding category identification results of the images in the image sequence to be identified by using the image identification model.
Illustratively, the image sequence to be recognized is a three-dimensional medical image sequence, and correspondingly, the category recognition result is a part recognition result.
According to the image recognition method and device, the purpose of accurately determining the category recognition results corresponding to the images in the image sequence to be recognized is achieved by using the image recognition model.
Method embodiments of the present application are described in detail above in conjunction with fig. 1-9, and apparatus embodiments of the present application are described in detail below in conjunction with fig. 10-12. It is to be understood that the description of the method embodiments corresponds to the description of the apparatus embodiments, and therefore reference may be made to the preceding method embodiments for parts not described in detail.
Fig. 10 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application. As shown in fig. 10, the model training apparatus provided in the embodiment of the present application includes a first determining module 100, a second determining module 200, a third determining module 300, and a training module 400. Specifically, the first determining module 100 is configured to determine, by using the first identification model, a first class identification result corresponding to each of the M images based on the first class assignment weight data corresponding to each of the M images. The second determining module 200 is configured to determine category coincidence rate data corresponding to each of the M images based on the category label data corresponding to each of the M images and the first category identification result. The third determining module 300 is configured to determine second category distribution weight data corresponding to each of the M images based on the category coincidence rate data corresponding to each of the M images. The training module 400 is configured to train the first recognition model based on the second class distribution weight data corresponding to each of the M images, so as to obtain an image recognition model.
In some embodiments, the second determining module 200 is further configured to, for each of the M images, determine boundary coincidence rate data corresponding to the image based on the category label data corresponding to the image and the first category identification result, and determine category coincidence rate data corresponding to the image based on the boundary coincidence rate data corresponding to the image.
In some embodiments, the second determining module 200 is further configured to determine, based on the first class identification result corresponding to the image, sequence information of a neighboring prediction class corresponding to the image, determine, based on class tag data corresponding to the image, sequence information of a neighboring real class corresponding to the image, and determine, based on the sequence information of the neighboring prediction class corresponding to the image and the sequence information of the neighboring real class, boundary coincidence rate data corresponding to the image.
In some embodiments, the sequence information of the neighboring prediction classes comprises mid sequence information and end sequence information of a prediction class to which the image belongs, mid sequence information and start sequence information of a next prediction class adjacent to the prediction class to which the image belongs, and the sequence information of the neighboring real classes comprises mid sequence information and end sequence information of a real class to which the image belongs, mid sequence information and start sequence information of a next real class adjacent to the real class to which the image belongs. The second determining module 200 is further configured to determine first offset data based on the start sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the end sequence information of the prediction category to which the image belongs, the start sequence information of the next real category adjacent to the real category to which the image belongs, and the end sequence information of the real category to which the image belongs, determine second offset data based on the intermediate sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the intermediate sequence information of the next real category adjacent to the real category to which the image belongs, and the intermediate sequence information of the real category to which the image belongs, and determine boundary coincidence rate data corresponding to the image based on the first offset data and the second offset data.
In some embodiments, the second determining module 200 is further configured to determine consistency rate classification interval information corresponding to the image sequence samples to be recognized based on layer thickness information and/or layer spacing information corresponding to the image sequence samples to be recognized, and determine category consistency rate data corresponding to the images based on the consistency rate classification interval information and boundary consistency rate data corresponding to the images.
In some embodiments, the third determining module 300 is further configured to determine gaussian distribution probability weight data corresponding to each of the M images, and determine second class assignment weight data corresponding to each of the M images based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images.
In some embodiments, the third determining module 300 is further configured to determine first incremental weight data corresponding to each of the M images based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images, and determine second class assignment weight data corresponding to each of the M images based on the first incremental weight data and the first class assignment weight data corresponding to each of the M images.
In some embodiments, the training module 400 is further configured to train the first recognition model based on the second class assignment weight data corresponding to each of the M images to obtain a second recognition model, determine a second class recognition result corresponding to each of the M images by using the second recognition model, obtain the current class assignment weight data corresponding to each of the M images based on the class label data and the second class recognition result corresponding to each of the M images, and determine the third class assignment weight data corresponding to each of the M images based on the first class assignment weight data, the second class assignment weight data, and the current class assignment weight data corresponding to each of the M images.
Fig. 11 is a schematic structural diagram of an image recognition apparatus according to an embodiment of the present application. As shown in fig. 11, the image recognition apparatus provided in the embodiment of the present application includes a model determination module 500 and a recognition result determination module 600. In particular, the model determination module 500 is used to determine an image recognition model. The recognition result determining module 600 is configured to determine, by using the image recognition model, a category recognition result corresponding to each of the images in the image sequence to be recognized.
Next, an electronic apparatus according to an embodiment of the present application is described with reference to fig. 12. Fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
As shown in fig. 12, the electronic device 70 includes one or more processors 710 and memory 720.
Processor 710 may be a Central Processing Unit (CPU) or other form of Processing Unit having data Processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 70 to perform desired functions.
Memory 720 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile Memory may include, for example, Random Access Memory (RAM), cache Memory (cache), and/or the like. The nonvolatile Memory may include, for example, a Read-Only Memory (ROM), a hard disk, a flash Memory, and the like. One or more computer program instructions may be stored on a computer-readable storage medium and executed by processor 710 to implement the model training methods, the image recognition methods, and/or other desired functions of the various embodiments of the present application mentioned above. Various content such as a sample of the image sequence to be recognized may also be stored in the computer readable storage medium.
In one example, the electronic device 70 may further include: an input device 730 and an output device 740, which are interconnected by a bus system and/or other form of connection mechanism (not shown).
The input device 730 may include, for example, a keyboard, a mouse, and the like.
The output device 740 may output various information including the result of the category recognition to the outside. The output devices 740 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.
Of course, for the sake of simplicity, only some of the components of the electronic device 70 relevant to the present application are shown in fig. 12, and components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 70 may include any other suitable components, depending on the particular application.
In addition to the above methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps of the model training method, the image recognition method according to various embodiments of the present application described above in this specification.
The computer program product may include program code for carrying out operations for embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform the steps in the model training method, the image recognition method according to various embodiments of the present application described above in the present specification.
A computer-readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM or flash Memory), an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.
The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (13)

1. A method of model training, comprising:
determining a first class identification result corresponding to each of the M images based on first class distribution weight data corresponding to each of the M images by using a first identification model;
determining category consistency rate data corresponding to the M images based on the category label data corresponding to the M images and the first category identification result;
determining second category distribution weight data corresponding to the M images based on the category coincidence rate data corresponding to the M images;
training the first recognition model based on second category distribution weight data corresponding to the M images respectively, and obtaining an image recognition model after training n times, wherein the image recognition model is used for determining category recognition results corresponding to the images in the image sequence to be recognized respectively;
wherein the determining, based on the class coincidence rate data corresponding to each of the M images, second class distribution weight data corresponding to each of the M images includes: determining Gaussian distribution probability weight data corresponding to the M images respectively; and determining second category distribution weight data corresponding to the M images respectively based on the category coincidence rate data and the Gaussian distribution probability weight data corresponding to the M images respectively.
2. The model training method according to claim 1, wherein the determining the class coincidence rate data corresponding to each of the M images based on the class label data corresponding to each of the M images and the first class recognition result comprises:
for each of the M images,
determining boundary consistency rate data corresponding to the image based on the category label data corresponding to the image and a first category identification result;
and determining the category matching rate data corresponding to the image based on the boundary matching rate data corresponding to the image.
3. The model training method according to claim 2, wherein the determining the boundary consistency rate data corresponding to the image based on the class label data corresponding to the image and the first class recognition result comprises:
determining sequence information of adjacent prediction categories corresponding to the images based on the first category identification results corresponding to the images;
determining sequence information of adjacent real categories corresponding to the images based on the category label data corresponding to the images;
and determining boundary consistency rate data corresponding to the image based on the sequence information of the adjacent prediction type and the sequence information of the adjacent real type corresponding to the image.
4. The model training method according to claim 3, wherein the sequence information of the neighboring prediction classes comprises middle sequence information and end sequence information of a prediction class to which the image belongs, middle sequence information and start sequence information of a next prediction class adjacent to the prediction class to which the image belongs, and the sequence information of the neighboring real classes comprises middle sequence information and end sequence information of a real class to which the image belongs, middle sequence information and start sequence information of a next real class adjacent to the real class to which the image belongs,
wherein the determining the boundary coincidence rate data corresponding to the image based on the sequence information of the adjacent prediction category and the sequence information of the adjacent real category corresponding to the image comprises:
determining first offset data based on the starting sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the ending sequence information of the prediction category to which the image belongs, the starting sequence information of the next real category adjacent to the real category to which the image belongs, and the ending sequence information of the real category to which the image belongs;
determining second offset data based on the intermediate sequence information of the next prediction category adjacent to the prediction category to which the image belongs, the intermediate sequence information of the next real category adjacent to the real category to which the image belongs, and the intermediate sequence information of the real category to which the image belongs;
and determining boundary consistency rate data corresponding to the image based on the first offset data and the second offset data.
5. The model training method according to any one of claims 2 to 4, wherein determining the class coincidence rate data corresponding to the image from the boundary coincidence rate data corresponding to the image comprises:
determining consistency rate classification interval information corresponding to the image sequence samples to be identified based on layer thickness information and/or layer spacing information corresponding to the image sequence samples to be identified;
and determining the category matching rate data corresponding to the image based on the matching rate classification section information and the boundary matching rate data corresponding to the image.
6. The model training method of claim 1, wherein the determining the gaussian distribution probability weight data corresponding to each of the M images comprises:
and determining the Gaussian radius of each image in the M images based on the number of images contained in the real category to which the image belongs, and obtaining the Gaussian distribution probability weight data by taking the central sequence information of the real category to which the image belongs as the mean value of distribution.
7. The model training method according to claim 1, wherein determining second class distribution weight data corresponding to each of the M images based on the class coincidence rate data and the gaussian distribution probability weight data corresponding to each of the M images comprises:
determining first incremental weight data corresponding to the M images respectively based on the class consistency rate data and the Gaussian distribution probability weight data corresponding to the M images respectively;
and determining second category distribution weight data corresponding to the M images respectively based on the first increment weight data and the first category distribution weight data corresponding to the M images respectively.
8. The model training method according to any one of claims 1 to 4, wherein the training of the first recognition model based on the second class assignment weight data corresponding to each of the M images, and the training of n rounds to obtain the image recognition model, comprises:
training the first recognition model based on second class distribution weight data corresponding to the M images to obtain a second recognition model;
determining a second category identification result corresponding to each of the M images by using the second identification model;
obtaining current category distribution weight data corresponding to the M images based on category label data corresponding to the M images and a second category identification result;
determining third category distribution weight data corresponding to the M images based on the first category distribution weight data, the second category distribution weight data and the current category distribution weight data corresponding to the M images;
and training the second recognition model based on the third class distribution weight data corresponding to the M images to obtain the image recognition model.
9. An image recognition method, comprising:
determining an image recognition model, wherein the image recognition model is obtained by training based on the model training method of any one of the claims 1 to 8;
and determining the category identification results corresponding to the images in the image sequence to be identified by using the image identification model.
10. A model training apparatus, comprising:
the first determining module is used for determining a first class identification result corresponding to each of the M images based on first class distribution weight data corresponding to each of the M images by using a first identification model;
a second determining module, configured to determine category consistency rate data corresponding to each of the M images based on the category label data corresponding to each of the M images and the first category identification result;
a third determining module, configured to determine second category assignment weight data corresponding to each of the M images based on category coincidence rate data corresponding to each of the M images, where the determining the second category assignment weight data corresponding to each of the M images based on category coincidence rate data corresponding to each of the M images includes: determining Gaussian distribution probability weight data corresponding to the M images respectively; determining second category distribution weight data corresponding to the M images respectively based on the category consistency rate data and the Gaussian distribution probability weight data corresponding to the M images respectively;
and the training module is used for training the first recognition model based on second category distribution weight data corresponding to the M images respectively, and obtaining an image recognition model after n rounds of training, wherein the image recognition model is used for determining respective category recognition results corresponding to the images in the image sequence to be recognized.
11. An image recognition apparatus, comprising:
a model determination module, configured to determine an image recognition model, where the image recognition model is obtained by training based on the model training method according to any one of claims 1 to 8;
and the identification result determining module is used for determining the category identification results corresponding to the images in the image sequence to be identified by using the image identification model.
12. A computer-readable storage medium, characterized in that the storage medium stores a computer program for performing the method of any of the preceding claims 1 to 9.
13. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing the processor-executable instructions;
the processor configured to perform the method of any of the preceding claims 1 to 9.
CN202111035290.XA 2021-09-06 2021-09-06 Model training method and device, and image recognition method and device Active CN113505859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111035290.XA CN113505859B (en) 2021-09-06 2021-09-06 Model training method and device, and image recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111035290.XA CN113505859B (en) 2021-09-06 2021-09-06 Model training method and device, and image recognition method and device

Publications (2)

Publication Number Publication Date
CN113505859A CN113505859A (en) 2021-10-15
CN113505859B true CN113505859B (en) 2021-12-28

Family

ID=78016316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111035290.XA Active CN113505859B (en) 2021-09-06 2021-09-06 Model training method and device, and image recognition method and device

Country Status (1)

Country Link
CN (1) CN113505859B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116028880B (en) * 2023-02-07 2023-07-04 支付宝(杭州)信息技术有限公司 Method for training behavior intention recognition model, behavior intention recognition method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909780A (en) * 2019-11-14 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training and image recognition method, device and system
CN111523621A (en) * 2020-07-03 2020-08-11 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN111814835A (en) * 2020-06-12 2020-10-23 理光软件研究所(北京)有限公司 Training method and device of computer vision model, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909780A (en) * 2019-11-14 2020-03-24 腾讯科技(深圳)有限公司 Image recognition model training and image recognition method, device and system
CN111814835A (en) * 2020-06-12 2020-10-23 理光软件研究所(北京)有限公司 Training method and device of computer vision model, electronic equipment and storage medium
CN111523621A (en) * 2020-07-03 2020-08-11 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113505859A (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN109902222B (en) Recommendation method and device
CN109271958B (en) Face age identification method and device
CN112269769B (en) Data compression method, device, computer equipment and storage medium
CN109086654B (en) Handwriting model training method, text recognition method, device, equipment and medium
CN108647571B (en) Video motion classification model training method and device and video motion classification method
CN113408674B (en) Model training method and device, and image recognition method and device
CN110942248B (en) Training method and device for transaction wind control network and transaction risk detection method
KR102570070B1 (en) Method and apparatus for user verification using generalized user model
CN113505859B (en) Model training method and device, and image recognition method and device
JP6172317B2 (en) Method and apparatus for mixed model selection
CN111652371A (en) Offline reinforcement learning network training method, device, system and storage medium
CN108549857B (en) Event detection model training method and device and event detection method
CN111340233B (en) Training method and device of machine learning model, and sample processing method and device
CN113330462A (en) Neural network training using soft nearest neighbor loss
CN110765843A (en) Face verification method and device, computer equipment and storage medium
CN117407797A (en) Equipment fault diagnosis method and model construction method based on incremental learning
CN111275059B (en) Image processing method and device and computer readable storage medium
CN112465805A (en) Neural network training method for quality detection of steel bar stamping and bending
Chandramohan et al. Sparse approximate dynamic programming for dialog management
CN109783769B (en) Matrix decomposition method and device based on user project scoring
US20240020531A1 (en) System and Method for Transforming a Trained Artificial Intelligence Model Into a Trustworthy Artificial Intelligence Model
CN112529637B (en) Service demand dynamic prediction method and system based on context awareness
JP6233432B2 (en) Method and apparatus for selecting mixed model
CN114547917A (en) Simulation prediction method, device, equipment and storage medium
CN116391193A (en) Method and apparatus for energy-based latent variable model based neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant