WO2024083152A1

WO2024083152A1 - Pathological image recognition method, pathological image recognition model training method and system therefor, and storage medium

Info

Publication number: WO2024083152A1
Application number: PCT/CN2023/125221
Authority: WO
Inventors: 张楚康; 张皓; 张行
Original assignee: 安翰科技(武汉)股份有限公司
Priority date: 2022-10-18
Filing date: 2023-10-18
Publication date: 2024-04-25
Also published as: CN115346076B; CN115346076A

Abstract

Disclosed in the present invention are a pathological image recognition method, a pathological image recognition model training method and system therefor, and a storage medium. The model training method comprises: receiving a sample image set; calling a first neural network model to perform traversal reasoning, calling a second neural network model to perform supervised training on the basis of a reasoning result, and calculating a first loss function; calling the second neural network model to perform traversal reasoning, calling the first neural network model to perform supervised training on the basis of a reasoning result, and calculating a second loss function; and performing iterative training according to the first loss function and the second loss function to obtain at least one of a first model training parameter and a second model training parameter. The model training method provided by the present invention can reduce the dependence on limited data, and enhance the stability and performance of models.

Description

Pathological image recognition method and model training method, system and storage medium thereof

This application claims priority to a Chinese patent application filed on October 18, 2022, with application number 202211272240.8, and invention name “Pathological image recognition method and model training method, system and storage medium thereof”, the entire contents of which are incorporated by reference into this application.

Technical Field

The present invention relates to the field of image processing technology, and in particular to a pathological image recognition method and a model training method, system and storage medium thereof.

Background technique

How to efficiently and accurately analyze pathological image data, especially the pathological image data of digestive tract malignant tumors, has always been a topic of great concern in the medical field. At present, the application of artificial intelligence in pathological image analysis can be roughly divided into two major directions: qualitative diagnosis and lesion identification. Due to the limitation of computer load capacity, the modeling idea is usually implemented based on the supervised learning model framework. In this case, the training and optimization of the model algorithm requires a large amount of rich labeled data; if the target lesion area is to be accurately segmented and predicted, clinical professionals are required to perform pixel-level fine annotation of the training data specimen image, which requires a lot of manpower and time costs. As a result, the application of artificial intelligence and deep learning technology in pathological image data analysis is hindered, especially in the scenarios of identifying lesions that have already occurred in the digestive system, identifying and warning precancerous lesions in the digestive system, etc., it is difficult for existing technologies to build a suitable model and quickly output the calculation results.

In addition, in the prior art, multiple sets of labeled data that are costly to establish are usually only used in one round of iterative training of the model. However, in order to improve the accuracy of model training, the number of iterations of model training has to be increased, resulting in a single set of labeled data having a weak contribution and influence on the model training process and limited model quality. At this time, in order to improve the quality of the model, workers are required to feed labeled data to the model again, creating a vicious cycle.

Summary of the invention

One of the purposes of the present invention is to provide a pathological image recognition model training method to solve the technical problems in the prior art that model training is too dependent on labeled data for supervised training, has low utilization rate, cannot fully utilize data for training, has poor training effect and high cost.

One of the purposes of the present invention is to provide a pathological image recognition model training system.

One of the objectives of the present invention is to provide a storage medium.

One of the objectives of the present invention is to provide a pathological image recognition method.

To achieve one of the above-mentioned purposes of the invention, an embodiment of the present invention provides a pathological image recognition model training method, the method comprising: receiving a sample image set; based on the sample image set, calling a first neural network model to perform supervised training and traversal reasoning in sequence, and calling a second neural network model to perform supervised training based on the reasoning result, and calculating a first loss function; based on the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating a second loss function; iteratively training the first neural network model and the second neural network model according to the first loss function and the second loss function to obtain at least one of the first model training parameters and the second model training parameters.

As a further improvement of an embodiment of the present invention, the sample image set includes a labeled sample image set and an unlabeled sample image set.

As a further improvement of an embodiment of the present invention, the "calling the first neural network model to perform supervised training and traversal reasoning in sequence according to the sample image set, and calling the second neural network model to perform supervised training based on the reasoning result to calculate the first loss function" specifically includes: calling the first neural network model to perform supervised training according to the labeled sample image set, and then calling the first neural network model to perform traversal reasoning according to the unlabeled sample image set to obtain a first recognition pseudo-label set corresponding to the unlabeled sample image set; calling the second neural network model to perform supervised training according to the unlabeled sample image set and the first recognition pseudo-label set, and calculating the The first loss function; the "according to the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating the second loss function" specifically includes: according to the labeled sample image set, calling the second neural network model to perform supervised training, and then according to the unlabeled sample image set, calling the second neural network model to perform traversal reasoning to obtain a second recognition pseudo-label set corresponding to the unlabeled sample image set; according to the unlabeled sample image set and the second recognition pseudo-label set, calling the first neural network model to perform supervised training, and calculating the second loss function.

As a further improvement of an embodiment of the present invention, before the "receiving the sample image set", the method also includes: receiving a reference pathology image set; performing size normalization processing and color migration normalization processing on the reference pathology image set in sequence, and calculating to obtain a standard pathology image set; wherein the standard pathology image set includes a labeled pathology image set and an unlabeled pathology image set; grouping the labeled pathology image set, combining the first labeled image set with the unlabeled pathology image set to form a sample image training set, and forming a sample image verification set based on the second labeled image set; generating the sample image set based on the sample image training set and the sample image verification set.

As a further improvement of an embodiment of the present invention, before the “receiving a reference pathological image set”, the method specifically includes: receiving a precancerous lesion specimen image and a non-precancerous lesion specimen image; performing pixel annotation on some precancerous lesion specimen images to obtain a lesion annotation mask; generating the reference pathological image set according to the precancerous lesion specimen image, the corresponding lesion annotation mask, and the non-precancerous lesion specimen image; the “performing size normalization processing and color migration normalization processing on the reference pathological image set in sequence to obtain a standard pathological image set” specifically includes: performing size normalization processing and color migration normalization processing on all annotated lesion specimen images in sequence; The size standardization processing and the color migration standardization processing are performed, and the set of annotated pathological images is obtained by calculation based on the annotated lesion specimen images after the processing; wherein the annotated lesion specimen images correspond to the precancerous lesion specimen images with the corresponding lesion annotation masks; for all unannotated lesion specimen images and all non-precancerous lesion specimen images, the size standardization processing and the color migration standardization processing are performed in sequence, and the set of unannotated pathological images is obtained by calculation based on the unannotated lesion specimen images and the non-precancerous lesion specimen images after the processing; wherein the unannotated lesion specimen images correspond to the precancerous lesion specimen images without the corresponding lesion annotation masks.

As a further improvement of an embodiment of the present invention, the number of the labeled lesion specimen images accounts for 30% of the number of all precancerous lesion specimen images; the number of all non-precancerous lesion specimen images accounts for 20% of the number of all precancerous lesion specimen images.

As a further improvement of an embodiment of the present invention, the "obtaining a set of standard pathological images by calculation" specifically includes: performing sliding window area segmentation on a reference pathological image that has completed size standardization and color migration standardization, and obtaining and calculating the set of standard pathological images based on multiple groups of sliding window area image groups; wherein the sliding window area segmentation specifically includes: constructing an image area sliding window of a preset size, and causing the image area sliding window to perform traversal segmentation on the annotated standardized image and the corresponding lesion annotation mask according to a preset step size, and obtaining multiple groups of annotated sliding window image groups and annotated sliding window mask groups; wherein the annotated standardized image is an annotated lesion specimen image that has completed standardization processing; traversal, segmentation The method comprises the following steps: analyzing and screening and updating the annotated sliding window image and the corresponding annotated sliding window mask according to the proportion of the lesion area of all the annotated sliding window masks in the annotated sliding window mask group; making the image area sliding window perform traversal segmentation on the unannotated standardized image and the non-lesion standardized image according to the preset step size to obtain multiple groups of unannotated sliding window image groups and non-lesion sliding window image groups; wherein the unannotated standardized image is an unannotated lesion specimen image after standardization processing, and the non-lesion standardized image is an image of a non-precancerous lesion specimen after standardization processing; traversing, analyzing and screening and updating the unannotated sliding window image and the non-lesion sliding window image according to the proportion of the tissue area of the unannotated sliding window image and the non-lesion sliding window image.

As a further improvement of an embodiment of the present invention, after "traversing, analyzing and screening and updating the annotated sliding window image and the corresponding annotated sliding window mask according to the proportion of the lesion area of all annotated sliding window masks in the annotated sliding window mask group", the method specifically includes: performing random data augmentation processing on the annotated sliding window image and the corresponding annotated sliding window mask to obtain the set of annotated pathological images; after "traversing, analyzing and screening and updating the unannotated sliding window image and the non-lesion sliding window image according to the proportion of the tissue area of the unannotated sliding window image and the non-lesion sliding window image", the method specifically includes: performing random data augmentation processing on the unannotated sliding window image and the non-lesion sliding window image to obtain the set of unannotated pathological images; wherein the random data augmentation specifically includes: performing at least one of horizontal flipping, vertical flipping, preset angle rotation and transposition on the image matrix according to a preset probability.

As a further improvement of one embodiment of the present invention, the lesion annotation mask includes a one-hot encoding label corresponding to each pixel in the precancerous lesion specimen image, and the one-hot encoding label includes a first encoding bit, a second encoding bit and a third encoding bit that respectively represent the background judgment label, the intraepithelial neoplasia judgment label and the intestinal metaplasia judgment label.

As a further improvement of an embodiment of the present invention, the size standardization processing specifically includes: performing size standardization processing on the reference pathology image set to unify all reference pathology images to a preset magnification; the color migration standardization processing specifically includes: receiving a baseline staining image, performing color space conversion on it, and calculating a baseline staining vector matrix; receiving a reference pathology image, performing color space conversion on it, and calculating a reference color density matrix; generating a color migration image corresponding to the reference pathology image based on the baseline staining vector matrix and the reference color density matrix.

As a further improvement of an embodiment of the present invention, the "receiving a reference staining image, performing color space conversion on it, and calculating a reference staining vector matrix" specifically includes: receiving a reference staining image, performing optical density matrix conversion processing, and obtaining a reference optical density matrix; performing singular value decomposition on the reference optical density matrix, selecting the first singular extreme value and the second singular extreme value to create a projection plane; determining at least one reference singular value and its reference plane axis on the projection plane, projecting the reference optical density matrix to the projection plane, fitting the connecting line between all numerical points on the projected reference optical density matrix and the origin of the projection plane, and calculating the angle between the connecting line and the reference plane axis, finding the maximum value among all angles, and obtaining maximum angle data; calculating the optical density matrix corresponding to the maximum angle data, and performing a normalization operation on the optical density matrix to obtain the reference staining vector matrix.

As a further improvement of an embodiment of the present invention, the "receiving a reference pathological image, performing color space conversion on it, and calculating a reference color density matrix" specifically includes: receiving a reference pathological image, performing optical density matrix conversion, singular value decomposition, plane projection and maximum angle data acquisition on it in sequence, and calculating a reference optical density matrix and a reference staining vector matrix corresponding to the reference pathological image; based on the reference staining vector matrix and the reference optical density matrix, calculating the reference color density matrix corresponding to the reference pathological image.

As a further improvement of an embodiment of the present invention, the method specifically comprises: performing downsampling interpolation on the reference pathological image, setting the The magnification of the reference pathological image is 10 times; wherein the downsampling interpolation is the nearest neighbor interpolation.

As a further improvement of an embodiment of the present invention, before the step of "after calling the first neural network model to perform supervised training according to the set of labeled sample images, calling the first neural network model to perform traversal reasoning according to the set of unlabeled sample images to obtain a first set of identification pseudo-labels corresponding to the set of unlabeled sample images", the method also includes: selecting a semantic segmentation backbone model based on a fully convolutional network as a basic backbone model; performing model initialization based on the basic backbone model according to first weight configuration parameters and second weight configuration parameters, respectively, to obtain the first neural network model and the second neural network model; wherein the first neural network model and the second neural network model are both equipped with a softmax activation function and are configured to have the same optimizer and learning rate adjustment strategy.

As a further improvement of one embodiment of the present invention, the basic backbone model is configured based on a U-Net network architecture, the first weight configuration parameter is set to be generated based on a Xavier parameter initialization strategy, and the second weight configuration parameter is set to be generated based on a Kaiming parameter initialization strategy; the first neural network model and the second neural network model are configured to include a stochastic gradient descent optimizer, and the learning rate adjustment strategy is configured so that the model learning rate value decreases with an increase in the number of iterations.

As a further improvement of an embodiment of the present invention, the model learning rate value is equal to the product of a preset exponential power of the ratio of the remaining number of iterations to the total number of iterations and a basic learning rate value.

As a further improvement of one embodiment of the present invention, the first loss function is configured as a weighted sum of a first supervised loss function and a first pseudo-label loss function, wherein the first supervised loss function refers to a supervised training process of the first neural network model based on a sample image set, and the first pseudo-label loss function refers to a supervised training process of the second neural network model based on an inference result; the second loss function is configured as a weighted sum of a second supervised loss function and a second pseudo-label loss function, wherein the second supervised loss function refers to a supervised training process of the second neural network model based on the sample image set, and the second pseudo-label loss function refers to a supervised training process of the first neural network model based on the inference result.

As a further improvement of an embodiment of the present invention, the first supervised loss function is configured as the sum of a first supervised cross entropy loss function and a first supervised intersection-over-union loss function; wherein, the first supervised cross entropy loss function represents the gap between the known label data in the sample image set and the corresponding inference classification probability, and the first supervised intersection-over-union loss function represents the gap between the known label data in the sample image set and the corresponding inference classification category; the first pseudo-label loss function includes a first pseudo-label cross entropy loss function; wherein, the first pseudo-label cross entropy loss function represents the difference between the inference classification probability of the sample image set by the first neural network model and the inference classification category of the sample image set by the second neural network model. The gap between class categories; the second supervised loss function is configured as the sum of a second supervised cross entropy loss function and a second supervised intersection-over-union loss function; wherein the second supervised cross entropy loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification probability, and the second supervised intersection-over-union loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification category; the second pseudo-label loss function includes a second pseudo-label cross entropy loss function; wherein the second pseudo-label cross entropy loss function characterizes the gap between the inference classification probability of the sample image set by the second neural network model and the inference classification category of the sample image set by the first neural network model.

As a further improvement of one embodiment of the present invention, the sample image characterizes the intraepithelial neoplasia and intestinal metaplasia; the first supervised cross entropy loss function, the first pseudo-label cross entropy loss function, the second supervised cross entropy loss function and the second pseudo-label cross entropy loss function point to the background area, intraepithelial neoplasia area and intestinal metaplasia area in the sample image; the first supervised intersection-over-union loss function and the second supervised intersection-over-union loss function point to the intraepithelial neoplasia area and intestinal metaplasia area in the sample image.

As a further improvement of an embodiment of the present invention, the first pseudo-label loss function and the second pseudo-label loss function have equal preset weight values, and the preset weight values are configured to increase with an increase in the number of iterations.

As a further improvement of an embodiment of the present invention, the preset weight value is equal to the product of the maximum weight value and a preset increasing function, and the preset increasing function is configured so that the function value approaches 1 infinitely.

As a further improvement of an embodiment of the present invention, the sample image represents the intraepithelial neoplasia and intestinal metaplasia.

To achieve one of the above-mentioned purposes of the invention, one embodiment of the present invention provides a pathological image recognition model training system, comprising: one or more processors; a memory for storing one or more computer programs, which, when the one or more computer programs are executed by the one or more processors, are configured to execute the pathological image recognition model training method described in any of the above-mentioned technical solutions.

To achieve one of the above-mentioned purposes of the invention, one embodiment of the present invention provides a storage medium on which a computer program is stored. When the computer program is executed by a processor, the pathological image recognition model training method described in any of the above-mentioned technical solutions is implemented.

To achieve one of the above-mentioned purposes of the invention, an embodiment of the present invention provides a pathological image recognition method, which includes: executing the pathological image recognition model training method described in any of the above-mentioned technical solutions to obtain at least one of the first model training parameters and the second model training parameters; carrying the model training parameters into the corresponding neural network model to construct a pathological image recognition model; receiving the pathological image data to be tested and preprocessing it, and inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction to obtain pathological recognition data.

As a further improvement of an embodiment of the present invention, the "receiving pathological image data to be tested and preprocessing it, inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction, and obtaining pathological recognition data" specifically includes: performing size standardization processing and color migration standardization processing on the pathological image data to be tested in sequence, and calculating to obtain a set of pathological images to be tested; inputting the set of pathological images to be tested into the pathological image recognition model for traversal prediction, and obtaining a pathological recognition pixel area; and superimposing and displaying the pathological recognition pixel area on the pathological image to be tested to form a pathological judgment image.

As a further improvement of one embodiment of the present invention, the "calculation to obtain a set of pathological images to be tested" specifically includes: performing sliding window area segmentation on the pathological image data to be tested that has completed size standardization and color migration standardization, and screening to obtain the set of pathological images to be tested based on the proportion of low grayscale value areas in the sliding window image to be tested.

As a further improvement of an embodiment of the present invention, the pathology identification data includes precancerous lesion determination information, and the "receiving the pathology image data to be tested and preprocessing it, inputting the preprocessed pathology image data to be tested into the pathology image recognition model for traversal prediction, and obtaining the pathology identification data" specifically includes: arranging the pixel values in the pathology identification pixel area respectively pointing to intraepithelial neoplasia and intestinal metaplasia in descending order, calculating the pixel average value within a preset number range, obtaining a first average value and a second average value, and judging the numerical relationship between the first average value and the second average value and a preset precancerous lesion determination threshold; if one of the first average value and the second average value is greater than the precancerous lesion determination threshold, it is judged that a precancerous lesion occurs at the position represented by the pathology image to be tested corresponding to the pathology identification pixel area, and outputting the precancerous lesion determination information.

Compared with the prior art, the pathological image recognition model training method provided by the present invention constructs two parallel learning models, a first neural network model and a second neural network model, and uses the two sets of loss functions generated to train and optimize the models in comparison, thereby making full use of limited image data for training and making the performance of the neural network model more stable; using a sample image set to sequentially train the previous model to the next model, and using a sample image set to sequentially train the next model to the previous model, combining general supervised training and pseudo-label-based supervised training, can reduce dependence on scarce data types such as labeled data, and make unlabeled data equivalent to labeled data and participate in the model training process, thereby greatly improving the performance of the trained model, reducing costs and increasing training speed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG1 is a schematic diagram of the structure of a pathological image recognition model training system according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of the steps of a pathological image recognition model training method according to an embodiment of the present invention.

FIG. 3 is a schematic diagram of the steps of a first embodiment of a pathological image recognition model training method according to an embodiment of the present invention.

FIG. 4 is a schematic diagram of some steps of a pathological image recognition model training method in another embodiment of the present invention.

FIG. 5 is a schematic diagram of some steps of a first embodiment of a pathological image recognition model training method in another embodiment of the present invention.

FIG. 6 is a schematic diagram of some steps of a specific example of the first embodiment of the pathological image recognition model training method in another embodiment of the present invention.

FIG. 7 is a schematic diagram of some steps of a pathological image recognition model training method in yet another embodiment of the present invention.

FIG8 is a schematic diagram of some steps of a first embodiment of a pathological image recognition model training method in yet another embodiment of the present invention.

FIG9 is a schematic diagram of the image data conversion process when executing the pathological image recognition model training method in another embodiment of the present invention.

FIG. 10 is a schematic diagram of the steps of a pathological image recognition method and a first embodiment thereof in one embodiment of the present invention.

Detailed ways

The present invention will be described in detail below in conjunction with the specific embodiments shown in the accompanying drawings. However, these embodiments do not limit the present invention, and any structural, methodological, or functional changes made by a person skilled in the art based on these embodiments are all within the scope of protection of the present invention.

It should be noted that the term "comprises" or any other variation thereof is intended to cover non-exclusive inclusion, so that a process, method, article or device that includes a series of elements includes not only those elements, but also includes other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In addition, the terms "first", "second", "third", etc. are used for descriptive purposes only and cannot be understood as indicating or implying relative importance.

The core technical route of the present invention is to construct two sets of parallel neural network models, alternately perform supervised training and supervised training based on the inference results after supervised training, so as to achieve the technical effects of making full use of the content of the sample image set, stabilizing the quality of the output training parameters, and improving the prediction accuracy of the model. At the same time, the additional technical features proposed in the following text of the present invention, such as image standardization, grouping, sliding window segmentation, etc., can also further optimize the model training method from the aspects of the quality of the sample image set itself, the construction of the image set used for training, and resource occupation. It is worth emphasizing that the various implementation methods, embodiments or specific examples below can be combined with each other, and the new technical scheme formed thereby can be included in the protection scope of the present invention.

In order to solve the technical problem and achieve the technical effect, an embodiment of the present invention provides a storage medium, which can be specifically a computer-readable storage medium or a computer-readable signal medium or any combination of the above two, so that the storage medium can be set in a computer and store a computer program. The computer storage medium can be any available medium that can be accessed by a computer, or can be a storage device such as a server or a data center that includes one or more available media. The available medium can be a magnetic medium such as a floppy disk, a hard disk, a magnetic tape, or a DVD (Digital Video Recorder). Video Disc, high-density digital video disc) and other optical media, or semiconductor media such as SSD (Solid State Disk, solid state hard disk), or any suitable combination of the above. In the present application, a computer-readable storage medium may be any tangible medium containing or storing a program, which can be used by or in combination with an instruction execution system, device or device. In the present application, a computer-readable signal medium may include a data signal propagated in a baseband or as part of a carrier wave, which carries a computer-readable program code. This propagated data signal may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which may send, propagate or transmit a program for use by or in combination with an instruction execution system, device or device. The program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the above. When the computer program is executed by any processor in a computer, a pathological image recognition model training method is implemented to at least perform: receiving a set of sample images, calling and training a first neural network model and a second neural network model, calculating a first loss function and a second loss function, and generating at least one of a first model training parameter and a second model training parameter.

One embodiment of the present invention further provides a pathological image recognition model training system 100 as shown in FIG1 , and the pathological image recognition model training system 100 includes a processor 11, a communication interface 12, a memory 13, and a communication bus 14. The processor 11, the communication interface 12, and the memory 13 communicate with each other through the communication bus 14. The following components are connected to the communication interface 12: input components including a keyboard, a mouse, etc.; output components including a cathode ray tube (CRT, Cathode Ray Tube), a liquid crystal display (LCD, Liquid Crystal Display), etc., and a speaker, etc.; a memory 13 including a hard disk, etc.; and a communication component including a network interface card such as a local area network card, a modem, etc. The communication component performs communication processing via a network such as the Internet. A drive can be connected to the communication interface 12 as needed. Removable media, such as magnetic disks, optical disks, magneto-optical disks, semiconductor media, etc., are installed on the drive as needed so that the computer program read therefrom can be installed into the memory 13 as needed.

In particular, according to an embodiment of the present application, the process described in each method flow chart can be implemented as a computer software program. For example, an embodiment of the present application includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes a program code for executing the method shown in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through a communication component, and/or installed from a removable medium. When the computer program is executed by the processor 11, various functions defined in the system of the present application are executed.

The pathological image recognition model training system 100 is trained based on the pathological image recognition model training method provided below.

The memory 13 is used to store an application program; the processor 11 is used to execute the application program stored in the memory 13, and the application program may be the application program stored in the storage medium as described above, that is, the storage medium may be included in the memory 13. When executing the application program, the functions and steps as described above may also be implemented, and the corresponding technical effects may be achieved.

Other structural features, such as possible functional partitions and module adjustments, can be determined according to the application program it carries. Specifically, in the pathological image recognition model training system 100, or in a pathological image recognition model training device, it can include a data acquisition module for acquiring a sample image set, a model construction module for constructing a first neural network model and a second neural network model, a data operation module for calculating a first loss function and a second loss function, and an iterative training module for iteratively training the first neural network model and the second neural network model.

One embodiment of the present invention provides a pathological image recognition model training method as shown in FIG2. The program or instruction used in the method can be carried in the above-mentioned storage medium and/or the above-mentioned pathological image recognition model training system and/or the above-mentioned pathological image recognition model training device to achieve the technical effect of training the pathological image recognition model. The pathological image recognition model training method specifically includes the following steps.

Step 21: Receive a sample image set.

Step 22, according to the sample image set, call the first neural network model to perform supervised training and traversal reasoning in sequence, and call the second neural network model to perform supervised training based on the reasoning result to calculate the first loss function.

Step 23, according to the sample image set, call the second neural network model to perform supervised training and traversal reasoning in sequence, and call the first neural network model to perform supervised training based on the reasoning result to calculate the second loss function.

Step 24, iteratively training the first neural network model and the second neural network model according to the first loss function and the second loss function to obtain at least one of the first model training parameters and the second model training parameters.

The sample image set can be specifically interpreted as an image set or image data set for training a pathological image recognition model, and its content can point to any part that needs to be recognized and analyzed for pathological images. For example, the sample image set can point to the stomach, intestines, and other parts of the digestive system. Based on the limitations of its own use, the sample image set can at least include some images pointing to the lesion site or the pre-lesion site.

In the scenario where the trained pathological image recognition model is used for early warning of digestive system cancer, at least some of the sample images in the sample image set can be configured to characterize intraepithelial neoplasia and intestinal metaplasia. Among them, intestinal metaplasia is generally considered to be an early manifestation of cancer, which can be divided into two types: small intestinal metaplasia and colonic metaplasia. Furthermore, considering that colonic metaplasia has a higher risk of malignant cancer, the number of sample images representing colonic metaplasia in the sample image set can be configured to be larger, or it can be given a higher weight in training.

Preferably, the sample image set includes a set of labeled sample images and a set of unlabeled sample images. For the set of labeled sample images, the present invention does not limit the way of labeling sample images, which may be to provide a unified label for partial areas. Similarly, the present invention does not limit the form of sample image labeling. As a preferred implementation, the labeling of sample images may be to classify each pixel, and finally form a mask that is adapted to the size of the sample image, so that the sample image and the corresponding mask together constitute the set of labeled sample images. The set of labeled sample images should at least include some sample images pointing to the lesion site and the pre-lesion site, while for the set of unlabeled sample images, it may include sample images pointing to the lesion site or the pre-lesion site, and may also include sample images that do not include lesion or pre-lesion features.

The first neural network model and the second neural network model can be any neural network model that can support supervised training and inference prediction. The first loss function represents the deviation between the model inference and the actual classification in the process of calling the first neural network model and the second neural network model for training in sequence. The second loss function represents the deviation between the model inference and the actual classification in the process of calling the second neural network model and the first neural network model for training in sequence.

Based on this, the present invention provides a preferred embodiment, which aims to build a better neural network model to adapt to the application scenario of pathological image recognition and improve the efficiency of model training. This embodiment specifically includes the steps of: selecting a semantic segmentation backbone model based on a fully convolutional network as a basic backbone model; performing model initialization based on the basic backbone model according to the first weight configuration parameter and the second weight configuration parameter, respectively, to obtain the first neural network model and the second neural network model.

In this way, with the fully convolutional network (FCN) as the structural basis, the deconvolution operation can replace the last fully connected layer of the traditional convolutional neural network (CNN), so as to maintain the consistency of the image output size with the input size during training, inference and prediction to meet the needs of refined prediction (for example, prediction for each pixel).

In addition, the selection of a backbone model that supports semantic segmentation as the basic backbone model can achieve pixel-level classification, so that when dealing with diverse classification needs, it can accurately segment the lesion or pre-lesion area from the background area, providing medical workers with a more accurate and reliable reference.

The first weight parameter and the second weight parameter are preferably configured to be generated based on different parameter initialization strategies, so that the corresponding first neural network model and the second neural network model have independent internal characteristics on the basis of maintaining parallel training. In this way, the generalization ability of the first model training parameters or the second model training parameters finally generated is improved. Because the first neural network model and the second neural network model are configured to be built based on the same basic backbone model, there is no need to make adaptive adjustments to the input sample image set for the model, and the form of the output data information is also similar, which makes it easier to compare with each other and calculate the overall loss function for performance evaluation.

Preferably, the first neural network model and the second neural network model are both equipped with a softmax activation function and are configured to have the same optimizer and learning rate adjustment strategy. Thus, it is possible to further ensure that the basic configurations of the two neural network models remain consistent and can be trained in parallel in comparison with each other. Among them, the softmax activation function is used to adapt to a larger number of classification requirements. For example, a single pixel or pixel area can be identified and determined in three categories: background, intraepithelial neoplasia, and intestinal metaplasia. The determination information can be in the form of a classification probability value.

In a preferred specific example, the basic backbone model is configured based on the U-Net network architecture. Compared with the traditional full convolutional network's skip connection, the U-Net network architecture chooses to superimpose features when resizing, thereby doubling the number of channels and taking into account both global and local features, thereby adapting to multi-scale prediction and deep supervision.

The first weight configuration parameter is preferably set to be generated based on the Xavier parameter initialization strategy, and the second weight configuration parameter is preferably set to be generated based on the Kaiming parameter initialization strategy. The former performs better when applied to the tanh activation function operation scenario, and can solve the problem of gradient disappearance caused by the Gaussian distribution as the depth of the neural network increases to a certain extent. The latter focuses more on the ability of nonlinear activation functions such as the relu activation function, and can also improve the problem of data variance decreasing layer by layer to a certain extent. In an application scenario, the above parameter initialization strategy can be implemented based on the PyTorch learning library, and the above first weight configuration parameter and the second weight configuration parameter can be interpreted as having different tensor parameters (tensor). It can be seen that the two weight parameters are not necessarily limited to being generated using the above two parameter initialization strategies.

In addition, the first neural network model and the second neural network model can be configured to have the same stochastic gradient descent (SGD) optimizer, so that the performance of the neural network model can be evaluated in real time and give it a faster learning speed. Of course, the present invention does not exclude the use of batch gradient descent, mini-batch gradient descent and other methods to construct the optimizer. The learning rate adjustment strategy is configured so that the model learning rate value decreases with the increase in the number of iterations, so that the performance of the neural network model gradually tends to be stable. For the model learning rate value, a maximum model learning rate value can be set as the basic learning rate value at the time of initialization. The basic learning rate value is preferably 0.01.

Preferably, in order to enhance the stability of the learning rate change during the iteration process, the model learning rate value can be specifically configured to be equal to the product of the preset exponential power of the ratio of the remaining number of iterations to the total number of iterations and the basic learning rate value. Define the current number of iterations as n, the total number of iterations as max_iter, the preset exponent value as i, and the basic learning rate value as Li, then the model learning rate value is at least configured to satisfy:

Specifically, the basic learning rate value can be configured as 0.01, the preset exponent value can be configured as 0.9, and the model learning rate value can be configured as Minimum configuration is required to meet:

In the first embodiment provided based on the above implementation, the present invention configures different training strategies for the first neural network model and the second neural network model according to the labeled sample image set and the unlabeled sample image set in the sample image set, mainly using supervised training and reasoning, taking the pseudo-label reasoning result as the basis for the second-level supervised training, and further making full use of the sample image set, especially the relatively scarce labeled sample image set, to improve the generalization recognition ability and prediction accuracy of the model. As shown in Figure 3, the first embodiment specifically includes the following steps.

Step 21: Receive a sample image set.

Step 221, after calling the first neural network model to perform supervised training according to the labeled sample image set, calling the first neural network model to perform traversal reasoning according to the unlabeled sample image set to obtain a first recognition pseudo-label set corresponding to the unlabeled sample image set.

Step 222: Based on the unlabeled sample image set and the first recognition pseudo-label set, call the second neural network model to perform supervised training and calculate the first loss function.

Step 231, after calling the second neural network model to perform supervised training according to the labeled sample image set, calling the second neural network model to perform traversal reasoning according to the unlabeled sample image set to obtain a second recognition pseudo-label set corresponding to the unlabeled sample image set.

Step 232: Based on the unlabeled sample image set and the second recognition pseudo-label set, call the first neural network model to perform supervised training, and calculate the second loss function.

In this way, on the one hand, model training can be performed in two directions: "from the first neural network model to the second neural network model" and "from the second neural network model to the first neural network model"; on the other hand, the unlabeled sample image set can be inferred through the supervised training model, and the identified pseudo-labels and the unlabeled sample image set can be used as the "labeled sample image set" for further supervised training, thereby improving the performance of the model, and obtaining model training parameters with better accuracy and stability through iteration, and reducing the dependence on the demand for a large number of labeled sample image sets.

Preferably, for this embodiment or any embodiment under this embodiment, or for any embodiment mentioned below, the first loss function and the second loss function may also have the following configuration. First, the first loss function is configured as the weighted sum of the first supervised loss function and the first pseudo-label loss function, and the second loss function is configured as the weighted sum of the second supervised loss function and the second pseudo-label loss function. In this way, the first loss function and the second loss function can be used as overall model evaluation parameters for step 22 and step 23, respectively, covering the entire process of training in the above two directions and enhancing the effect of iterative training.

Specifically, the first supervised loss function refers to the supervised training process of the first neural network model based on the sample image set, and the first pseudo-label loss function refers to the supervised training process of the second neural network model based on the reasoning result. Preferably, the "based on the reasoning result" in the first embodiment of the above-mentioned implementation mode can be specifically interpreted as "based on the first identification pseudo-label set". The second supervised loss function refers to the supervised training process of the second neural network model based on the sample image set, and the second pseudo-label loss function refers to the supervised training process of the first neural network model based on the reasoning result. The "based on the reasoning result" in the first embodiment of the above-mentioned implementation mode can be specifically interpreted as "based on the second identification pseudo-label set".

In this way, based on the loss function corresponding to the overall construction step as an evaluation parameter, the supervised training process based on labeled data and pseudo-labeled data can be included to improve the generalization recognition ability of the model and reduce the demand for labeled data.

The technical solution provided above is intended to correspond to the training process corresponding to the loss function, and for the specific type of loss function, in a specific example, any of the above loss functions can be configured as a cross entropy loss function, or a combination of a cross entropy loss function and an intersection-over-union loss function. Preferably, in the case where the overall effectiveness of the model is given priority, the latter combination scheme can be used to configure the loss function, and in the case where the stability and certainty of the training process are given priority, the former single scheme can be used to configure the loss function.

Based on this, the present invention provides a preferred solution to configure the loss function type according to the effects of the above-mentioned different loss functions. In this preferred solution, the first supervised loss function is configured as the sum of the first supervised cross entropy loss function and the first supervised intersection-over-union loss function. Among them, the first supervised cross entropy loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification probability, and the first supervised intersection-over-union loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification category.

In an embodiment where the sample image set includes a labeled sample image set, the known label data may be label data such as a mask in the labeled sample image set. The inference classification probability corresponding to the known label data may be the inference classification probability of all pixels in the labeled sample image by the first neural network model. The inference classification category corresponding to the known label data may be the inference classification category of all pixels in the labeled sample image by the first neural network model.

The first pseudo label loss function is configured to include a first pseudo label cross entropy loss function. The invention relates to a method for characterizing the gap between the inference classification probability of the first neural network model for the sample image set and the inference classification category of the second neural network model for the sample image set.

In an embodiment where the sample image set includes an unlabeled sample image set, the first pseudo-label cross entropy loss function can characterize the gap between the inferred classification probability of all pixels in the unlabeled sample images by the first neural network model and the inferred classification category of all pixels in the unlabeled sample images by the second neural network model.

The second supervised loss function is configured as the sum of a second supervised cross entropy loss function and a second supervised intersection-over-union loss function. Among them, the second supervised cross entropy loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification probability. The second supervised intersection-over-union loss function characterizes the gap between the known label data in the sample image set and the corresponding inference classification category. Specifically, the inference classification probability corresponding to the known label data may be the inference classification probability of the second neural network model for all pixels in the labeled sample image. The inference classification category corresponding to the known label data may be the inference classification category of the second neural network model for all pixels in the labeled sample image.

The second pseudo-label loss function includes a second pseudo-label cross entropy loss function. The second pseudo-label cross entropy loss function represents the difference between the inference classification probability of the second neural network model for the sample image set and the inference classification category of the first neural network model for the sample image set. Specifically, the second pseudo-label cross entropy loss function can represent the difference between the inference classification probability of all pixels in the unlabeled sample image by the second neural network model and the inference classification category of all pixels in the unlabeled sample image by the first neural network model.

Define the known label data corresponding to the labeled sample image (which may be the lesion annotation mask corresponding to the image, or the classification coding label corresponding to each pixel on the mask) as label ^L ; define the inference classification probability of all pixels in the labeled sample image by the first neural network model as out_prob_1 ^L , the inference classification category of all pixels in the labeled sample image by the first neural network model as out_class_1 ^L , the inference classification probability of all pixels in the unlabeled sample image by the first neural network model as out_prob_1 ^U , and the inference classification category of all pixels in the unlabeled sample image by the first neural network model as pseudo_label_1 ^U (that is, the first recognition pseudo label set); define the inference classification probability of all pixels in the labeled sample image by the second neural network model as out_prob_2 ^L , the inference classification category of all pixels in the labeled sample image by the first neural network model as out_class_2 ^L , the inference classification probability of all pixels in the unlabeled sample image by the first neural network model as out_prob_2 ^U , and the inference classification category of all pixels in the unlabeled sample image by the first neural network model as pseudo_label_2 ^U (That is, the second identification pseudo-label set). Then, the first supervised loss function at least satisfies:

supervised_loss_1＝ce_loss(out_prob_1 ^L ,label ^L )+dice_loss(out_class_1 ^L ,label ^L );

Wherein, the ce_loss(out_prob_1 ^L ,label ^L ) is the first supervised cross entropy loss function, and the dice_loss(out_class_1 ^L ,label ^L ) is the first supervised intersection-over-union loss function.

The first pseudo label loss function at least satisfies: pseudo_loss_1=ce_loss(out_prob_1 ^U ,pseu_label_2 ^U );

Among them, the ce_loss(out_prob_1 ^U ,pseudo_label_2 ^U ) is the first pseudo-label cross entropy loss function.

The second supervised loss function at least satisfies:

supervised_loss_2=ce_loss(out_prob_2 ^L ,label ^L )+dice_loss(out_class_2 ^L ,label ^L );

Wherein, the ce_loss(out_prob_2 ^L ,label ^L ) is the second supervised cross entropy loss function, and the dice_loss(out_class_2 ^L ,label ^L ) is the second supervised intersection-over-union loss function.

The second pseudo label loss function at least satisfies: pseudo_loss_2=ce_loss(out_prob_2 ^U ,pseu_label_1 ^U );

Wherein, the ce_loss(out_prob_1 ^U ,pseudo_label_2 ^U ) is the second pseudo-label cross entropy loss function.

In one embodiment mentioned above, the sample images in the sample image set characterize the intraepithelial neoplasia and intestinal metaplasia. Thus, the first supervised cross entropy loss function, the first pseudo-label cross entropy loss function, the second supervised cross entropy loss function and the second pseudo-label cross entropy loss function point to the background area, the intraepithelial neoplasia area and the intestinal metaplasia area in the sample image, that is, they are configured as a three-class average cross entropy loss (cross-entropy loss) function. The first supervised intersection-of-union loss function and the second supervised intersection-of-union loss function point to the intraepithelial neoplasia area and the intestinal metaplasia area in the sample image, that is, they are configured as a two-class average intersection-of-union loss (dice loss) function.

For the weight of the first pseudo-label loss function in the first loss function, and the weight of the second pseudo-label loss function in the second loss function, it is preferred to configure them to have equal preset weight values, so as to enhance the consistency of model evaluation under the two training directions. Furthermore, the preset weight value can be configured to increase with the increase in the number of training iterations, that is, the two are configured to be positively correlated. In this way, the confidence in the pseudo-label loss function is gradually improved, so that the model training process gradually tends to be stable. Of course, the present invention does not exclude the technical solution of configuring the preset weight value as a fixed value, so that the degree of participation of the pseudo-label loss function in the model evaluation process can be kept within a stable range.

Specifically, for the technical solution of configuring the preset weight value as a dynamically changing value, the present invention provides a preferred configuration mode, wherein the preset weight value is configured to be equal to the product of the maximum weight value and a preset increasing function, and the preset increasing function is configured so that the function value infinitely approaches 1. Preferably, the preset increasing function is configured to increase smoothly from 0 and infinitely approaches 1 with a smaller slope.

Based on this, in one implementation, the Euler number can be used as the base to construct an exponential function that changes with the number of iterations, thereby realizing the above configuration. Define the maximum weight value as λ _max , the current number of iterations is n, and the preset weight value at least satisfies:

The symbol "//" represents the integer division downwards, and is used to return the integer part of the integer division result. Based on the above configuration, the preset weight value can have a more gentle change trend. The maximum weight value λ _max is preferably 0.1.

Of course, the present invention can also use a linear function as the above-mentioned preset increasing function, define the current number of iterations as n, and the total number of iterations as max_iter, then the preset weight value at least satisfies:

In this way, the first 80% of the training steps can be configured incrementally for the preset weight values, and the last 20% can keep the preset weight values unchanged.

It is understandable that although the above-mentioned embodiment of the present invention takes "obtaining at least one of the first model training parameter and the second model training parameter" as the last step, it does not mean that only the model training parameters can be obtained by executing the technical solution provided by the present invention. Those skilled in the art can certainly generate a corresponding neural network model based on the model training parameters for pathological image recognition. Based on this, the present invention may also include a supplementary step after step 24: loading the first model training parameters to initialize the first neural network model, and/or loading the second model training parameters to initialize the second neural network model to obtain a pathological image recognition model.

It can be understood that the termination condition of the above iterative training process can be specifically configured to stop when the loss function is reduced and stabilized within a preset range.

It should be noted that the reasoning test process of the pathological image recognition model training method provided by the present invention can be performed on a separate verification set, and is configured to verify the trained neural network model after completing each round of training, so as to obtain the loss function index corresponding to the above, so as to select the optimal node (that is, the model training parameters described above). It can be seen that the pathological image recognition model training method provided by the present invention is not only included in the iterative process on the training set, but also includes the process of model evaluation and selection on the verification set. In addition, the total number of iterations is defined as epoch, and the total number of iterations max_iter can correspond to the product of the total number of iterations epoch and the number of iterations required to traverse all the data in the sample image set.

Similarly, the present invention does not limit the step 21 to include other pre-steps. For example, in another embodiment of the present invention, a generation process of the sample image set is provided, and a reference pathological image set with different morphological characteristics is standardized and grouped into a training set and a validation set, so as to facilitate the subsequent training process. In combination with FIG. 2 and FIG. 4, the other embodiment specifically includes the following steps.

Step 31: Receive a reference pathology image set.

Step 32, performing size standardization processing and color migration standardization processing on the reference pathology image set in sequence, and calculating to obtain a standard pathology image set. The standard pathology image set includes a labeled pathology image set and an unlabeled pathology image set.

Step 33, grouping the annotated pathological image sets, combining the first annotated image set with the unannotated pathological image set to form a sample image training set, and forming a sample image verification set based on the second annotated image set.

Step 34: Generate a sample image set based on the sample image training set and the sample image verification set.

Step 21: Receive a sample image set.

In this way, each group of images or image data in the reference pathological image set can be processed into images or image data with uniform size and staining conditions, avoiding the influence of external factors such as staining on the accuracy of subsequent model training process and model training parameters. At the same time, the sample image training set is configured to include both annotated pathological images and unannotated pathological images, which can adapt to the special configuration of the subsequent training process and reduce the demand for annotated pathological images.

The reference pathological images in the reference pathological image set can be interpreted as a reference image set that at least contains some labeled specimen images. The reference image set is used to generate the sample image set and is put into model training. Based on the fact that the sample image set includes a sample image training set and a sample image verification set, any of the steps mentioned above regarding iterative training on the training set can be configured to be performed on the sample image training set, and any of the steps regarding evaluation and selection on the verification set can be configured to be performed on the sample image verification set, which is not described in detail in the present invention.

The present invention provides a first embodiment based on the above another embodiment, wherein the reference pathology image set is configured to be generated by selectively annotating pixels according to different types of lesion specimen images, and the multiple images thus formed are respectively standardized to obtain different components of the standard pathology image set. In conjunction with FIG. 2 and FIG. 5 , the first embodiment specifically includes the following steps.

Step 301: receiving a precancerous lesion specimen image and a non-precancerous lesion specimen image.

Step 302 : pixel-annotate some precancerous lesion specimen images to obtain lesion annotation masks.

Step 303 : generating a reference pathology image set according to the precancerous lesion specimen image, the corresponding lesion annotation mask, and the non-precancerous lesion specimen image.

Step 31: Receive a reference pathology image set.

Step 32, performing size standardization and color migration standardization on the reference pathology image set in sequence, and calculating to obtain a standard pathology image set. The step 32 specifically includes:

Step 321, performing size standardization processing and color migration standardization processing on all labeled lesion specimen images in sequence, and obtaining a set of labeled pathology images based on the processed labeled lesion specimen images; wherein the labeled lesion specimen images correspond to precancerous lesion specimen images with corresponding lesion annotation masks;

Step 322, size normalization processing and color migration normalization processing are performed on all unlabeled lesion specimen images and all non-precancerous lesion specimen images in sequence, and a set of unlabeled pathological images is obtained by calculation based on the processed unlabeled lesion specimen images and non-precancerous lesion specimen images; wherein the unlabeled lesion specimen images correspond to precancerous lesion specimen images without corresponding lesion annotation masks.

Step 21: Receive a sample image set.

Step 23, based on the sample image set, call the second neural network model to perform supervised training and traversal reasoning in sequence, and call the first neural network The network model is supervisedly trained based on the inference results, and the second loss function is calculated.

When the above-mentioned pathological image recognition model training method is applied to the scenario of early warning of the digestive system, the precancerous lesion specimen image can be interpreted as: a specimen image with intraepithelial neoplasia or intestinal metaplasia. The non-precancerous lesion specimen image can be correspondingly interpreted as: a specimen image that does not contain the above-mentioned phenomenon. By implementing the above-mentioned technical solution, only some precancerous lesion specimen images can be pixel-annotated, thereby reducing cost consumption.

The configuration of the number or amount of data of the above-mentioned images or image data can be specifically: the number of the labeled lesion specimen images accounts for 30% of the number of all precancerous lesion specimen images. Compared with the 100% required for fully supervised training, it can greatly reduce costs, improve efficiency and utilization of labeled data. In addition, the number of all non-precancerous lesion specimen images accounts for 20% of the number of all precancerous lesion specimen images, which can enhance the generalization recognition ability of the model.

In order to ensure the smooth progress of subsequent model training, the present invention further provides a specific example for step 321 and step 322 based on the above first embodiment. In this specific example, step 32 specifically includes the steps of: performing sliding window region segmentation on the reference pathological image that has completed the size standardization processing and color migration standardization processing, and obtaining and calculating a standard pathological image set based on multiple groups of sliding window region image groups. In this way, the reference pathological image can be cut into a size suitable for model input, thereby facilitating the traversal and iterative training of the model.

The sliding window region images in the sliding window region image group have a size of 256*256. The step size for performing sliding window region segmentation can be any pixel size between 0.25 and 0.5 times of any side of the sliding window region image, for example, 128 pixels. Thus, a 50% overlap is formed during the traversal process, so that various edge features are effectively covered.

Specifically, as shown in FIG6 , the “sliding window region segmentation” may include:

Step 3211, constructing an image area sliding window of a preset size, and making the image area sliding window perform traversal segmentation on the annotated standardized image and the corresponding lesion annotated mask according to a preset step length, to obtain multiple groups of annotated sliding window image groups and annotated sliding window mask groups;

Step 3212, traverse, analyze and filter and update the labeled sliding window image and the corresponding labeled sliding window mask according to the proportion of the lesion area of all labeled sliding window masks in the labeled sliding window mask group;

Step 3221, causing the image region sliding window to perform traversal segmentation on the unlabeled standardized image and the non-lesion standardized image according to a preset step length, to obtain multiple groups of unlabeled sliding window image groups and non-lesion sliding window image groups;

Step 3222, traverse, analyze and filter and update the unlabeled sliding window images and the non-lesion sliding window images according to the tissue area ratio of the unlabeled sliding window images and the non-lesion sliding window images.

The annotated standardized image is an annotated lesion specimen image after standardization. The unannotated standardized image is an unannotated lesion specimen image after standardization. The non-lesion standardized image is a non-precancerous lesion specimen image after standardization. In this way, a sliding window region image group corresponding to annotated lesion specimen images, a sliding window region mask group corresponding to lesion annotation masks, a sliding window region image group corresponding to unannotated lesion specimen images, and a sliding window region image group corresponding to non-precancerous lesion specimen images can be obtained to serve as data inputs for the model respectively.

On the one hand, the sliding window area image in the above sliding window area image group can be an RGB image. Therefore, the data type input into the neural network model for iteration can be an RGB matrix corresponding to the RGB image, and specifically can be a multi-channel RGB matrix of (256, 256, T). The number of channels T can be determined according to the number of categories to be identified. For the above application scenario of early warning of digestive system cancer, the number of channels T = 3, which respectively refer to the background, intraepithelial neoplasia and intestinal metaplasia.

Furthermore, in the annotation sliding window mask, the blue color pointed to by the RGB value (0, 0, 255) can be used to represent the background, the red color pointed to by the RGB value (255, 0, 0) can be used to represent the intraepithelial neoplasia, and the green color pointed to by the RGB value (0, 255, 0) can be used to represent the intestinal metaplasia. In addition, the above specimen image can be specifically made by a unified staining method (for example, Hematoxylin-Hosin Staining) and saved in a unified format (for example, svs format or kfb format, etc.). The corresponding generated annotation sliding window mask can be configured as a PNG (Portable Network Graphics) file. The way of annotation can be specifically annotated by tools such as ASAP (Automated Slide Analysis Platform) or labelme.

On the other hand, the process of updating and screening the annotated sliding window image and the corresponding annotated sliding window mask can be specifically configured to screen according to the coverage of the lesion site in the central area, and screen and retain the annotated sliding window image and the annotated sliding window mask with a coverage higher than a preset percentage. In the above-mentioned digestive system scenario, when the sliding window image size is 256*256, an area of 64*64 pixels in size at the center position of the annotated sliding window mask can be intercepted. When the coverage area of any lesion is greater than or equal to one-third of the area, the annotated sliding window image and the annotated sliding window mask corresponding to the area are retained. Any of the lesions can be interpreted as one of intraepithelial neoplasia or intestinal metaplasia. In this way, the amount of data processing in the screening and updating process can be reduced, and the central area that can better summarize the content of the annotated sliding window image can be selected for analysis, thereby speeding up the overall work efficiency.

The process of updating and screening the unlabeled sliding window images and the non-lesion sliding window images can be specifically configured to be performed according to the overall tissue area ratio, and the unlabeled sliding window images and the non-lesion sliding window images whose resistance area ratio is higher than a preset percentage are screened and retained. In the above-mentioned digestive system scenario, The area with a lower grayscale value (such as a grayscale value lower than 210) is calculated as the tissue area, and the proportion of the area in the overall image is calculated and compared with a preset 30% or other value. If it is greater than 30%, it is retained. It can be understood that since this part does not contain lesions or other features that need to be classified, the unlabeled sliding window image and the non-lesion sliding window image can be set as a background color (for example, blue) as a whole.

On the other hand, in addition to directly using the above-mentioned sliding window area image group and the annotated sliding window mask group for training (that is, using the updated annotated sliding window image and the corresponding annotated sliding window mask directly as the annotated pathology image set, and using the updated unannotated sliding window image and the non-lesion sliding window image directly as the unannotated pathology image set), in this specific example, the above-mentioned data can also be augmented to further enhance the generalization recognition ability of the model. Specifically, as shown in Figure 6, the step 3212 may include step 3213: performing random data augmentation processing on the annotated sliding window image and the corresponding annotated sliding window mask to obtain a set of annotated pathology images. The step 3222 may include step 3223: performing random data augmentation processing on the unannotated sliding window image and the non-lesion sliding window image to obtain a set of unannotated pathology images.

Specifically, the "random data augmentation" may include the steps of: performing at least one of horizontal flipping, vertical flipping, rotation at a preset angle, and transposition on the image matrix according to a preset probability. Thus, by adjusting the image morphology, different data based on the image is generated. The preset probability is preferably 50%. The preset angle is preferably 90°.

For the form of the above-mentioned lesion annotation mask, in addition to being configured as a PNG file format, the content used to annotate pixels thereon can be specifically configured in the form of a unique hot coding label. In other words, the lesion annotation mask includes a unique hot coding label corresponding to each pixel in the precancerous lesion specimen image. Specifically, the unique hot coding label includes a first coding bit, a second coding bit, and a third coding bit that respectively characterize the background judgment label, the intraepithelial neoplasia judgment label, and the intestinal metaplasia judgment label. For example, when the unique hot coding label corresponding to a certain pixel is (0, 0, 1), it represents that the pixel belongs to the background part, if it is (1, 0, 0), it represents that the pixel belongs to the intraepithelial neoplasia part, and if it is (0, 1, 0), it represents that the pixel belongs to the intestinal metaplasia part.

This implementation does not conflict with the technical solution of classifying by different colors formed by RGB values in the previous text. The above-mentioned one-hot encoding label can be interpreted as obtained after normalization of the RGB image or RGB matrix. Based on this, the corresponding position of the present invention can also include a step of normalizing the lesion annotation mask or the annotation sliding window mask.

For the size standardization and color migration standardization processes proposed in step 32 and its derivative steps, the present invention provides the following preferred scheme in another embodiment.

First, the size normalization process can be specifically configured to adjust the magnification of the reference pathology image, that is, the step 32 and its derivative steps can specifically include the steps of: performing size normalization on the reference pathology image set, unifying all reference pathology images to a preset magnification. Preferably, the preset magnification is 10 times, and the initial magnification of the reference pathology image may be 5 times, 10 times, 20 times or 40 times.

Furthermore, when the reference pathology image is configured as an RGB image, in order to ensure that the processed lesion annotation mask corresponding to the precancerous lesion specimen image in the reference pathology image only contains pixel values of predetermined categories (blue, red, and green at the RGB level), the downsampling interpolation method may use the nearest neighbor interpolation method.

Secondly, the color migration standardization process may include the refinement steps shown in FIG. 7 , that is, step 32 in FIG. 4 and its derivative steps, which may specifically include the following steps.

Step 41, receiving a reference dyeing image, performing color space conversion on it, and calculating a reference dyeing vector matrix.

Step 42: Receive a reference pathological image, perform color space conversion on it, and calculate a reference color density matrix.

Step 43: Generate a color migration image corresponding to the reference pathological image according to the reference staining vector matrix and the reference color density matrix.

In this way, there is no need to perform complex migration coefficient calculations, and the color migration process can be completed directly according to the reference coloring vector matrix, which has a better color migration effect, does not excessively increase the amount of calculation, and simplifies the calculation logic.

In a first embodiment based on the above further implementation manner, the above step 41 may specifically include the following steps shown in FIG. 8 .

Step 411, receiving a reference staining image, performing optical density matrix conversion processing, and obtaining a reference optical density matrix.

Step 412, performing singular value decomposition on the reference optical density matrix, selecting the first singular extremum and the second singular extremum to create a projection plane.

Step 413, determine at least one reference singular value and its reference plane axis on the projection plane, project the reference optical density matrix onto the projection plane, fit the connecting lines of all numerical points on the projected reference optical density matrix and the origin of the projection plane, calculate the angle between the connecting line and the reference plane axis, find the maximum value among all the angles, and obtain the maximum angle data.

Step 414, calculate the optical density matrix corresponding to the maximum angle data, and perform a normalization operation on the optical density matrix to obtain a reference staining vector matrix.

In this way, the reference staining image formed by the hematoxylin-eosin staining method can be separated on the staining level with high efficiency, and the reference staining vector matrix representing the staining degree can be extracted, so as to be directly replaced in the subsequent steps to achieve the effect of color migration.

The reference staining image can be interpreted as a reference pathological image with better staining quality. Therefore, it can be used as a benchmark to perform color migration standardization processing on other reference pathological images. The optical density matrix conversion processing can be interpreted as: converting the reference staining image in the RGB color domain into The optical density matrix is converted to a reference optical density matrix in the OD (Optical Density) optical density domain. In this process, the process of removing pixels whose optical density values are less than a preset optical density threshold value may also be included.

The singular value decomposition can be interpreted as: decomposing the reference optical density matrix into a unitary matrix U, an eigenvalue square root Σ and the transposed product of another unitary matrix V. Based on this, the present invention uses the eigenvalue square root Σ to establish a projection plane, and specifically, uses the more typical eigenvalues therein to characterize the staining tendency of the two dyes, thereby extracting the reference staining vector matrix. At this time, the two largest vectors in the singular value vector, that is, the first singular extreme value and the second singular extreme value, can be used as a reference for calculating the more typical eigenvalue.

The “projecting the reference optical density matrix onto the projection plane” may also include: normalizing the projected values. Calculating the angle extreme value thereafter can simplify the operation steps and reduce errors to a certain extent. The “at least one reference singular value” may be any singular value on the projection plane, preferably one of the first singular extreme value and the second singular extreme value, and the “reference plane axis on the projection plane” may correspond to the number axis formed by the first singular extreme value on the projection plane or the number axis formed by the second singular extreme value on the projection plane.

The final generated reference staining vector matrix records the staining tendency of the reference staining image and removes other tissue region contents. At this time, the vector elements in the reference staining vector matrix reflect the staining degree of the two staining agents, hematoxylin and eosin.

In other words, the reference optical density matrix satisfies OD _target =C _target ×S _target , where C _target is the reference color density matrix of the reference staining image, and S _target is the reference staining vector matrix of the reference staining image. After the above steps, the reference staining vector matrix can be extracted.

In a first embodiment based on the above further implementation manner, the above step 42 may specifically include the following steps shown in FIG. 8 .

Step 421, receiving a reference pathological image, and sequentially performing optical density matrix conversion, singular value decomposition, plane projection, and maximum angle data acquisition on it, to calculate a reference optical density matrix and a reference staining vector matrix corresponding to the reference pathological image.

Step 422: Calculate a reference color density matrix corresponding to the reference pathological image based on the reference staining vector matrix and the reference optical density matrix.

The parts of "optical density matrix conversion", "singular value decomposition", "plane projection" and "maximum angle data acquisition" in step 421 can be replaced by implementing the technical solutions and related explanations of the above steps 411 to 414, which will not be repeated here.

For the reference pathological image, its reference optical density matrix also satisfies OD _source = C _source × S _source . Among them, C _source is the reference color density matrix of the reference staining image, and S _source is the reference staining vector matrix of the reference staining image. After step 421, the reference staining vector matrix can be extracted, and after step 422, the reference color density matrix can be calculated according to the above-mentioned operation relationship. Thus, the reference color density matrix and the benchmark staining vector matrix can be reorganized by "cross multiplication" to generate an optical density matrix after color migration (that is, OD _{source_norm} = C _source × S _target ). Thus, an inverse transformation relative to the color space conversion of step 41 is performed to restore the optical density matrix after color migration to the RGB color domain, and finally the color migration image is obtained.

The above-mentioned various implementations, embodiments or specific examples provided by the present invention can be combined with each other to finally form multiple better implementations. FIG9 shows the conversion process of related images or image data when executing one of the better implementations.

After receiving the precancerous lesion specimen image, the labeled lesion specimen image and the unlabeled lesion specimen image are correspondingly formed through the lesion area marking. The labeled lesion specimen image also includes a corresponding lesion annotation mask. The labeled lesion specimen image is processed by size standardization, color migration standardization, etc. to generate the labeled standardized image, and further subjected to sliding window area segmentation to generate the labeled sliding window image group. In this process, the lesion annotation mask also undergoes the above corresponding steps to finally generate a labeled sliding window mask group corresponding to the labeled sliding window image group, and the two together constitute the labeled pathological image set. Continuing, through the preset training set and verification set ratio relationship, the labeled pathological image set can be divided into a first labeled image set (or, labeled sample image set) and a second labeled image set, the former participating in the composition of the sample image training set, and the latter participating in the evaluation and selection link of the model as the sample image verification set.

The unlabeled lesion specimen image generated based on the precancerous lesion specimen image is processed by size standardization, color migration standardization, etc. to generate the unlabeled standardized image, and further processed by sliding window area segmentation to generate the unlabeled sliding window image group. In addition, after receiving the non-precancerous lesion specimen image, the non-lesion standardized image is generated after size standardization, color migration standardization, etc., and further processed by sliding window area segmentation to generate the non-lesion sliding window image group. The unlabeled sliding window image group and the non-lesion sliding window image group together constitute the unlabeled pathology image set (or, unlabeled sample image set), thereby, together with the first labeled image set, constituting the sample image training set.

In order to accurately identify and classify different regions in a pathological image, an embodiment of the present invention provides a pathological image recognition system and a pathological image recognition method as shown in FIG10 .

Corresponding to the above-mentioned pathological image recognition method, the present invention first provides a storage medium, which can have a corresponding pathological image recognition model training method. The configuration scheme of the pathological image recognition system can be the same or similar to that of the pathological image recognition model training system, and even the application programs of the pathological image recognition method and the pathological image recognition model training method can be set in the same storage medium. Similarly, the configuration scheme of the pathological image recognition system can also have the same or similar configuration scheme as that of the pathological image recognition model training system, which will not be repeated here.

Correspondingly, the pathological image recognition method provided in one embodiment of the present invention can also be installed in the above storage medium and/or the above pathological image recognition system. The pathological image recognition method specifically includes the following steps.

Step 51, executing a pathological image recognition model training method to obtain at least one of a first model training parameter and a second model training parameter.

Step 52, load the model training parameters into the corresponding neural network model to build a pathological image recognition model.

Step 53, receiving the pathological image data to be tested and preprocessing it, inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction, and obtaining pathological recognition data.

The pathological image recognition model training method can be the model training method provided by any of the above-mentioned implementation modes, embodiments or specific examples. Those skilled in the art can refer to the above-mentioned description and generate a variety of derived implementation modes based on steps 51 to 53, which will not be repeated here.

The corresponding neural network model can be interpreted as: a neural network model corresponding to at least one of the first model training parameters and the second model training parameters. For example, the neural network model can be the first neural network model, then the first model training parameters obtained by training are loaded into the first neural network model to construct a pathological image recognition model. The same is true when the neural network model is the second neural network model. However, it can be understood that the pathological image recognition model can also be configured to include a parallel first neural network model and a second neural network model at the same time.

It is worth noting that the pathological image data to be tested may have a format and content configuration similar to that of the sample images in the sample image set, and may especially have a form similar to that of the unlabeled sample images in the unlabeled sample image set, which will not be described in detail here.

Based on the above implementation, the present invention provides a first embodiment thereof, and the first embodiment provides a preferred technical solution for step 53. Step 53 in the first embodiment may specifically include the following steps.

Step 531 , performing size standardization processing and color migration standardization processing on the pathological image data to be tested in sequence, and obtaining a set of pathological images to be tested by calculation.

Step 532: input the pathological image set to be tested into the pathological image recognition model for traversal prediction to obtain the pathological recognition pixel area.

Step 533 , superimposing the pathology recognition pixel area on the pathology image to be detected to form a pathology judgment image.

The size standardization processing and the color migration standardization processing can refer to the technical solution provided above, and preferably adjust the size magnification ratio of the pathological image to be tested or its data, and unify the staining style tendency, so as to achieve the effect of improving the prediction accuracy.

The pathology identification pixel area can be interpreted as: the distribution of judgment results corresponding to each pixel on the pathology image to be tested. The pathology identification pixel area can specifically include a background identification pixel area, an intraepithelial neoplasia identification pixel area and an intestinal metaplasia pixel area, and each pixel has a background judgment annotation, an intraepithelial neoplasia judgment annotation and an intestinal metaplasia judgment annotation in sequence. The pathology identification pixel area can be expressed in the form of a mask similar to the lesion annotation mask, an image corresponding to the pathology image to be tested, or a data group that simply points to certain specific areas on the pathology image to be tested.

In this embodiment, although a pathology judgment image is generated, it is not limited to the pathology judgment image as the pathology identification data, and it can be presented as intermediate data. Of course, the setting of step 533 can also be cancelled and replaced by other technical solutions.

Preferably, segmentation and screening can also be performed on the pathological image data to be tested that has completed the standardization process, so as to obtain the pathological image set to be tested as the input of the neural network model. In other words, the "obtaining the pathological image set to be tested by calculation" can specifically include the steps of: performing sliding window area segmentation on the pathological image data to be tested that has completed the size standardization process and the color migration standardization process, and screening the pathological image set to be tested according to the proportion of low gray value areas in the sliding window image to be tested. The screening rules can refer to the technical solution for screening and updating unlabeled sliding window images and non-lesion sliding window images in the previous text.

Of course, the present invention does not exclude the difference from the technical solution provided above in the above-mentioned features. The resulting technical solution should also be considered to fall within the scope protected by the present invention. For example, the step length for performing sliding window region segmentation can be configured to be equal to the side length of the image region sliding window, and preferably 256 pixels.

In order to avoid the influence of false positives in the segmentation results on the qualitative diagnosis of the specimen, the pathological identification data in the first embodiment can be specifically configured to include precancerous lesion determination information. Based on this, the step 53 can further include the following steps.

Step 534, arrange the pixel values pointing to intraepithelial neoplasia and intestinal metaplasia in the pathological identification pixel area in descending order, calculate the average value of pixels within a preset number range, obtain a first average value and a second average value, and determine the numerical relationship between the first average value and the second average value and the preset precancerous lesion determination threshold.

Step 535: if one of the first average value and the second average value is greater than the precancerous lesion determination threshold, it is determined that a precancerous lesion occurs at the position represented by the pathological image to be detected corresponding to the pathological identification pixel area, and precancerous lesion determination information is output.

Preferably, the precancerous lesion determination threshold is 0.5, and the preset number range is within 10,000 pixels or within 15,000 pixels. Of course, the present invention also implicitly includes the step of: if both the first average value and the second average value are not greater than the precancerous lesion determination threshold, then the pathological recognition image is determined to be No precancerous lesions occur at the position represented by the pathological image to be tested corresponding to the element area, and precancerous lesion determination information is output.

In summary, the pathological image recognition model training method provided by the present invention constructs two parallel learning models, the first neural network model and the second neural network model, and uses the two sets of loss functions generated to train and optimize the models in comparison, thereby making full use of limited image data for training and making the performance of the neural network model more stable; using a sample image set to sequentially train the previous model to the next model, and using a sample image set to sequentially train the next model to the previous model, combined with general supervised training and pseudo-label-based supervised training, it can reduce the dependence on scarce data types such as labeled data, and implicitly treat unlabeled data as labeled data to participate in the model training process, thereby greatly improving the performance of the trained model, reducing costs and increasing training speed.

Similarly, the pathological image recognition method constructed by the pathological image recognition model (or model training data) generated based on the above training process can naturally take into account the advantages of high generalization recognition rate, low dependence on scarce data, as well as high cost and low performance.

It should be understood that although this specification is described according to implementation modes, not every implementation mode contains only one independent technical solution. This description of the specification is only for the sake of clarity. Those skilled in the art should regard the specification as a whole. The technical solutions in each implementation mode may also be appropriately combined to form other implementation modes that can be understood by those skilled in the art.

The series of detailed descriptions listed above are only specific descriptions of feasible implementation methods of the present invention. They are not intended to limit the scope of protection of the present invention. Any equivalent implementation methods or changes that do not deviate from the technical spirit of the present invention should be included in the scope of protection of the present invention.

Claims

A pathological image recognition model training method, characterized in that the method comprises:

receiving a sample image set;

According to the sample image set, calling the first neural network model to perform supervised training and traversal reasoning in sequence, and calling the second neural network model to perform supervised training based on the reasoning result, and calculating a first loss function;

According to the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating a second loss function;

The first neural network model and the second neural network model are iteratively trained according to the first loss function and the second loss function to obtain at least one of a first model training parameter and a second model training parameter.
The pathological image recognition model training method according to claim 1 is characterized in that the sample image set includes a labeled sample image set and an unlabeled sample image set.
The pathological image recognition model training method according to claim 2 is characterized in that the step of "calling the first neural network model to sequentially perform supervised training and traversal reasoning according to the sample image set, and calling the second neural network model to perform supervised training based on the reasoning result, and calculating the first loss function" specifically includes:

After calling the first neural network model to perform supervised training according to the set of labeled sample images, calling the first neural network model to perform traversal reasoning according to the set of unlabeled sample images to obtain a first recognition pseudo-label set corresponding to the set of unlabeled sample images;

According to the unlabeled sample image set and the first recognition pseudo-label set, calling the second neural network model to perform supervised training, and calculating the first loss function;

The “according to the sample image set, calling the second neural network model to sequentially perform supervised training and traversal reasoning, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating the second loss function” specifically includes:

After calling the second neural network model to perform supervised training according to the set of labeled sample images, calling the second neural network model to perform traversal reasoning according to the set of unlabeled sample images to obtain a second set of identification pseudo labels corresponding to the set of unlabeled sample images;

According to the unlabeled sample image set and the second identification pseudo-label set, the first neural network model is called to perform supervised training, and the second loss function is calculated.
The pathological image recognition model training method according to claim 1 is characterized in that, before the "receiving the sample image set", the method further comprises:

receiving a reference pathology image set;

Performing size standardization processing and color migration standardization processing on the reference pathology image set in sequence, and obtaining a standard pathology image set by calculation; wherein the standard pathology image set includes a labeled pathology image set and an unlabeled pathology image set;

Grouping the annotated pathological image sets, combining the first annotated image set with the unannotated pathological image set to form a sample image training set, and forming a sample image verification set based on the second annotated image set;

The sample image set is generated according to the sample image training set and the sample image verification set.
The pathological image recognition model training method according to claim 4 is characterized in that, before the “receiving a reference pathological image set”, the method specifically comprises:

Receiving images of precancerous lesion specimens and images of non-precancerous lesion specimens;

Perform pixel annotation on some precancerous lesion specimen images to obtain lesion annotation masks;

generating the reference pathological image set according to the precancerous lesion specimen image, the corresponding lesion annotation mask, and the non-precancerous lesion specimen image;

The “performing size standardization processing and color migration standardization processing on the reference pathological image set in sequence to obtain a standard pathological image set” specifically includes:

Performing size standardization processing and color migration standardization processing on all labeled lesion specimen images in sequence, and calculating and obtaining the labeled pathology image set according to the processed labeled lesion specimen images; wherein the labeled lesion specimen images correspond to the precancerous lesion specimen images with corresponding lesion annotation masks;

Size standardization and color migration standardization are performed in sequence on all unlabeled lesion specimen images and all non-precancerous lesion specimen images, and the unlabeled pathological image set is obtained by calculation based on the processed unlabeled lesion specimen images and non-precancerous lesion specimen images; wherein the unlabeled lesion specimen images correspond to precancerous lesion specimen images without corresponding lesion annotation masks.
The pathological image recognition model training method according to claim 5 is characterized in that the number of the labeled lesion specimen images accounts for all 30% of the number of precancerous lesion specimen images; the number of all non-precancerous lesion specimen images accounts for 20% of the number of all precancerous lesion specimen images.
The pathological image recognition model training method according to claim 5 is characterized in that the "operating to obtain a standard pathological image set" specifically includes:

Performing sliding window region segmentation on the reference pathological image that has completed the size standardization processing and the color migration standardization processing to obtain and obtain the standard pathological image set based on multiple groups of sliding window region image groups; wherein the sliding window region segmentation specifically includes:

Constructing an image area sliding window of a preset size, and making the image area sliding window perform traversal segmentation on the annotated standardized image and the corresponding lesion annotated mask according to a preset step length, to obtain multiple groups of annotated sliding window image groups and annotated sliding window mask groups; wherein the annotated standardized image is an annotated lesion specimen image after the standardization process is completed;

Traversing, analyzing, and screening and updating the annotated sliding window image and the corresponding annotated sliding window mask according to the proportion of the lesion area of all annotated sliding window masks in the annotated sliding window mask group;

The image area sliding window is used to perform traversal segmentation on the unlabeled standardized image and the non-lesion standardized image according to the preset step length to obtain multiple groups of unlabeled sliding window image groups and non-lesion sliding window image groups; wherein the unlabeled standardized image is an unlabeled lesion specimen image after standardization processing, and the non-lesion standardized image is a non-precancerous lesion specimen image after standardization processing;

Traverse, analyze, and filter and update the unlabeled sliding window image and the non-lesion sliding window image according to the tissue area ratio of the unlabeled sliding window image and the non-lesion sliding window image.
The pathological image recognition model training method according to claim 7 is characterized in that after "traversing, analyzing and screening and updating the annotated sliding window image and the corresponding annotated sliding window mask according to the lesion area proportion of all annotated sliding window masks in the annotated sliding window mask group", the method specifically includes:

Performing random data augmentation processing on the annotated sliding window image and the corresponding annotated sliding window mask to obtain the annotated pathological image set;

After “traversing, analyzing, and screening and updating the unlabeled sliding window image and the non-lesion sliding window image according to the tissue area ratio of the unlabeled sliding window image and the non-lesion sliding window image”, the method specifically includes:

Performing random data augmentation processing on the unlabeled sliding window image and the non-lesion sliding window image to obtain the unlabeled pathological image set;

The random data augmentation specifically includes:

The image matrix is at least one of horizontally flipped, vertically flipped, rotated at a preset angle, and transposed according to a preset probability.
The pathological image recognition model training method according to claim 5 is characterized in that the lesion annotation mask includes a one-hot encoding label corresponding to each pixel in the precancerous lesion specimen image, and the one-hot encoding label includes a first encoding bit, a second encoding bit, and a third encoding bit that respectively characterize the background judgment label, the intraepithelial neoplasia judgment label, and the intestinal metaplasia judgment label.
The pathological image recognition model training method according to claim 4 is characterized in that the size standardization process specifically includes:

Performing size standardization processing on the reference pathology image set to unify all reference pathology images to a preset magnification;

The color migration standardization process specifically includes:

receiving a reference dyed image, performing a color space conversion on the image, and calculating a reference dyed vector matrix;

receiving a reference pathological image, performing color space conversion on the image, and calculating a reference color density matrix;

A color migration image corresponding to the reference pathological image is generated according to the reference staining vector matrix and the reference color density matrix.
The pathological image recognition model training method according to claim 10 is characterized in that the step of "receiving a reference stained image, performing a color space conversion on the image, and calculating a reference stained vector matrix" specifically includes:

receiving a reference staining image, performing optical density matrix conversion processing, and obtaining a reference optical density matrix;

Performing singular value decomposition on the reference optical density matrix, selecting a first singular extreme value and a second singular extreme value to create a projection plane;

Determine at least one reference singular value and its reference plane axis on the projection plane, project the reference optical density matrix onto the projection plane, fit the connecting line between all numerical points on the projected reference optical density matrix and the origin of the projection plane, calculate the angle between the connecting line and the reference plane axis, find the maximum value among all the angles, and obtain maximum angle data;

The optical density matrix corresponding to the maximum angle data is calculated, and after performing a normalization operation on the optical density matrix, the reference staining vector matrix is obtained.
The pathological image recognition model training method according to claim 11 is characterized in that the step of "receiving a reference pathological image, performing color space conversion on the reference pathological image, and calculating a reference color density matrix" specifically includes:

Receive a reference pathological image, and sequentially perform optical density matrix conversion, singular value decomposition, plane projection, and maximum angle data acquisition on the reference pathological image to calculate a reference optical density matrix and a reference staining vector matrix corresponding to the reference pathological image;

The reference color density matrix corresponding to the reference pathological image is calculated based on the reference staining vector matrix and the reference optical density matrix.
The pathological image recognition model training method according to claim 10 is characterized in that the method specifically comprises:

Downsampling interpolation is performed on the reference pathological image, and the magnification of the reference pathological image is set to 10 times; wherein the downsampling interpolation is nearest neighbor interpolation.
The pathological image recognition model training method according to claim 1 is characterized in that before the step of “after calling the first neural network model to perform supervised training according to the labeled sample image set, calling the first neural network model to perform traversal reasoning according to the unlabeled sample image set to obtain a first recognition pseudo-label set corresponding to the unlabeled sample image set”, the method further comprises:

The semantic segmentation backbone model based on the fully convolutional network structure is selected as the basic backbone model;

According to the first weight configuration parameter and the second weight configuration parameter, model initialization is performed based on the basic backbone model to obtain the first neural network model and the second neural network model; wherein the first neural network model and the second neural network model are both equipped with a softmax activation function and are configured to have the same optimizer and learning rate adjustment strategy.
The pathological image recognition model training method according to claim 14 is characterized in that the basic backbone model is configured based on a U-Net network architecture, the first weight configuration parameter is set to be generated based on a Xavier parameter initialization strategy, and the second weight configuration parameter is set to be generated based on a Kaiming parameter initialization strategy;

The first neural network model and the second neural network model are configured to include a stochastic gradient descent optimizer, and the learning rate adjustment strategy is configured so that the model learning rate value decreases as the number of iterations increases.
The pathological image recognition model training method according to claim 15 is characterized in that the model learning rate value is equal to the product of a preset exponential power of the ratio of the remaining number of iterations to the total number of iterations and a basic learning rate value.
The pathological image recognition model training method according to claim 1 is characterized in that the first loss function is configured as a weighted sum of a first supervised loss function and a first pseudo-label loss function, wherein the first supervised loss function refers to a supervised training process of the first neural network model based on a sample image set, and the first pseudo-label loss function refers to a supervised training process of the second neural network model based on an inference result;

The second loss function is configured as a weighted sum of a second supervised loss function and a second pseudo-label loss function, wherein the second supervised loss function refers to a supervised training process performed by the second neural network model based on the sample image set, and the second pseudo-label loss function refers to a supervised training process performed by the first neural network model based on the inference result.
The pathological image recognition model training method according to claim 17 is characterized in that the first supervised loss function is configured as the sum of a first supervised cross entropy loss function and a first supervised intersection-over-union loss function; wherein the first supervised cross entropy loss function represents the gap between the known label data in the sample image set and the corresponding inference classification probability, and the first supervised intersection-over-union loss function represents the gap between the known label data in the sample image set and the corresponding inference classification category;

The first pseudo-label loss function includes a first pseudo-label cross entropy loss function; wherein the first pseudo-label cross entropy loss function represents the gap between the inference classification probability of the first neural network model for the sample image set and the inference classification category of the second neural network model for the sample image set;

The second supervised loss function is configured as the sum of a second supervised cross entropy loss function and a second supervised intersection-over-union loss function; wherein the second supervised cross entropy loss function represents the gap between the known label data in the sample image set and the corresponding inference classification probability, and the second supervised intersection-over-union loss function represents the gap between the known label data in the sample image set and the corresponding inference classification category;

The second pseudo-label loss function includes a second pseudo-label cross-entropy loss function; wherein the second pseudo-label cross-entropy loss function represents the gap between the inference classification probability of the second neural network model for the sample image set and the inference classification category of the first neural network model for the sample image set.
The pathological image recognition model training method according to claim 18 is characterized in that the sample image represents intraepithelial neoplasia and intestinal metaplasia;

The first supervised cross entropy loss function, the first pseudo-label cross entropy loss function, the second supervised cross entropy loss function and the second pseudo-label cross entropy loss function point to the background area, intraepithelial neoplasia area and intestinal metaplasia area in the sample image; the first supervised intersection-over-union loss function and the second supervised intersection-over-union loss function point to the intraepithelial neoplasia area and intestinal metaplasia area in the sample image.
The pathological image recognition model training method according to claim 17 is characterized in that the first pseudo-label loss function and the second pseudo-label loss function have equal preset weight values, and the preset weight values are configured to increase with the increase in the number of iterations.
The pathological image recognition model training method according to claim 20 is characterized in that the preset weight value is equal to the product of the maximum weight value and a preset increasing function, and the preset increasing function is configured so that the function value infinitely approaches 1.
The pathological image recognition model training method according to claim 1 is characterized in that the sample images represent intraepithelial neoplasia and intestinal metaplasia.
A pathological image recognition model training system, characterized by comprising: one or more processors; a memory for storing one or more computer programs, wherein when the one or more computer programs are executed by the one or more processors, the system is configured to execute a pathological image recognition model training method; the pathological image recognition model training method comprises:

receiving a sample image set;

According to the sample image set, calling the first neural network model to perform supervised training and traversal reasoning in sequence, and calling the second neural network model to perform supervised training based on the reasoning result, and calculating a first loss function;

According to the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating a second loss function;

The first neural network model and the second neural network model are iteratively trained according to the first loss function and the second loss function to obtain at least one of a first model training parameter and a second model training parameter.
A storage medium having a computer program stored thereon, characterized in that when the computer program is executed by a processor, a pathological image recognition model training method is implemented; the pathological image recognition model training method comprises:

receiving a sample image set;

According to the sample image set, calling the first neural network model to perform supervised training and traversal reasoning in sequence, and calling the second neural network model to perform supervised training based on the reasoning result, and calculating a first loss function;

According to the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating a second loss function;

The first neural network model and the second neural network model are iteratively trained according to the first loss function and the second loss function to obtain at least one of a first model training parameter and a second model training parameter.
A pathological image recognition method, characterized in that the method comprises:

Executing a pathological image recognition model training method to obtain at least one of a first model training parameter and a second model training parameter;

Load the model training parameters into the corresponding neural network model to build a pathological image recognition model;

Receiving and preprocessing the pathological image data to be tested, and inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction to obtain pathological recognition data;

Wherein, the pathological image recognition model training method includes:

receiving a sample image set;

According to the sample image set, calling the first neural network model to perform supervised training and traversal reasoning in sequence, and calling the second neural network model to perform supervised training based on the reasoning result, and calculating a first loss function;

According to the sample image set, calling the second neural network model to perform supervised training and traversal reasoning in sequence, and calling the first neural network model to perform supervised training based on the reasoning result, and calculating a second loss function;

The first neural network model and the second neural network model are iteratively trained according to the first loss function and the second loss function to obtain at least one of a first model training parameter and a second model training parameter.
The pathological image recognition method according to claim 25 is characterized in that the "receiving the pathological image data to be tested and preprocessing it, inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction, and obtaining the pathological recognition data" specifically includes:

Performing size standardization processing and color migration standardization processing on the pathological image data to be tested in sequence, and obtaining a set of pathological images to be tested by calculation;

Inputting the pathological image set to be tested into the pathological image recognition model for traversal prediction to obtain a pathological recognition pixel area;

The pathology recognition pixel area is superimposed and displayed on the pathology image to be detected to form a pathology judgment image.
The pathological image recognition method according to claim 26 is characterized in that the "operation to obtain a set of pathological images to be tested" specifically includes: performing sliding window area segmentation on the pathological image data to be tested that has completed size standardization and color migration standardization, and screening to obtain the set of pathological images to be tested based on the proportion of low grayscale value areas in the sliding window image to be tested.
The pathological image recognition method according to claim 26 is characterized in that the pathological recognition data includes precancerous lesion determination information, and the "receiving the pathological image data to be tested and preprocessing it, inputting the preprocessed pathological image data to be tested into the pathological image recognition model for traversal prediction, and obtaining the pathological recognition data" specifically includes:

Arrange the pixel values pointing to intraepithelial neoplasia and intestinal metaplasia in the pathological identification pixel area in descending order, calculate the average value of pixels within a preset number range, obtain a first average value and a second average value, and determine the numerical relationship between the first average value and the second average value and a preset precancerous lesion determination threshold;

If one of the first average value and the second average value is greater than the precancerous lesion determination threshold, it is determined that a precancerous lesion occurs at the position represented by the pathological image to be detected corresponding to the pathological identification pixel area, and precancerous lesion determination information is output.