CN111291825A

CN111291825A - Focus classification model training method and device, computer equipment and storage medium

Info

Publication number: CN111291825A
Application number: CN202010115041.0A
Authority: CN
Inventors: 甘伟焜; 詹维伟; 张璐; 陈超; 黄凌云; 刘玉宇
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-02-25
Filing date: 2020-02-25
Publication date: 2020-06-16
Also published as: WO2021169126A1

Abstract

The invention provides a method and a device for training a focus classification model, computer equipment and a storage medium, wherein the method for training the focus classification model comprises the following steps: acquiring a sample data set, wherein the sample data set comprises a plurality of sample pictures respectively marked with a focus area, and each focus area is respectively marked with a corresponding classification label; local image information acquisition processing is carried out on each focal zone to obtain local image information of each focal zone; carrying out global image information acquisition processing on each focal zone to obtain global image information of each focal zone; and training a pre-established focus classification model with double input channels by using the local image information and the global image information of each focus area and the classification labels corresponding to each focus area to obtain a target focus classification model. The invention can reduce the requirement on the sample data size on the premise of realizing accurate classification of the focus.

Description

Focus classification model training method and device, computer equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a device for training a focus classification model, computer equipment and a storage medium.

Background

Conventionally, the disease is judged by doctors through manual judgment, the manual judgment is long in time consumption and multiple in involved processes, comprehensive diagnosis needs to be carried out according to various indexes of a focus, such as whether calcification exists, whether cystic disease exists, echo level and other indexes, the diagnosis result is not easy to quantify, meanwhile, the experience of the doctors is also very dependent, and the condition of judgment errors is easy to occur in the actual diagnosis process.

With the rapid development of medical and computer image processing technologies, automatic identification and diagnosis of medical images is a hot spot of current research in the cross field of computer image technologies and medical images. The computer image processing technology is utilized to complete the auxiliary diagnosis of the ultrasonic image, and a classifier with high speed and high accuracy is mainly constructed to assist doctors to carry out benign and malignant diagnosis on a focus area.

The diagnosis of benign and malignant diseases belongs to a classification problem to some extent, and the existing focus classification method is realized by inputting an image marked with a focus area into a pre-trained focus classification model. However, the size of the lesion region is relatively large, and the number of pixel points reflected in the lesion region in the ultrasound image may vary from several hundreds to several tens of thousands (the benign and malignant degree of the lesion region has no direct relation with the size of the lesion region). In order to solve the problem, the existing method is to separately establish a model for different lesion size ranges, however, each separate model needs to be trained on samples of a corresponding scale, which increases the demand for sample data size.

Disclosure of Invention

In view of the above-mentioned deficiencies of the prior art, an object of the present invention is to provide a method and an apparatus for training a lesion classification model, a computer device and a storage medium, so as to reduce the requirement for sample data size on the premise of implementing accurate classification of lesions.

In order to achieve the above object, the present invention provides a method for training a lesion classification model, comprising:

acquiring a sample data set, wherein the sample data set comprises a plurality of sample pictures respectively marked with a focus area, and each focus area is respectively marked with a corresponding classification label;

local image information acquisition processing is carried out on each focal zone to obtain local image information of each focal zone;

carrying out global image information acquisition processing on each focal zone to obtain global image information of each focal zone;

and training a pre-established focus classification model with double input channels by using the local image information and the global image information of each focus area and the classification labels corresponding to each focus area to obtain a target focus classification model.

In one embodiment of the present invention, the local image information acquisition processing includes:

constructing a focus rectangular frame by taking the uppermost point, the lowermost point, the leftmost point and the rightmost point of the focus area as references;

and randomly moving the focus rectangular frame for multiple times in the neighborhood of the focus area, and taking the image information in the focus rectangular frame after moving every time as the local image information of the focus area.

In an embodiment of the present invention, before randomly moving the lesion rectangular frame a plurality of times in the neighborhood of the lesion region, the local image information collecting process further includes:

and magnifying the focus rectangular frame according to a preset magnification ratio by taking the center of the focus rectangular frame as a reference.

In an embodiment of the present invention, the global image information acquisition processing includes:

dividing each focus area into a first type focus area, a second type focus area and a third type focus area according to the size, wherein the sizes of the first type focus area, the second type focus area and the third type focus area are reduced in sequence;

performing down-sampling processing on each focal zone in the first type of focal zone, and taking image information subjected to down-sampling processing as global image information of a corresponding focal zone in the first type of focal zone;

taking the image information of each focal zone in the second type of focal zone as the global image information of the corresponding focal zone in the second type of focal zone;

and performing interpolation processing on each focal zone in the third type of focal zone, and taking the image information subjected to the interpolation processing as the global image information of the corresponding focal zone in the third type of focal zone.

In an embodiment of the present invention, the lesion classification model includes a first convolution network, a second convolution network, a stitching layer and a fully connected classification layer, wherein an input end of the first convolution network is configured to receive local image information of each lesion area, an input end of the second convolution network is configured to receive global image information of each lesion area, an input end of the stitching layer is connected to output ends of the first convolution network and the second convolution network, respectively, and an input end of the fully connected classification layer is connected to an output end of the stitching layer.

In one embodiment of the present invention, after obtaining the target lesion classification model, the method further comprises:

acquiring a target case picture, wherein a target focal zone is marked in the target case picture;

acquiring and processing the local image information of the target focal zone to obtain local image information of the target focal zone;

acquiring and processing the global image information of the target focal zone to obtain the global image information of the target focal zone;

and processing the local image information and the global image information of the target focal zone by using the target focal classification model to obtain a classification result of the target focal zone.

In order to achieve the above object, the present invention further provides a lesion classification model training device, including:

the system comprises a sample acquisition module, a classification module and a classification module, wherein the sample acquisition module is used for acquiring a sample data set, the sample data set comprises a plurality of sample pictures respectively marked with focus areas, and each focus area is respectively marked with a corresponding classification label;

the first local image information acquisition module is used for acquiring and processing local image information of each focal zone to obtain local image information of each focal zone;

the first global image information acquisition module is used for carrying out global image information acquisition processing on each focal zone to obtain global image information of each focal zone;

and the model training module is used for training a pre-established focus classification model with double input channels by utilizing the local image information and the global image information of each focus area and the classification labels corresponding to each focus area to obtain a target focus classification model.

In an embodiment of the present invention, the first local image information acquisition module is specifically configured to:

In an embodiment of the present invention, the first local image information collecting module is further configured to: before randomly moving the focus rectangular frame for a plurality of times in the neighborhood of the focus area,

In an embodiment of the present invention, the first global image information acquisition module is specifically configured to:

In one embodiment of the invention, the apparatus further comprises:

a case image acquisition module, configured to acquire a target case image after the model training module obtains the target lesion classification model, where the target case image is marked with a target lesion area;

the second local image information acquisition module is used for acquiring and processing the local image information of the target focal zone to obtain the local image information of the target focal zone;

the second global image information acquisition module is used for acquiring and processing the global image information of the target focal zone to obtain the global image information of the target focal zone;

and the model processing module is used for processing the local image information and the global image information of the target lesion area by using the target lesion classification model to obtain a classification result of the target lesion area.

In order to achieve the above object, the present invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the aforementioned method when executing the computer program.

In order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the aforementioned method.

By adopting the technical scheme, the invention has the following beneficial effects:

according to the invention, a focus classification model with double input channels is designed, local image information and global image information of a focus area are respectively received through the two input channels, and the focus classification model obtained through training has higher accuracy compared with the existing single-input focus classification model because the local image information contains more detailed information of the focus area and the global image information can better reflect the global characteristics of the focus area. Meanwhile, the invention does not need to respectively establish models aiming at different focus size ranges, thereby reducing the requirement of model training on sample data size.

Drawings

FIG. 1 is a flowchart of a lesion classification model training method according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a lesion classification model according to the present invention;

FIG. 3 is a schematic structural diagram of a first convolutional network and a second convolutional network in the present invention;

FIG. 4 is a schematic structural diagram of a residual module according to the present invention;

FIG. 5 is a schematic diagram of an attention network according to the present invention;

FIG. 6 is a flowchart illustrating a method for training a lesion classification model according to another embodiment of the present invention;

FIG. 7 is a block diagram of a lesion classification model training apparatus according to an embodiment of the present invention;

FIG. 8 is a block diagram of another embodiment of the lesion classification model training apparatus according to the present invention;

fig. 9 is a hardware architecture diagram of the computer apparatus of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

Example one

This embodiment provides a method for training a lesion classification model, as shown in fig. 1, the method includes the following steps:

s11, a sample data set is obtained, the sample data set comprises a plurality of sample pictures respectively marked with focus areas, and each focus area is respectively marked with a corresponding classification label.

Specifically, in the step, sample pictures can be obtained from a sample database of a hospital, the outline of the focal region in each sample picture is manually marked by a doctor in advance or automatically or semi-automatically marked by the existing detection or segmentation algorithm, and classification labels corresponding to the focal regions are marked by the doctor in advance, such as a benign label [ 10 ] and a malignant label [ 01 ]. In this embodiment, the sample image mainly refers to an ultrasound image, and particularly refers to an ultrasound image marked with a thyroid focal region or a breast focal region.

S12, local image information collection processing is carried out on each focus area to obtain local image information of each focus area. In this embodiment, the local image information acquisition process may be implemented by:

and S121, constructing a focus rectangular frame by taking the uppermost point, the lowermost point, the leftmost point and the rightmost point of the focus area as references.

And S122, randomly moving the focus rectangular frame for multiple times in the neighborhood of the focus area, and collecting image information in the focus rectangular frame after each movement as local image information of the corresponding focus area. The focus rectangle frame is moved because: in practical application, the ultrasonic image has the defects of low resolution, artifact in the image, no obvious boundary of a focal zone and the like, so that the marking position of the focal zone is not always accurate, and the focal zone moves for many times; in addition, the data volume is increased through random movement, and the requirement for the sample data volume can be further reduced to a certain extent.

Preferably, before performing step S122, the local image information acquisition process may further include the steps of: and taking the center of the focus rectangular frame as a reference, magnifying the focus rectangular frame according to a preset magnification ratio N%, wherein the size of the magnified focus rectangular frame is equal to the size of the original focus rectangular frame (1+ N%). The focus rectangular box is enlarged here because: when the focal zone is classified, better classification effect can be obtained by combining the images of the focal zone and the peripheral area for classification.

And S13, carrying out global image information acquisition processing on each focal zone to obtain global image information of each focal zone. In this embodiment, the global image information acquisition processing includes:

s131, dividing each focus area into a first focus area, a second focus area and a third focus area according to the size, wherein the sizes of the first focus area, the second focus area and the third focus area are sequentially reduced. Wherein, the classification process is as follows: firstly, counting the size of a long axis (a long axis is the longest axis passing through the center of a focal zone) of each focal zone in a sample data set, acquiring the median M of all the long axes, taking the focal zone with the long axis larger than M as a large focal zone, and taking the focal zone with the long axis smaller than M as a small focal zone; then, acquiring median M1 of the long axes of all the large focal zones, and taking the large focal zone with the long axis larger than M1 as a first type of focal zone; then obtaining the average value M2 of the long axes of all the small focal zones, and taking the small focal zone with the long axis smaller than M2 as a third type focal zone; meanwhile, a large focal zone having a major axis less than M1 and a small focal zone having a major axis less than M2 are used as the second type of focal zone.

S132, performing down-sampling processing on each focal zone in the first type of focal zone, adjusting each focal zone in the first type of focal zone to be the same as M1 in long axis through down-sampling, and taking image information subjected to down-sampling processing as global image information of the corresponding focal zone in the first type of focal zone.

S133, the image information of each focal zone in the second type of focal zone is used as the global image information of the corresponding focal zone in the second type of focal zone.

S134, performing interpolation processing on each focal zone in the third type of focal zone, so as to adjust each focal zone in the third type of focal zone to have the long axis same as that of M2 through interpolation, and using the image information subjected to interpolation processing as the global image information of the corresponding focal zone in the third type of focal zone.

Through the processing, the sample data can be integrated into a certain size range (the long axis is between M2 and M1) in the focal region. Because the benign and malignant degree of the focus area has no direct relation with the size of the focus area, the focus classification is not influenced by the size adjustment, and the adverse effect on a subsequent focus classification model caused by large focus size difference can be avoided.

And S14, training a pre-established focus classification model with double input channels by using the local image information and the global image information of each focus area and the classification labels corresponding to each focus area to obtain a target focus classification model.

Specifically, the gradient descent method is used for training the lesion classification model, and a deep learning network accelerator weight normalization can be used for training and accelerating in the training process. Meanwhile, the model obtained by training can be verified by adopting a common K-fold cross verification method so as to obtain the performance of the trained lesion classification model, such as an accuracy/AUC curve and the like.

In an embodiment, the value of the amplification ratio N% may be adjusted multiple times to obtain a plurality of trained lesion classification models, and a model with the best performance among the plurality of trained lesion classification models may be used as the target lesion classification model. Wherein, the amplification ratio N% can be adjusted by adopting the following rule: assuming that the value range of N% is set to be less than 100%, the initial value of N% is set to 10%, and then N% is increased by a predetermined step (e.g., 10%) until N% reaches 100%. It should be understood that taking different values for N% will result in different local image information for each focal zone, and thus different lesion classification models are trained.

Preferably, the lesion classification model used in this embodiment is as shown in fig. 2, and includes a first convolution network, a second convolution network, a splicing layer, and a fully-connected classification layer, where the first convolution network and the first convolution network are parallel networks, an input end of the first convolution network is used to receive local image information of each lesion area, an input end of the second convolution network is used to receive global image information of each lesion area, an input end of the splicing layer is connected to output ends of the first convolution network and the second convolution network, and an input end of the fully-connected classification layer is connected to an output end of the splicing layer. Therefore, the characteristics in the local image information and the global image information can be respectively extracted through the first convolution network and the second convolution network, the characteristics in the local image information and the global image information are spliced through the splicing layer and then input into the full-connection classification layer, finally, classification processing is carried out through the full-connection classification layer, and a classification result is output.

Specifically, the first convolution network and the second convolution network in this embodiment are depth residual networks (resnet), wherein the depth residual networks preferably include convolution layers, residual modules (Resblock), and pooling layers. As shown in fig. 3, showing specific structures of the first convolutional network and the second convolutional network, after receiving the local image information/global image information, the first convolutional network/the second convolutional network sequentially processes the local image information/global image information through the first convolutional layer, 3 first residual modules, the second convolutional layer, 3 second residual modules, the third convolutional layer, 3 third residual modules, and the pooling layer, so as to obtain features in the local image information/global image information. Preferably, the parameters of the third convolutional layers, the third residual modules and the pooling layer of the first convolutional network and the second convolutional network are independent, and the parameters of the other layers are shared.

In addition, the structure of each residual module in the present invention is shown in fig. 4, and includes a residual module layer and an attention network (attention). Wherein the residual module layer comprises a fourth convolution layer, a fifth convolution layer and a sixth convolution layer. And after receiving the input signal, the residual module layer processes the input signal sequentially through the fourth convolution layer, the fifth convolution layer and the sixth convolution layer, sums the output result of the sixth convolution layer with the input signal and inputs the sum to the attention network for processing.

The attention network, which serves to focus the model on the region of interest, has a structure as shown in fig. 5, and includes two paths, wherein the first path includes a seventh convolution layer of 1 × 1; the second path comprises a first maximum pooling layer, an eighth pooling layer, a second maximum pooling layer, a ninth pooling layer, a sub-pixel up-sampling layer and a classification layer, and is used for obtaining an attention weighting matrix, and finally multiplying the attention weighting matrix by an output result of the first path to enable the model to focus more on a high weight area. The sub-pixel upsampling is obtained by performing interpolation adjustment on the upsampling, and compared with the upsampling, the sub-pixel upsampling does not need to reduce the channel of the model to 1 in the convolution process, and is more favorable for integrating information in a plurality of channels.

In summary, the lesion classification model with the dual input channels is designed in this embodiment, the local image information and the global image information of the lesion area are respectively received through the two input channels, and the local image information contains more detailed information of the lesion area, and the global image information can better reflect the global characteristics of the lesion area, so that the accuracy of the trained lesion classification model is higher compared with that of the existing single-input lesion classification model. Meanwhile, the invention does not need to respectively establish models aiming at different lesion size ranges, thereby reducing the requirement on sample data size.

Example two

The present embodiment provides a method for training a lesion classification model based on the first embodiment, which is different from the first embodiment in that after the target lesion classification model is obtained in step S14, the method shown in fig. 6 is performed, and specifically includes:

s21, acquiring a target case picture, wherein the target case picture is marked with the outline of the target focal zone.

S22, local image information acquisition processing is carried out on the target focal zone to obtain local image information of the target focal zone. The local image information collection processing in this step is the same as the local image information collection processing in step S12, and is not described herein again.

S23, carrying out overall image information acquisition processing on the target focal zone to obtain the overall image information of the target focal zone. The process of the global image information collection in this step is the same as the process of the global image information collection in step S13, and is not described herein again.

And S24, processing the local image information and the global image information of the target lesion area by using the target lesion classification model obtained by the training of the first embodiment to obtain a good and malignant classification result of the target lesion area.

In this embodiment, the two input channels of the target lesion classification model respectively receive the local image information and the global image information of the target lesion area, and the target lesion classification model has a more accurate classification effect, so that the target lesion area in the case image can be accurately classified.

It should be noted that, for the sake of simplicity, both the first embodiment and the second embodiment are described as a series of acts, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention

EXAMPLE III

In this embodiment, there is provided a lesion classification model training apparatus 10, as shown in fig. 7, the apparatus including:

the system comprises a sample acquisition module 11, a classification module and a classification module, wherein the sample acquisition module is used for acquiring a sample data set, the sample data set comprises a plurality of sample pictures respectively marked with a focus area, and each focus area is respectively marked with a corresponding classification label;

a first local image information acquisition module 12, configured to perform local image information acquisition processing on each focal zone to obtain local image information of each focal zone;

a first global image information acquisition module 13, configured to perform global image information acquisition processing on each focal zone to obtain global image information of each focal zone;

and the model training module 14 is configured to train a pre-established lesion classification model with dual input channels by using the local image information and the global image information of each lesion area and the classification label corresponding to each lesion area, so as to obtain a target lesion classification model.

In this embodiment, the first local image information acquisition module is specifically configured to:

In this embodiment, the first local image information acquisition module is further configured to: before randomly moving the focus rectangular frame for a plurality of times in the neighborhood of the focus area,

In this embodiment, the first global image information acquisition module is specifically configured to:

In this embodiment, the lesion classification model includes a first convolution network, a second convolution network, a stitching layer, and a fully-connected classification layer, where an input of the first convolution network is configured to receive local image information of each lesion area, an input of the second convolution network is configured to receive global image information of each lesion area, an input of the stitching layer is connected to output terminals of the first convolution network and the second convolution network, and an input of the fully-connected classification layer is connected to an output terminal of the stitching layer.

Example four

As shown in fig. 8, the apparatus of the present embodiment is different from the third embodiment in that it further includes the following modules:

a case picture acquiring module 15, configured to acquire a target case picture, where a target focal zone is marked in the target case picture;

the second local image information acquisition module 16 is configured to perform local image information acquisition processing on the target focal zone to obtain local image information of the target focal zone;

a second global image information collecting module 17, configured to collect and process the global image information of the target focal zone to obtain global image information of the target focal zone;

a model processing module 18, configured to process the local image information and the global image information of the target focal zone using the target focal classification model obtained by the focal classification model training apparatus of the third embodiment, so as to obtain a classification result of the target focal zone.

The above device embodiment is basically similar to the method embodiment, so the description is simple, and the relevant points can be referred to the partial description of the method embodiment. Also, it should be understood by those skilled in the art that the embodiments described in the specification are preferred embodiments and the module referred to is not necessarily essential to the invention.

EXAMPLE five

The present embodiment provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster composed of multiple servers) capable of executing programs. The computer device 20 of the present embodiment includes at least, but is not limited to: a memory 21, a processor 22, which may be communicatively coupled to each other via a system bus, as shown in FIG. 9. It is noted that fig. 9 only shows a computer device 20 with components 21-22, but it is to be understood that not all shown components are required to be implemented, and that more or fewer components may be implemented instead.

In the present embodiment, the memory 21 (i.e., a readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 20, such as a hard disk or a memory of the computer device 20. In other embodiments, the memory 21 may also be an external storage device of the computer device 20, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 20. Of course, the memory 21 may also include both internal and external storage devices of the computer device 20. In this embodiment, the memory 21 is generally used for storing an operating system and various types of application software installed on the computer device 20, such as program codes of the lesion classification model training apparatus 10 of the third/fourth embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 20. In this embodiment, the processor 22 is configured to run the program code stored in the memory 21 or process data, for example, run the lesion classification model training apparatus 10, so as to implement the lesion classification model training method of the first embodiment.

EXAMPLE six

The present embodiment provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of the present embodiment is used for storing a lesion classification model training apparatus 10, and when being executed by a processor, the computer-readable storage medium implements a lesion classification model training method of the present embodiment.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for training a lesion classification model is characterized by comprising the following steps:

2. The lesion classification model training method according to claim 1, wherein the local image information acquisition processing includes:

3. The method of claim 1, wherein the local image information collection process further comprises, before the lesion rectangular frame is randomly moved multiple times in the neighborhood of the lesion area:

4. The lesion classification model training method of claim 1, wherein the global image information acquisition process comprises:

5. The lesion classification model training method of claim 1, wherein the lesion classification model comprises a first convolution network, a second convolution network, a splicing layer and a fully-connected classification layer, wherein an input end of the first convolution network is used for receiving local image information of each lesion area, an input end of the second convolution network is used for receiving global image information of each lesion area, an input end of the splicing layer is connected with output ends of the first convolution network and the second convolution network respectively, and an input end of the fully-connected classification layer is connected with an output end of the splicing layer.

6. The method for training a lesion classification model according to claim 1, further comprising, after obtaining the target lesion classification model:

7. A lesion classification model training apparatus, comprising:

8. The apparatus for training a lesion classification model according to claim 7, further comprising:

and the model processing module is used for processing the local image information and the global image information of the target focal zone by using the target focal classification model to obtain a classification result of the target focal zone.

9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented by the processor when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.