CN113724188A - Method for processing focus image and related device - Google Patents

Method for processing focus image and related device Download PDF

Info

Publication number
CN113724188A
CN113724188A CN202110285271.6A CN202110285271A CN113724188A CN 113724188 A CN113724188 A CN 113724188A CN 202110285271 A CN202110285271 A CN 202110285271A CN 113724188 A CN113724188 A CN 113724188A
Authority
CN
China
Prior art keywords
image
focus
target
lesion
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110285271.6A
Other languages
Chinese (zh)
Inventor
郑瀚
常健博
王任直
冯铭
姚建华
王晓宁
裴翰奇
陈星翰
尚鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Original Assignee
Tencent Technology Shenzhen Co Ltd
Peking Union Medical College Hospital Chinese Academy of Medical Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd, Peking Union Medical College Hospital Chinese Academy of Medical Sciences filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110285271.6A priority Critical patent/CN113724188A/en
Publication of CN113724188A publication Critical patent/CN113724188A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Abstract

The application discloses a focus image processing method and a related device, which are applied to an artificial intelligence machine learning technology. Obtaining an encoding vector by an encoder which sees a to-be-predicted image and inputs the image into a target model; then, inputting the coding vector into a mapper in a target model to obtain a target vector, and training the mapper based on the focus image pair to obtain the target vector; the target vector is then input to a decoder in the target model to obtain a predicted image containing the target lesion after a change based on the temporal relationship. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.

Description

Method for processing focus image and related device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and a related apparatus for processing a lesion image.
Background
With the rapid development of medical technology, more and more detection items appear in a medical scene, and how to identify and predict a focus on a detection image generated in medical detection becomes a difficult problem.
Generally, the prediction of the lesion area in the medical image needs to be identified according to the professional knowledge and experience of the doctor, and the requirement on the professional skill of the doctor is relatively high.
However, the process of manual prediction is time-consuming and labor-consuming, and is affected by subjective factors, the result of lesion prediction is unstable, and the accuracy of lesion change prediction is affected.
Disclosure of Invention
In view of this, the present application provides a method for processing a lesion image, which can effectively improve the accuracy of lesion prediction.
A first aspect of the present application provides a method for processing a lesion image, which may be applied to a system or a program containing a processing function of a lesion image in a terminal device, and specifically includes:
acquiring a to-be-predicted image containing a target focus, and inputting the image into an encoder in a target model for vector representation to obtain an encoding vector;
inputting the coding vector into a mapper in the target model for vector conversion of image dimensions to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images contain the target focus;
and inputting the target vector into a decoder in the target model for image prediction to obtain a predicted image containing the target focus after the change of the time sequence relation.
Optionally, in some possible implementations of the present application, a plurality of pairs of lesion images are acquired, where the pairs of lesion images include a first image and a second image that are the training images, the first image is an image obtained by detecting a target lesion at a first time node, the second image is an image obtained by detecting the target lesion at a second time node, and the second time node is after the first time node;
training a target model based on the focus image, wherein the training process of the target model comprises a reconstruction task and a preset task, the reconstruction task is used for carrying out image reconstruction based on the first image, and the prediction task is used for learning the corresponding relation between the first image and the second image.
Optionally, in some possible implementations of the present application, the training the target model based on the lesion image includes:
performing identity mapping based on the first image in the focus image pair to obtain a reconstructed decoded image;
executing the reconstruction task according to the process of restoring the reconstructed decoded image to the first image;
if the similarity between the reconstructed decoded image and the first image reaches a threshold value, ending the reconstruction task;
and executing the prediction task based on the corresponding relation of the first image and the second image so as to train the target model.
Optionally, in some possible implementations of the present application, the performing identity mapping on the first image in the lesion image pair to obtain a reconstructed decoded image includes:
performing encoding operation on the first image in the focus image pair to compress the feature dimension of the first image to obtain a first feature vector;
performing decoding operation on the first feature vector to perform similarity constraint based on a first constraint function to obtain a reconstructed decoded image, wherein the feature dimension of the reconstructed decoded image is the feature dimension of the first feature vector restored to the first image;
performing discrimination operation on the reconstructed decoded image and the first image to obtain a discrimination result, wherein the discrimination result output based on a second constraint function is the first image as a target of the discrimination operation, and the discrimination result output based on a third constraint function is the reconstructed decoded image as a target of the encoding operation and the decoding operation;
the process of restoring the first image according to the reconstructed decoded image performs the reconstruction task, and includes:
and executing the reconstruction task based on the adjustment process of the second constraint function and the third constraint function.
Optionally, in some possible implementations of the present application, the method further includes:
acquiring a preset preheating step number;
and circularly performing the similarity constraint process of the first constraint function based on the preset preheating step number to preheat the processes of the encoding operation and the decoding operation.
Optionally, in some possible implementations of the present application, the performing the prediction task based on the correspondence between the first image and the second image to train the target model includes:
performing vector representation on the first image to obtain a low-dimensional vector;
inputting the low-dimensional vector into a mapper to obtain a prediction vector;
converting to a vector corresponding to the second image based on the prediction vector to adjust a fourth constraint function;
and executing the prediction task based on the adjustment process of the fourth constraint function to train the target model.
Optionally, in some possible implementations of the present application, the method further includes:
cutting the image in the focus image pair;
aligning images in the cut focus image pair based on a preset angle to obtain an aligned image pair;
adjusting the images in the aligned image pair to the same feature dimension;
normalizing the images in the adjusted aligned image pair to update the pair of lesion images.
Optionally, in some possible implementations of the present application, the cropping the image in the lesion image pair includes:
determining body part information corresponding to the target focus;
determining a crop item based on the body part information;
and cutting the image in the focus image pair according to the cutting item.
Optionally, in some possible implementations of the present application, the acquiring a plurality of pairs of lesion images includes:
determining type information corresponding to the target focus;
determining a lesion change period based on the type information;
and acquiring a plurality of focus image pairs according to the change duration corresponding to the focus change period as an acquisition interval.
Optionally, in some possible implementations of the present application, the method further includes:
acquiring a reference focus area;
determining a negative exemplar parameter based on a similarity of the reference lesion region to a lesion region of the first image;
determining a first difference portion of the reference lesion region from a lesion region of the first image;
determining a second difference portion of the lesion area of the second image and the lesion area of the first image;
determining a positive sample parameter from the first difference portion and the second difference portion;
determining an evaluation index based on the negative sample parameter and the positive sample parameter, the evaluation index indicating a prediction accuracy of the target model.
Optionally, in some possible implementations of the present application, the method further includes:
if the evaluation index indicates that the target model does not meet a preset index, determining difference information in the focus image pair;
reviewing the pair of focus images based on the difference information to update the pair of focus images, and training the target model based on the updated pair of focus images.
Optionally, in some possible implementations of the present application, the method further includes:
inputting the predicted image into the trained target model to obtain a circulating image;
if the cyclic image is the same as the predicted image, determining the predicted image as a prediction result;
and if the cyclic image is different from the predicted image, predicting the change condition of the target focus based on the cyclic image.
Optionally, in some possible implementations of the present application, the pair of lesion images is obtained by a computed tomography, the target lesion is a cerebral hemorrhage region obtained by the computed tomography, the target model is a depth generation countermeasure network, and the predicted image is used to predict a variation trend of the cerebral hemorrhage region.
A second aspect of the present application provides a lesion image processing apparatus, including:
the encoding unit is used for acquiring a to-be-predicted image containing a target focus and inputting the image into an encoder in a target model for vector representation to obtain an encoding vector;
a mapping unit, configured to input the coding vector into a mapper in the target model to perform vector transformation of image dimensions to obtain a target vector, where the mapper is obtained by training based on a pair of lesion images, the pair of lesion images is composed of training images obtained by detection based on a timing relationship, and the training images include the target lesion;
and the processing unit is used for inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus after the change of the time sequence relation.
Optionally, in some possible implementations of the present application, the processing apparatus of the lesion image further includes:
a training unit, configured to acquire a plurality of pairs of focal images, where the pairs of focal images include a first image and a second image that are the training images, the first image is an image obtained by detecting a target focal at a first time node, the second image is an image obtained by detecting the target focal at a second time node, and the second time node is after the first time node;
and the training unit is specifically used for training a target model based on the focus image, the training process of the target model comprises a reconstruction task and a preset task, the reconstruction task is used for carrying out image reconstruction based on the first image, and the prediction task is used for learning the corresponding relation between the first image and the second image.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to perform identity mapping on the first image in the focus image pair to obtain a reconstructed decoded image;
the training unit is specifically configured to execute the reconstruction task according to a process of restoring the reconstructed decoded image to the first image;
the training unit is specifically configured to end the reconstruction task if the similarity between the reconstructed decoded image and the first image reaches a threshold;
the training unit is specifically configured to execute the prediction task based on a correspondence between the first image and the second image, so as to train the target model.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to perform an encoding operation on the first image in the lesion image pair, so as to compress a feature dimension of the first image to obtain a first feature vector;
the training unit is specifically configured to perform decoding operation on the first feature vector to perform similarity constraint based on a first constraint function to obtain a reconstructed decoded image, where a feature dimension of the reconstructed decoded image is a feature dimension of the first feature vector restored to the first image;
the training unit is specifically configured to perform a discrimination operation on the reconstructed decoded image and the first image to obtain a discrimination result, where a target of the discrimination operation is that the discrimination result output based on a second constraint function is the first image, and a target of the encoding operation and the decoding operation is that the discrimination result output based on a third constraint function is the reconstructed decoded image;
the training unit is specifically configured to execute the reconstruction task based on the adjustment process of the second constraint function and the third constraint function.
Optionally, in some possible implementation manners of the present application, the training unit is specifically configured to obtain a preset number of preheating steps;
the training unit is specifically configured to perform a similarity constraint process of the first constraint function based on the preset preheating step number cycle, so as to preheat the processes of the encoding operation and the decoding operation.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to perform vector representation on the first image to obtain a low-dimensional vector;
the training unit is specifically configured to input the low-dimensional vector into a mapper to obtain a prediction vector;
the training unit is specifically configured to convert, based on the prediction vector, to a vector corresponding to the second image to adjust a fourth constraint function;
the training unit is specifically configured to execute the prediction task based on the adjustment process of the fourth constraint function to train the target model.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to crop an image in the focus image pair;
the training unit is specifically used for aligning images in the cut focus image pair based on a preset angle to obtain an aligned image pair;
the training unit is specifically configured to adjust the images in the aligned image pair to the same feature dimension;
the training unit is specifically configured to perform normalization processing on the images in the adjusted aligned image pair to update the focus image pair.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to determine body part information corresponding to the target lesion;
the training unit is specifically used for determining a cutting item based on the body part information;
and the training unit is specifically used for clipping the images in the focus image pair according to the clipping item.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to determine type information corresponding to the target lesion;
the training unit is specifically used for determining a lesion change period based on the type information;
the training unit is specifically configured to acquire a plurality of pairs of the focus images according to a change duration corresponding to the focus change period as an acquisition interval.
Optionally, in some possible implementations of the present application, the training unit is specifically configured to acquire a reference lesion area;
the training unit is specifically configured to determine a negative sample parameter based on a similarity between the reference lesion area and the lesion area of the first image;
the training unit is specifically configured to determine a first difference between the reference lesion area and a lesion area of the first image;
the training unit is specifically configured to determine a second difference portion between a focal region of the second image and a focal region of the first image;
the training unit is specifically configured to determine a positive sample parameter according to the first difference portion and the second difference portion;
the training unit is specifically configured to determine an evaluation index based on the negative sample parameter and the positive sample parameter, where the evaluation index is used to indicate prediction accuracy of the target model.
Optionally, in some possible implementation manners of the present application, the training unit is specifically configured to determine difference information in the focus image pair if the evaluation indicator indicates that the target model does not meet a preset indicator;
the training unit is specifically configured to review the pair of focus images based on the difference information to update the pair of focus images, and train the target model based on the updated pair of focus images.
Optionally, in some possible implementations of the present application, the processing unit is specifically configured to input the predicted image into the trained target model to obtain a loop image;
the processing unit is specifically configured to determine that the predicted image is a prediction result if the loop image is the same as the predicted image;
and the processing unit is specifically configured to predict a change condition of the target lesion based on the loop image if the loop image is different from the predicted image.
A third aspect of the present application provides a computer device comprising: a memory, a processor, and a bus system; the memory is used for storing program codes; the processor is configured to execute the method for processing a lesion image according to any one of the first aspect or the first aspect according to instructions in the program code.
A fourth aspect of the present application provides a computer-readable storage medium having stored therein instructions, which, when executed on a computer, cause the computer to execute the method for processing a lesion image according to the first aspect or any one of the first aspects.
According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method for processing a lesion image provided in the first aspect or the various alternative implementations of the first aspect.
According to the technical scheme, the embodiment of the application has the following advantages:
obtaining a to-be-predicted image containing a target focus and inputting the to-be-predicted image into an encoder in a target model for vector representation to obtain an encoding vector; then inputting the coding vector into a mapper in a target model to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images comprise target focuses; and then inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a diagram of a network architecture in which a lesion image processing system operates;
fig. 2 is a flowchart of a lesion image processing method according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a method for processing a lesion image according to an embodiment of the present disclosure;
fig. 4 is a scene schematic diagram of a method for processing a lesion image according to an embodiment of the present disclosure;
fig. 5 is a schematic view of another exemplary embodiment of a method for processing a lesion image;
fig. 6 is a schematic view of another exemplary embodiment of a method for processing a lesion image;
fig. 7 is a schematic view of another exemplary embodiment of a method for processing a lesion image;
fig. 8 is a flowchart of another lesion image processing method according to an embodiment of the present disclosure;
fig. 9 is a schematic view of another exemplary embodiment of a method for processing a lesion image;
fig. 10 is a schematic structural diagram of a lesion image processing apparatus according to an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of a terminal device according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a method for processing a focus image and a related device, which can be applied to a system or a program containing a focus image processing function in terminal equipment, and can obtain an encoding vector by obtaining a to-be-predicted image containing a target focus and inputting the image into an encoder in a target model for vector representation; then inputting the coding vector into a mapper in a target model to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images comprise target focuses; and then inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims of the present application and in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "corresponding" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some nouns that may appear in the embodiments of the present application are explained.
Computed Tomography (CT): the X-ray beam, gamma ray, ultrasonic wave, etc. with precise collimation are used together with high sensitivity detector to scan the sections around some part of human body one by one.
Cerebral hemorrhage: refers to bleeding caused by rupture of blood vessels within the non-traumatic brain parenchyma.
D, Dice: a medical image segmentation evaluation index is generally used for calculating the similarity of two samples, wherein the value ranges from 0 to 1, the segmentation result is preferably 1, and the worst value is 0.
Dice of Change: and (3) prediction accuracy indexes of hematoma change regions between two CT times.
It should be understood that the method for processing a lesion image provided in the present application may be applied to a system or a program including a processing function of a lesion image in a terminal device, for example, an auxiliary medical application, specifically, the processing system of a lesion image may operate in a network architecture as shown in fig. 1, which is a network architecture diagram operated by the processing system of a lesion image as shown in fig. 1, as can be seen from the figure, the processing system of a lesion image may provide a processing process of a lesion image with multiple information sources, that is, a lesion image and sequence images at different time points are obtained through a detection operation at a terminal side (medical detection instrument), and then sent to a server for analysis, so that the server performs a sorting training after collecting a certain number of lesion images, so as to predict a development condition of a lesion; it is understood that, fig. 1 shows various terminal devices, the terminal devices may be computer devices, and in an actual scene, there may be more or fewer types of terminal devices participating in the process of processing the lesion image, and the specific number and type are determined by the actual scene, and are not limited herein, and in addition, fig. 1 shows one server, but in an actual scene, there may also be participation of multiple servers, and the specific number of servers is determined by the actual scene.
In this embodiment, the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a medical detection device, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal and the server may be directly or indirectly connected through a wired or wireless communication manner, and the terminal and the server may be connected to form a block chain network, which is not limited herein.
It is understood that the above-mentioned lesion image processing system may be operated in a personal mobile terminal, for example: the application can be used as an auxiliary medical application, can also be operated on a server, and can also be used as a processing device operated on a third-party device to provide a focus image so as to obtain a processing result of the focus image of an information source; the specific focus image processing system may be operated in the above-mentioned device in the form of a program, may also be operated as a system component in the above-mentioned device, and may also be used as one of cloud service programs, and a specific operation mode is determined by an actual scene, which is not limited herein.
With the rapid development of medical technology, more and more detection items appear in a medical scene, and how to identify and predict a focus on a detection image generated in medical detection becomes a difficult problem.
Generally, the prediction of the lesion area in the medical image needs to be identified according to the professional knowledge and experience of the doctor, and the requirement on the professional skill of the doctor is relatively high.
However, the process of manual prediction is time-consuming and labor-consuming, and is affected by subjective factors, the result of lesion prediction is unstable, and the accuracy of lesion change prediction is affected.
In order to solve the above problems, the present application provides a method for processing a lesion image by Machine Learning (ML), which is a multi-domain cross subject and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.
Specifically, the method is applied to a process framework of processing a lesion image shown in fig. 2, as shown in fig. 2, which is a process framework of processing a lesion image provided in an embodiment of the present application, a user performs an inspection operation by controlling a medical device, so that a server receives a lesion image and sorts the lesion image according to time, thereby obtaining a lesion image pair, and thus a model can be trained according to the lesion image, and the trained model can be used for lesion change prediction.
It is understood that the method provided by the present application may be a program written as a processing logic in a hardware system, or may be a processing apparatus for a lesion image, and the processing logic is implemented in an integrated or external manner. As one implementation manner, the processing device of the focus image performs vector representation by an encoder which acquires a to-be-predicted image containing a target focus and inputs the image into a target model to obtain an encoded vector; then inputting the coding vector into a mapper in a target model to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images comprise target focuses; and then inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.
The scheme provided by the embodiment of the application relates to an artificial intelligence machine learning technology, and is specifically explained by the following embodiment:
with reference to the above flow architecture, a method for processing a lesion image in the present application will be introduced below, please refer to fig. 3, fig. 3 is a flow chart of a method for processing a lesion image according to an embodiment of the present application, where the prediction method may be executed by a terminal, or may be executed by a server, or may be executed by both the terminal and the server, and the following description is given by taking the terminal as an example, where the embodiment of the present application at least includes the following steps:
301. and acquiring a to-be-predicted image containing the target focus, and inputting the image into an encoder in the target model for vector representation to obtain an encoding vector.
In this embodiment, the target lesion may be an image of a lesion at different positions, including but not limited to disease lesion types such as a cerebral hematoma, a tumor, and a tuberculosis, and the present application takes a prediction of a change in a cerebral hematoma as an example.
Specifically, the target model is generated into a countermeasure network, and specifically comprises an encoder, a mapper, a decoder and a discriminator. The encoder may be composed of a plurality of convolutional layers and a fully-connected layer, and the encoder is used for performing vector representation on the image to be predicted so as to facilitate the subsequent mapping process.
302. And inputting the coding vector into a mapper in the target model to perform vector conversion of image dimensions so as to obtain a target vector.
In this embodiment, the target vector for indicating the image after the target lesion is developed is obtained by vector conversion of image dimensions, which is because the target model of the present application is performed based on the discrimination and training of the corresponding relationship of the images, that is, the mapper is trained based on the pair of lesion images, the pair of lesion images is composed of training images detected based on a time sequence relationship, the training images contain the target lesion, the time sequence relationship may be a fixed time period (for example, two or more images adjacent to each other for 24 hours), or a dynamic time period (for example, two or more lesion development images in different treatment stages) may also be a set time period based on the lesion development (for example, two or more images before and after the lesion is changed, that is, the interval duration is not fixed).
It will be appreciated that for the mapper training process, i.e. for the entire training process of the target model, parameter adjustments of the encoder, decoder and arbiter may be involved in the mapper training process.
The following describes the training process of the model.
Firstly, a plurality of focus image pairs are collected, wherein each focus image pair comprises a first image and a second image, the first image is an image obtained by detecting a target focus at a first time node, the second image is an image obtained by detecting the target focus at a second time node, and the second time node is behind the first time node.
In a possible scenario, as shown in fig. 4, fig. 4 is a schematic view of another scenario of a method for processing a lesion image according to an embodiment of the present application; a pair of lesion image pairs is shown, i.e., a first image containing a lesion area a1 at a first time node and a second image containing a lesion area a2 at a second time node, i.e., a lesion area a2 is changed from the lesion area a1, thereby illustrating the change in the lesion.
The image pair training is constructed by CT images acquired by a large number of patients at different time points, so that the model can finally predict the possible future state of the model based on a single CT image. In the present application, CT obtained at the previous time node is defined as CT1, and the image obtained at the later time node is defined as CT 2.
Optionally, because the change periods of different focuses are different, when collecting the focus image pairs, the period duration may be considered, that is, the type information corresponding to the target focus is determined first; then determining a lesion change period based on the type information; and then collecting a plurality of focus image pairs according to the change duration corresponding to the focus change period as a collection interval. Therefore, the accuracy of the acquired pair of the lesion images is ensured, namely the development condition of the acquired pair of the lesion images is relatively determined.
In a possible scenario, the focus image is obtained by computer tomography, and the target focus is a cerebral hemorrhage region obtained by computer tomography, and the predicted image is used to predict a variation trend of the cerebral hemorrhage region.
The target model is then trained based on the lesion image pair. The training process of the target model comprises a reconstruction task and a preset task, wherein the reconstruction task is used for carrying out image reconstruction based on a first image, and the prediction task is used for learning the corresponding relation between the first image and a second image; specifically, the target model may be a depth generation countermeasure network, which includes 4 modules, namely, an Encoder (Encoder), a Decoder (Decoder), a Discriminator (Discriminator) and a Mapper (Mapper), and specifically, the first image in the focus image pair is subjected to identity mapping (Decoder decoding after Encoder encoding) to obtain a reconstructed decoded image; then, executing a reconstruction task according to the process of restoring the first image from the reconstructed decoded image of the discriminator; if the similarity between the reconstructed decoded image and the first image reaches a threshold value, ending the reconstruction task; and the mapper executes a prediction task based on the corresponding relation of the first image and the second image so as to train the target model.
Specifically, the process of predicting the subsequent change of a CT image is divided into two subtasks: the first subtask is to reconstruct the image, and when the model has the capability to recover a CT completely from the network, the model will perform the second subtask, change prediction. For two subtasks, the training of the model is divided into two stages, namely identity mapping and target mapping, and then the training processes of the two stages will be described respectively.
For the process of identity mapping, referring to the scene architecture shown in fig. 5, fig. 5 is a scene schematic diagram of another method for processing a lesion image according to the embodiment of the present application; firstly, an encoder performs encoding operation on a first image in a focus image pair so as to compress the characteristic dimension of the first image to obtain a first characteristic vector; then, the decoder performs a decoding operation on the first feature vector to perform similarity constraint based on a first constraint function to obtain a reconstructed decoded image, where a feature dimension of the reconstructed decoded image is a feature dimension of the first feature vector restored to the first image, and specifically, the first constraint function may be:
Figure BDA0002980197930000151
where E is the output of the encoder and D is the output of the decoder, the function being used to minimize the difference between the first image and the reconstructed decoded image.
Further, the discriminator performs discrimination operation on the reconstructed decoded image and the first image to obtain a discrimination result, the discrimination result output by the discriminator based on the second constraint function is the first image, and the discrimination result output by the encoder and the decoder based on the third constraint function is the reconstructed decoded image; wherein the second constraint function may be:
Figure BDA0002980197930000152
where E is the output of the encoder, D is the output of the decoder, and Dis is the output of the discriminator, the function is used to instruct the discriminator to resolve and generate the real image (first image) as much as possible.
The third constraint function may be:
Figure BDA0002980197930000153
where E is the output of the encoder, D is the output of the decoder, and Dis is the output of the discriminator, which is a function used to instruct the encoder and decoder to spoof the discriminator as much as possible, i.e., the discriminator generates a reconstructed decoded image.
The reconstruction task can thus be performed based on the adjustment procedure of the second constraint function and the third constraint function.
Optionally, the identity mapping process may further preheat an encoder and a decoder, that is, attempt to execute a process of reconstructing a decoded image and restoring the decoded image to a first image, that is, first obtain a preset number of preheating steps; and then circularly performing the similarity constraint process of the first constraint function based on the preset preheating step number to preheat the processes of the encoding operation and the decoding operation.
In one possible scenario, the identity map first pre-heats the two modules, namely the encoder and the decoder, the pre-heating compresses a pre-processed CT1 from the encoder to a low-dimensional representation (for example, compresses a 128 × 128 CT image to a 1024-dimensional vector), and then the decoder restores the corresponding low-dimensional representation to the original dimension, and performs similarity constraint on the original image, where the constraint function is the first constraint function. Specifically, the encoder in the model is composed of a plurality of convolution layers and a full connection layer, and the decoder is composed of a full connection layer and a plurality of deconvolution layers.
Further, after preheating a certain number of steps, a discriminator is added for training, and the discriminator is composed of a plurality of convolution layers and a full connection layer. The decoder decoded image and the real image are simultaneously input into a discriminator for discrimination, the discriminator needs to distinguish and generate the real image as much as possible, and the encoder and the decoder need to deceive the discriminator as much as possible.
Specifically, the constraint function of the discriminator is a second constraint function, and the constraint function of the decoder and generator is a third constraint function. During the training process, the encoder, the generator and the discriminator are alternately trained according to respective optimization constraint functions until the model converges.
It will be appreciated that alternating training is constantly performed by the two networks with opposite objectives. When the image is finally converged, if the source of a sample can not be judged by the discriminator, the method is equivalent to that the generation network can generate the sample which accords with the real data distribution, so that the authenticity of the image output by the target model is ensured, namely the image is similar to the image output by an actual medical detection instrument.
On the other hand, for the process of target mapping, referring to the scene architecture shown in fig. 6, fig. 6 is a scene schematic diagram of another method for processing a lesion image according to the embodiment of the present application; firstly, performing vector representation on a first image based on an encoder to obtain a low-dimensional vector; then inputting the low-dimensional vector into a mapper to obtain a prediction vector; then, converting the vector corresponding to the second image based on the prediction vector so as to adjust a fourth constraint function; executing a prediction task through an adjustment process based on a fourth constraint function to train the target model, specifically, the fourth constraint function may be:
Figure BDA0002980197930000171
where E is the output of the encoder, D is the output of the decoder, M is the output of the mapper, and L is the similarity constraint.
In one possible scenario, the modules involved in the training in the target mapping process are mapper, encoder and decoder, where the parameters of encoder and decoder are inherited from the previous stage. In the training process, after the CT1 obtains low-dimensional expression embedding (feature vector 1) through an encoder, the low-dimensional expression embedding is input into a mapper to obtain prediction embedding, the prediction embedding is decoded by the decoder to obtain a predicted CT image CTpred, and the CTpred is close to the vector representation (feature vector 2) of the CT 2. The mapper is composed of a plurality of fully connected layers, the effect of the mapper is to convert CT1 to CT2 in the embedding dimension, and in the whole training process, the parameters of the modules except the mapper are fixed.
It is to be understood that the similarity constraint L in the above two training phases is both a mean square error constraint. Since the optimization goal of the first stage is to reconstruct the image, a large amount of label-free data can be used in this training stage, thereby ensuring the reconstruction quality of the data.
Optionally, the first image and the second image participating in the training may be preprocessed, that is, the images in the focus image pair are firstly cropped; then aligning images in the cut focus image pair based on a preset angle to obtain an aligned image pair; adjusting the images in the aligned image pair to the same characteristic dimension; and further carrying out standardization processing on the images in the adjusted aligned image pair so as to update the focus image pair. Thereby ensuring that the lesion positions of the first image and the second image are aligned for subsequent vector processing.
Optionally, because the body parts corresponding to different focuses are different, a targeted clipping process may be performed, that is, the body part information corresponding to the target focus is determined first; then determining a cutting item based on the body part information; and then the image in the focus image pair is cut according to the cutting item.
In one possible scenario, the main goal of the preprocessing described above is to align CT1 with CT2 as much as possible and to remove all the remaining differences beyond the image-induced changes caused by the development of the non-lesion itself. Firstly, the method removes the areas except the brain tissue in the CT by using an image cutting method, wherein the areas include two parts of head cutting and bone removing. The CT1 and CT2 would then be registered to the same angle using image registration techniques, so that the location of the lesion is aligned. Further, to align the differences in scan layer thickness, all images are resampled to the same layer thickness and all data are fixed to the same dimension by pad and crop methods. And finally, the image is standardized, and the value range of the image is mapped to the range from-1 to 1, so that the alignment relation between the CT1 and the CT2 is ensured, and the accuracy of data in the model training process is improved.
Optionally, the process of evaluating the effect of the model may be performed after the target model is trained, which is to consider that the prediction of the present application aims at accurately depicting the hematoma region on CT2, and it is desirable to keep the original state as possible for the sample without much change (negative sample), and to depict the changed part as accurately as possible for the sample containing much change. So that a reference lesion area can be obtained first; then theDetermining a negative sample parameter based on a similarity of the reference lesion area and a lesion area of the first image; and determining a first difference portion between the reference lesion region and the lesion region of the first image; further determining a second difference part of the lesion region of the second image and the lesion region of the first image; then determining a positive sample parameter according to the first difference part and the second difference part; and further determining an evaluation index based on the negative sample parameter and the positive sample parameter, wherein the evaluation index is used for indicating the prediction precision of the target model. For example, first, a hematoma region of a generated image is divided by a hematoma division method to obtain Mgen(reference lesion region), and then M is calculated for the negative sample (sample not satisfying the clinical judgment enlargement condition)genDice from the haematoma region of CT1, defined as Dneg(ii) a To align the samples, the application will calculate MgenThe dice of the difference between CT1 and CT2 and CT1 is defined as Dpos. Further, an average between the two is calculated by a weighting formula as an index of the evaluation model, and the weighting formula may be:
Avg=2*(Dneg*Dpos)/(Dneg+Dpos)
wherein, Avg is a model index, DnegIs MgenSimilarity to the haematoma region of CT1, DposIs MgenAnd the similarity of the difference between CT1 and the difference between CT2 and CT 1.
Optionally, if the evaluation index indicates that the target model does not meet the preset index, the pair of focus images participating in the training may be audited, that is, difference information in the pair of focus images is determined; and then reviewing the pair of lesion images based on the difference information (for example, manually reviewing the corresponding situation of the pair of lesion images with smaller difference), so as to update the pair of lesion images, and training the target model based on the updated pair of lesion images.
303. And inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation.
In this embodiment, the parameter configuration of the decoder refers to the process of model training described above. After the image is restored, a result of a lesion development prediction process is obtained, and the prediction process is applicable to a prediction process of a lesion of the same type, for example, in a scene shown in fig. 7, fig. 7 is a scene schematic diagram of another method for processing a lesion image provided in the embodiment of the present application; the figure shows a contrast graph of a predicted image obtained from a to-be-predicted image containing a target focus based on a trained target model, and a visible image is clear and is consistent with a real image.
It is understood that the image to be predicted in the present application may be simply analyzed, i.e. with identification marks; the model can be also without identification marks, even if the identification marks (such as hematoma region marks) do not exist, the data can be used for training, supervision information depended on by the model is derived from objective image states, the subjective marks of people are not used, the influence caused by the marks is smaller, and the application range is wide.
By combining the embodiments, the method and the device for predicting hematoma change trend in the image prediction angle show the change trend of hematoma, and specifically obtain a to-be-predicted image containing a target focus and input the to-be-predicted image into an encoder in a target model for vector representation to obtain an encoding vector; then inputting the coding vector into a mapper in a target model to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images comprise target focuses; and then inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.
The above embodiment describes a process of primary prediction, and in a possible scenario, the present application may further perform a process of loop identification, which is described below. Referring to fig. 8, fig. 8 is a flowchart of another method for processing a lesion image according to an embodiment of the present disclosure, which includes the following steps:
801. a target detection image is acquired in response to the detection operation.
In this embodiment, the detection operation may be an operation of a detection person on the medical detection device, that is, after a preliminary target detection image is obtained by initiating detection, a subsequent prediction process is automatically triggered.
802. And inputting the target detection image into a target model to obtain a first prediction image.
In this embodiment, the process of predicting the target detection image is similar to step 303 in the embodiment shown in fig. 3, and is not repeated here.
803. Case information of the target detection image is judged.
In this embodiment, the target detection image is obtained, and the medical staff can obtain preliminary case information, for example: hematoma diffusion, hematoma shrinkage, etc.
804. And inputting the first prediction image into the target model to obtain a second prediction image.
In this embodiment, since it is not determined whether the first predicted image is the final form of the lesion, that is, whether the lesion may be further expanded, reduced, or unchanged, a process of inputting the target model and predicting it twice may be performed, so that the second predicted image may be obtained.
805. Case inference is performed based on the second prediction image, and the first prediction image and the second prediction image are presented on the basis of the target detection image.
In this embodiment, if the loop image (second predicted image) is the same as the predicted image (first predicted image), it indicates that the first predicted image may be in a current stable state of the lesion, and the predicted image may be determined as a prediction result; if the loop image (second prediction image) is different from the prediction image (first prediction image), the change situation of the target focus is predicted based on the loop image, or the second prediction image is determined as the prediction result, namely the focus is still in a change state currently, and the first prediction image and the second prediction image can be displayed on the basis of the target detection image.
In a possible scenario, as shown in fig. 9, fig. 9 is a schematic view of another scenario of a method for processing a lesion image according to an embodiment of the present application; the figure shows the process of displaying the first predictive image and the second predictive image on the basis of the target detection image, i.e., the target detection image B1, the first predictive image B2 and the second predictive image B3 are gradually extended, thereby dynamically displaying the change of the lesion.
The embodiment is based on a method for generating an antagonistic network, and can generate a corresponding CT image according to the acquired CT image and the change trend of the target lesion, thereby providing a richer reference basis for the prognostic analysis of the patient.
In order to better implement the above-mentioned aspects of the embodiments of the present application, the following also provides related apparatuses for implementing the above-mentioned aspects. Referring to fig. 10, fig. 10 is a schematic structural diagram of a device for processing a lesion image according to an embodiment of the present application, wherein the processing device 1000 includes:
an encoding unit 1001, configured to obtain a to-be-predicted image including a target lesion, and input the to-be-predicted image into an encoder in a target model for vector representation to obtain an encoding vector;
a mapping unit 1002, configured to input the coding vector into a mapper in the target model to perform vector transformation of an image dimension to obtain a target vector, where the mapper is obtained by training based on a pair of lesion images, the pair of lesion images is composed of training images obtained by detection based on a timing relationship, and the training images include the target lesion;
a processing unit 1003, configured to input the target vector into a decoder in the target model to perform image prediction, so as to obtain a predicted image including the target lesion after being changed based on the temporal relationship.
Optionally, in some possible implementations of the present application, the processing apparatus of the lesion image further includes:
a training unit 1004, configured to acquire a plurality of pairs of lesion images, where the pairs of lesion images include a first image and a second image that are the training images, the first image is an image obtained by detecting a target lesion at a first time node, the second image is an image obtained by detecting the target lesion at a second time node, and the second time node is after the first time node;
a training unit 1004, configured to train a target model based on the lesion image, where a training process of the target model includes a reconstruction task and a preset task, the reconstruction task is configured to perform image reconstruction based on the first image, and the prediction task is configured to learn a correspondence between the first image and the second image.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to perform identity mapping on the first image in the lesion image pair to obtain a reconstructed decoded image;
the training unit 1004 is specifically configured to execute the reconstruction task according to a process of restoring the reconstructed decoded image to the first image;
the training unit 1004 is specifically configured to end the reconstruction task if the similarity between the reconstructed decoded image and the first image reaches a threshold;
the training unit 1004 is specifically configured to execute the prediction task based on a correspondence between the first image and the second image, so as to train the target model.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to perform an encoding operation on the first image in the lesion image pair, so as to compress a feature dimension of the first image to obtain a first feature vector;
the training unit 1004 is specifically configured to perform a decoding operation on the first feature vector to perform similarity constraint based on a first constraint function to obtain a reconstructed decoded image, where a feature dimension of the reconstructed decoded image is a feature dimension of the first feature vector restored to the first image;
the training unit 1004 is specifically configured to perform a discrimination operation on the reconstructed decoded image and the first image to obtain a discrimination result, where a target of the discrimination operation is that the discrimination result output based on a second constraint function is the first image, and a target of the encoding operation and the decoding operation is that the discrimination result output based on a third constraint function is the reconstructed decoded image;
the training unit 1004 is specifically configured to execute the reconstruction task based on the adjustment process of the second constraint function and the third constraint function.
Optionally, in some possible implementation manners of the present application, the training unit 1004 is specifically configured to obtain a preset preheating step number;
the training unit 1004 is specifically configured to perform a similarity constraint process of the first constraint function based on the preset preheating step number cycle, so as to preheat the processes of the encoding operation and the decoding operation.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to perform vector representation on the first image to obtain a low-dimensional vector;
the training unit 1004 is specifically configured to input the low-dimensional vector into a mapper to obtain a prediction vector;
the training unit 1004 is specifically configured to convert the prediction vector into a vector corresponding to the second image, so as to adjust a fourth constraint function;
the training unit 1004 is specifically configured to execute the prediction task based on the adjustment process of the fourth constraint function to train the target model.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to crop an image in the lesion image pair;
the training unit 1004 is specifically configured to align the images in the cut lesion image pair based on a preset angle to obtain an aligned image pair;
the training unit 1004 is specifically configured to adjust the images in the aligned image pair to the same feature dimension;
the training unit 1004 is specifically configured to perform normalization processing based on the adjusted images in the aligned image pair, so as to update the lesion image pair.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to determine body part information corresponding to the target lesion;
the training unit 1004 is specifically configured to determine a crop item based on the body part information;
the training unit 1004 is specifically configured to crop the image in the focus image pair according to the cropping item.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to determine type information corresponding to the target lesion;
the training unit 1004 is specifically configured to determine a lesion change period based on the type information;
the training unit 1004 is specifically configured to acquire a plurality of pairs of the lesion images according to a variation duration corresponding to the lesion variation period as an acquisition interval.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to obtain a reference lesion region;
the training unit 1004 is specifically configured to determine a negative sample parameter based on a similarity between the reference lesion region and a lesion region of the first image;
the training unit 1004 is specifically configured to determine a first difference portion between the reference lesion region and a lesion region of the first image;
the training unit 1004 is specifically configured to determine a second difference portion between the lesion region of the second image and the lesion region of the first image;
the training unit 1004 is specifically configured to determine a positive sample parameter according to the first difference portion and the second difference portion;
the training unit 1004 is specifically configured to determine an evaluation index based on the negative sample parameter and the positive sample parameter, where the evaluation index is used to indicate a prediction accuracy of the target model.
Optionally, in some possible implementations of the present application, the training unit 1004 is specifically configured to determine difference information in the focus image pair if the evaluation indicator indicates that the target model does not meet a preset indicator;
the training unit 1004 is specifically configured to review the pair of focus images based on the difference information to update the pair of focus images, and train the target model based on the updated pair of focus images.
Optionally, in some possible implementations of the present application, the processing unit 1003 is specifically configured to input the predicted image into the trained target model to obtain a loop image;
the processing unit 1003 is specifically configured to determine that the predicted image is a prediction result if the loop image is the same as the predicted image;
the processing unit 1003 is specifically configured to, if the loop image is different from the predicted image, perform prediction of a change condition of the target lesion based on the loop image.
Obtaining a to-be-predicted image containing a target focus and inputting the to-be-predicted image into an encoder in a target model for vector representation to obtain an encoding vector; then inputting the coding vector into a mapper in a target model to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images comprise target focuses; and then inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus changed based on the time sequence relation. Therefore, the process of focus change prediction based on artificial intelligence is realized, the target model is trained in an image-to-image training mode, and training data are acquired based on the actual change process of the focus, so that the target model has the presumption capability of the focus image, and the accuracy of the focus change prediction is improved.
An embodiment of the present application further provides a terminal device, as shown in fig. 11, which is a schematic structural diagram of another terminal device provided in the embodiment of the present application, and for convenience of description, only a portion related to the embodiment of the present application is shown, and details of the specific technology are not disclosed, please refer to a method portion in the embodiment of the present application. The terminal may be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a point of sale (POS), a vehicle-mounted computer, and the like, taking the terminal as the mobile phone as an example:
fig. 11 is a block diagram illustrating a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application. Referring to fig. 11, the cellular phone includes: radio Frequency (RF) circuitry 1110, memory 1120, input unit 1130, display unit 1140, sensors 1150, audio circuitry 1160, wireless fidelity (WiFi) module 1170, processor 1180, and power supply 1190. Those skilled in the art will appreciate that the handset configuration shown in fig. 11 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
The following describes each component of the mobile phone in detail with reference to fig. 11:
RF circuit 1110 may be used for receiving and transmitting signals during a message transmission or call, and in particular, for receiving downlink messages from a base station and then processing the received downlink messages to processor 1180; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1110 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 1110 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), etc.
The memory 1120 may be used to store software programs and modules, and the processor 1180 may execute various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1120. The memory 1120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1120 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 1130 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1130 may include a touch panel 1131 and other input devices 1132. The touch panel 1131, also referred to as a touch screen, can collect touch operations of a user on or near the touch panel 1131 (for example, operations of the user on or near the touch panel 1131 using any suitable object or accessory such as a finger, a stylus pen, etc., and a range of touch operations on the touch panel 1131 in an interval), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 1131 may include two parts, namely, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1180, and can receive and execute commands sent by the processor 1180. In addition, the touch panel 1131 can be implemented by using various types, such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 1130 may include other input devices 1132 in addition to the touch panel 1131. In particular, other input devices 1132 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 1140 may be used to display information input by the user or information provided to the user and various menus of the cellular phone. The display unit 1140 may include a display panel 1141, and optionally, the display panel 1141 may be configured in the form of a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), or the like. Further, the touch panel 1131 can cover the display panel 1141, and when the touch panel 1131 detects a touch operation on or near the touch panel, the touch panel is transmitted to the processor 1180 to determine the type of the touch event, and then the processor 1180 provides a corresponding visual output on the display panel 1141 according to the type of the touch event. Although in fig. 11, the touch panel 1131 and the display panel 1141 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1131 and the display panel 1141 may be integrated to implement the input and output functions of the mobile phone.
The handset may also include at least one sensor 1150, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1141 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1141 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
Audio circuitry 1160, speakers 1161, and microphone 1162 may provide an audio interface between a user and a cell phone. The audio circuit 1160 may transmit the electrical signal converted from the received audio data to the speaker 1161, and convert the electrical signal into a sound signal for output by the speaker 1161; on the other hand, the microphone 1162 converts the collected sound signals into electrical signals, which are received by the audio circuit 1160 and converted into audio data, which are then processed by the audio data output processor 1180, and then transmitted to, for example, another cellular phone via the RF circuit 1110, or output to the memory 1120 for further processing.
WiFi belongs to short-distance wireless transmission technology, and the cell phone can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 1170, and provides wireless broadband internet access for the user. Although fig. 11 shows the WiFi module 1170, it is understood that it does not belong to the essential constitution of the handset, and can be omitted entirely as needed within the scope not changing the essence of the invention.
The processor 1180 is a control center of the mobile phone, and is connected to various parts of the whole mobile phone through various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1120 and calling data stored in the memory 1120, thereby performing overall monitoring of the mobile phone. Optionally, processor 1180 may include one or more processing units; optionally, the processor 1180 may integrate an application processor and a modem processor, wherein the application processor mainly handles operating systems, user interfaces, application programs, and the like, and the modem processor mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated within processor 1180.
The mobile phone further includes a power supply 1190 (e.g., a battery) for supplying power to each component, and optionally, the power supply may be logically connected to the processor 1180 through a power management system, so that functions of managing charging, discharging, power consumption management, and the like are implemented through the power management system.
Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.
In the embodiment of the present application, the processor 1180 included in the terminal further has a function of executing the steps of the page processing method.
Referring to fig. 12, fig. 12 is a schematic structural diagram of a server according to an embodiment of the present invention, where the server 1200 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1222 (e.g., one or more processors) and a memory 1232, and one or more storage media 1230 (e.g., one or more mass storage devices) storing an application program 1242 or data 1244. Memory 1232 and storage media 1230 can be, among other things, transient storage or persistent storage. The program stored in the storage medium 1230 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 1222 may be configured to communicate with the storage medium 1230, to execute a series of instruction operations in the storage medium 1230 on the server 1200.
The server 1200 may also include one or more power supplies 1226, one or more wired or wireless network interfaces 1250, one or more input-output interfaces 1258, and/or one or more operating systems 1241, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
The steps performed by the management apparatus in the above-described embodiment may be based on the server configuration shown in fig. 12.
Also provided in an embodiment of the present application is a computer-readable storage medium, which stores therein instructions for processing a lesion image, and when the instructions are executed on a computer, the instructions cause the computer to perform the steps performed by the processing apparatus for a lesion image in the method described in the foregoing embodiments shown in fig. 3 to 9.
Also provided in an embodiment of the present application is a computer program product including instructions for processing a lesion image, which when executed on a computer causes the computer to perform the steps performed by a device for processing a lesion image in the method as described in the embodiment of fig. 3 to 9.
The embodiment of the present application further provides a system for processing a lesion image, where the system for processing a lesion image may include a device for processing a lesion image in the embodiment described in fig. 10, a terminal device in the embodiment described in fig. 11, or a server described in fig. 12.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a processing apparatus for lesion images, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (15)

1. A method for processing a lesion image, comprising:
acquiring a to-be-predicted image containing a target focus, and inputting the image into an encoder in a target model for vector representation to obtain an encoding vector;
inputting the coding vector into a mapper in the target model for vector conversion of image dimensions to obtain a target vector, wherein the mapper is obtained by training based on a focus image pair, the focus image pair is composed of training images obtained by detection based on a time sequence relation, and the training images contain the target focus;
and inputting the target vector into a decoder in the target model for image prediction to obtain a predicted image containing the target focus after the change of the time sequence relation.
2. The method of claim 1, further comprising:
acquiring a plurality of focus image pairs, wherein each focus image pair comprises a first image and a second image acquired based on the time sequence relation, the first image is an image obtained by detecting the target focus at a first time node, the second image is an image obtained by detecting the target focus at a second time node, and the second time node is behind the first time node;
training the target model based on the focus image pair, wherein the training process of the target model comprises a reconstruction task and a preset task, the reconstruction task is used for carrying out image reconstruction based on the first image, and the prediction task is used for learning the corresponding relation between the first image and the second image.
3. The method of claim 2, wherein the training of the target model based on the lesion image comprises:
performing identity mapping based on the first image in the focus image pair to obtain a reconstructed decoded image;
executing the reconstruction task according to the process of restoring the reconstructed decoded image to the first image;
if the similarity between the reconstructed decoded image and the first image reaches a threshold value, ending the reconstruction task;
and executing the prediction task based on the corresponding relation of the first image and the second image so as to train the target model.
4. The method of claim 3, wherein said performing identity mapping based on the first image of the lesion image pair to obtain a reconstructed decoded image comprises:
performing encoding operation on the first image in the focus image pair to compress the feature dimension of the first image to obtain a first feature vector;
performing decoding operation on the first feature vector to perform similarity constraint based on a first constraint function to obtain a reconstructed decoded image, wherein the feature dimension of the reconstructed decoded image is the feature dimension of the first feature vector restored to the first image;
performing discrimination operation on the reconstructed decoded image and the first image to obtain a discrimination result, wherein the discrimination result output based on a second constraint function is the first image as a target of the discrimination operation, and the discrimination result output based on a third constraint function is the reconstructed decoded image as a target of the encoding operation and the decoding operation;
the process of restoring the first image according to the reconstructed decoded image performs the reconstruction task, and includes:
and executing the reconstruction task based on the adjustment process of the second constraint function and the third constraint function.
5. The method of claim 4, further comprising:
acquiring a preset preheating step number;
and circularly performing the similarity constraint process of the first constraint function based on the preset preheating step number to preheat the processes of the encoding operation and the decoding operation.
6. The method of claim 3, wherein the performing the prediction task based on the correspondence of the first image and the second image to train the target model comprises:
performing vector representation on the first image to obtain a low-dimensional vector;
inputting the low-dimensional vector into a mapper to obtain a prediction vector;
converting to a vector corresponding to the second image based on the prediction vector to adjust a fourth constraint function;
and executing the prediction task based on the adjustment process of the fourth constraint function to train the target model.
7. The method of claim 3, further comprising:
cutting the image in the focus image pair;
aligning images in the cut focus image pair based on a preset angle to obtain an aligned image pair;
adjusting the images in the aligned image pair to the same feature dimension;
normalizing the images in the adjusted aligned image pair to update the pair of lesion images.
8. The method of claim 2, wherein acquiring a plurality of pairs of lesion images comprises:
determining type information corresponding to the target focus;
determining a lesion change period based on the type information;
and acquiring a plurality of focus image pairs according to the change duration corresponding to the focus change period as an acquisition interval.
9. The method according to any one of claims 2-8, further comprising:
acquiring a reference focus area;
determining a negative exemplar parameter based on a similarity of the reference lesion region to a lesion region of the first image;
determining a first difference portion of the reference lesion region from a lesion region of the first image;
determining a second difference portion of the lesion area of the second image and the lesion area of the first image;
determining a positive sample parameter from the first difference portion and the second difference portion;
determining an evaluation index based on the negative sample parameter and the positive sample parameter, the evaluation index indicating a prediction accuracy of the target model.
10. The method of claim 9, further comprising:
if the evaluation index indicates that the target model does not meet a preset index, determining difference information in the focus image pair;
reviewing the pair of focus images based on the difference information to update the pair of focus images, and training the target model based on the updated pair of focus images.
11. The method according to any one of claims 1-8, further comprising:
inputting the predicted image into the target model to obtain a circulating image;
if the cyclic image is the same as the predicted image, determining the predicted image as a prediction result;
and if the cyclic image is different from the predicted image, predicting the change condition of the target focus based on the cyclic image.
12. The method according to claim 1, wherein the pair of lesion images is computed tomography, the target lesion is a brain hemorrhage region obtained by computed tomography, the target model is a depth-generating countermeasure network, and the prediction image is used for predicting a variation trend of the brain hemorrhage region.
13. An apparatus for processing a lesion image, comprising:
the encoding unit is used for acquiring a to-be-predicted image containing a target focus and inputting the image into an encoder in a target model for vector representation to obtain an encoding vector;
a mapping unit, configured to input the coding vector into a mapper in the target model to perform vector transformation of image dimensions to obtain a target vector, where the mapper is obtained by training based on a pair of lesion images, the pair of lesion images is composed of training images obtained by detection based on a timing relationship, and the training images include the target lesion;
and the processing unit is used for inputting the target vector into a decoder in the target model to perform image prediction so as to obtain a predicted image containing the target focus after the change of the time sequence relation.
14. A computer device, the computer device comprising a processor and a memory:
the memory is used for storing program codes; the processor is configured to execute the method for processing a lesion image according to any one of claims 1 to 12 according to instructions in the program code.
15. A computer-readable storage medium having stored therein instructions which, when run on a computer, cause the computer to execute the method of processing a lesion image according to any one of claims 1 to 12.
CN202110285271.6A 2021-03-17 2021-03-17 Method for processing focus image and related device Pending CN113724188A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110285271.6A CN113724188A (en) 2021-03-17 2021-03-17 Method for processing focus image and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110285271.6A CN113724188A (en) 2021-03-17 2021-03-17 Method for processing focus image and related device

Publications (1)

Publication Number Publication Date
CN113724188A true CN113724188A (en) 2021-11-30

Family

ID=78672571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110285271.6A Pending CN113724188A (en) 2021-03-17 2021-03-17 Method for processing focus image and related device

Country Status (1)

Country Link
CN (1) CN113724188A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114418069A (en) * 2022-01-19 2022-04-29 腾讯科技(深圳)有限公司 Method and device for training encoder and storage medium
WO2023226009A1 (en) * 2022-05-27 2023-11-30 中国科学院深圳先进技术研究院 Image processing method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114418069A (en) * 2022-01-19 2022-04-29 腾讯科技(深圳)有限公司 Method and device for training encoder and storage medium
WO2023226009A1 (en) * 2022-05-27 2023-11-30 中国科学院深圳先进技术研究院 Image processing method and device

Similar Documents

Publication Publication Date Title
CN110517759B (en) Method for determining image to be marked, method and device for model training
CN110738263B (en) Image recognition model training method, image recognition method and image recognition device
CN110348543B (en) Fundus image recognition method and device, computer equipment and storage medium
CN111091576B (en) Image segmentation method, device, equipment and storage medium
CN111598900B (en) Image region segmentation model training method, segmentation method and device
CN110414631B (en) Medical image-based focus detection method, model training method and device
CN109919928B (en) Medical image detection method and device and storage medium
CN107895369B (en) Image classification method, device, storage medium and equipment
CN109949271B (en) Detection method based on medical image, model training method and device
CN110704661B (en) Image classification method and device
CN109934220B (en) Method, device and terminal for displaying image interest points
CN113610750B (en) Object identification method, device, computer equipment and storage medium
CN113177928B (en) Image identification method and device, electronic equipment and storage medium
CN110070129B (en) Image detection method, device and storage medium
CN110610181A (en) Medical image identification method and device, electronic equipment and storage medium
CN113724188A (en) Method for processing focus image and related device
CN114418069A (en) Method and device for training encoder and storage medium
CN112084959B (en) Crowd image processing method and device
CN113706441A (en) Image prediction method based on artificial intelligence, related device and storage medium
CN115082490B (en) Abnormity prediction method, and abnormity prediction model training method, device and equipment
US20230097391A1 (en) Image processing method and apparatus, electronic device, computer-readable storage medium, and computer program product
CN111598896B (en) Image detection method, device, equipment and storage medium
CN114722937A (en) Abnormal data detection method and device, electronic equipment and storage medium
CN112988984B (en) Feature acquisition method and device, computer equipment and storage medium
CN114328948A (en) Training method of text standardization model, text standardization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination