CN116993680A - Image processing method and training method of image processing model - Google Patents

Image processing method and training method of image processing model Download PDF

Info

Publication number
CN116993680A
CN116993680A CN202310814888.1A CN202310814888A CN116993680A CN 116993680 A CN116993680 A CN 116993680A CN 202310814888 A CN202310814888 A CN 202310814888A CN 116993680 A CN116993680 A CN 116993680A
Authority
CN
China
Prior art keywords
image processing
target
image
detection
detection area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310814888.1A
Other languages
Chinese (zh)
Inventor
高远
闫轲
郭恒
张灵
姚佳文
周靖人
吕乐
石喻
李春利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Damo Institute Hangzhou Technology Co Ltd
Original Assignee
Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Damo Institute Hangzhou Technology Co Ltd filed Critical Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority to CN202310814888.1A priority Critical patent/CN116993680A/en
Publication of CN116993680A publication Critical patent/CN116993680A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the specification provides an image processing method and a training method of an image processing model, wherein the image processing method comprises the following steps: receiving an image processing task, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not; and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information. The method combines the characteristic information of the detection area and the texture information of the detection area, improves the dimension of image processing, and further improves the accuracy of detection results.

Description

Image processing method and training method of image processing model
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to an image processing method.
Background
With the improvement of the living standard of people, the morbidity of each organ of the human body is increased year by year, and the organs become serious hidden dangers which threaten health, such as fatty liver, pulmonary fibrosis and the like. Currently, operator inspection is typically aided by ultrasound, nuclear magnetic resonance, or other imaging tools, but the accuracy depends on the skill and experience of the operator. Based on this, computed Tomography (CT) provides an operator independent, standardized, universal method of fat content quantification, making it an important tool for screening by discriminating between medical images.
In the current identification of medical images, the accuracy of identification is low, for example, the slight difference of liver attenuation caused by mild fatty liver or moderate fatty liver cannot be accurately identified and evaluated, so how to improve the accuracy of identification of medical images is called as a problem to be solved urgently by technicians.
Disclosure of Invention
In view of this, the present embodiment provides an image processing method. One or more embodiments of the present specification relate to an image processing apparatus, a computing device, a computer-readable storage medium, and a computer program that solve the technical drawbacks of the related art.
According to a first aspect of embodiments of the present specification, there is provided an image processing method including:
receiving an image processing task, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
According to a second aspect of embodiments of the present specification, there is provided a CT image processing method, including:
receiving a CT image processing task, wherein the CT image processing task carries a plurality of CT images corresponding to a target detection area, and the CT image processing task is used for detecting whether the target detection area is abnormal or not;
and inputting the CT images into a CT image processing model to obtain a detection result corresponding to the target detection region, wherein the CT image processing model generates detection region characteristic information and detection region texture information based on each CT image, and generates the detection result based on the detection region characteristic information and the detection region texture information.
According to a third aspect of embodiments of the present disclosure, there is provided a training method of an image processing model, applied to cloud-side equipment, including:
obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to a target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results;
inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection area;
calculating a model loss value according to the prediction detection result and the sample detection result;
adjusting model parameters of the image processing model according to the model loss value until model training stopping conditions are reached, and obtaining model parameters of the image processing model;
and sending the model parameters of the image processing model to end-side equipment.
According to a fourth aspect of embodiments of the present specification, there is provided an image processing method comprising:
receiving an image processing request sent by a user, wherein the image processing request comprises an image processing task, the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
Inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information;
and sending a detection result corresponding to the target detection area to a user.
According to a fifth aspect of embodiments of the present specification, there is provided an image processing apparatus comprising:
the receiving module is configured to receive an image processing task, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
and the detection module is configured to input the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
According to a sixth aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed by the processor, perform the steps of the method described above.
According to a seventh aspect of embodiments of the present description, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the above-described method.
According to an eighth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above method.
According to the image processing method provided by the embodiment of the specification, an image processing task is received, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not; and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
According to the method provided by the embodiment of the specification, in the process of identifying a plurality of target images, detection area characteristic information is generated based on each target image, detection area texture information is extracted from the detection area characteristic information, final image coding characteristics are generated based on the detection area characteristic information and the detection area characteristic information, and distillation characteristic information and classification characteristic information are added before image decoding, so that distillation characteristics and classification characteristics are referenced in the process of image decoding, and a final detection result is more accurate.
Drawings
FIG. 1 is a block diagram of an image processing system according to one embodiment of the present disclosure;
FIG. 2 is a flow chart of an image processing method provided in one embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a model structure of an image processing model according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of a CT image processing method according to one embodiment of the present disclosure;
FIG. 5 is a flow chart of a training method for an image processing model provided in one embodiment of the present disclosure;
FIG. 6 is a flow chart of another image processing method provided by one embodiment of the present disclosure;
Fig. 7 is a flowchart of a processing procedure of an image processing method applied to detecting a fatty liver scene according to an embodiment of the present disclosure;
fig. 8 is a schematic structural view of an image processing apparatus according to an embodiment of the present specification;
FIG. 9 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions, and is provided with corresponding operation entries for the user to select authorization or denial.
First, terms related to one or more embodiments of the present specification will be explained.
CT (Computed Tomography): the electronic computerized tomography scan uses precisely collimated X-ray beam and detector with very high sensitivity to scan one by one cross section around one part of human body, and features quick scan time and clear image.
CNN (convolutional neural network): the convolutional neural network is a deep learning model commonly used in the field of computer vision.
NC CT (non-coherent CT): plain scan CT, a common scan in CT, is commonly used for routine examination of certain parts.
CAD (computer aided diagnosis): the computer aided diagnosis refers to the auxiliary finding of focus through imaging, medical image processing technology and other possible physiological and biochemical means and by combining the analysis and calculation of a computer, the accuracy of diagnosis is improved.
With the continuous development of computer technology, various learning models are gradually applied to the prediction of various application scenes, and correspondingly, deep learning models have also been remarkably successful in the task of medical image Computer Aided Diagnosis (CAD). The pathological analysis and classification from medical images is an important topic in computer-aided diagnosis.
At present, when organs are detected, computer Tomography (CT) is generally used, and the CT provides a standardized and universal fat content quantification method independent of operators, so that the method becomes an ideal choice for detecting pathological organ screening in various clinical situations. Taking the example of detecting fatty liver, radiologists typically measure CT attenuation in Hounsfield Units (HU) throughout the liver and map one or more target regions over a representative substantial portion of the liver. The physician will also use measurements of differences in liver-spleen attenuation, liver-spleen attenuation ratio, etc. to compare liver attenuation, however, these processes are subjective. In the process of identifying the images, some slight differences and changes of organs cannot be identified very accurately.
Based on this, in the present specification, an image processing method is provided, and the present specification relates to an image processing apparatus, a computing device, and a computer-readable storage medium, which are described in detail one by one in the following embodiments.
Referring to fig. 1, fig. 1 illustrates an architecture diagram of an image processing system provided in one embodiment of the present disclosure, which may include a client 100 and a server 200;
The client 100 is configured to send an image processing task to the server 200. The image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
the server 200 is configured to input the plurality of target images to an image processing model, and obtain a detection result corresponding to the target detection area, where the image processing model generates detection area feature information and detection area texture information based on each target image, and generates the detection result based on the detection area feature information and the detection area texture information; sending a detection result of the image processing task to the client 100;
the client 100 is further configured to receive a detection result of the image processing task sent by the server 200.
By applying the scheme of the embodiment of the specification, an image processing task is received, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not; and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
According to the scheme provided by the embodiment of the specification, the detection area characteristic information is generated according to the target image, meanwhile, the detection area texture information in the target detection area is extracted, and finally, a final detection result is generated according to the detection area characteristic information and the detection area texture information. In the process of processing the target image by the image processing model, not only the characteristic information of the detection area is considered, but also the texture information of the detection area is further extracted from the characteristic information of the detection area, and the texture information can more accurately represent the state of the target detection area, so that the characteristics of the target detection area can be enriched by extracting and analyzing the characteristics of the texture information of the target detection area, thereby improving the precision of the image processing model.
In practical applications, the image processing system may include a plurality of clients 100 and a server 200, where the clients 100 may be referred to as an end-side device, and the server 200 may be referred to as a cloud-side device. Communication connection can be established between the plurality of clients 100 through the server 200, and in an image processing scenario, the server 200 is used to provide an image processing service between the plurality of clients 100, and the plurality of clients 100 can respectively serve as a transmitting end or a receiving end, so that communication is realized through the server 200.
The user may interact with the server 200 through the client 100 to receive data transmitted from other clients 100, or transmit data to other clients 100, etc. In the image processing scenario, it may be that the user issues a data stream to the server 200 through the client 100, and the server 200 generates a detection result according to the data stream and pushes the detection result to other clients that establish communication.
Wherein, the client 100 and the server 200 establish a connection through a network. The network provides a medium for a communication link between client 100 and server 200. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The data transmitted by the client 100 may need to be encoded, transcoded, compressed, etc. before being distributed to the server 200.
The client 100 may be a browser, APP (Application), or a web Application such as H5 (HyperText Markup Language, hypertext markup language (htv) 5 th edition) Application, or a light Application (also called applet, a lightweight Application) or cloud Application, etc., and the client 100 may be based on a software development kit (SDK, software Development Kit) of a corresponding service provided by the server 200, such as a real-time communication (RTC, real Time Communication) based SDK development acquisition, etc. The client 100 may be deployed in an electronic device, need to run depending on the device or some APP in the device, etc. The electronic device may for example have a display screen and support information browsing etc. as may be a personal mobile terminal such as a mobile phone, tablet computer, personal computer etc. Various other types of applications are also commonly deployed in electronic devices, such as human-machine conversation type applications, model training type applications, text processing type applications, web browser applications, shopping type applications, search type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The server 200 may include a server that provides various services, such as a server that provides communication services for multiple clients, a server for background training that provides support for a model used on a client, a server that processes data sent by a client, and so on. It should be noted that, the server 200 may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. The server may also be a server of a distributed system or a server that incorporates a blockchain. The server may also be a cloud server for cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN, content Delivery Network), and basic cloud computing services such as big data and artificial intelligence platforms, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology.
It should be noted that, the image processing method provided in the embodiment of the present disclosure is generally executed by the server, but in other embodiments of the present disclosure, the client may have a similar function to the server, so as to execute the image processing method provided in the embodiment of the present disclosure. In other embodiments, the image processing method provided in the embodiments of the present disclosure may be performed by the client and the server together.
Referring to fig. 2, fig. 2 shows a flowchart of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 202: and receiving an image processing task, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not.
In practical application, the image processing task sent by the user can be received through the server side or the client side.
Specifically, the image processing task is a task for detecting whether the target detection area is abnormal, and the image processing task carries a plurality of target images corresponding to the target detection area, further, the target detection area can be understood as a partition for predicting whether the abnormality occurs, for example, the target detection area can be any organ in a human body, such as liver, spleen, lung, stomach, and the like, and by predicting whether the target detection area is abnormal, the state of the object to be detected can be further determined in an auxiliary manner according to the prediction result, thereby providing assistance for determining the state of the object to be detected.
It should be noted that, in one or more embodiments of the present disclosure, the image processing task may be applied to identifying various medical images, and determine, according to image features, whether an abnormality occurs in a target detection area in the medical image, and, for example, in an application scenario of pulmonary fibrosis detection, whether fibrosis occurs in a lung may be predicted according to the medical image of the lung; in the application scene of fatty liver detection, whether fatty liver appears in the liver can be predicted according to the medical image of the liver; thereby helping doctors to judge whether the target detection area is abnormal or not, and further facilitating the subsequent treatment.
In a specific embodiment provided in the present disclosure, in an application scenario for detecting pulmonary fibrosis, an acquired target image is an image of a lung, specifically, an acquired plurality of target images are CT images of the lung, and the plurality of target images may form a 3D image of the lung, and the plurality of CT images of the lung are acquired, and by performing image detection processing on the plurality of CT images, whether pulmonary fibrosis occurs in the lung is detected.
The terminal receives the image processing task, and can take a plurality of target images corresponding to the target detection areas carried in the image processing task as input for detecting whether the abnormality exists in the target detection areas.
In practical applications, there may be multiple regions in the target image, where the target detection region needs to be determined in advance in multiple regions of the target image, for example, in the medical image, if the captured CT image is chest CT, the CT image may include a heart, spleen, liver, and so on; if the CT image taken is an abdominal CT, the CT image may include liver, spleen, etc. In a specific embodiment provided in this specification, an image segmentation recognition model is used, and the image segmentation recognition model is trained by supervised or semi-supervised model training to enable the determination of the target detection region from the target image. In the actual application process, a plurality of target images and target detection area identifiers are input into the image segmentation recognition model, and the image segmentation recognition model can determine the target detection area corresponding to the target detection area identifiers from the plurality of target images, so that the accuracy of segmenting the target detection area from the plurality of target images is improved.
Step 204: and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
In practical application, after receiving an image processing task, acquiring a plurality of target images carried by the image processing task from the image processing task, inputting the plurality of target images into an image processing model for processing, and obtaining a detection result corresponding to a target detection area output by the image processing model, wherein the detection result specifically comprises whether the target detection area is abnormal, degree information of the occurrence of the abnormality, and the like.
Specifically, the image processing model can extract detection region feature information corresponding to a target detection region and detection region texture information corresponding to the target detection region based on a plurality of input target images, wherein the detection region feature information corresponding to the target detection region specifically refers to local image feature information corresponding to the target detection region, and the detection region feature information is obtained after image feature extraction based on the plurality of target images; the texture information of the detection area specifically refers to texture information corresponding to the target detection area. After the detection region characteristic information and the detection region texture information corresponding to the target detection region are obtained, a final detection result is generated based on the two information.
In the embodiment provided in the present specification, in addition to collecting the feature information of the detection area corresponding to the target detection area, the texture information of the detection area corresponding to the target detection area is collected, and the feature of the target detection area is enriched by extracting the texture information of the target detection area, so that the detection precision of the image processing model is improved.
In particular, in one embodiment provided in the present specification, the image processing model includes an encoder, a decoder, and a classifier;
inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the detection result comprises S2042-S2046:
s2042, inputting the plurality of target images to the encoder to obtain image coding features, wherein the image coding features are determined based on detection region feature information and detection region texture information corresponding to each target image.
The encoder is specifically used for extracting feature information of a target detection area, and is used for encoding a plurality of input target images, extracting image feature information in each target image and obtaining image encoding features corresponding to the target detection area.
It should be noted that, in one or more embodiments of the present disclosure, the image coding feature corresponding to the target detection area is specifically determined according to the detection area feature information and the detection area texture information corresponding to the target detection area, where the detection area feature information is obtained by extracting features from each target image, and the detection area texture information is obtained by extracting the detection area feature information.
Specifically, the encoder comprises an image encoding unit, a texture encoding unit and a feature fusion unit;
inputting the plurality of target images to the encoder to obtain image encoding features, comprising:
inputting the plurality of target images into the image coding unit to obtain detection region characteristic information corresponding to the plurality of target images;
inputting the detection region characteristic information into the texture coding unit to obtain detection region texture information corresponding to the detection region characteristic information;
and inputting the detection region characteristic information and the detection region texture information into the characteristic fusion unit to obtain image coding characteristics.
The encoder comprises an image encoding unit, a texture encoding unit and a feature fusion unit. The image coding unit is used for extracting image features of the target image, the texture coding unit is used for extracting texture features from the extracted image feature information, and the feature fusion unit is used for fusing the image features and the texture features to finally generate image coding features output by the encoder.
Further, the image encoding unit may specifically identify a three-dimensional image area of the target detection area in the plurality of target images after obtaining the plurality of target images based on the feature extraction unit of the three-dimensional convolution networkWhere H represents height information of the target detection area, W represents width information of the target detection area, and D represents depth information of the target detection area.
The three-dimensional image area of the target detection area is input into an image coding unit (Patch Encoder 3D CNN), wherein the three-dimensional image area is divided into a plurality of image blocks based on a preset block size, and characteristic information corresponding to each image block is respectively extracted through a three-dimensional convolution network, so that the detection area characteristic information output by the image coding unit is obtained.
In a specific embodiment provided in the present disclosure, taking the dimension of the feature information corresponding to each image block as an example, if the three-dimensional image area of the target detection area is divided into P image blocks, and after each image block is processed by the three-dimensional convolution network, the dimension of the feature information corresponding to each image block is obtained as 1×512, the dimension of the feature information of the detection area formed by the P image blocks is p×512.
After the detection region characteristic information is obtained, the detection region characteristic information is input into a texture coding unit, and the texture information of the detection region characteristic information is extracted in the texture coding unit to obtain the detection region texture information. The texture coding unit is used for calculating the similarity between the detection region characteristic information and the texture template, and assembling the detection region characteristic information into the texture template according to the similarity to obtain final detection region texture information.
And finally, processing the obtained detection region characteristic information and detection region texture information by a characteristic fusion unit to obtain the image coding characteristic output by the characteristic fusion unit, wherein in practical application, the detection region characteristic information and the detection region texture information can be fused in a weighted summation mode, a direct summation mode or a characteristic splicing mode. In the embodiment provided in the present specification, the direct summation is preferably used for fusion, for example, the dimension of the feature information of the detection area is p×512, and the dimension of the texture information of the detection area is also p×512, and after the two are fused, the dimension of the obtained image coding feature is still p×512.
In the encoder, by extracting the detection region characteristic information and the detection region texture information corresponding to the target detection region, in the subsequent processing process, besides decoding the image characteristic information from the detection region characteristic information, the texture information of the target detection region can be extracted from the detection region texture information, and the information of the target detection region is represented from the aspect of texture characteristics, so that the depth representation of the target detection region is enriched, and the analysis precision is improved.
In practical applications, reference may be made to biomarker information determined from the target image, in addition to texture features corresponding to the target detection region. In particular, in another specific embodiment provided in the present specification, the method further includes:
acquiring biomarker information corresponding to a plurality of target images;
and inputting the target images and the biomarker information into an image processing model to obtain a detection result corresponding to the target detection region.
The biomarker information specifically refers to information that marks the target detection region and the reference detection region in the target image. In the subsequent image processing process, a plurality of target images and biomarker information are input into an image processing model together, so that the image processing model outputs a detection result corresponding to the target detection region.
In a specific embodiment provided in the present specification, taking the target detection area as a liver as an example, the reference detection area may be a spleen beside the liver, and biomarker information is obtained by performing a series of biological measurement evaluations on the liver and the spleen, specifically, average attenuation values of CT values of the liver and the spleen are obtained by calculating histograms of CT values of the liver and the spleen, and attenuation ratio and attenuation difference values of CT values of the liver and the spleen are calculated; in addition, regional attenuation of the liver and spleen, and meta information (such as gender, age, etc.) of the target to be detected are evaluated, respectively, and biomarker information I can be obtained based on the above information BIO
S2044, inputting the image coding features to the decoder, and obtaining image decoding features corresponding to the image coding features.
The decoder is used for decoding the image coding features so as to acquire the image decoding features corresponding to the image coding features from the image coding features. Further, in the decoder, a plurality of decoding layers are used, and a multi-head self-attention mechanism is used in each decoding layer, and the image coding feature is decoded by the multi-head self-attention mechanism, so that a corresponding image decoding feature is obtained, and preferably, 4 decoding layers are included in the decoder. Each decoding layer uses a multi-headed self-attention mechanism.
In one or more specific embodiments provided in the present specification, specifically, inputting the image coding feature to the decoder, to obtain an image decoding feature corresponding to the image coding feature, includes:
adding distillation characteristic information and classification characteristic information for the image coding characteristics to obtain image coding characteristics to be processed;
and inputting the image coding feature to be processed to the decoder to obtain an image decoding feature.
In practical application, the image processing model is a pre-trained machine learning model and is limited by the defect of training data, and before the image is decoded, distillation characteristic information and classification characteristic information are added into image coding characteristics of the image processing model, wherein the distillation characteristic information is used for comparing with a non-standard label, and the classification characteristic information is used for comparing with a standard label, so that the efficiency of a model training stage is improved.
Based on this, after the image encoding feature is obtained, distillation feature information and classification feature information are added to the image encoding feature for predicting a final detection result based on the distillation feature information and the classification feature information in a subsequent processing. Still further, distillation feature information may be added to the beginning of the image encoding feature and classification feature information may be added to the end of the image encoding feature.
For example, characterised by image coding [ E ] 0 、E 1 ……E P ]For example, distillation characteristic information E is added to the image coding characteristic dis And classification characteristic information E cls Obtaining the coding feature [ E ] of the image to be processed dis 、E 0 、E 1 ……E P 、E cls ]And inputting the image coding features to be processed into a decoder for decoding, and finally obtaining the image decoding features. Encoding features with images to be processed [ E ] dis 、E 0 、E 1 ……E P 、E cls ]For example, after processing by the decoder, the image decoding characteristic is [ D ] dis 、D 0 、D 1 ……D P 、D cls ]。
In another specific embodiment provided in the present specification, in a case where biomarker information is input to an image processing model, inputting the image encoding feature to be processed to the decoder, obtaining an image decoding feature includes:
splicing the biomarker information to the image coding feature to be processed to obtain a spliced image coding feature;
and inputting the spliced image coding features to the decoder to obtain image decoding features.
In the above-mentioned steps, a case is also mentioned in which biomarker information is input to the image processing model, in which case the biomarker information is added to the image-encoding feature to be processed to obtain a stitched image-encoding feature, further, for example, to the image-encoding feature to be processed [ E ] dis 、E 0 、E 1 ……E P 、E cls ]Biomarker information I BIO For example, biomarker information I BIO Conversion to corresponding biomarker profile E BIO Splicing the biomarker characteristic information and the image coding characteristic to be processed to obtain a spliced image coding characteristic [ E ] dis 、E 0 、E 1 ……E P 、E BIO 、E cls ]. Inputting the spliced image coding characteristic into a decoder to obtain an image decoding characteristic [ D ] dis 、D 0 、D 1 ……D P 、D BIO 、D cls ]。
S2046, inputting the image decoding characteristics into the classifier to obtain a detection result corresponding to the target detection area.
After the image decoding features are obtained, the image decoding features are input into a classifier, and the image decoding features are classified by the classifier, so that a detection result finally corresponding to the target detection region is obtained.
In one embodiment provided in this specification, in the case where the target image is input to the image processing model, the image decoding characteristic is [ D dis 、D 0 、D 1 ……D P 、D cls ]Inputting the image decoding characteristics into a classifier for processing, and classifying the image decoding characteristics by the classifier according to the distillation decoding characteristics and the classification decoding characteristics, so as to obtain a final detection result.
In another embodiment provided in the present specification, in the case where the target image and the biomarker information are input to the image processing model, the image decoding feature is [ D dis 、D 0 、D 1 ……D P 、D BIO 、D cls ]Inputting the image decoding characteristics into a classifier for processing, and classifying the image decoding characteristics by the classifier according to the distillation decoding characteristics and the classification decoding characteristics, so as to obtain a final detection result.
Specifically, inputting the image decoding feature to the classifier to obtain a detection result corresponding to the target detection area, including:
inputting the image decoding characteristics into the classifier to obtain a first classification result corresponding to the distillation characteristic information and a second classification result corresponding to the classification characteristic information;
and generating the detection result according to the first classification result and the second classification result.
In the method provided by the specification, after the image decoding feature is input into the classifier, the classifier classifies according to the distillation feature information and the classification feature information to respectively obtain a first classification result corresponding to the distillation feature information and a second classification result corresponding to the classification feature information, and then the first classification result and the second classification result are weighted and averaged to obtain a final detection result.
Referring to fig. 3, fig. 3 is a schematic diagram of a model structure of an image processing model according to an embodiment of the present disclosure, where, as shown in fig. 3, the image processing model includes an encoder, a decoder, and a classifier, the encoder includes an image encoding unit, a texture encoding unit, and a feature fusion unit, and the decoder includes four decoding layers based on a Multi-head Self-attention Mechanism (MSA).
Inputting a plurality of target images into an image processing model, extracting image features through an image coding unit to obtain detection region feature information, inputting the detection region feature information into a texture coding unit to extract texture features to obtain detection region texture information, inputting the detection region feature information and the detection region texture information into a feature fusion unit to obtain the image coding feature [ E ] 0 、E 1 ……E P ]。
Biomarker information I is obtained from a plurality of target images by performing biomarker BIO Inputting the biological marker into an image processing model, and obtaining biological marker characteristic information E through embedding processing BIO . Characterizing the image as [ E ] 0 、E 1 ……E P ]Biomarker profile E BIO Splicing and adding distillation characteristic information D dis And classification characteristic information D cls Generating stitched image encoding features [ E dis 、E 0 、E 1 ……E P 、E BIO 、E cls ]Encoding the stitched image to feature [ E dis 、E 0 、E 1 ……E P 、E BIO 、E cls ]Inputting into a decoder for decoding to obtain an image decoding characteristic of [ D ] dis 、D 0 、D 1 ……D P 、D BIO 、D cls ]The image is decoded and characterized as [ D ] dis 、D 0 、D 1 ……D P 、D BIO 、D cls ]D in (2) dis 、D cls The vector is input into a classifier to be classified, and D is obtained dis Corresponding first classification result and D cls And a corresponding second classification result. Finally, the first classification result and the second classification result are fused in a weighted summation mode, and a final detection result is obtained.
In a specific embodiment provided in the present disclosure, in a process of performing recognition processing on a plurality of target images, detection region feature information is generated based on each target image, then detection region texture information is extracted from the detection region feature information, and final image coding features are generated based on the detection region feature information and the detection region feature information, and distillation feature information and classification feature information are added before image decoding is performed, so that distillation features and classification features are referred to in the process of image decoding, and a final detection result is more accurate.
In the method provided in an embodiment of the present specification, the image processing model is a trained machine learning model, further, the image processing model is a supervised trained model, specifically, the image processing model is obtained by training in S2062 to S2068:
s2062, obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to a target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results.
Specifically, the training method of the image processing model provided in the present specification uses supervised training, which includes a training sample pair including a plurality of training sample images and sample detection results for the target detection area, and it should be noted that in the training method of the image processing model provided in the present specification, the sample detection results include two types, one is a standard sample detection result, and the other is a reference sample detection result. The standard sample detection result is a verified sample detection result, and the reference sample detection result is a sample detection result predicted by a relevant technician according to experience.
For example, taking an image processing model as an example to explain whether the liver has fatty liver, in a model training stage of the image processing model, a plurality of training sample pairs form a training sample set, two training sample subsets are included in the training sample set, 680 sample pairs are included in the first training sample subset, and CT images of the liver of 680 users and pathological results after pathological verification are respectively obtained; the second training sample subset has 1103 sample pairs, including liver CT images of 1103 users and prediction results predicted by doctors. The pathological result after pathological verification is a standard sample detection result, and the prediction result predicted by the doctor is a reference sample detection result. The standard sample detection result and the reference sample detection result are both sample detection results.
In the method provided in the specification, a plurality of training sample pairs are input into an image processing model according to a preset batch, and training sample images in the training sample pairs and corresponding sample detection results are positive sample pairs, and sample detection results in other training sample pairs are negative sample pairs. For example, taking the above-mentioned example of predicting whether liver has fatty liver, one training batch has 64 training sample pairs, for training sample pair 1, training sample image 1 and sample detection result 1 are positive sample pairs, and training sample image 1 and other sample detection results are negative sample pairs; for training sample pair 2, training sample image 2 and sample detection result 2-bit positive sample pair, training sample image 2 and other sample detection results are negative sample pair … ….
S2064, inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection region.
After the training sample pairs are obtained, training sample images of a plurality of training sample pairs are input into an image processing model according to a preset training batch, the image processing model at the moment is an image processing model which is not trained yet, in the image processing model, detection region characteristic information corresponding to a target detection region is generated according to each training sample image, detection region texture information is obtained from the detection region characteristic information, image coding features are generated according to the detection region characteristic information and the detection region texture information, and distillation feature information and classification feature information are added into the image coding features to obtain image coding features to be processed.
In a specific embodiment provided in the present disclosure, the target detection area and the biomarker information of the reference detection area corresponding to the target detection area are also extracted, and the biomarker information is input into the image processing model and spliced with the image coding feature to be processed to generate the spliced image coding feature. And then inputting the prediction detection result into a decoder for decoding, and finally inputting the prediction detection result into a classifier to obtain the prediction detection result corresponding to the target detection region output by the final model.
In the method provided in the present specification, the image processing model has the same model structure as the image processing model, and the data processing process of the training sample image in the untrained image processing model is referred to the data processing process of the target image in the image processing model, which is not described herein.
S2066, calculating a model loss value according to the prediction detection result and the sample detection result.
After the prediction detection result of the image processing model is obtained, the model loss value can be calculated according to the prediction detection result and the sample detection result, and in the method provided in the present specification, there are many methods for calculating the model loss value, such as a cross entropy loss function, a maximum loss function, an average loss function, and the like, in the present specification, the specific mode of the loss function is not limited, and the actual application is subject to.
In another specific embodiment provided in the present specification, the prediction detection result includes a first prediction classification result corresponding to distillation characteristic information and a second prediction classification result corresponding to classification characteristic information;
calculating a model loss value according to the prediction detection result and the sample detection result, including:
calculating a first loss value according to the reference sample detection result and the first prediction classification result; or (b)
Calculating a second loss value according to the standard sample detection result and the first prediction classification result;
specifically, in the method provided in the present specification, distillation feature information and classification feature information are added to the coding feature information in the image processing model, where the distillation feature information corresponds to the reference sample detection result, and the classification feature information corresponds to the standard sample detection result.
In the model application stage, the detection result output by the model is determined according to the first prediction classification result corresponding to the distillation characteristic information and the second prediction classification result corresponding to the classification characteristic information. In the model training stage, by adjusting the weight, under the condition that the training sample pair is a standard sample detection result, calculating a second loss value according to the standard sample detection result and a second prediction classification result; when the training sample pair is the reference sample detection result, a first loss value is calculated according to the reference sample detection result and the first prediction classification result. If the training sample pair is a standard sample detection result in the model training stage, calculating a loss value with the prediction result of the classification characteristic information; if the training sample pair is the reference sample detection result, calculating a loss value with the prediction result of the distillation characteristic information.
S2068, adjusting model parameters of the image processing model according to the model loss value until a model training stopping condition is reached.
After the model loss value is obtained, the model parameters of the image processing model can be adjusted according to the model loss value, and specifically, the model parameters of the image processing model can be updated by back propagation of the model loss value.
Correspondingly, adjusting the model parameters of the image processing model according to the model loss value comprises the following steps:
and adjusting model parameters of the image processing model according to the first loss value and the second loss value.
In one embodiment provided in the present disclosure, there may be both a first loss value and a second loss value in the same batch of model training, and after the training of the same batch of training data is completed, the model parameters of the image processing model are adjusted according to the first loss values and/or the second loss values of the batch.
After the model parameters are adjusted, the steps can be continuously repeated, and the image processing model is continuously trained until the training stopping condition is reached, and in practical application, the training stopping condition of the image processing model comprises the following steps:
The model loss value is smaller than a preset threshold value; and/or
The training round reaches the preset training round.
Specifically, in the process of training the image processing model, the training stop condition of the model may be set to be smaller than the preset threshold value, or the training stop condition may be set to be a preset training round, for example, 10 training rounds, where in the present specification, the preset threshold value of the loss value and/or the preset training round are not specifically limited, and the actual application is in order.
In the method provided by the embodiment of the specification, the verified standard sample detection result is used in the process of training the image processing model, and the reference sample detection result predicted based on experience according to the related technicians is also used. The number of training sample sets is enriched, a data basis is provided for training of the image processing model, in the training process, the first prediction classification result corresponding to the distillation characteristic information and the second prediction classification result corresponding to the classification characteristic information are used for training with the training sample sets, the processing capacity of the image processing model is enriched, the image processing model has richer generalization capacity, and accordingly the prediction precision of the image processing model is further improved.
Referring to fig. 4, fig. 4 shows a flowchart of a CT image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 402: and receiving a CT image processing task, wherein the CT image processing task carries a plurality of CT images corresponding to a target detection area, and the CT image processing task is used for detecting whether the target detection area is abnormal or not.
Step 404: and inputting the CT images into a CT image processing model to obtain a detection result corresponding to the target detection region, wherein the CT image processing model generates detection region characteristic information and detection region texture information based on each CT image, and generates the detection result based on the detection region characteristic information and the detection region texture information.
It should be noted that, the implementation manner of step 402 and step 404 is the same as the implementation manner of steps 202 to 204, and will not be described in detail in the embodiment of the present disclosure.
For example, taking the target detection area as a liver, to detect whether a fatty liver exists in the liver, a CT image processing task is received, where the CT image processing task includes a plurality of CT images corresponding to the liver of the target user, where the plurality of CT images may form a 3D map of the liver of the target user, and the CT image processing task is used to detect whether the target user has the fatty liver.
By applying the method of the embodiment of the present disclosure, the CT image processing model is the image processing model in the above embodiment, and the model structure of the CT image processing model is the same as that of the image processing model in the above embodiment, which is not described herein again, and by inputting a plurality of CT images corresponding to the liver portion into the CT image processing model, the detection result corresponding to the liver portion output by the image processing model can be obtained, thereby implementing automatic detection on whether there is a fatty liver in the liver or not, and the degree of the fatty liver.
In the CT image processing method provided in the embodiments of the present disclosure, in a CT image processing model, detection region feature information is generated based on a CT image, then detection region texture information is extracted from the detection region feature information, and final image coding features are generated based on the detection region feature information and the detection region feature information, and before image decoding, distillation feature information and classification feature information are added, so that distillation features and classification features are referred to in the image decoding process, and the final detection result is more accurate.
Referring to fig. 5, fig. 5 shows a flowchart of a training method of an image processing model according to an embodiment of the present disclosure, which is applied to cloud-side equipment, and specifically includes the following steps:
Step 502: and obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to the target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results.
Step 504: and inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection region.
Step 506: and calculating a model loss value according to the prediction detection result and the sample detection result.
Step 508: and adjusting the model parameters of the image processing model according to the model loss value until the model training stopping condition is reached, and obtaining the model parameters of the image processing model.
Step 510: and sending the model parameters of the image processing model to end-side equipment.
It should be noted that, step 502 and step 508 are the same as the implementation manners of S2062-S2068, and are not repeated in the embodiment of the present disclosure.
In practical application, because a large amount of data and better computing resources are required for training the model, the terminal side equipment may not have corresponding processing capability, so that the model training process can be realized in cloud side equipment, and the cloud side equipment can also send the model parameters to the terminal side equipment after obtaining the model parameters of the image processing model. The terminal equipment can locally construct an image processing model according to model parameters of the image processing model, and further perform image processing by using the image processing model.
In the method provided by the embodiment of the specification, the verified standard sample detection result is used in the process of training the image processing model, and the reference sample detection result predicted based on experience according to the related technicians is also used. The number of training sample sets is enriched, a data basis is provided for training of the image processing model, in the training process, the first prediction classification result corresponding to the distillation characteristic information and the second prediction classification result corresponding to the classification characteristic information are used for training with the training sample sets, the processing capacity of the image processing model is enriched, the image processing model has richer generalization capacity, and accordingly the prediction precision of the image processing model is further improved.
Referring to fig. 6, fig. 6 shows a flowchart of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 602: receiving an image processing request sent by a user, wherein the image processing request comprises an image processing task, the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not.
Step 604: and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
Step 606: and sending a detection result corresponding to the target detection area to a user.
It should be noted that, the specific implementation manner of the steps 602 to 604 is the same as the implementation manner of the steps 202 to 204, and will not be described in detail in the embodiment of the present disclosure.
In this embodiment, an image processing request sent by a user is received, where the image processing request includes an image processing task, and after the image processing method of the foregoing embodiment is completed to obtain a detection result, the detection result needs to be returned to the user, so that the user performs corresponding subsequent processing according to the detection result.
In the method provided by the embodiment of the specification, in the process of identifying a plurality of target images, detection region feature information is generated based on each target image, then detection region texture information is extracted from the detection region feature information, final image coding features are generated based on the detection region feature information and the detection region feature information, and distillation feature information and classification feature information are added before image decoding, so that distillation feature and classification feature are referenced in the process of image decoding, and a final detection result is more accurate.
The following describes an example of the application of the image processing method provided in the present specification to detecting whether there is a fatty liver, with reference to fig. 7. Fig. 7 shows a flowchart of a processing procedure of an image processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 702: and receiving a CT image processing task, wherein the CT image processing task carries a plurality of CT images corresponding to the liver.
Step 704: and acquiring biomarker information corresponding to the CT images, wherein the biomarker information comprises liver CT information, spleen CT information and user attribute information.
Step 706: and inputting the CT images and the biomarker information into a CT image processing model to obtain a detection result corresponding to the liver.
Specifically, when the CT image processing task is a task of detecting whether or not the target user has fatty liver, and what degree of fatty liver has.
The CT image processing model is trained in advance, and the source of training data is from 680 study subjects diagnosed by pathology and 1103 study subjects predicted by doctors. Among 680 subjects with pathological diagnosis, 203 healthy subjects, 150 subjects with mild fatty liver, 138 subjects with moderate fatty liver, and 89 subjects with severe fatty liver. Among the subjects predicted by 1103 doctors, there were 438 healthy subjects, 307 mild fatty liver subjects, 112 moderate fatty liver subjects, and 246 severe fatty liver subjects.
Under the same contrast condition, the non-enhanced CT scan images of the chest and the abdomen of the study object are respectively acquired by a plurality of CT scanners. And taking the acquired image as a training sample image, and taking a diagnosis result corresponding to each study object as a sample detection result. Therefore, the CT image processing model is trained, and the specific training process is described in the above embodiment, which is not described herein.
After model training, a validation dataset was selected, the data sources of which were from 226 validation study subjects as validation set UNIFESP-tr, by varying the validation set to three different extents: data statistics for Mild (Mild), moderate (Moderate) and Severe (Severe), AUC represents summary statistics, ACC represents accuracy values. See table 1 below.
TABLE 1
As shown in table 1, in a specific embodiment provided in the present specification, three ablation verification methods, namely, a method based on biometric identification, a method of deep learning branching, and a method of hybrid configuration are used, respectively.
In the method based on biometric identification, three configurations were tested, "MAL", "MALRO", respectively + ". Wherein, "MAL" is only suitable for configuration of carrying out logistic ordered regression training on average HU of liver and user attribute information; "MALRO" increases the sampling of the liver target region on the basis of "MAL"; "MALRO + Spleen-related biomarker information was added "on" MALRO "basis.
In the deep learning branch-based approach, three configurations were also tested, respectively "3DN", "3DNT + ". Wherein "3DN" is a basic configuration using only 3D-ResNet 34; "3DNT" is a multi-head self-attention mechanism added on the basis of "3 DN"; "3DNT + "is based on" 3DNT "to add texture coding.
In the hybrid configuration-based approach, four configurations were tested, namely "Bio-3DNT", respectively T ”、“Bio-3DNT R ”、“Bio-3DNT TR "wherein" Bio-3DNT "is a configuration combining biometrics and deep learning, in which case a teacher model" Bio-3DNT "is also provided T "radiologist knowledge model" Bio-3DNT R "and teacher-radiologist knowledge combination model" Bio-3DNT TR ". "Bio-3DNT" was chosen as the teacher model. The sample label of the teacher model uses only standard sample detection results, the sample label of the radiologist indication model uses only reference sample detection results, and the teacher-radiologist knowledge is combined with the model' Bio-3DNT TR "sample tag uses standard samplesThe detection result and the reference sample detection result.
Referring to table 1, the method of introducing deep learning has greatly improved overall accuracy compared to the method based on biometric identification, and the addition of multi-head self-attention mechanism and texture coding to the method of deep learning further improves performance. When the deep learning is combined with the biological recognition, the overall performance is greatly improved. The teacher-radiologist knowledge combined model 'Bio-3 DNT' combined with the standard sample detection result and the reference sample detection result TR In "model performance is improved jointly by the indication of the radiologist and the refinement of the teacher model.
By the method provided by the embodiment of the specification, the obtained image processing model is trained, and in practical application, the biological identification information corresponding to the target image is also referred. And generating detection region characteristic information based on each target image, extracting detection region texture information from the detection region characteristic information, generating final image coding characteristics based on the detection region characteristic information, the detection region characteristic information and the biological identification information, and adding distillation characteristic information and classification characteristic information before image decoding, so that distillation characteristics and classification characteristics are referenced in the image decoding process, and the final detection result is more accurate.
Corresponding to the above method embodiments, the present disclosure further provides an image processing apparatus embodiment, and fig. 8 shows a schematic structural diagram of an image processing apparatus according to one embodiment of the present disclosure. As shown in fig. 8, the apparatus includes:
a receiving module 802 configured to receive an image processing task, where the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is configured to detect whether the target detection area is abnormal;
the detection module 804 is configured to input the multiple target images into an image processing model, and obtain a detection result corresponding to the target detection area, where the image processing model generates detection area feature information and detection area texture information based on each target image, and generates the detection result based on the detection area feature information and the detection area texture information.
Optionally, the image processing model includes an encoder, a decoder, and a classifier;
the detection module 804 is further configured to:
inputting the target images into the encoder to obtain image coding features, wherein the image coding features are determined based on detection region feature information and detection region texture information corresponding to each target image;
Inputting the image coding features to the decoder to obtain image decoding features corresponding to the image coding features;
and inputting the image decoding characteristics into the classifier to obtain a detection result corresponding to the target detection area.
Optionally, the encoder comprises an image encoding unit, a texture encoding unit and a feature fusion unit;
the detection module 804 is further configured to:
inputting the plurality of target images into the image coding unit to obtain detection region characteristic information corresponding to the plurality of target images;
inputting the detection region characteristic information into the texture coding unit to obtain detection region texture information corresponding to the detection region characteristic information;
and inputting the detection region characteristic information and the detection region texture information into the characteristic fusion unit to obtain image coding characteristics.
Optionally, the detection module 804 is further configured to:
adding distillation characteristic information and classification characteristic information for the image coding characteristics to obtain image coding characteristics to be processed;
and inputting the image coding feature to be processed to the decoder to obtain an image decoding feature.
Optionally, the detection module 804 is further configured to:
inputting the image decoding characteristics into the classifier to obtain a first classification result corresponding to the distillation characteristic information and a second classification result corresponding to the classification characteristic information;
and generating the detection result according to the first classification result and the second classification result.
Optionally, the apparatus further includes:
an acquisition module configured to acquire biomarker information corresponding to a plurality of target images;
the detection module 804 is further configured to input the multiple target images and the biomarker information into an image processing model, so as to obtain a detection result corresponding to the target detection area.
Optionally, the detecting module 804 is further configured to:
splicing the biomarker information to the image coding feature to be processed to obtain a spliced image coding feature;
and inputting the spliced image coding features to the decoder to obtain image decoding features.
Optionally, the apparatus further comprises a training module configured to:
obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to a target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results;
Inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection area;
calculating a model loss value according to the prediction detection result and the sample detection result;
and adjusting model parameters of the image processing model according to the model loss value until a model training stopping condition is reached.
Optionally, the prediction detection result includes a first prediction classification result corresponding to distillation characteristic information and a second prediction classification result corresponding to classification characteristic information;
the training module is further configured to:
calculating a first loss value according to the reference sample detection result and the first prediction classification result; or (b)
Calculating a second loss value according to the standard sample detection result and the first prediction classification result;
and adjusting model parameters of the image processing model according to the first loss value and the second loss value.
According to the device provided by the embodiment of the specification, in the process of identifying a plurality of target images, detection area characteristic information is generated based on each target image, detection area texture information is extracted from the detection area characteristic information, final image coding characteristics are generated based on the detection area characteristic information and the detection area characteristic information, and distillation characteristic information and classification characteristic information are added before image decoding, so that distillation characteristics and classification characteristics are referenced in the process of image decoding, and a final detection result is more accurate.
The above is a schematic scheme of an image processing apparatus of the present embodiment. It should be noted that, the technical solution of the image processing apparatus and the technical solution of the image processing method belong to the same concept, and details of the technical solution of the image processing apparatus, which are not described in detail, can be referred to the description of the technical solution of the image processing method.
Fig. 9 illustrates a block diagram of a computing device 900 provided in accordance with one embodiment of the present specification. The components of computing device 900 include, but are not limited to, memory 910 and processor 920. Processor 920 is coupled to memory 910 via bus 930 with database 950 configured to hold data.
Computing device 900 also includes an access device 940, access device 940 enabling computing device 900 to communicate via one or more networks 960. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. Access device 940 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, near field communication (NFC, near Field Communication).
In one embodiment of the present description, the above-described components of computing device 900 and other components not shown in FIG. 9 may also be connected to each other, for example, by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 9 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 900 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 900 may also be a mobile or stationary server.
Wherein the processor 920 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the image processing method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the image processing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the image processing method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the image processing method described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the image processing method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the image processing method.
An embodiment of the present specification also provides a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the image processing method described above.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the image processing method belong to the same conception, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the image processing method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (14)

1. An image processing method, comprising:
receiving an image processing task, wherein the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
and inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information.
2. The method of claim 1, the image processing model comprising an encoder, a decoder, and a classifier;
inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the method comprises the following steps:
inputting the target images into the encoder to obtain image coding features, wherein the image coding features are determined based on detection region feature information and detection region texture information corresponding to each target image;
inputting the image coding features to the decoder to obtain image decoding features corresponding to the image coding features;
and inputting the image decoding characteristics into the classifier to obtain a detection result corresponding to the target detection area.
3. The method of claim 2, the encoder comprising an image encoding unit, a texture encoding unit, and a feature fusion unit;
inputting the plurality of target images to the encoder to obtain image encoding features, comprising:
inputting the plurality of target images into the image coding unit to obtain detection region characteristic information corresponding to the plurality of target images;
inputting the detection region characteristic information into the texture coding unit to obtain detection region texture information corresponding to the detection region characteristic information;
And inputting the detection region characteristic information and the detection region texture information into the characteristic fusion unit to obtain image coding characteristics.
4. The method of claim 2, inputting the image encoding features to the decoder to obtain image decoding features corresponding to the image encoding features, comprising:
adding distillation characteristic information and classification characteristic information for the image coding characteristics to obtain image coding characteristics to be processed;
and inputting the image coding feature to be processed to the decoder to obtain an image decoding feature.
5. The method of claim 4, inputting the image decoding feature to the classifier to obtain a detection result corresponding to the target detection region, comprising:
inputting the image decoding characteristics into the classifier to obtain a first classification result corresponding to the distillation characteristic information and a second classification result corresponding to the classification characteristic information;
and generating the detection result according to the first classification result and the second classification result.
6. The method of claim 4, further comprising:
acquiring biomarker information corresponding to a plurality of target images;
and inputting the target images and the biomarker information into an image processing model to obtain a detection result corresponding to the target detection region.
7. The method of claim 6, inputting the image encoding feature to be processed to the decoder to obtain an image decoding feature, comprising:
splicing the biomarker information to the image coding feature to be processed to obtain a spliced image coding feature;
and inputting the spliced image coding features to the decoder to obtain image decoding features.
8. The method of claim 1, the image processing model being obtained by training:
obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to a target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results;
inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection area;
calculating a model loss value according to the prediction detection result and the sample detection result;
and adjusting model parameters of the image processing model according to the model loss value until a model training stopping condition is reached.
9. The method of claim 8, wherein the predicted detection results comprise a first predicted classification result corresponding to distillation characteristic information and a second predicted classification result corresponding to classification characteristic information;
Calculating a model loss value according to the prediction detection result and the sample detection result, including:
calculating a first loss value according to the reference sample detection result and the first prediction classification result; or (b)
Calculating a second loss value according to the standard sample detection result and the first prediction classification result;
correspondingly, adjusting the model parameters of the image processing model according to the model loss value comprises the following steps:
and adjusting model parameters of the image processing model according to the first loss value and the second loss value.
10. A CT image processing method, comprising:
receiving a CT image processing task, wherein the CT image processing task carries a plurality of CT images corresponding to a target detection area, and the CT image processing task is used for detecting whether the target detection area is abnormal or not;
and inputting the CT images into a CT image processing model to obtain a detection result corresponding to the target detection region, wherein the CT image processing model generates detection region characteristic information and detection region texture information based on each CT image, and generates the detection result based on the detection region characteristic information and the detection region texture information.
11. The training method of the image processing model is applied to cloud side equipment and comprises the following steps:
obtaining a training sample pair, wherein the training sample pair comprises a plurality of training sample images corresponding to a target detection area and sample detection results of the target detection area, and the sample detection results comprise standard sample detection results or reference sample detection results;
inputting the training sample images into an image processing model to obtain a prediction detection result corresponding to the target detection area;
calculating a model loss value according to the prediction detection result and the sample detection result;
adjusting model parameters of the image processing model according to the model loss value until model training stopping conditions are reached, and obtaining model parameters of the image processing model;
and sending the model parameters of the image processing model to end-side equipment.
12. An image processing method includes
Receiving an image processing request sent by a user, wherein the image processing request comprises an image processing task, the image processing task carries a plurality of target images corresponding to a target detection area, and the target image processing task is used for detecting whether the target detection area is abnormal or not;
Inputting the plurality of target images into an image processing model to obtain a detection result corresponding to the target detection area, wherein the image processing model generates detection area characteristic information and detection area texture information based on each target image, and generates the detection result based on the detection area characteristic information and the detection area texture information;
and sending a detection result corresponding to the target detection area to a user.
13. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the method of any one of claims 1 to 12.
14. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of any one of claims 1 to 12.
CN202310814888.1A 2023-07-04 2023-07-04 Image processing method and training method of image processing model Pending CN116993680A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310814888.1A CN116993680A (en) 2023-07-04 2023-07-04 Image processing method and training method of image processing model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310814888.1A CN116993680A (en) 2023-07-04 2023-07-04 Image processing method and training method of image processing model

Publications (1)

Publication Number Publication Date
CN116993680A true CN116993680A (en) 2023-11-03

Family

ID=88525707

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310814888.1A Pending CN116993680A (en) 2023-07-04 2023-07-04 Image processing method and training method of image processing model

Country Status (1)

Country Link
CN (1) CN116993680A (en)

Similar Documents

Publication Publication Date Title
US20210365717A1 (en) Method and apparatus for segmenting a medical image, and storage medium
Mishra et al. Diabetic retinopathy detection using deep learning
EP3992851A1 (en) Image classification method, apparatus and device, storage medium, and medical electronic device
CN111488921B (en) Intelligent analysis system and method for panoramic digital pathological image
CN105640577A (en) Method and system automatically detecting local lesion in radiographic image
US10943157B2 (en) Pattern recognition method of autoantibody immunofluorescence image
CN117408946A (en) Training method of image processing model and image processing method
CN113393469A (en) Medical image segmentation method and device based on cyclic residual convolutional neural network
WO2023221697A1 (en) Method and apparatus for training image recognition model, device and medium
CN116797554A (en) Image processing method and device
US11721023B1 (en) Distinguishing a disease state from a non-disease state in an image
CN111627555B (en) Intelligent inspection diagnosis system based on deep learning
EP3817648A1 (en) Method for diagnosing, predicting, determining prognosis, monitoring, or staging disease based on vascularization patterns
CN114219754A (en) Thyroid-related eye disease identification method and device based on eye CT image
CN115222713A (en) Method and device for calculating coronary artery calcium score and storage medium
CN117408948A (en) Image processing method and training method of image classification segmentation model
CN117237351B (en) Ultrasonic image analysis method and related device
Lee et al. Assessment of diagnostic image quality of computed tomography (CT) images of the lung using deep learning
CN117274185B (en) Detection method, detection model product, electronic device, and computer storage medium
CN113360611A (en) AI diagnosis method, device, storage medium and equipment based on inspection result
CN113705595A (en) Method, device and storage medium for predicting degree of abnormal cell metastasis
KR102036052B1 (en) Artificial intelligence-based apparatus that discriminates and converts medical image conformity of non-standardized skin image
CN116993680A (en) Image processing method and training method of image processing model
CN115130543B (en) Image recognition method and device, storage medium and electronic equipment
Abdulraheem et al. Continuous eye disease severity evaluation system using siamese neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination