CN116977338A - Chromosome case-level abnormality prompting system based on visual semantic association - Google Patents

Chromosome case-level abnormality prompting system based on visual semantic association Download PDF

Info

Publication number
CN116977338A
CN116977338A CN202311235013.2A CN202311235013A CN116977338A CN 116977338 A CN116977338 A CN 116977338A CN 202311235013 A CN202311235013 A CN 202311235013A CN 116977338 A CN116977338 A CN 116977338A
Authority
CN
China
Prior art keywords
vector
text
abnormal
type
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311235013.2A
Other languages
Chinese (zh)
Other versions
CN116977338B (en
Inventor
穆阳
张金超
高悦
汤滔
徐思
邓代华
邹磊
刘丽珏
蔡昱峰
彭伟雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Zixing Wisdom Medical Technology Co ltd
Original Assignee
Hunan Zixing Wisdom Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Zixing Wisdom Medical Technology Co ltd filed Critical Hunan Zixing Wisdom Medical Technology Co ltd
Priority to CN202311235013.2A priority Critical patent/CN116977338B/en
Publication of CN116977338A publication Critical patent/CN116977338A/en
Application granted granted Critical
Publication of CN116977338B publication Critical patent/CN116977338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a chromosome case-level abnormality prompting system based on visual semantic association, which adopts a karyotype diagram preprocessing module to divide chromosomes and splice a position code, and then encodes the karyotype diagram into an image feature vector through a karyotype image encoding vector module; then inputting N abnormal core types into a text encoder to obtain text feature vectors, and constructing an abnormal text vector base; finally, calculating the feature similarity between the image feature vector and the abnormal text vector base, outputting the highest similarity and judging whether the similarity reaches a set threshold value, outputting a designated kernel type if the similarity reaches the threshold value, otherwise, outputting no kernel type; the text encoder guides the image encoder to learn important features, and directly detects quantity abnormality and structural abnormality end to end, so that the text encoder is simple and effective.

Description

Chromosome case-level abnormality prompting system based on visual semantic association
Technical Field
The application relates to the technical field of medical artificial intelligence, in particular to a chromosome case-level abnormality prompting system based on visual semantic association.
Background
Chromosome karyotyping (Chromosome karyotyping) is a technique for detecting the presence of abnormal numbers and structures in human chromosomes. However, there are hundreds of metaphase pictures taken on a patient's slide, and the quality varies, and a patient needs to analyze at least 30 metaphase images to determine if an abnormality (quantitative abnormality, structural abnormality) exists. At this time, the doctor is in an unknown state for each metaphase image, and misdiagnosis and missed diagnosis are easy to occur. In order to reduce the burden of professionals and improve the efficiency of nuclear type analysis, a chromosome case level abnormality prompting system based on visual semantic association is provided, and the system can make preliminary judgment on a patient case before analysis by a doctor, so that the doctor can analyze corresponding images in a targeted manner conveniently. Thereby improving the accuracy and efficiency of the nuclear type analysis.
Similar methods for analyzing chromosome abnormalities have been developed in the current chromosome karyotype analysis technology, mainly including the following methods.
And judging whether the chromosome is normal or not by setting rules such as the length of the chromosome and the position of a central point. The method has better image effect on clear chromosome structure, but has difficult rule design and poor effect on nuclear pattern images with poor image effect.
Traditional machine learning based method: and extracting the characteristics of chromosome morphology, gray scale and the like by using characteristic engineering, inputting the characteristics into a classification model for training, and judging whether the chromosome is normal or not. This approach is sensitive to feature design and works poorly when the image is complex or has poor resolution.
The method based on deep learning is to directly perform anomaly detection training on chromosome images by using a convolutional neural network. This approach is more limited to the type of anomaly detected. Specific, such as number 9 inversion; der (13; 15) Roche translocation. The detection model trained for a specific karyotype has great limitation and can only be used for prompting a certain type of abnormality somewhat. Therefore, the case can be prompted on the map in the middle stage; and the system with unrestricted abnormality prompt category is more reasonable and effective.
Disclosure of Invention
The application aims to provide a chromosome case-level abnormality prompting system based on visual semantic association, which can prompt case-level abnormality in units of a single Zhang Zhongqi chart, wherein an abnormality identification sign prompts a case in the system to indicate that the case is likely to be an abnormality case to remind doctors of focus attention, and the corresponding metaphase chart is also marked by an abnormality identification sign. In short, the data of a case is divided and identified to obtain a corresponding karyotype graph. And then automatically inputting the karyotype graphs into the model of the application to obtain the abnormal information corresponding to each karyotype graph in the case, and finally, counting whether the quantity of the abnormal information in the case meets the threshold set by us, if so, throwing out the case abnormality, and displaying in the case, thereby achieving the function of prompting the abnormality of the case.
The application provides a chromosome case-level abnormality prompting system based on visual semantic association, which comprises:
the nuclear pattern diagram preprocessing module is used for dividing chromosomes of the input nuclear pattern diagram, dividing the inputted nuclear pattern diagram according to categories, and outputting 24 chromosome images coded by splicing positions;
the model off-line training module is used for training a model, calling the data of the local kernel image database in a multithreading manner and completing distributed training on a plurality of GPUs;
the nuclear type image coding vector module is used for coding the chromosome nuclear type image into an image characteristic vector;
the core-type text coding module is used for coding the abnormal information core-type text information into text feature vectors;
the abnormal kernel type text vectorization module is used for inputting N abnormal kernel types into the text encoder to obtain abnormal text feature vectors and constructing an abnormal text vector base;
the feature vector similarity calculation module is used for calculating feature similarity calculation of the image feature vector and the abnormal text vector base, outputting the highest similarity and judging whether the similarity reaches a set threshold value, outputting a designated kernel type if the similarity reaches the threshold value, otherwise, not outputting the kernel type;
and the user interaction interface is used for displaying the case level abnormal information judged by the feature vector similarity calculation module and the abnormal information of the list Zhang Zhongqi chart in the case.
Specifically, the karyotype map preprocessing module is responsible for dividing chromosomes of an input karyotype map, outputting 24 divided chromosome images according to categories, calculating the maximum size of all chromosome blocks as standard size, and filling 255 for each chromosome image to reach the standard size, wherein the standard size is defined as 128 x 128;
the nuclear image coding vector module plays a role in junction visual information and semantic information in the whole process, and extracts high-level semantic features of the image by means of strong multi-modal representation capability of the model; unlike pure pixel-level information, these semantic features focus on the visual patterns of chromosomes and are associated with linguistic concepts; the finally output coding vector fully fuses the visual and semantic information of the kernel-type image; these coding features can be used directly for abnormality diagnosis, or can be input into other diagnostic modules to enhance the effect;
the abnormal nuclear type text information coder is used for inputting N abnormal nuclear type text information into the text coder to obtain feature vectors, and constructing an abnormal text vector base;
the feature vector similarity calculation module calculates the similarity between the feature vector of the reconstructed chromosome and the vector of the abnormal text vector base by calculating the cosine similarity between the feature vector of the reconstructed chromosome and the feature vector of the real image, and the kernel image coding vector and the vector of the abnormal text vector base;
the user interaction interface is used for displaying case level abnormality information and abnormality information of a single Zhang Zhongqi chart in the case, playing a role in directly prompting an abnormality identification result and accurately positioning abnormality of the medium-term charts in the case.
Further, in the division of the chromosome in the karyotype map, the chromosome is divided into blocks according to the karyotype category, the long side of each chromosome image is adjusted to be the standard length 128, then the filling quantity of the short side is calculated, and the pixel values 255 are filled at the two sides of the short side to be consistent with the long side 128; the filling formula is as follows:
where H denotes a standard length 128, H is the size of the short side to be filled, and pad_h denotes the filling amount on both sides of the short side.
Further, the core pattern diagram preprocessing module includes: firstly dividing the vector into 24 patches in total of 1-22 # chromosome, x-y chromosome according to the category, obtaining 24 vectors through linear mapping, splicing a position code for recording the position of the vector for each vector, and inputting the 24 vectors into a Transformers Encoder encoder to obtain the corresponding image coding vector.
Further, the linear mapping converts 24 patches of 128 x 128 into 24 768-dimensional vectors by convolution, pooling, activation operations, and batch normalization.
Further, the batch normalization adjustment process is as follows:
(1);
(2);
(3);
(4);
in the method, in the process of the application,for inputting the average value of patch, +.>Representing an input profile, < >>For variance of patch, ++>Is->Normalized value, ++>For scaling parameters +.>For translation parameter, y i Normalized value for each patch, i= [ 12 3 … … 24]。
Further, the data of the local karyotype image database is derived from millions of metaphase graphs, which are marked with the data of the karyotype analysis result, and each metaphase graph corresponds to one karyotype information text.
Further, the core type text divides words of the core type text information through a text encoder, and inputs the words into the core type encoder to obtain a core type text vector; and then comparing and learning with the image coding vector obtained by the kernel-type image coding vector module, and adjusting the loss function to enable the image coding vector and the kernel-type text vector to tend to be consistent, wherein the loss function is as follows:
where q represents a vector obtained by the image encoder,representing the correct text vector matching q, k refers to the number of categories in the dataset, and in contrast learning, this k refers to the number of negative samples, sum in the denominator above is done on 1 positive sample and k negative samples, from 0 to k, so k+1 samples total, T represents a hyper-parameter, a scalar here default value t=1, is a scalar here default value t=1>Representing the loss value of one sample.
Further, the construction steps of the abnormal text vector base are as follows: k+1 abnormal nuclear type information is prepared in advance, a nuclear type expression is obtained through labeling, the k+1 nuclear type information is input into a trained bert text encoder, feature vectors of the corresponding nuclear type information are obtained, and finally a nuclear type information feature vector library with the vector number of k+1 and the vector length of 768 is established.
Further, the process of calculating the feature vector similarity is as follows: inputting a kernel type graph into an image encoder to obtain a feature vector Q, obtaining cosine similarity between the current feature vector Q and all vectors of an abnormal text vector base, obtaining cosine similarity D= [0,1], and when D > threshold, representing that the matched kernel type text information is credible when the D > threshold is greater than the threshold, and otherwise, obtaining the feature vector Q when the D > threshold is less than the threshold, wherein the threshold is a threshold for judging whether the abnormal kernel type is correct; the similarity calculation formula is as follows
Wherein Q is a characteristic vector of a core pattern diagram, b j And the abnormal text vector base stores the feature vector of the core information.
Further, the user interaction interface marks structural anomalies by pointing with arrows, marks the quantity anomalies by circles, and prompts text information for the anomalies.
The application has the beneficial effects that:
the application directly inputs the nuclear image without considering the image film making level, and has better image modeling capability: the global content and the local relation of the core pattern graph can be more fully represented according to the characteristic that the core pattern graph has category prior information and the category blocks;
the application performs end-to-end training: the method can be jointly optimized, the text encoder can guide the image encoder to learn important features, and the number abnormality and the structure abnormality are detected directly end to end, so that the method is simple and effective; and which abnormality can be accurately prompted; semantic information is introduced, and the output result accords with clinical expression habit;
the application has strong interpretation: the detection result is interpreted through text description, so that the structural abnormality of suspected number abnormality numbers in a plurality of images in a case can be known, a doctor or a professional can rapidly and efficiently locate whether the case really appears abnormality or not, and the understandability is improved;
the application is flexible and universal: the method can be applied to other pathological image classification detection tasks in an expanded mode, and more modal information is added, so that model migration is facilitated;
the application is easy to optimize: data may continue to be collected, model architecture adjusted, etc., to iterate the improvement effect.
Drawings
For a further understanding of the nature and technical aspects of the present application, reference should be made to the following detailed description of the application and to the accompanying drawings, which are provided for purposes of reference only and are not intended to limit the application.
In the drawing the view of the figure,
FIG. 1 is a block diagram of the present application;
FIG. 2 is a flow chart of the operation of the application;
FIG. 3 is a diagram of an inventive image encoder model;
FIG. 4 is a diagram of a text encoder model of the application;
FIG. 5 is a model diagram of an inventive core text base;
fig. 6 is a schematic diagram of an abnormality notification system according to the present application.
Detailed Description
In order to further explain the technical means adopted by the present application and the effects thereof, the following detailed description is given with reference to the preferred embodiments of the present application and the accompanying drawings.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
In the description of the present application, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings are merely for convenience in describing the present application and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
In the present application, the term "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described as "exemplary" in this disclosure is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been described in detail so as not to obscure the description of the application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
Examples
Referring to fig. 1-6, the present application can be divided into three parts, namely a local core image database, a model offline training module, and a core image coding vector module; the core-type text coding module, the abnormal core-type text vectorization module and the feature vector similarity calculation module are used as a part; the user interaction interface is a part.
Firstly, the implementation method of the local chromosome image database, the model off-line training module, the nuclear type image coding module and the nuclear type information coding module is described as follows:
the data of the local karyotype image database is derived from the karyotype analysis results marked by millions of metaphase pictures and used as training data, and each metaphase picture is used as a training label for a karyotype information text.
The left side of fig. 3 is the image encoder of the present application, which inputs a kernel-type map to output a coded vector. As shown in FIG. 3, the left-most part is a complete karyotype map, and even if the karyotype map has chromosome abnormality, the model can hardly distinguish which category has abnormality, so that in order to solve the problem of category position coding, the application firstly divides the complete karyotype map into chromosome maps of small blocks according to categories. The application is divided into 24 patches (1-22, x, y chromosomes) according to categories. 24 vectors are obtained through linear mapping on the right side of fig. 3, and a position code is spliced to each vector for recording the position of each category vector, such as (1, 2,3 … …, 24). These vectors are then input to a Transformers Encoder encoder to obtain corresponding image encoding vectors.
On the right side of fig. 3 is a linear mapping module that converts 24 patches of 128 x 128 into 24 768-dimensional vectors by convolution, pooling, activation operations, and batch normalization. The batch normalization is added for the stability of data input, the generalization capability of a model is enhanced, and the scale dependence of the gradient on the initial value of the parameter is reduced.
The above formulaFor inputting the average value of patch, +.>Representing an input feature map. />For variance of patch, ++>Is->The values after the treatment are normalized. The final normalized values are:
introduced intoAnd->The scaling and translation processes are performed on two parameters, so that the network can learn the two reconstruction parameters, and the model can pay out the characteristic distribution of the original network.
The text encoder used in the application is shown in fig. 4, which refers to a bert model as a text vector extractor, wherein bert combines the representational learning ability of a transducer and the large-scale pre-training ability, is an important model for text representation and understanding, and is widely applied to various NLP tasks. And the core text information corresponding to the core graph is as karyottype: +18, t (9, 22), the information is first input to the word segmentation device for word segmentation, and then the vector obtained by the word segmentation device is input to the text encoder ebedding to obtain a 768-dimensional text vector.
The model offline training module is used for training the model provided by the application, the model is a multi-mode machine learning model, and vision and semantic combined representation is obtained mainly through image and text contrast learning. When using an image encoder, the image encoder model structure of the present application is shown in FIG. 3, which divides a core image into a plurality of patches, and then generates an ebedding for each patch like word embedding in NLP. These patch enabling are input into transformer encoder along with position embedding, and global features of the image are extracted through a self-attention mechanism. Conventional VIT (Vision Transformer) is to divide an input image into image blocks according to an equal division size, for example, an input 224 x 224 image, and divide it into 16 x 16 patches, each patch size being fixed 14 x 14. However, the size of each chromosome in the karyotype map of the present application is not uniform and the number is not uniform, so that the chromosomes cannot be divided according to the fixed size, and here, we skillfully divide the chromosomes into blocks according to the karyotype category. First, the length of each chromosome image is from the long side to the standard length 128 according to the long side resolution, then the filling amount of the short side is calculated, and the pixel values 255 are filled on both sides of the short side so as to be consistent with the long side 128. Each block is then input transformer encoder by category so that the model can be informed of the category a priori information, and location information can be obtained from the category so that the model can know which chromosome is abnormal. The filling formula is as follows, where H represents the standard length 128, i.e. the size of the image block before we want to input the model, H is the size of the short side to be filled, and pad_h represents the filling amount on both sides of the short side.
The filled kernel pattern is input to the image encoder of the present application as shown in fig. 3, and a 768 kernel pattern feature vector q is obtained.
Next, as shown in fig. 4, the text encoder in the present application divides the core text information into words, inputs the words into the core encoder to obtain a core text vector, and performs contrast learning on the core diagram feature vector q obtained by the image encoder and the core text vector k obtained by the text encoder. The characteristic vector of the core pattern diagram and the core pattern text vector tend to be consistent.
The loss function used in the present application is as follows, and this loss design initially shows that if the problem is considered to be a two-class problem, only the data samples and noise samples may not be friendly to model learning, since many noise samples may not be a class at all, and therefore it is reasonable to consider it as a multi-class problem.
Where q represents the vector obtained by the image encoder,representing the correct text vector matching q. k refers to the number of categories in the dataset, while in contrast learning, this k refers to the number of negative samples. The sum in the denominator above is done on 1 positive sample and k negative samples, from 0 to k, so k+1 samples (number of core type information categories), T represents a superparameter, which is a scalar where default t=1,>representing the loss value of one sample.
Then, in the construction of the abnormal karyotype text vector base, k+1 abnormal karyotype information is prepared in advance, and all abnormal karyotypes are karyotype expressions obtained by a professional chromosome karyotype analyst through labeling. Inputting the kernel type information in the k+1 into a previously trained bert text encoder to obtain the feature vector of the corresponding kernel type information. Finally, a kernel type information characteristic vector library B with a vector number of k+1 and a vector length of 768 is established, as shown in fig. 5.
Inputting a kernel type graph into an image encoder to obtain a feature vector Q, obtaining cosine similarity between the current vector and all vectors of a kernel type text vector base, obtaining cosine similarity D= [0,1], and when D > threshold, indicating that the matched kernel type text information is credible when the D > threshold is greater than the threshold, and otherwise, obtaining the feature vector Q when the D > threshold is less than the threshold, wherein the threshold is a threshold for whether the abnormal kernel type is correct. The similarity calculation formula is as follows
Wherein Q is a characteristic vector of a core pattern diagram, b j And the abnormal text vector base stores the feature vector of the core information.
Finally, the doctor is assisted in judging whether the number of cases is abnormal or the structure is abnormal by acquiring the highest similarity and judging whether the set threshold is reached.
Application example:
as shown in FIG. 2, after an abnormal case is subjected to a segmentation and identification algorithm in the front of the system, a metaphase map, that is, a karyotype map corresponding to each cell in the case, is obtained, and the karyotype map is a map obtained by segmenting chromosomes from the metaphase map according to categories and placing the chromosomes at corresponding positions according to identification results. Inputting each karyogram in the case into a previously trained image encoder to obtain an image feature vector Q, and then inputting the vector Q and an abnormal karyotype text vector baseCalculating cosine similarity for matching, and finding out a core text vector ++f with the maximum similarity>. Then determine this similarity +.>Whether greater than the threshold we set.
The karyotype information corresponding to each cell in the case is obtained after the previous operation, and the karyotype information of normal abnormality is agreed, at this time, we count that the frequency ratio of occurrence of a certain type of abnormal karyotype (such as karyotype: +18, t (9, 11)) exceeds n (where n=0.6 is also a representative threshold), and then indicate that the case does have the abnormality (karyotype: +18, t (9, 11)). When a prompt sign is displayed in the system and the case is opened, the cells corresponding to the abnormal are also marked as arrow pointing marks (abnormal structure) and red circle marks (abnormal quantity) in fig. 6.
The embodiments of the present application described above do not limit the scope of the present application.

Claims (10)

1. A chromosome case-level abnormality cue system based on visual semantic association, comprising:
the nuclear pattern diagram preprocessing module is used for dividing chromosomes of the input nuclear pattern diagram, dividing the inputted nuclear pattern diagram according to categories, and outputting 24 chromosome images coded by splicing positions;
the model off-line training module is used for training a model, calling the data of the local kernel image database in a multithreading manner and completing distributed training on a plurality of GPUs;
the nuclear type image coding vector module is used for coding the chromosome nuclear type image into an image characteristic vector;
the core-type text coding module is used for coding the abnormal information core-type text information into text feature vectors;
the abnormal kernel type text vectorization module is used for inputting N abnormal kernel types into the text encoder to obtain abnormal text feature vectors and constructing an abnormal text vector base;
the feature vector similarity calculation module is used for calculating feature similarity calculation of the image feature vector and the abnormal text vector base, outputting the highest similarity and judging whether the similarity reaches a set threshold value, outputting a designated kernel type if the similarity reaches the threshold value, otherwise, not outputting the kernel type;
and the user interaction interface is used for displaying the case level abnormal information judged by the feature vector similarity calculation module and the abnormal information of the list Zhang Zhongqi chart in the case.
2. The visual semantic association-based chromosome case-level abnormality prompting system according to claim 1, wherein the division of the chromosomes in the karyotype map is to divide the chromosomes into blocks according to the karyotype category, then adjust the long side of each chromosome image to the standard length 128, then calculate the filling amount of the short side, and fill the pixel values 255 on both sides of the short side to be consistent with the long side 128; the filling formula is as follows:
where H denotes a standard length 128, H is the size of the short side to be filled, and pad_h denotes the filling amount on both sides of the short side.
3. The system for prompting chromosomal case-level abnormalities based on visual semantic association of claim 1, wherein said karyogram preprocessing module comprises: firstly dividing the vector into 24 patches in total of 1-22 # chromosome, x-y chromosome according to the category, obtaining 24 vectors through linear mapping, splicing a position code for recording the position of the vector for each vector, and inputting the 24 vectors into a Transformers Encoder encoder to obtain the corresponding image coding vector.
4. A chromosome case level abnormality prompting system based on visual semantic association as claimed in claim 3, wherein said linear mapping converts 24 patches of 128 x 128 into 24 768-dimensional vectors by convolution, pooling, activation operations, and batch normalization.
5. The system for prompting chromosomal case-level abnormalities based on visual semantic association of claim 4, wherein said batch normalization is adjusted as follows:
(1);
(2);
(3);
(4);
in the method, in the process of the application,for inputting the average value of patch, +.>Representing an input profile, < >>For variance of patch, ++>Is->Normalized value, ++>For scaling parameters +.>For translation parameter, y i Normalized value for each patch, i= [ 12 3 … … 24]。
6. The visual semantic association-based chromosome case-level abnormality prompting system according to claim 1, wherein the data of the local karyotype image database is derived from data of millions of metaphase graphs marked with karyotype analysis results, and each metaphase graph corresponds to a karyotype information text.
7. The chromosome case-level abnormality prompting system based on visual semantic association according to claim 6, wherein the core-type information text is used for segmenting core-type text information through a text encoder, and inputting the segmented core-type text information into the core-type encoder to obtain a core-type text vector; and then comparing and learning with the image coding vector obtained by the kernel-type image coding vector module, and adjusting the loss function to enable the image coding vector and the kernel-type text vector to tend to be consistent, wherein the loss function is as follows:
where q represents a vector obtained by the image encoder,representing the correct text vector matching q, k refers to the number of categories in the dataset, and in contrast learning, this k refers to the number of negative samples, sum in the denominator above is done on 1 positive sample and k negative samples, from 0 to k, so k+1 samples total, T represents a hyper-parameter, a scalar here default value t=1, is a scalar here default value t=1>Representing the loss value of one sample.
8. The chromosome case-level abnormality prompting system based on visual semantic association according to claim 6, wherein the step of constructing the abnormal text vector base is: and (3) pre-manufacturing k+1 abnormal nuclear type information, marking the obtained nuclear type expression, inputting the k+1 nuclear type information into a trained bert text encoder to obtain the characteristic vector of the corresponding nuclear type information, and finally establishing a nuclear type information characteristic vector library with the vector number of k+1 and the vector length of 768.
9. The system for prompting chromosomal case-level abnormalities based on visual semantic association according to claim 6, wherein said feature vector similarity calculation is performed by: inputting a kernel type graph into an image encoder to obtain a feature vector Q, obtaining cosine similarity between the current feature vector Q and all vectors of an abnormal text vector base, obtaining cosine similarity D= [0,1], and when D > threshold, representing that the matched kernel type text information is credible when the D > threshold is greater than the threshold, and otherwise, obtaining the feature vector Q when the D > threshold is less than the threshold, wherein the threshold is a threshold for judging whether the abnormal kernel type is correct; the similarity calculation formula is as follows
Wherein Q is a characteristic vector of a core pattern diagram, b j And the abnormal text vector base stores the feature vector of the core information.
10. The system for prompting chromosomal case-level abnormalities based on visual semantic association according to claim 1, wherein said user interactive interface is pointed by arrows to identify structural abnormalities, to identify a number of abnormalities by circles, and to prompt text information for abnormal cells.
CN202311235013.2A 2023-09-25 2023-09-25 Chromosome case-level abnormality prompting system based on visual semantic association Active CN116977338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311235013.2A CN116977338B (en) 2023-09-25 2023-09-25 Chromosome case-level abnormality prompting system based on visual semantic association

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311235013.2A CN116977338B (en) 2023-09-25 2023-09-25 Chromosome case-level abnormality prompting system based on visual semantic association

Publications (2)

Publication Number Publication Date
CN116977338A true CN116977338A (en) 2023-10-31
CN116977338B CN116977338B (en) 2023-12-12

Family

ID=88477136

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311235013.2A Active CN116977338B (en) 2023-09-25 2023-09-25 Chromosome case-level abnormality prompting system based on visual semantic association

Country Status (1)

Country Link
CN (1) CN116977338B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117577258A (en) * 2024-01-16 2024-02-20 北京大学第三医院(北京大学第三临床医学院) PETCT (pulse-based transmission control test) similar case retrieval and prognosis prediction method

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9141883B1 (en) * 2015-05-11 2015-09-22 StradVision, Inc. Method, hard negative proposer, and classifier for supporting to collect hard negative images using a similarity map
US20190336085A1 (en) * 2018-04-10 2019-11-07 Hill-Rom Services, Inc. Patient risk assessment based on data from multiple sources in a healthcare facility
EP3633682A1 (en) * 2018-10-05 2020-04-08 China Medical University Hospital Chromosome abnormality detecting model, detecting system thereof, and method for detecting chromosome abnormality
CN114219786A (en) * 2021-12-16 2022-03-22 易构智能科技(广州)有限公司 Chromosome karyotype analysis method and system based on deep learning
CN114491125A (en) * 2021-12-31 2022-05-13 中山大学 Cross-modal figure clothing design generation method based on multi-modal codebook
CN114842472A (en) * 2022-07-04 2022-08-02 杭州德适生物科技有限公司 Method and device for detecting chromosome structure abnormality based on deep learning
CN114913176A (en) * 2022-07-18 2022-08-16 江苏启航箱包有限公司 Flexible leather material scab defect detection method and system based on artificial intelligence
CN115063412A (en) * 2022-08-04 2022-09-16 湖南自兴智慧医疗科技有限公司 Chromosome image splicing method and chromosome karyotype analysis method
CN115294150A (en) * 2022-06-22 2022-11-04 华为技术有限公司 Image processing method and terminal equipment
CN115601360A (en) * 2022-12-13 2023-01-13 湖南自兴智慧医疗科技有限公司(Cn) Chromosome structure abnormality auxiliary identification method and system and computer equipment
CN115658942A (en) * 2022-10-31 2023-01-31 南京财经大学 Financial scene-oriented joint credit investigation intelligent data retrieval method
CN116091828A (en) * 2023-01-16 2023-05-09 上海科莫生医疗科技有限公司 Chromosome image interpretable analysis method, device, equipment and storage medium
WO2023101679A1 (en) * 2021-12-02 2023-06-08 Innopeak Technology, Inc. Text-image cross-modal retrieval based on virtual word expansion
CN116561365A (en) * 2023-05-16 2023-08-08 中国海洋大学 Remote sensing image cross-modal retrieval method based on layout semantic joint significant characterization

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9141883B1 (en) * 2015-05-11 2015-09-22 StradVision, Inc. Method, hard negative proposer, and classifier for supporting to collect hard negative images using a similarity map
US20190336085A1 (en) * 2018-04-10 2019-11-07 Hill-Rom Services, Inc. Patient risk assessment based on data from multiple sources in a healthcare facility
EP3633682A1 (en) * 2018-10-05 2020-04-08 China Medical University Hospital Chromosome abnormality detecting model, detecting system thereof, and method for detecting chromosome abnormality
WO2023101679A1 (en) * 2021-12-02 2023-06-08 Innopeak Technology, Inc. Text-image cross-modal retrieval based on virtual word expansion
CN114219786A (en) * 2021-12-16 2022-03-22 易构智能科技(广州)有限公司 Chromosome karyotype analysis method and system based on deep learning
CN114491125A (en) * 2021-12-31 2022-05-13 中山大学 Cross-modal figure clothing design generation method based on multi-modal codebook
CN115294150A (en) * 2022-06-22 2022-11-04 华为技术有限公司 Image processing method and terminal equipment
CN114842472A (en) * 2022-07-04 2022-08-02 杭州德适生物科技有限公司 Method and device for detecting chromosome structure abnormality based on deep learning
CN114913176A (en) * 2022-07-18 2022-08-16 江苏启航箱包有限公司 Flexible leather material scab defect detection method and system based on artificial intelligence
CN115063412A (en) * 2022-08-04 2022-09-16 湖南自兴智慧医疗科技有限公司 Chromosome image splicing method and chromosome karyotype analysis method
CN115658942A (en) * 2022-10-31 2023-01-31 南京财经大学 Financial scene-oriented joint credit investigation intelligent data retrieval method
CN115601360A (en) * 2022-12-13 2023-01-13 湖南自兴智慧医疗科技有限公司(Cn) Chromosome structure abnormality auxiliary identification method and system and computer equipment
CN116091828A (en) * 2023-01-16 2023-05-09 上海科莫生医疗科技有限公司 Chromosome image interpretable analysis method, device, equipment and storage medium
CN116561365A (en) * 2023-05-16 2023-08-08 中国海洋大学 Remote sensing image cross-modal retrieval method based on layout semantic joint significant characterization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MARTIN ENGILBERGE等: "Finding beans in burgers: Deep semantic-visual embedding with localization", 《ARXIV:1804.01720V1》, pages 1 - 10 *
李康等: "基于卷积神经网络和几何优化的统计染色体核型分析方法", 《南京大学学报(自然科学版)》, vol. 56, no. 1, pages 116 - 124 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117577258A (en) * 2024-01-16 2024-02-20 北京大学第三医院(北京大学第三临床医学院) PETCT (pulse-based transmission control test) similar case retrieval and prognosis prediction method
CN117577258B (en) * 2024-01-16 2024-04-02 北京大学第三医院(北京大学第三临床医学院) PETCT (pulse-based transmission control test) similar case retrieval and prognosis prediction method

Also Published As

Publication number Publication date
CN116977338B (en) 2023-12-12

Similar Documents

Publication Publication Date Title
EP3961484A1 (en) Medical image segmentation method and device, electronic device and storage medium
WO2022001623A1 (en) Image processing method and apparatus based on artificial intelligence, and device and storage medium
CN110647874B (en) End-to-end blood cell identification model construction method and application
CN110390674B (en) Image processing method, device, storage medium, equipment and system
CN110490242B (en) Training method of image classification network, fundus image classification method and related equipment
CN111047605B (en) Construction method and segmentation method of vertebra CT segmentation network model
CN104484886B (en) A kind of dividing method and device of MR images
CN116977338B (en) Chromosome case-level abnormality prompting system based on visual semantic association
CN112396605B (en) Network training method and device, image recognition method and electronic equipment
CN111079901A (en) Acute stroke lesion segmentation method based on small sample learning
CN113902945A (en) Multi-modal breast magnetic resonance image classification method and system
CN115546605A (en) Training method and device based on image labeling and segmentation model
CN116797554A (en) Image processing method and device
CN115861181A (en) Tumor segmentation method and system for CT image
CN114283406A (en) Cell image recognition method, device, equipment, medium and computer program product
CN113850796A (en) Lung disease identification method and device based on CT data, medium and electronic equipment
CN113177957A (en) Cell image segmentation method and device, electronic equipment and storage medium
CN117522891A (en) 3D medical image segmentation system and method
CN111598904B (en) Image segmentation method, device, equipment and storage medium
CN114332858A (en) Focus detection method and device and focus detection model acquisition method
CN115578400A (en) Image processing method, and training method and device of image segmentation network
CN117853490B (en) Image processing method and training method of image processing model
CN112950582B (en) 3D lung focus segmentation method and device based on deep learning
CN116188879B (en) Image classification and image classification model training method, device, equipment and medium
CN117422732B (en) Pathological image segmentation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant