CN113139956A - Generation method and identification method of section identification model based on language knowledge guidance - Google Patents

Generation method and identification method of section identification model based on language knowledge guidance Download PDF

Info

Publication number
CN113139956A
CN113139956A CN202110516561.7A CN202110516561A CN113139956A CN 113139956 A CN113139956 A CN 113139956A CN 202110516561 A CN202110516561 A CN 202110516561A CN 113139956 A CN113139956 A CN 113139956A
Authority
CN
China
Prior art keywords
training
model
tangent plane
category
corpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110516561.7A
Other languages
Chinese (zh)
Other versions
CN113139956B (en
Inventor
倪东
何双池
杨鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN202110516561.7A priority Critical patent/CN113139956B/en
Publication of CN113139956A publication Critical patent/CN113139956A/en
Application granted granted Critical
Publication of CN113139956B publication Critical patent/CN113139956B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10132Ultrasound image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30008Bone
    • G06T2207/30012Spine; Backbone
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30044Fetus; Embryo
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)

Abstract

The application discloses a generating method and a recognition method of a section recognition model based on language knowledge guidance, wherein the method comprises the steps of training a feature extraction model based on a training sample set; determining model parameters of an initial classification model based on a training corpus and an initial graph neural network, and determining a prediction tangent plane type and a prediction anatomical structure type based on a feature extraction model and the initial classification model; training the neural network of the initial graph to obtain a classification model based on the predicted tangent plane class and the predicted anatomical structure class; and connecting the feature extraction model with the classification model to obtain a section identification model. According to the method and the device, the section recognition model is trained through the image characteristics of the training ultrasonic image and the semantic characteristics formed based on the labeling categories carried by the training ultrasonic image, so that the section recognition model can learn the interdependence relation among the image characteristic information of the ultrasonic image, each section category and each anatomical structure category, and the recognition accuracy based on the section recognition model can be improved.

Description

Generation method and identification method of section identification model based on language knowledge guidance
Technical Field
The application relates to the technical field of ultrasonic image processing, in particular to a generation method and an identification method of a tangent plane identification model based on language knowledge guidance.
Background
Ultrasound imaging has been widely used to assess fetal growth and development and congenital abnormalities in routine obstetrical screening. During the ultrasonic screening process, a doctor needs to select a standard section containing key anatomical structures for measurement of relevant biological parameters, for example, as shown in fig. 1, the horizontal cross section of the upper abdomen of prenatal ultrasound contains the anatomical structures such as the spine, the gastric vacuole and the umbilical vein (see the left drawing of fig. 1), and the abdominal circumference of the fetus, which can be evaluated based on the horizontal cross section of the upper abdomen of property ultrasound, needs to be measured on the standard horizontal cross section of the upper abdomen. However, the existing standard cutting planes carrying anatomical structures are generally manually identified by operators, which depends heavily on the experience of the operators, so that the identified standard cutting planes are different from operator to operator, and the accuracy of the identified standard cutting planes is affected.
Thus, the prior art has yet to be improved and enhanced.
Disclosure of Invention
The technical problem to be solved by the present application is to provide a method for generating a tangent plane recognition model based on language knowledge guidance and a recognition method, aiming at the defects of the prior art.
In order to solve the above technical problem, a first aspect of the embodiments of the present application provides a method for generating a tangent plane recognition model based on language knowledge guidance, where the method further includes:
training a preset network model by adopting a supervision comparison learning mode based on a preset training sample set to obtain a feature extraction model;
determining initial model parameters based on a preset training corpus and an initial graph neural network, and taking the initial model parameters as model parameters of an initial classification model, wherein the training corpus is determined based on labeling tangent plane classes and labeling anatomical structure classes corresponding to training ultrasonic images in the training sample set;
determining a feature map corresponding to a training ultrasonic image in the training sample set based on the feature extraction model;
determining a predicted tangent plane class and a predicted anatomical structure class corresponding to the feature map based on the initial classification model;
training the initial graph neural network based on the predicted tangent plane type, the predicted anatomical structure type, the labeled tangent plane type and the labeled anatomical structure type, and continuing to execute the step of determining initial model parameters based on a preset training corpus and the initial graph neural network until a classification model is obtained through training;
and connecting the feature extraction model with the classification model to obtain a section identification model.
The method for generating the tangent plane recognition model based on the language knowledge guidance further comprises the following steps of training a preset network model by adopting a supervision and comparison learning mode based on a preset training sample set to obtain a feature extraction model:
acquiring a plurality of training ultrasonic images, and labeling section classes and labeling anatomical structure classes corresponding to the training ultrasonic images respectively to obtain a preset training sample set;
and regarding each training ultrasonic image in the plurality of training ultrasonic images, taking the labeling section class and the anatomical structure class corresponding to the training ultrasonic image as a training corpus to obtain a preset training corpus.
The generation method of the tangent plane recognition model based on the language knowledge guidance comprises the following steps of:
determining a corpus graph corresponding to the training corpus based on the word vector matrix and the adjacency matrix corresponding to the training corpus;
and carrying out graph operation on the corpus graph based on the initial graph neural network so as to determine initial model parameters.
The generating method of the tangent plane recognition model based on the language knowledge guidance, wherein the determining the corpus map corresponding to the training corpus based on the word vector matrix and the adjacency matrix corresponding to the training corpus specifically includes:
each word vector in the word vector matrix is used as a graph node, and a connecting edge between every two graph nodes is determined based on the adjacent matrix;
and constructing a corpus graph corresponding to the training corpus based on the graph nodes and the connecting edges among the graph nodes.
The method for generating the tangent plane recognition model based on the language knowledge guidance comprises the steps that each word vector in the word vector matrix corresponds to a target labeling type, the target labeling type is a labeling tangent plane type or a labeling anatomical structure type carried by a training ultrasonic image in the training sample set, and the target labeling types corresponding to the word vectors are different from one another.
The generation method of the tangent plane recognition model based on the language knowledge guidance comprises the following steps of training a preset network model by adopting a supervision and comparison learning mode based on a preset training sample set to obtain a feature extraction model:
clustering word vectors of a word vector matrix corresponding to the training corpus to obtain a plurality of candidate classes, and dividing the training samples into a plurality of training batches;
for each training ultrasonic image in each training batch, determining a reference marking section class corresponding to the training ultrasonic image based on a marking section class, a marking anatomical structure class and a plurality of candidate classes carried by the training ultrasonic image to obtain a reference training batch corresponding to the training batch;
inputting each training ultrasonic image in the training batch into a preset network model respectively to obtain a training characteristic diagram corresponding to each training ultrasonic image;
constructing a supervision contrast loss function value corresponding to the training batch based on the training characteristic graph corresponding to each training ultrasonic image;
and training a preset network model based on the supervision comparison loss function value to obtain a feature extraction model.
The generation method of the tangent plane recognition model based on the language knowledge guidance includes the following steps of:
for the training feature map corresponding to each training ultrasonic image, determining a plurality of first training feature maps and a plurality of second training feature maps corresponding to the training feature map based on a reference training batch, wherein the reference labeling tangent plane category corresponding to the first training feature map is the same as the reference category corresponding to the training feature map, and the reference labeling tangent plane category corresponding to the second training feature map is different from the reference category corresponding to the training feature map;
determining a first loss value of the training feature map and each corresponding first training feature map, and a second loss value of the training feature map and each corresponding second training feature map;
weighting each first loss value and each second loss value based on a preset weighting coefficient to obtain a loss value corresponding to the training characteristic diagram;
and determining the supervision contrast loss function value corresponding to the training batch based on the loss value corresponding to each training feature map.
A second aspect of the embodiments of the present application provides a method for recognizing a tangent plane based on language knowledge guidance, where the method for recognizing a tangent plane applies any one of the tangent plane recognition models described above, and the method for recognizing a tangent plane specifically includes:
acquiring an ultrasonic image to be identified, and inputting the ultrasonic identification image into the section identification model;
determining a target characteristic diagram corresponding to the ultrasonic image to be identified by using a characteristic extraction model in the section identification model;
and determining the section class and the anatomical structure class corresponding to the ultrasonic image to be identified based on the target feature map by using the classification model in the section identification model.
A third aspect of the embodiments of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs, which are executable by one or more processors to implement the steps in the method for generating a tangent plane recognition model based on language knowledge guidance as described in any one of the above and/or to implement the steps in the method for recognizing a tangent plane based on language knowledge guidance as described in the above.
A fourth aspect of the embodiments of the present application provides a terminal device, including: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps of the method for generating the tangent plane recognition model based on language knowledge guidance as described in any one of the above, and/or implements the steps of the method for recognizing the tangent plane based on language knowledge guidance as described in the above.
Has the advantages that: compared with the prior art, the generation method and the recognition method of the section recognition model based on the language knowledge guidance are provided, and the generation method comprises the steps of training a feature extraction model based on a preset training sample set; determining model parameters of an initial classification model based on a preset training corpus and an initial map neural network, and then determining a predicted tangent plane class and a predicted anatomical structure class of a training ultrasonic image based on the feature extraction model and the initial classification model; training the initial graph neural network to obtain a classification model based on the predicted tangent plane category, the predicted anatomical structure category, the labeled tangent plane category and the labeled anatomical structure category; and finally, connecting the feature extraction model with the classification model to obtain a section identification model. According to the method and the device, the section recognition model is trained through the image characteristic information of the training ultrasonic image and the semantic characteristics formed based on the labeling categories carried by the training ultrasonic image, so that the section recognition model can learn the interdependence relation among the image characteristic information of the ultrasonic image, each section category and each anatomical structure category, and the recognition accuracy based on the section recognition model can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without any inventive work.
Fig. 1 is a flowchart of a method for generating a tangent plane recognition model based on language knowledge guidance according to the present application.
Fig. 2 is a schematic flowchart of a method for generating a tangent plane recognition model based on language knowledge guidance according to the present application.
Fig. 3 is a horizontal cross-sectional image of a standard upper abdomen of a fetus.
Fig. 4 is a standard four-chamber section image of a fetus.
Fig. 5 is a horizontal cross-sectional image of a non-standard upper abdomen of a fetus.
Fig. 6 is a schematic structural diagram of a terminal device provided in the present application.
Detailed Description
The present application provides a method for generating a tangent plane recognition model based on language knowledge guidance and a recognition method, and in order to make the purpose, technical scheme and effect of the present application clearer and clearer, the present application is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those within the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In particular implementations, the terminal devices described in embodiments of the present application include, but are not limited to, other portable devices such as mobile phones, laptops, or tablet computers with touch sensitive surfaces (e.g., touch displays and/or touch pads). It should also be understood that in some embodiments, the device is not a portable communication device, but is a desktop computer having a touch-sensitive surface (e.g., a touch-sensitive display screen and/or touchpad).
In the discussion that follows, a terminal device that includes a display and a touch-sensitive surface is described. However, it should be understood that the terminal device may also include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
The terminal device supports various applications, such as one or more of the following: a drawing application, a presentation application, a word processing application, a video conferencing application, a disc burning application, a spreadsheet application, a gaming application, a telephone application, a video conferencing application, an email application, an instant messaging application, an exercise support application, a photo management application, a data camera application, a digital video camera application, a web browsing application, a digital music player application, and/or a digital video playing application, etc.
Various applications that may be executed on the terminal device may use at least one common physical user interface device, such as a touch-sensitive surface. The first or more functions of the touch-sensitive surface and the corresponding information displayed on the terminal may be adjusted and/or changed between applications and/or within respective applications. In this way, a common physical framework (e.g., touch-sensitive surface) of the terminal can support various applications with user interfaces that are intuitive and transparent to the user.
It should be understood that, the sequence numbers and sizes of the steps in this embodiment do not mean the execution sequence, and the execution sequence of each process is determined by its function and inherent logic, and should not constitute any limitation on the implementation process of this embodiment.
The inventors have found through research that ultrasound imaging has been widely used to assess the growth and development status of a fetus and congenital malformations in conventional obstetrical screening. During the ultrasonic screening process, the doctor needs to select a standard section containing key anatomical structures for measurement of relevant biological parameters, for example, as shown in fig. 3, the horizontal cross section of the upper abdomen of prenatal ultrasound contains the anatomical structures such as the spine, the gastric vacuole and the umbilical vein, and the abdominal circumference of the fetus, which can be evaluated based on the horizontal cross section of the upper abdomen of property ultrasound, needs to be measured on the standard horizontal cross section of the upper abdomen. However, the existing standard cutting planes carrying anatomical structures are generally manually identified by operators, which depends heavily on the experience of the operators, so that the identified standard cutting planes are different from operator to operator, and the accuracy of the identified standard cutting planes is affected.
In recent years, methods based on deep learning are continuously driving automatic identification of standard slices of ultrasound images. For example, Chen et al propose an auto-identifying neural network framework of 3 standard facets; Burgos-Artizzu et al evaluated the performance of a series of current optimal convolutional neural networks on more than 6 standard slices; cai et al propose a SonoEyeNet network framework for standard cut plane detection and find that the human eye tends to focus on anatomical structures in the cut plane. The method can distinguish the standard section from the non-standard section through the label on the section level. However, the above method does not explicitly combine the identification of critical anatomical structures with standard slices, thereby limiting the clinical interpretability of the above method and the guiding possibilities for young sonographers. In addition, Lin et al detect critical anatomical structures and provide fine-grained information of standard slices, but the accuracy of the slices cannot be guaranteed only by the existence of the critical anatomical structures.
In order to solve the above problem, in the embodiment of the present application, a feature extraction model is trained based on a preset training sample set; determining model parameters of an initial classification model based on a preset training corpus and an initial map neural network, and then determining a predicted tangent plane class and a predicted anatomical structure class of a training ultrasonic image based on the feature extraction model and the initial classification model; training the initial graph neural network to obtain a classification model based on the predicted tangent plane category, the predicted anatomical structure category, the labeled tangent plane category and the labeled anatomical structure category; and finally, connecting the feature extraction model with the classification model to obtain a section identification model. According to the method and the device, the section recognition model is trained through the image characteristic information of the training ultrasonic image and the semantic characteristics formed based on the labeling categories carried by the training ultrasonic image, so that the section recognition model can learn the interdependence relation among the image characteristic information of the ultrasonic image, each section category and each anatomical structure category, and the recognition accuracy based on the section recognition model can be improved.
The following further describes the content of the application by describing the embodiments with reference to the attached drawings.
The embodiment provides a method for generating a tangent plane recognition model based on language knowledge guidance, as shown in fig. 1 and fig. 2, the method includes:
and S10, training the preset network model by adopting a supervision, comparison and learning mode based on the preset training sample set to obtain a feature extraction model.
Specifically, the training sample set includes a plurality of training ultrasound images, and each of the plurality of training ultrasound images carries an annotated section category and an annotated anatomical structure category, where the annotated section category is a section category of a section in the training ultrasound image, the annotated anatomical structure category is an anatomical structure category of an anatomical structure included in the section, and the number of the annotated anatomical structure categories carried by the training ultrasound images is the same as the number of the anatomical structures included in the annotated section. For example, as shown in fig. 3, the section included in the training ultrasound image is a standard upper abdominal horizontal cross section, which includes the bleb 3, the spine 4, and the umbilical vein 5, and then the standard anatomical structure category carried by the training ultrasound image includes the bleb, the spine, and the umbilical vein; the type of the marked section carried by the training ultrasonic image is a standard upper abdomen horizontal cross section. In addition, it is worth to be noted that each of the plurality of training ultrasound images is an ultrasound slice image, and each of the training ultrasound images in the training sample set carries a slice.
The feature extraction model is used for extracting an image feature map of an ultrasound image, and it is understood that an input item of the feature extraction module is the ultrasound image, and an output item is the image feature map, wherein the image feature map includes image detail information of the ultrasound image, so that a section class of a section in the ultrasound image and an anatomical structure class of an anatomical structure included in the section can be determined based on the image feature map. In an implementation manner of this embodiment, the feature extraction model may adopt a convolutional neural network module and a global maximum pooling layer, the convolutional neural network module is connected to the global maximum pooling layer, an input item of the convolutional neural network module is an ultrasound image, and an output item of the convolutional neural network module is a feature map corresponding to the ultrasound image; and the input item of the global maximum pooling layer is a feature map, the output item is a feature vector corresponding to the ultrasonic image, and the feature vector is used as an image feature map corresponding to the ultrasonic image. The convolutional neural network module can select the network structure of the convolutional neural network module according to the actual requirement, such as Resnet [10], and the like.
In an implementation manner of this embodiment, the training a preset network model in a supervised contrast learning manner based on a preset training sample set to obtain a feature extraction model specifically includes:
s11, clustering word vectors to obtain a plurality of candidate categories according to the word vector matrix corresponding to the training corpus, and dividing the training samples into a plurality of training batches;
s12, for each training ultrasonic image in each training batch, determining a reference type corresponding to the training ultrasonic image based on the labeled tangent plane type, the labeled anatomical structure type and a plurality of candidate types carried by the training ultrasonic image to obtain a reference training batch corresponding to the training batch;
s13, inputting each training ultrasonic image in the training batch into a preset network model respectively to obtain a training characteristic diagram corresponding to each training ultrasonic image;
s14, constructing a supervision contrast loss function value corresponding to the training batch based on the training characteristic graph corresponding to each training ultrasonic image;
and S15, training a preset network model based on the supervision comparison loss function value to obtain a feature extraction model.
Specifically, in the step S11, the training corpus is determined based on the labeling tangent plane category and the labeling anatomical structure category corresponding to each training ultrasound image in the training sample set, where the training corpus includes a plurality of training corpora, each training corpus in the plurality of training corpora corresponds to a training ultrasound image in the training sample set, training ultrasound images corresponding to each training corpus are different from each other, and for each training ultrasound image in the training sample set, a training corpus can be selected from the training corpus, and the training corpus is determined based on the labeling tangent plane category and the labeling anatomical structure category of the training ultrasound image.
Based on this, in an implementation manner of this embodiment, before training the preset network model based on the preset training sample set by using a supervised contrast learning manner to obtain the feature extraction model, the method further includes:
acquiring a plurality of training ultrasonic images, and labeling section classes and labeling anatomical structure classes corresponding to the training ultrasonic images respectively to obtain a preset training sample set;
and regarding each training ultrasonic image in the plurality of training ultrasonic images, taking the labeling section class and the anatomical structure class corresponding to the training ultrasonic image as a training corpus to obtain a preset training corpus.
Specifically, the plurality of training ultrasound images may be acquired by an ultrasound acquisition device, or acquired via a network (e.g., hundred degrees), or transmitted via another external device. The labeling section type is the section type of the section in the training ultrasonic image, the labeling anatomical structure type is the anatomical structure type of the anatomical structure included in the section, and the number of the labeling anatomical structure types carried by the training ultrasonic image is the same as the number of the anatomical structures included in the labeling section. In a specific implementation manner of this embodiment, the training ultrasound images may be acquired by an ultrasound acquisition device, and after the training ultrasound images are acquired, the training ultrasound images may be adjusted to make the image sizes of the adjusted training ultrasound images the same, for example, the image sizes of the training ultrasound images are all adjusted to a preset size, or an image area with a preset size is intercepted from each training ultrasound image, and the intercepted image area is used as a training ultrasound image. Of course, after a plurality of training ultrasound images are acquired, a training sample set formed by the training ultrasound images may be enhanced, for example, the training sample set is rotated and cut, so as to improve the diversity of the training sample set through enhancement processing, so as to improve the robustness of the subsequently determined section identification model.
After the training sample set is obtained, for each training ultrasound image in the training sample set, a labeling section category and an anatomical structure category carried by the training ultrasound image are obtained, and text data formed by the obtained labeling section category and the anatomical structure category is used as a training corpus corresponding to the training ultrasound image, for example, as shown in fig. 1, a horizontal cross-section image of a fetal standard upper abdomen includes a horizontal cross-section of an upper abdomen, a spine, a bleb, and an umbilical vein, so that the training corpus corresponding to the horizontal cross-section image of the fetal standard upper abdomen is "the horizontal cross-section of the upper abdomen, the bleb, and the umbilical vein". And when the training corpuses corresponding to the training ultrasonic images are obtained, a training corpuses set formed by all the determined training corpuses is used as a training corpus corresponding to the training sample set. For example, the training sample set includes a fetal standard epigastric horizontal cross-section image, a fetal standard four-chamber cardiac cross-section image, and a fetal non-standard epigastric horizontal cross-section image, as shown in fig. 3, the anatomical structures in the fetal standard epigastric horizontal cross-section image include the spine 4, the bleb 3, and the umbilical vein 5; as shown in fig. 4, the anatomical structure in the fetal standard four-chamber heart slice image includes the four-chamber heart 6, the aorta 7, the lung 8, and the spine 4; as shown in fig. 5, the anatomical structure in the horizontal cross-sectional image of the non-standard upper abdomen of the fetus includes a spine 4 and a bleb 3, and then the corpus corresponding to the horizontal cross-sectional image of the standard upper abdomen of the fetus is "the horizontal cross-sectional spine bleb umbilical vein of the upper abdomen", the corpus corresponding to the horizontal cross-sectional image of the standard four-chamber heart is "the four-chamber heart-cut four-chamber heart aorta-lung spine", and the corpus corresponding to the horizontal cross-sectional image of the non-standard upper abdomen of the fetus is "the horizontal cross-sectional spine bleb of the upper abdomen", and then the corpus includes: the upper abdominal horizontal cross section spinal gastric bleb umbilical vein, the four-chamber heart tangent plane four-chamber heart aortic lung spine and the upper abdominal horizontal cross section spinal gastric bleb are adopted.
In an implementation manner of this embodiment, the word vector matrix includes a plurality of word vectors, each of the word vectors corresponds to a target labeling category, the target labeling category is a labeling tangent plane category or a labeling anatomical structure category carried by a training ultrasound image in the training sample set, and the target labeling categories corresponding to the word vectors are different from each other. It can be understood that the number of the word vectors in the word vector matrix is the sum of the number of the class types of the labeled tangent plane classes carried by all the training ultrasound images in the training sample set and the number of the class types of the labeled anatomical structures carried by all the training ultrasound images. For example, the training sample set includes a fetal standard upper abdominal horizontal cross-section image, a fetal standard four-chamber heart cross-section image, and a fetal non-standard upper abdominal horizontal cross-section image, the labeled cross-section categories carried by all the training ultrasound images in the training sample set include an upper abdominal horizontal cross-section and a four-chamber heart cross-section, the labeled anatomical structure categories carried by all the training ultrasound images in the training sample set include a spine, a bleb, an umbilical vein, a four-chamber heart, an aorta, and a lung, the word vector matrix corresponding to the training corpus corresponding to the training sample includes 8 word vectors, which are respectively corresponding to the upper abdominal horizontal cross-section, the four-chamber heart cross-section, the spine, the bleb, the umbilical vein, the four-chamber heart, the aorta, and the lung.
Based on this, the process of determining the word vector matrix corresponding to the training corpus may specifically be: selecting each labeling tangent plane category and each labeling anatomical structure category included in the training corpus, and determining word vectors corresponding to each labeling tangent plane category and each labeling anatomical structure category respectively to obtain a word vector matrix corresponding to the training corpus, wherein the vector dimensions of the word vectors corresponding to each labeling tangent plane category and each labeling anatomical structure category are the same, for example, the vector dimensions of each word vector are 512. In addition, the word vector corresponding to each labeling tangent plane type and each labeling anatomical structure type can be determined by a trained word embedding model, in other words, each labeling tangent plane type and each labeling anatomical structure type are respectively input into the word embedding model, and the word vector corresponding to each labeling tangent plane type and each labeling anatomical structure type is determined by the word embedding model.
The number of candidate classes of the plurality of candidate classes and the number of class types of standard tangent plane classes carried by all the training ultrasonic images in the training sample set. It can be understood that, when clustering is performed on each word vector by using the word vector matrix corresponding to the training corpus, distances between a plurality of word vectors in the word vector matrix are a plurality of clustering clusters, and a labeling category corresponding to a word vector corresponding to a clustering center of each clustering cluster in the plurality of clustering clusters is used as a candidate category to obtain a plurality of candidate categories, where the candidate category may be a labeling tangent plane category or a labeling anatomical structure category. For example, as described in the training corpus, which includes 2 types of labeled tangent plane categories and 6 types of labeled anatomical structure categories, the word vector matrix includes 8 word vectors, and when 8 word vectors are clustered, the 8 word vectors are clustered into 2 clusters, that is, the number of clusters is equal to the number of category categories of the labeled tangent plane categories included in the training corpus. In addition, when clustering is performed on each word vector in the word vector matrix, kmean clustering may be adopted, each word vector in the word vector matrix is clustered into a plurality of clustering clusters through kmean clustering, and then a clustering center in each clustering cluster in the plurality of clustering clusters is used as a candidate category to obtain a plurality of candidate categories.
In one implementation of this embodiment, each of the training batches includes a plurality of training ultrasound images, and each training batch includes the same number of images of each training ultrasound image. It is to be understood that the training sample set is equally divided into a number of training batches, and that each of the training batches includes training ultrasound images that are different from each other. For example, the training sample set includes a training ultrasound image a, a training ultrasound image B, a training ultrasound image C, and a training ultrasound image D, and the training sample set is divided into a first training batch including the training ultrasound image a and the training ultrasound image B and a second sequence batch including the training ultrasound image C and the training ultrasound image D.
In step S12, after acquiring the training batches, for each of the training batches, a reference category of each training ultrasound image in the training batch is determined, where the reference category is included in a plurality of candidate categories, in other words, the reference category is one of the candidate categories. Therefore, based on the labeling section category, the labeling anatomical structure category and the plurality of candidate categories carried by the training ultrasound image, the reference category corresponding to the training ultrasound image is determined to be a candidate category selected from the plurality of candidate categories based on the labeling section category and the labeling anatomical structure category carried by the training ultrasound image, and the selected candidate category is used as the reference category corresponding to the training ultrasound image.
In an implementation manner of this embodiment, when one candidate category is selected from a plurality of candidate categories, an average word vector of a word vector that is carried by the training ultrasound image and that is labeled with a tangent plane category and an average word vector of a word vector that is labeled with an anatomical structure category may be calculated, then euclidean distances between the average word vector and respective word vectors corresponding to each of the plurality of candidate categories are respectively calculated, and then a candidate category corresponding to a minimum euclidean distance is selected as a reference category corresponding to the training ultrasound image. Certainly, in practical applications, another method may also be adopted to select one candidate category from a plurality of candidate categories, for example, the euclidean distances between the word vectors of each candidate category in the plurality of candidate categories and the word vectors of the labeling tangent plane category of the training ultrasound image are respectively calculated, and then the candidate category corresponding to the minimum euclidean distance is selected as the reference category corresponding to the training ultrasound image; or respectively calculating the Euclidean distance between the word vector of each candidate category and the word vector of the labeling tangent plane category corresponding to the ultrasonic training image and the word vector of the labeling anatomical structure category, and then selecting the candidate category corresponding to the minimum Euclidean distance as the reference category corresponding to the training ultrasonic image. In this embodiment, by calculating the average word vector of the word vector labeling the tangent plane category and the word vector labeling the anatomical structure category carried by the training ultrasound image, and determining the reference category based on the average word vector, the semantic correlation between the tangent plane category and the anatomical structure category can be fully utilized, thereby improving the accuracy of the reference category. In addition, the minimum value of the Euclidean distance between the average word vector and the word vector corresponding to each candidate category is selected, and the reference category with the strongest semantic correlation can be selected.
After the reference category corresponding to each training ultrasound image in the training batch is obtained, the training ultrasound images and the reference categories corresponding to the training ultrasound images are used as a training sample to obtain training samples corresponding to each training ultrasound image, and then all the obtained training samples are used as reference training batches corresponding to the training batch. It can be understood that the training ultrasound images included in the reference training batch are the same as the training ultrasound images included in the corresponding training batch, and the difference between the two training ultrasound images is that each training ultrasound image in the reference training batch carries a corresponding reference type, and each training ultrasound image in the training batch carries a corresponding labeled section type and labeled anatomical structure type.
In step S13, the preset network model is a pre-established neural network model, and the model structure of the preset network model is the same as the model structure of the feature extraction model, where the difference between the preset network model and the feature extraction model is that the model parameters of the preset network model are initial model parameters, and the feature extraction model is model parameters obtained through training of a training sample set. Correspondingly, the input item of the preset network model is a training ultrasonic image, and the output item is a training characteristic diagram corresponding to the training ultrasonic image. Therefore, for each training ultrasonic image in the training batch, the training characteristic diagram corresponding to each training ultrasonic image can be determined through the preset network model.
In step S14, the supervised contrast loss function value is determined based on the loss value corresponding to each training ultrasound image in the training batch, and is used to perform reverse learning on the preset network model to update the model parameters of the preset network model. In an implementation manner of this embodiment, the constructing the supervised contrast loss function value corresponding to the training batch based on the training feature maps corresponding to the training ultrasound images specifically includes:
for the training feature map corresponding to each training ultrasonic image, determining a plurality of first training feature maps and a plurality of second training feature maps corresponding to the training feature map based on a reference training batch;
determining a first loss value of the training feature map and each corresponding first training feature map, and a second loss value of the training feature map and each corresponding second training feature map;
weighting each first loss value and each second loss value based on a preset weighting coefficient to obtain a loss value corresponding to the training characteristic diagram;
and determining the supervision contrast loss function value corresponding to the training batch based on the loss value corresponding to each training feature map.
Specifically, each of the plurality of first training feature maps is a feature map corresponding to a training ultrasound image in the reference training batch, and each of the plurality of second training feature maps is a feature map corresponding to a training ultrasound image in the reference training batch, wherein a reference category corresponding to the first training feature map is the same as a reference category corresponding to the training feature map, and a reference category corresponding to the second training feature map is different from the reference category corresponding to the training feature map. For example, the training batch includes a training ultrasound image a and a training ultrasound image B, and then the reference training batch corresponding to the training batch includes a training ultrasound image a and a training ultrasound image B, where a reference category corresponding to the training ultrasound image a is an annotation tangent plane category a, a reference category corresponding to the training ultrasound image B is an annotation tangent plane category B, a first training feature map corresponding to the training ultrasound image a is a training feature map corresponding to the training ultrasound image a, and a second training feature map corresponding to the training ultrasound image a is a training feature map corresponding to the training ultrasound image B. In a specific implementation manner of this embodiment, the training ultrasound images corresponding to the first training feature maps and the second training feature maps are different from each other, and a sum of a number of the first training feature maps and a number of the second training feature maps is equal to a number of images of the training ultrasound images included in the reference training batch. It can be understood that a plurality of training ultrasound images in a training batch are divided into two image groups, a reference category corresponding to each training ultrasound image in one image group in the two image groups is the same as a reference category corresponding to the training ultrasound image, and a reference category corresponding to each training ultrasound image in the other image group is different from a reference category corresponding to the training ultrasound image.
Of course, in practical applications, when determining the plurality of first training feature maps and the plurality of second training feature maps, other methods may be used for obtaining the plurality of first training feature maps and the plurality of second training feature maps, for example, after determining the same image group and the different image group based on the reference category, a preset number of training ultrasound images may be selected in each image group to obtain the plurality of first training feature maps and the plurality of second training feature maps, where the preset number may be smaller than or equal to the minimum number of images in the number of images included in the two image groups; or, the training feature maps are respectively selected from the two image groups according to the number ratio of the images included in the two image groups, so that the number ratio of the images selected from the two image groups to the training ultrasound images is equal to the number ratio of the images included in the two image groups. In the embodiment, all the training ultrasonic images of the two image groups are selected, and the image characteristic information carried by each training ultrasonic image in the training batch can be fully utilized, so that the training speed of the preset network model and the model precision of the feature extraction model obtained by training can be improved.
In an implementation manner of this embodiment, the preset weighting coefficient is preset, and is configured to weight the loss value of each of the first training feature maps and the loss value of each of the second training feature maps corresponding to the training feature map, so as to obtain the loss value corresponding to the training feature map, where the weighting coefficients corresponding to the first training feature maps are the same, and the weighting coefficients corresponding to the second training feature maps are the same. In addition, since the number of the second training feature maps and the number of the first training feature maps may be the same, the weighting coefficients corresponding to the first training feature maps and the weighting coefficients corresponding to the second training feature maps may not be the same, and the magnitude relationship between the weighting coefficients corresponding to the first training feature maps and the weighting coefficients corresponding to the second training feature maps may be based on the magnitude relationship between the number of images of the plurality of first training feature maps and the number of images of the plurality of second training feature maps, and the magnitude relationship between the weighting coefficients corresponding to the first training feature maps and the weighting coefficients corresponding to the second training feature maps is inversely proportional to the magnitude relationship between the number of images based on the plurality of first training feature maps and the number of images based on the plurality of second training feature maps. For example, the number of the first training feature maps is smaller than the number of the second training feature maps, so that the weighting coefficient corresponding to the first training feature map is larger than the weighting coefficient corresponding to the second training feature maps, e.g., the weighting coefficient corresponding to the first training feature map is 0.75, the weighting coefficient corresponding to the second training feature map is 0.25, and so on.
In an implementation manner of this embodiment, after the preset network model is trained based on the supervised contrast loss function value, the model parameters of the preset network model are updated, then the preset network model is trained based on another training batch by using the training process until the preset network model reaches the preset condition, and the preset network model reaching the preset condition is used as the feature extraction model. The preset condition is that the supervision comparison loss function value corresponding to the training batch meets the preset requirement, or the training times of the preset network model reach a preset time threshold value and the like. In the embodiment, the feature extraction model is trained in a supervised contrast learning mode, so that the feature extraction model can learn the feature difference of different types of standard tangent planes and the feature similarity of the same type of standard tangent planes, and further the model accuracy of the feature extraction model is improved, so that the accuracy of the subsequently determined tangent plane identification model is improved.
S20, determining initial model parameters based on a preset training corpus and an initial graph neural network, and taking the initial model parameters as model parameters of an initial classification model.
Specifically, the training corpus is determined based on the labeling tangent plane class and the labeling anatomical structure class corresponding to each training ultrasound image in the training sample set, wherein the step S10 of the specific determination process of the training corpus is already described, and is not repeated here. The initial graph neural network is a pre-established graph neural network model, and is configured with initial model parameters, and the model parameters of the initial classification model can be determined through the initial graph neural network.
In an implementation manner of this embodiment, the determining the initial model parameters based on the preset training corpus and the initial graph neural network specifically includes:
s21, determining a corpus map corresponding to the training corpus based on the word vector matrix and the adjacency matrix corresponding to the training corpus;
and S22, carrying out graph operation on the corpus graph based on the initial graph neural network to determine initial model parameters.
Specifically, in step S21, the adjacency matrix is used to reflect the correlation between the annotation tangent plane category and the annotation deconstruction structure category, and for any two annotation categories, whether the two annotation categories are included in the same training ultrasound image can be determined based on the neighbor matrix, where the two annotation categories are any two of all the annotation tangent plane categories and all the annotation anatomical structure categories included in all the training ultrasound images. In one implementation manner of this embodiment, each matrix element in the adjacency matrix is 0 or 1,0 indicates that the labeling category corresponding to the row and the labeling category corresponding to the column do not appear in one training ultrasound image at the same time, 1 indicates that the labeling category corresponding to the row and the labeling category corresponding to the column do appear in one training ultrasound image at the same time, wherein the number of rows of the adjacency matrix is equal to the number of all labeling tangent plane categories and all labeling anatomical structure categories, each row corresponds to one standard category of all labeling tangent plane categories and all labeling anatomical structure categories, and the respective standard categories of each row are different from each other, the number of columns of the adjacency matrix is equal to the number of all tangent plane labeling categories and all labeling anatomical structure categories, and each column corresponds to one standard category of all labeling tangent plane categories and all labeling anatomical structure categories, and the standard categories corresponding to the columns are different from each other.
Based on this, when determining the adjacency matrix, a co-occurrence matrix corresponding to the training corpus may be determined, where a row number of the co-occurrence matrix is equal to the number of all the labeled tangent plane categories and all the labeled anatomical structure categories, a column number of the co-occurrence matrix is equal to the number of all the labeled tangent plane categories and all the labeled anatomical structure categories, and each matrix element in the co-occurrence matrix represents the number of times that the labeled category corresponding to the row where the matrix element is located and the labeled category corresponding to the column where the matrix element is located appear together in the same training ultrasound image. For example, as described above for the training corpus, which includes all labeled section classes and all labeled anatomical structure classes including the upper abdominal horizontal cross section, the four-chamber heart section, the spine, the bleb, the umbilical vein, the four-chamber heart, the aorta, and the lung, the training corpus pair co-occurrence matrix may be as shown in table 1.
TABLE 1 training corpus to co-occurrence matrix
Figure BDA0003062532410000171
Figure BDA0003062532410000181
After the co-occurrence matrix is determined, determining an adjacency matrix corresponding to the training corpus based on the co-occurrence matrix, wherein the number of co-occurrence times of every two categories in the co-occurrence matrix is greater than 0, the corresponding numerical value in the adjacency matrix is 1, otherwise, the number of co-occurrence times of every two categories in the co-occurrence matrix is equal to 0, the corresponding numerical value in the adjacency matrix is equal to, and the numerical value at the position corresponding to the same labeled category in the adjacency matrix is 1. Thus, the adjacency matrix determined based on the co-occurrence matrix shown in table 1 may be as shown in table 2.
TABLE 2 training corpus to adjacency matrix
Figure BDA0003062532410000182
In an implementation manner of this embodiment, the determining, based on the word vector matrix and the adjacency matrix corresponding to the training corpus, the corpus map corresponding to the training corpus specifically includes:
each word vector in the word vector matrix is used as a graph node, and a connecting edge between every two graph nodes is determined based on the adjacent matrix;
and constructing a corpus graph corresponding to the training corpus based on the graph nodes and the connecting edges among the graph nodes.
Specifically, the number of graph nodes is the same as the number of word vectors in a word vector matrix, each word vector is a graph node, a connecting edge is determined based on an adjacency matrix, and when a matrix element value determined in the adjacency matrix based on a labeling category corresponding to each of two graph nodes is 1, a connecting edge is set between the two graph nodes; when the matrix element value determined in the adjacency matrix based on the labeling type corresponding to each of the two graph nodes is 0, a connection edge is not arranged between the two graph nodes, so that the connection edge between the graph nodes is obtained. After the graph nodes and the connecting edges are determined, the graph nodes are connected through the connecting edges to form a corpus graph corresponding to the training corpus.
In step S22, after determining the corpus map, performing a graph operation on the corpus map based on the initial graph neural network to update the values of the graph nodes in the corpus map; and then taking the updated values of the graph nodes as model parameters of the initial classification model. In a specific implementation manner of this embodiment, the initial graph neural network may be an initial convolutional graph neural network or the like.
And S30, determining a feature map corresponding to the training ultrasonic image in the training sample set based on the feature extraction model.
Specifically, the feature extraction model is obtained by training in the supervised contrast learning manner, the input item of the feature extraction model is a training ultrasonic image, and the output item is a feature map corresponding to the training ultrasonic image. In addition, because the feature extraction model is trained, the feature map corresponding to the training ultrasound image can be used for training the classification model. Of course, in practical applications, the feature extraction model and the classification model may be trained synchronously, that is, for one training batch, the feature extraction model is trained based on the training batch, then the classification model is trained on the feature extraction model trained based on the training batch and the training batch, and then the feature extraction model and the classification model are trained based on another training batch.
And S40, determining the predicted tangent plane class and the predicted anatomical structure class corresponding to the feature map based on the initial classification model.
Specifically, the initial classification model is preset, and the configuration of the initial classification model is determined by an initial graph neural network and a corpus corresponding to a training corpus. The input items of the initial classification model are feature maps of training ultrasonic images determined based on the feature extraction model, and the output items are a predicted tangent plane class and a predicted anatomical structure class. The initial classification model is a multi-label classification model, and a plurality of class labels corresponding to the training ultrasonic image can be output through each initial classification model, wherein the class labels comprise a section class and an anatomical structure class. In addition, the initial classification model may adopt an existing classification model, such as a densenet model and the like.
S50, training the initial graph neural network based on the predicted tangent plane type, the predicted anatomical structure type, the tangent plane type label and the anatomical structure type label, and continuing to execute the step of determining initial model parameters based on a preset training corpus and the initial graph neural network until a classification model is obtained through training.
Specifically, the predicted section category and the predicted anatomical structure category are determined by an initial classification model, and the section category label and the anatomical structure category label are carried by a training ultrasound image. Thus, after the predicted tangent plane class and the predicted anatomical structure class are determined by the initial classification model, the loss function value corresponding to the initial classification model can be determined based on the predicted tangent plane class, the predicted anatomical structure class, the tangent plane class label and the anatomical structure class label. After the loss function value is obtained, the loss function value is used as a loss function value corresponding to the initial graph neural network, reverse learning is carried out on the initial graph neural network based on the loss function value, so that model parameters of the initial graph neural network are corrected, the corrected initial graph neural network is used as the initial graph neural network, then the step of determining initial model parameters based on a preset training corpus and the initial graph neural network is continuously executed, the model parameters of the initial classification model are determined through the updated initial graph neural network, so that the model parameters of the initial classification model are updated, and then the initial classification model is trained. Therefore, model parameters of the classification model are trained through the graph neural network model, the classification model can learn the mutual relation between tangent plane list labels and anatomical structure list labels, the classification model can learn semantic features of the training ultrasonic images, and meanwhile, the classification model can learn image features in the feature graph determined by the training ultrasonic images, so that the classification model can train the image features and the semantic features of the ultrasonic images, and further, the classification precision of the classification model can be improved.
And S60, connecting the feature extraction model and the classification model to obtain a tangent plane identification model.
Specifically, the feature extraction model and the classification model are both network models determined in the above steps, and an output item of the feature extraction model is an input item of the classification model, so that the section identification model can identify the section category and the anatomical structure category of the ultrasound image. It is understood that the section identification model includes a feature extraction model and a classification model, an input item of the feature extraction model is an ultrasound image, an output item of the feature extraction model is an input item of the classification model, and an output item of the classification model is a section category and an anatomical structure category.
In summary, the present embodiment provides a method for generating a tangent plane recognition model based on linguistic knowledge guidance, where the method includes training a feature extraction model based on a preset training sample set; determining model parameters of an initial classification model based on a preset training corpus and an initial map neural network, and then determining a predicted tangent plane class and a predicted anatomical structure class of a training ultrasonic image based on the feature extraction model and the initial classification model; training the initial graph neural network to obtain a classification model based on the predicted tangent plane category, the predicted anatomical structure category, the labeled tangent plane category and the labeled anatomical structure category; and finally, connecting the feature extraction model with the classification model to obtain a section identification model. According to the method and the device, the section recognition model is trained through the image characteristic information of the training ultrasonic image and the semantic characteristics formed based on the labeling categories carried by the training ultrasonic image, so that the section recognition model can learn the interdependence relation among the image characteristic information of the ultrasonic image, each section category and each anatomical structure category, and the recognition accuracy based on the section recognition model can be improved.
Based on the above method for generating a tangent plane recognition model based on language knowledge guidance, this embodiment provides a tangent plane recognition method based on language knowledge guidance, where the tangent plane recognition method applies the tangent plane recognition model as described above, and the tangent plane recognition method specifically includes:
acquiring an ultrasonic image to be identified, and inputting the ultrasonic identification image into the section identification model;
determining a target characteristic diagram corresponding to the ultrasonic image to be identified by using a characteristic extraction model in the section identification model;
and determining the section class and the anatomical structure class corresponding to the ultrasonic image to be identified based on the target feature map by using the classification model in the section identification model.
Based on the above method for generating a tangent plane recognition model based on language knowledge guidance, the present embodiment provides a computer-readable storage medium, where one or more programs are stored, and the one or more programs are executable by one or more processors to implement the steps in the method for generating a tangent plane recognition model based on language knowledge guidance according to the above embodiment.
Based on the above method for generating the tangent plane recognition model based on the language knowledge guidance, the present application further provides a terminal device, as shown in fig. 6, which includes at least one processor (processor) 20; a display screen 21; and a memory (memory)22, and may further include a communication Interface (Communications Interface)23 and a bus 24. The processor 20, the display 21, the memory 22 and the communication interface 23 can communicate with each other through the bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may call logic instructions in the memory 22 to perform the methods in the embodiments described above.
Furthermore, the logic instructions in the memory 22 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product.
The memory 22, which is a computer-readable storage medium, may be configured to store a software program, a computer-executable program, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 executes the functional application and data processing, i.e. implements the method in the above-described embodiments, by executing the software program, instructions or modules stored in the memory 22.
The memory 22 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 22 may include a high speed random access memory and may also include a non-volatile memory. For example, a variety of media that can store program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, may also be transient storage media.
In addition, the specific processes loaded and executed by the storage medium and the instruction processors in the terminal device are described in detail in the method, and are not stated herein.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. A generation method of a tangent plane recognition model based on language knowledge guidance is characterized by further comprising the following steps:
training a preset network model by adopting a supervision comparison learning mode based on a preset training sample set to obtain a feature extraction model;
determining initial model parameters based on a preset training corpus and an initial graph neural network, and taking the initial model parameters as model parameters of an initial classification model, wherein the training corpus is determined based on labeling tangent plane classes and labeling anatomical structure classes corresponding to training ultrasonic images in the training sample set;
determining a feature map corresponding to a training ultrasonic image in the training sample set based on the feature extraction model;
determining a predicted tangent plane class and a predicted anatomical structure class corresponding to the feature map based on the initial classification model;
training the initial graph neural network based on the predicted tangent plane type, the predicted anatomical structure type, the labeled tangent plane type and the labeled anatomical structure type, and continuing to execute the step of determining initial model parameters based on a preset training corpus and the initial graph neural network until a classification model is obtained through training;
and connecting the feature extraction model with the classification model to obtain a section identification model.
2. The method for generating a tangent plane recognition model based on language knowledge guidance according to claim 1, wherein before the training of the preset network model based on the preset training sample set by adopting a supervised contrast learning manner to obtain the feature extraction model, the method further comprises:
acquiring a plurality of training ultrasonic images, and labeling section classes and labeling anatomical structure classes corresponding to the training ultrasonic images respectively to obtain a preset training sample set;
and regarding each training ultrasonic image in the plurality of training ultrasonic images, taking the labeling section class and the anatomical structure class corresponding to the training ultrasonic image as a training corpus to obtain a preset training corpus.
3. The method of claim 1, wherein the determining initial model parameters based on the predetermined training corpus and the initial graph neural network specifically comprises:
determining a corpus graph corresponding to the training corpus based on the word vector matrix and the adjacency matrix corresponding to the training corpus;
and carrying out graph operation on the corpus graph based on the initial graph neural network so as to determine initial model parameters.
4. The method as claimed in claim 3, wherein the determining the corpus map corresponding to the training corpus based on the word vector matrix and the adjacency matrix corresponding to the training corpus specifically comprises:
each word vector in the word vector matrix is used as a graph node, and a connecting edge between every two graph nodes is determined based on the adjacent matrix;
and constructing a corpus graph corresponding to the training corpus based on the graph nodes and the connecting edges among the graph nodes.
5. The method of claim 3, wherein each word vector in the word vector matrix corresponds to a target labeling category, the target labeling category is a labeling tangent category or a labeling anatomical structure category carried by a training ultrasound image in the training sample set, and the target labeling categories corresponding to the word vectors are different from each other.
6. The method for generating the tangent plane recognition model based on the language knowledge guidance of claim 1, wherein the training of the preset network model by adopting a supervised contrast learning manner based on the preset training sample set to obtain the feature extraction model specifically comprises:
clustering word vectors of a word vector matrix corresponding to the training corpus to obtain a plurality of candidate classes, and dividing the training samples into a plurality of training batches;
for each training ultrasonic image in each training batch, determining a reference marking section class corresponding to the training ultrasonic image based on a marking section class, a marking anatomical structure class and a plurality of candidate classes carried by the training ultrasonic image to obtain a reference training batch corresponding to the training batch;
inputting each training ultrasonic image in the training batch into a preset network model respectively to obtain a training characteristic diagram corresponding to each training ultrasonic image;
constructing a supervision contrast loss function value corresponding to the training batch based on the training characteristic graph corresponding to each training ultrasonic image;
and training a preset network model based on the supervision comparison loss function value to obtain a feature extraction model.
7. The method of claim 6, wherein the constructing the supervised contrast loss function value corresponding to the training batch based on the training feature maps corresponding to the training ultrasound images specifically comprises:
for the training feature map corresponding to each training ultrasonic image, determining a plurality of first training feature maps and a plurality of second training feature maps corresponding to the training feature map based on the reference training batch, wherein the reference labeling tangent plane category corresponding to the first training feature map is the same as the reference category corresponding to the training feature map, and the reference labeling tangent plane category corresponding to the second training feature map is different from the reference category corresponding to the training feature map;
determining a first loss value of the training feature map and each corresponding first training feature map, and a second loss value of the training feature map and each corresponding second training feature map;
weighting each first loss value and each second loss value based on a preset weighting coefficient to obtain a loss value corresponding to the training characteristic diagram;
and determining the supervision contrast loss function value corresponding to the training batch based on the loss value corresponding to each training feature map.
8. A method for recognizing a tangent plane based on language knowledge guidance, wherein the tangent plane recognition method applies the tangent plane recognition model of any one of claims 1 to 7, and the tangent plane recognition method specifically comprises:
acquiring an ultrasonic image to be identified, and inputting the ultrasonic identification image into the section identification model;
determining a target characteristic diagram corresponding to the ultrasonic image to be identified by using a characteristic extraction model in the section identification model;
and determining the section class and the anatomical structure class corresponding to the ultrasonic image to be identified based on the target feature map by using the classification model in the section identification model.
9. A computer-readable storage medium storing one or more programs which are executable by one or more processors to implement the steps in the method for generating the language-knowledge-based tangent plane recognition model according to any one of claims 1 to 7 and/or to implement the steps in the method for generating the language-knowledge-based tangent plane recognition model according to claim 8.
10. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the method for generating the language-knowledge-based tangent plane recognition model according to any one of claims 1 to 7, and/or implements the steps in the method for recognizing the language-knowledge-based tangent plane according to claim 8.
CN202110516561.7A 2021-05-12 2021-05-12 Generation method and identification method of section identification model based on language knowledge guidance Active CN113139956B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110516561.7A CN113139956B (en) 2021-05-12 2021-05-12 Generation method and identification method of section identification model based on language knowledge guidance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110516561.7A CN113139956B (en) 2021-05-12 2021-05-12 Generation method and identification method of section identification model based on language knowledge guidance

Publications (2)

Publication Number Publication Date
CN113139956A true CN113139956A (en) 2021-07-20
CN113139956B CN113139956B (en) 2023-04-14

Family

ID=76817288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110516561.7A Active CN113139956B (en) 2021-05-12 2021-05-12 Generation method and identification method of section identification model based on language knowledge guidance

Country Status (1)

Country Link
CN (1) CN113139956B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255118A (en) * 2017-07-11 2019-01-22 普天信息技术有限公司 A kind of keyword extracting method and device
CN109961442A (en) * 2019-03-25 2019-07-02 腾讯科技(深圳)有限公司 Training method, device and the electronic equipment of neural network model
CN110464380A (en) * 2019-09-12 2019-11-19 李肯立 A kind of method that the ultrasound cross-section image of the late pregnancy period fetus of centering carries out quality control
WO2020221278A1 (en) * 2019-04-29 2020-11-05 北京金山云网络技术有限公司 Video classification method and model training method and apparatus thereof, and electronic device
CN112070119A (en) * 2020-08-11 2020-12-11 长沙大端信息科技有限公司 Ultrasonic tangent plane image quality control method and device and computer equipment
CN112562819A (en) * 2020-12-10 2021-03-26 清华大学 Report generation method of ultrasonic multi-section data for congenital heart disease
CN112668459A (en) * 2020-12-25 2021-04-16 合肥工业大学 Rolling bearing fault diagnosis method based on supervised contrast learning and convolutional neural network
CN112767366A (en) * 2021-01-22 2021-05-07 南京汇川图像视觉技术有限公司 Image recognition method, device and equipment based on deep learning and storage medium
CN112784918A (en) * 2021-02-01 2021-05-11 中国科学院自动化研究所 Node identification method, system and device based on unsupervised graph representation learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255118A (en) * 2017-07-11 2019-01-22 普天信息技术有限公司 A kind of keyword extracting method and device
CN109961442A (en) * 2019-03-25 2019-07-02 腾讯科技(深圳)有限公司 Training method, device and the electronic equipment of neural network model
WO2020221278A1 (en) * 2019-04-29 2020-11-05 北京金山云网络技术有限公司 Video classification method and model training method and apparatus thereof, and electronic device
CN110464380A (en) * 2019-09-12 2019-11-19 李肯立 A kind of method that the ultrasound cross-section image of the late pregnancy period fetus of centering carries out quality control
CN112070119A (en) * 2020-08-11 2020-12-11 长沙大端信息科技有限公司 Ultrasonic tangent plane image quality control method and device and computer equipment
CN112562819A (en) * 2020-12-10 2021-03-26 清华大学 Report generation method of ultrasonic multi-section data for congenital heart disease
CN112668459A (en) * 2020-12-25 2021-04-16 合肥工业大学 Rolling bearing fault diagnosis method based on supervised contrast learning and convolutional neural network
CN112767366A (en) * 2021-01-22 2021-05-07 南京汇川图像视觉技术有限公司 Image recognition method, device and equipment based on deep learning and storage medium
CN112784918A (en) * 2021-02-01 2021-05-11 中国科学院自动化研究所 Node identification method, system and device based on unsupervised graph representation learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
PRANNAY KHOSLA ET AL.: "Supervised Contrastive Learning", 《ARXIV:2004.11362V5》 *
ZHAO-MIN CHEN ET AL.: "Multi-Label Image Recognition with Graph Convolutional Networks", 《IEEE》 *
倪东 等: "基于深度学习的胎儿颜面部超声标准切面自动识别", 《中国生物医学工程学报》 *
陈健 等: "基于深度学习的中孕期胎儿超声筛查切面", 《研究论著》 *

Also Published As

Publication number Publication date
CN113139956B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
CN109446517B (en) Reference resolution method, electronic device and computer readable storage medium
WO2022001623A1 (en) Image processing method and apparatus based on artificial intelligence, and device and storage medium
CN109189767B (en) Data processing method and device, electronic equipment and storage medium
CN108804641A (en) A kind of computational methods of text similarity, device, equipment and storage medium
CA3066029A1 (en) Image feature acquisition
CN110210286A (en) Abnormality recognition method, device, equipment and storage medium based on eye fundus image
CN109145085B (en) Semantic similarity calculation method and system
WO2021208727A1 (en) Text error detection method and apparatus based on artificial intelligence, and computer device
CN111340054A (en) Data labeling method and device and data processing equipment
KR102279126B1 (en) Image-based data processing method, device, electronic device and storage medium
CN112990318B (en) Continuous learning method, device, terminal and storage medium
CN113590796A (en) Training method and device of ranking model and electronic equipment
CN114330499A (en) Method, device, equipment, storage medium and program product for training classification model
CN111340213B (en) Neural network training method, electronic device, and storage medium
CN113127672A (en) Generation method, retrieval method, medium and terminal of quantized image retrieval model
CN109284497B (en) Method and apparatus for identifying medical entities in medical text in natural language
WO2022127333A1 (en) Training method and apparatus for image segmentation model, image segmentation method and apparatus, and device
CN111414930A (en) Deep learning model training method and device, electronic equipment and storage medium
CN111709475B (en) N-gram-based multi-label classification method and device
CN109635004A (en) A kind of object factory providing method, device and the equipment of database
CN113139956B (en) Generation method and identification method of section identification model based on language knowledge guidance
CN111597336A (en) Processing method and device of training text, electronic equipment and readable storage medium
CN115062783B (en) Entity alignment method and related device, electronic equipment and storage medium
CN115761360A (en) Tumor gene mutation classification method and device, electronic equipment and storage medium
CN116109732A (en) Image labeling method, device, processing equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant