CN118193770A

CN118193770A - Medical image retrieval method and system based on deep learning

Info

Publication number: CN118193770A
Application number: CN202410221607.6A
Authority: CN
Inventors: 裴萌; 王楠; 李玲肖
Original assignee: Beijing Capton Pharmaceutical Technology Development Co ltd
Current assignee: Beijing Capton Pharmaceutical Technology Development Co ltd
Priority date: 2024-02-28
Filing date: 2024-02-28
Publication date: 2024-06-14
Anticipated expiration: 2044-02-28
Also published as: CN118193770B

Abstract

The invention discloses a medical image retrieval method and a system based on deep learning, comprising the following steps: firstly, basic medical image data which has association relation with the medical record of a patient is extracted from a medical image information blockchain by acquiring the medical record of the patient and pathological images thereof. At the same time, the visual feature vector and the content feature vector of the current pathological image are acquired, and further advanced medical image data related to the feature vectors are extracted from the medical image information blockchain. And taking the data as undetermined medical image data to obtain corresponding visual feature vectors, content feature vectors and associated contents of the target demand pathological image. By comprehensively comparing the feature vectors, a demand matching coefficient between the pending medical image data and the current patient matching request is determined. And finally, sorting the undetermined medical image data according to the demand matching coefficient, thereby determining target medical image data. The design can more accurately understand the image content, and the retrieval accuracy and efficiency are improved.

Description

Medical image retrieval method and system based on deep learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a medical image retrieval method and system based on deep learning.

Background

Current medical image retrieval methods are based primarily on keywords and metadata, which can lead to deviations from the desired retrieval results due to label inaccuracy or lack thereof. In addition, existing medical image retrieval techniques often fail to understand the image content deeply, and it is difficult to distinguish between cases that are highly similar but differ in diagnostic result due to minor differences. Therefore, a more intelligent and accurate medical image retrieval method is needed.

Disclosure of Invention

The invention aims to provide a medical image retrieval method and a medical image retrieval system based on deep learning.

In a first aspect, an embodiment of the present invention provides a medical image retrieval method based on deep learning, including:

Acquiring a current patient matching request containing a current patient medical record and a current pathological image, and extracting basic medical image data with an association relation with the current patient medical record from a medical image information block chain;

Acquiring a visual feature vector of a current pathological image corresponding to the current pathological image and a content feature vector of the current pathological image corresponding to the current pathological image, and extracting advanced medical image data with association relation with the content feature vector of the current pathological image from the medical image information block chain;

Taking the basic medical image data and the advanced medical image data as undetermined medical image data, and acquiring a visual feature vector of a target demand pathological image corresponding to the undetermined medical image data, a content feature vector of a target demand pathological image corresponding to the undetermined medical image data and a target pathological association content corresponding to the undetermined medical image data;

Determining a demand matching coefficient between the pending medical image data and the current patient matching request based on the current patient medical record, the current pathology image content feature vector, the current pathology image visual feature vector, the target pathology-associated content, the target demand pathology image content feature vector, and the target demand pathology image visual feature vector;

And according to the requirement matching coefficient, carrying out order arrangement based on the matching degree on the undetermined medical image data, and determining target medical image data from the undetermined medical image data which are arranged in order based on the matching degree.

In a second aspect, an embodiment of the present invention provides a server system, including a server, where the server is configured to perform the method described in the first aspect.

Compared with the prior art, the invention has the beneficial effects that: by adopting the medical image retrieval method and the medical image retrieval system based on deep learning, provided by the invention, the medical record of the patient and the pathological image thereof are acquired, and the basic medical image data with the association relation with the medical record of the patient is extracted from the medical image information block chain. At the same time, the visual feature vector and the content feature vector of the current pathological image are acquired, and further advanced medical image data related to the feature vectors are extracted from the medical image information blockchain. And taking the data as undetermined medical image data to obtain corresponding visual feature vectors, content feature vectors and associated contents of the target demand pathological image. By comprehensively comparing the feature vectors, a demand matching coefficient between the pending medical image data and the current patient matching request is determined. And finally, sorting the undetermined medical image data according to the demand matching coefficient, thereby determining target medical image data. The design can more accurately understand the image content, and the retrieval accuracy and efficiency are improved.

Drawings

In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described. It is appreciated that the following drawings depict only certain embodiments of the invention and are therefore not to be considered limiting of its scope. Other relevant drawings may be made by those of ordinary skill in the art without undue burden from these drawings.

Fig. 1 is a schematic flow chart of steps of a medical image retrieval method based on deep learning according to an embodiment of the present invention;

fig. 2 is a schematic block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

The following describes specific embodiments of the present invention in detail with reference to the drawings.

In order to solve the foregoing technical problems in the background art, fig. 1 is a schematic flow chart of a medical image retrieval method based on deep learning according to an embodiment of the disclosure, and the medical image retrieval method based on deep learning is described in detail below.

Step S201, obtaining a current patient matching request containing a current patient medical record and a current pathological image, and extracting basic medical image data with an association relation with the current patient medical record from a medical image information blockchain;

Step S202, a current pathological image vision feature vector corresponding to the current pathological image and a current pathological image content feature vector corresponding to the current pathological image are obtained, and advanced medical image data with association relation with the current pathological image content feature vector is extracted from the medical image information block chain;

Step 203, taking the basic medical image data and the advanced medical image data as pending medical image data, and obtaining a visual feature vector of a target required pathological image corresponding to the pending medical image data, a content feature vector of a target required pathological image corresponding to the pending medical image data and a target pathological association content corresponding to the pending medical image data;

Step S204, determining a demand matching coefficient between the undetermined medical image data and the current patient matching request based on the current patient medical record, the current pathology image content feature vector, the current pathology image visual feature vector, the target pathology associated content, the target demand pathology image content feature vector and the target demand pathology image visual feature vector;

Step S205, performing matching degree-based sequential arrangement on the undetermined medical image data according to the requirement matching coefficient, and determining target medical image data from the undetermined medical image data after the matching degree-based sequential arrangement.

In an exemplary embodiment of the present invention, a server receives a current patient matching request containing a current patient medical record and a current pathology image. For example, a doctor may need to submit patient medical record information and a pathology image (e.g., CT, MRI, or X-ray image) to a server during a diagnosis process to request the server to retrieve medical image data associated with the patient medical record in order to more accurately understand the patient's condition. The server extracts basic medical image data which has association relation with the current patient medical record from the medical image information blockchain. This means that the server will access a blockchain network storing medical image and medical record information, ensuring the authenticity and reliability of the extracted medical image data by utilizing the non-tamper-evident and de-centralised nature of blockchain technology. These underlying medical image data may include medical images of other patients similar to the medical history of the current patient. The server further acquires a visual feature vector and a content feature vector corresponding to the current pathological image. The visual feature vectors may include low-level features of images of color, texture, shape, etc., while the content feature vectors may contain higher-level semantic information extracted by a deep learning model. Meanwhile, the server also extracts advanced medical image data with association relation with the content feature vector of the current pathological image from the medical image information blockchain. These advanced medical image data may be other medical images that are similar or related in visual or content to the current pathology image. The server merges the base medical image data and the advanced medical image data as pending medical image data. For these data, the server will obtain their target demand pathology image visual feature vector, target demand pathology image content feature vector, and target pathology associated content. These target demand feature vectors and content may be defined according to certain specific medical requirements or criteria. Based on the current patient record, the current pathology image content feature vector, the current pathology image visual feature vector, the target pathology associated content, the target demand pathology image content feature vector, and the target demand pathology image visual feature vector, the server may utilize an algorithm (e.g., a deep learning model) to determine a demand matching coefficient between the pending medical image data and the current patient matching request. This coefficient reflects how well the medical image data to be determined matches the current patient record and pathology image. And the server sequentially arranges the undetermined medical image data based on the matching degree according to the demand matching coefficient. This means that medical image data that better matches the current patient history and pathology image will be ranked ahead. Finally, the server determines target medical image data from the sequentially arranged undetermined medical image data based on the degree of matching. These target medical image data may be medical images of other patients that are of most reference value to the physician, helping the physician to more accurately diagnose and treat the current patient.

In order to more clearly describe the solution provided by the embodiments of the present application, the following description is made in more detail.

In an embodiment of the present invention, the current patient medical record refers to a medical record of a patient currently undergoing treatment or diagnosis, including patient basic information, medical history, diagnosis results, treatment regimen, and the like. For example, a patient named Zhang III may have medical records including age, gender, allergy history, past records of disease, family history, and duration of the cough, symptoms, etc. because he is continuously coughing to the hospital. Current pathology images refer to images generated by medical imaging examinations (e.g., X-rays, CT scans, MRI, etc.) currently performed on a patient, which images are used to assist a physician in disease diagnosis and treatment planning. Zhang san because cough received chest X-ray examination, the X-ray image produced was the current pathology image. This image may show that the lungs are shaded, suggesting that an infection or other lesion may be present. The current patient matching request refers to a request issued by a physician or medical system in order to find medical image data matching the current patient history and pathology image to aid diagnosis or treatment. After viewing Zhang Sanhe has reviewed his medical history and X-ray images, he may find that more similar cases or images are needed to aid in the judgment, and then issues a matching request to the server requesting retrieval of medical image data similar to Zhang Sanhe. Medical image information blockchain refers to a system that stores and manages medical image information using blockchain technology. Blockchain technology ensures the non-tamper-ability and security of data, and is suitable for storing medical data that is sensitive and needs to be highly trusted. In this system, each time new medical image data (e.g., CT scan images) is added, it is encrypted and stored in a block. This block will link to the previous block, forming an ever-increasing chain. Any modification to the data is immediately detected by the system and is considered invalid. The basic medical image data with the association relationship refers to medical image data with certain association or similarity with the current patient medical record. Such association may be based on diagnostic results, symptoms, disease types, etc. in the medical record. If Zhang three medical records show that he may have pneumonia, the server will extract medical image data of all patients diagnosed as pneumonia from the medical image information blockchain as basic medical image data with an association relationship. These data can be used for comparison and analysis to help doctors diagnose Zhang three disease more accurately.

Further, the current pathology image visual feature vector refers to a series of numerical features extracted from an image by image processing techniques, which may describe visual properties of the image, such as color, texture, shape, etc. In the context of the current pathology image, the visual feature vector may include features describing the size, shape, edge sharpness, etc. of the lesion region. Illustrating: for example, the current pathology image is a lung X-ray film, and the server uses an image processing algorithm to analyze the lesion (e.g., lung shadow) in the region, extract visual features such as shadow area, boundary smoothness, contrast, etc. from the lesion, and then convert the features into a numerical vector, i.e., a visual feature vector. The present pathological image content feature vector refers to high-level, abstract features extracted from an image through deep learning or other machine learning technologies, and the features can describe semantic content in the image, such as object types, scenes and the like. In the context of the current pathology image, the content feature vector may encode medically relevant information such as the type, severity, etc. of the lesion. For the same lung radiograph, the server will analyze the image using a pre-trained deep learning model and extract features describing the lesion type (e.g., lung cancer, lung inflammation, etc.), and possibly the severity (e.g., mild, moderate, severe, etc.). These features will be encoded into a vector of values, the content feature vector. Advanced medical image data refers to medical image data that has a deeper association in vision or content with the current pathology image, above the underlying medical image data. These data may provide more detailed information about the lesion or a reference to similar cases, e.g. the server has extracted visual and content feature vectors of the current pathology image, which would search the medical image information blockchain for those medical images with similar visual or content feature vectors. These found images are advanced medical image data which may contain similar lesion types, locations or disease processes as the current patient, thereby providing a more abundant diagnostic basis for the physician. The existence of the association relationship means that there is some similarity or correlation between the advanced medical image data and the content feature vector of the current pathological image. Such associations may be a server established based on similarity metrics (e.g., cosine similarity, euclidean distance, etc.) between image features that calculate the similarity between the current pathology image content feature vector and other image feature vectors stored in the medical image information blockchain. The images with higher similarity are considered to have association relation with the current pathological image and are selected as advanced medical image data. In this way, the server can screen out the medical image data that is most valuable for reference to the current case.

In addition, the basic medical image data is medical image data which is preliminarily extracted from a medical image information blockchain and has a direct association relationship with the current patient medical record. These data are basal-level and may contain basic lesion information similar to the current patient. If the current patient is a lung cancer patient, the underlying medical image data may include CT scan images of other lung cancer patients that have been diagnosed and stored in the medical image information blockchain. The advanced medical image data refers to medical image data which has a deeper association relation with the current pathological image in vision or content and is found through deeper analysis and feature matching on top of the basic medical image data. These data provide more specific, detailed lesion information. For lung cancer patients, the advanced medical image data may include CT images of other patients having similar tumor sizes, locations, morphologies, or degrees of infiltration. These data can provide a more accurate reference for the physician. The undetermined medical image data refers to an image data set to be screened and evaluated formed by combining the basic medical image data and the advanced medical image data. The images in this dataset may be related to the medical history and pathology images of the current patient, but require further analysis and validation. The pending medical image data may contain all CT images associated with lung cancer, whether they are basal or advanced. These images will be processed and analyzed later as a whole. The target demand pathology image visual feature vector is a feature vector which is extracted from each image in the medical image data to be determined through an image processing technology and describes visual attributes of the image. These feature vectors can reflect visual features of the image, such as color, texture, shape, etc. For a CT image, the target demand pathology image visual feature vector may include a series of values describing the visual features of tumor region size, shape, density, etc. These values are organized into a vector for subsequent similarity calculation and matching. The target demand pathology image content feature vector refers to a feature vector describing image semantic content extracted from undetermined medical image data through deep learning or other machine learning technologies. These feature vectors can encode high-level information in the image, such as lesion type, severity, etc. For a CT image of a lung cancer patient, the target demand pathology image content feature vector may contain high-level features describing tumor type (e.g., adenocarcinoma, squamous carcinoma, etc.), stage (e.g., early, late, etc.), and possibly prognostic information. These features are encoded into a vector of values for subsequent similarity calculations and matching. The target pathology-associated content refers to additional information associated with the pending medical image data, which may include basic information of the patient, medical history, diagnosis results, treatment regimen, etc. This information is important for understanding the medical context and context of the image. For a CT image of a lung cancer patient, the target pathology-related content can include information such as the age, sex, smoking history, family history, previous diagnosis results and treatment conditions of the patient. Such information helps the physician more fully understand the patient's condition and background information of the image.

Furthermore, the demand matching factor is a quantitative indicator for measuring the degree of similarity or matching between the pending medical image data and the current patient matching request. It may be based on comprehensive evaluation of various factors including similarity of medical record information, similarity of image content, similarity of visual features of images, and the like. The demand matching coefficient can be comprehensively obtained by calculating a plurality of indexes such as text similarity between the current patient medical record information and the target pathology associated content, cosine similarity between the current pathology image feature vector and the target demand pathology image feature vector and the like. The higher this coefficient, the higher the degree of matching of the pending medical image data with the current patient. The order arrangement based on the matching degree refers to ordering the medical image data to be determined according to the size of the matching coefficient required, so that the image with higher matching degree with the current patient is arranged in front, and a doctor can conveniently and quickly find the most relevant medical image data. If there are a plurality of pending medical image data, each image having a demand matching coefficient, the server sorts the images in order of the coefficients from high to low, forming an ordered list. The target medical image data refers to image data determined to be most suitable for a matching request of a current patient from the medical image data to be determined after being sequentially arranged based on the matching degree. These data may provide important references for diagnosis and treatment by the physician. In the ordered list of pending medical image data, the top few images may be considered to be the closest match to the current patient and are therefore selected as target medical image data. The physician may further view and analyze these images to assist their decision making process.

In the embodiment of the present invention, the foregoing step S201 may be implemented by the following example execution.

Text extraction processing is carried out on the current patient medical record to obtain a corresponding pathological keyword of the current patient medical record;

Acquiring a query identification content pair with query characteristics being keyword characteristics from the medical image information block chain; the query identification content pairs comprise query identifications correspondingly determined by medical image codes and query contents correspondingly determined by medical image keywords;

and taking a query identifier corresponding to the query content containing the pathological key words as a target query identifier, and acquiring the basic medical image data corresponding to the target query identifier from the medical image information blockchain.

In an embodiment of the present invention, the server first receives and loads medical record data of the current patient. The medical history may include portions of the patient's personal information, complaints, current medical history, past history, family history, physical examination, preliminary diagnosis, and the like. The server runs a Natural Language Processing (NLP) algorithm to perform word segmentation, part-of-speech tagging, entity recognition and other processes on the medical record text so as to extract keywords related to pathology. For example, if the current patient history describes "patient is in a diagnosis due to persistent cough and chest pain, CT examination shows that the upper right lung has a space-occupying lesion with a diameter of about 3 cm, and lung cancer is suspected", the server may extract pathological keywords such as "cough", "chest pain", "upper right lung leaf", "space-occupying lesion", "lung cancer is suspected" and the like. The server then accesses a medical image information blockchain, which is a distributed, de-centralized database that stores large amounts of medical image data and its associated metadata information. In the blockchain, each data block contains a number of medical image records, each record being identified by a unique medical image code, with a keyword description associated with the image. The server uses the pathological keywords extracted in the first step as query characteristics to search the blockchain for query identification content pairs containing the keywords. The query identification content pair is pairing data composed of a medical image code (query identification) and a medical image keyword (query content) corresponding to the medical image code. And the server screens out query contents containing current patient medical record pathology keywords from the search results, and marks medical image codes corresponding to the query contents as target query identifications. The server then accesses the medical image information blockchain again, and retrieves corresponding underlying medical image data based on the target query identifications. For example, if the server finds multiple medical image encodings in the blockchain that match keywords such as "upper right lung lobe", "placeholder lesion", etc., it marks those encodings as target query identifications and retrieves the underlying medical image data to which those encodings correspond. These data may include CT scan images, X-ray films, MRI images, etc. of other patients, all stored in different data blocks of the blockchain, and the integrity and security of the data is ensured by encryption and signing. Finally, the server returns these underlying medical image data to the requesting party (e.g., the doctor's workstation or the patient's mobile device) for the doctor to use as a reference basis in diagnosing the current patient. In this way, a doctor can quickly acquire basic medical image data related to the current patient medical record, and the accuracy and efficiency of diagnosis are improved.

Loading the current patient medical record into a semantic recognition model, and obtaining a corresponding content feature vector of the current patient medical record by using the semantic recognition model;

acquiring a query identification content pair with query characteristics being medical record content characteristics from the medical image information block chain; the query identification content pairs comprise query identifications correspondingly determined by medical image codes and query contents correspondingly determined by pathological association content feature vectors;

Obtaining a matching score between the characteristic vector of the current patient medical record content and the query content, and taking the query content with the matching score larger than a matching score threshold value as target query content;

And taking a query identifier corresponding to the target query content as a target query identifier, and acquiring the basic medical image data corresponding to the target query identifier from the medical image information blockchain.

In the embodiment of the invention, the server receives the medical record data of the current patient and then loads the medical record data into the trained semantic recognition model. This model may be built based on deep learning techniques, such as Convolutional Neural Networks (CNNs) or long-term memory networks (LSTM), specifically for understanding and representing medical text data. Through the processing of the model, the server converts the current patient medical record into a high-dimensional characteristic vector, namely the characteristic vector of the content of the current patient medical record. The vector captures key information and semantic structures in the medical record, and provides a basis for subsequent similarity matching. For example, if information such as patient symptoms, diagnostic results, and treatment regimens is described in the medical record, the semantic recognition model encodes the information into a vector of values, each element representing the intensity or relevance of the expression of a particular aspect of the medical record. The server accesses the medical image information blockchain and searches for query identification content pairs stored therein. These pairs are made up of medical image codes (query identifications) and pathology-associated content feature vectors (query content) associated therewith. The server is particularly concerned with identifying pairs of content whose query features are medical record content features. These query identification content pairs are stored in the blockchain in an encrypted and signed manner, ensuring the integrity and security of the data. The server uses appropriate decryption and authentication mechanisms to obtain this data. The server performs similarity calculation on the current patient medical record content feature vector and the query content retrieved from the blockchain. The similarity calculation can adopt cosine similarity, euclidean distance and other methods to measure the proximity degree of two vectors in space. The server calculates a matching score for each pair of compared feature vectors, the score reflecting the degree of similarity between the current patient medical record and the query content. The server then compares the match scores to a preset match score threshold. Only those query contents that match scores above the threshold are considered targeted query contents. And finally, marking the medical image code corresponding to the target query content determined in the step three as a target query identifier by the server. It then accesses the medical image information blockchain again, retrieving the corresponding underlying medical image data based on these target query identifications. The data can comprise CT scan images, MRI images, X-ray films and the like, and have association relation with the medical record of the current patient, so that the data can be used as an important reference basis for diagnosis and treatment of doctors. The server returns this data to the requesting party (e.g., the physician's workstation or the patient's mobile device) to support the subsequent medical decision-making process.

In the embodiment of the present invention, the foregoing step S202 may be implemented by the following example execution.

Loading the current pathological image to a pathological image recognition model, and obtaining a pathological image recognition result of the current pathological image by using the pathological image recognition model;

Extracting a pathological content recognition model with an association relation with the pathological image recognition result and a pathological visual feature recognition model with an association relation with the pathological image recognition result;

Loading the current pathological image to the pathological content recognition model, and obtaining the content feature vector of the current pathological image corresponding to the current pathological image by utilizing the pathological content recognition model;

loading the current pathological image to the pathological visual feature recognition model, extracting the tissue structure features of the current pathological image and the pathological change features of the current pathological image by using the pathological visual feature recognition model, and performing depth feature fusion operation on the tissue structure features of the current pathological image and the pathological change features of the current pathological image to obtain the current pathological image visual feature vector corresponding to the current pathological image.

In an embodiment of the invention, the server first receives pathology image data of the current patient, which may be a digitized microscopic slice image, CT scan image, or other type of medical image. The server loads the image into a pre-trained pathology image recognition model. This model may be built based on deep learning techniques, such as Convolutional Neural Networks (CNNs), specifically for analyzing and understanding medical pathology images. And the server obtains a pathological image recognition result of the current pathological image through the processing of the model. The result may be a probability distribution representing the likelihood of different pathology types in the image; or a specific pathological type label, such as "lung cancer", "breast cancer", etc. This result provides the basis for subsequent feature extraction and matching. According to the pathological image recognition result, the server extracts a pathological content recognition model and a pathological visual characteristic recognition model which are associated with the recognition result from the model library. These models may be optimized for specific pathology types or image features to ensure more accurate extraction and representation of critical information in the image. For example, if the pathological image recognition result shows that the current image is a lung cancer slice image, the server selects a pathological content recognition model specially used for analyzing the tissue structure of lung cancer and a pathological visual feature recognition model aiming at the pathological feature of lung cancer. And the server loads the current pathological image into the extracted pathological content identification model. The model can further analyze the image and extract content characteristics related to pathological types, tissue structures and the like. These features are encoded as a high-dimensional feature vector, i.e. the current pathology image content feature vector. The vector captures key pathological information in the image and provides basis for subsequent image matching and diagnosis. Finally, the server loads the current pathology image into the pathology visual feature recognition model. This model focuses on extracting visual features in the image, such as texture, shape, color, etc., which are critical to describe the tissue structure and appearance of the lesion. The model first extracts the tissue structure features and lesion features of the current pathological image, which may include the arrangement of cells, the morphology of blood vessels, the size and shape of the lesion region, etc. The server then performs a depth feature fusion operation that fuses these features into a more comprehensive visual feature vector, i.e., the visual feature vector of the current pathology image. The vector not only contains basic visual attributes in the image, but also captures interaction and relation among different features, and provides richer information for subsequent image similarity matching.

In the embodiment of the invention, the pathological content recognition model comprises a key pathological content recognition model, a detail pathological content recognition model and a global pathological content recognition model; the foregoing step of loading the current pathology image into the pathology content recognition model, and obtaining the content feature vector of the current pathology image corresponding to the current pathology image by using the pathology content recognition model may be implemented by the following example implementation.

Loading the current pathological image to the key pathological content recognition model, and extracting key content features of the current pathological image by using the key pathological content recognition model to obtain the key content feature vector;

loading the current pathological image to the detail pathological content recognition model, and extracting detail content characteristics of the current pathological image by using the detail pathological content recognition model to obtain the detail content characteristic vector;

loading the current pathological image to the global pathological content recognition model, and extracting global content features of the current pathological image by using the global pathological content recognition model to obtain the global content feature vector;

And taking the key content feature vector, the global content feature vector and the detail content feature vector as the current pathological image content feature vector.

In an embodiment of the present invention, the server first loads the current pathology image into the key pathology content identification model, for example. This model is trained specifically to identify key pathological content in images, such as lesion areas, abnormal cells, etc. Through processing of the model, the server extracts key content features in the image, which may be key information closely related to lesion type, severity, etc. These key content features are encoded as a feature vector, i.e. key content feature vector. Next, the server loads the current pathology image into a detailed pathology content identification model. This model focuses on capturing detailed information in the image, such as morphology of cells, division of nuclei, etc. Such detailed information is critical for pathological diagnosis and differential diagnosis. Through the processing of the model, the server extracts the detail content features in the image and encodes them into a feature vector, namely the detail content feature vector. The server then loads the current pathology image into the global pathology content identification model. This model analyzes the image from a global perspective, extracting content features related to the entire image, such as texture, color distribution, etc. of the image. These global features help describe the overall properties and style of the image. And the server obtains the global content feature vector through the processing of the model. And finally, combining or fusing the key content feature vector, the global content feature vector and the detail content feature vector by the server to form a more comprehensive content feature vector of the current pathological image. The vector integrates key information, detail information and global information in the image, and provides rich basic data for subsequent image matching and diagnosis. Through the processing flow, the server can accurately extract and represent the content characteristics of the current pathological image, and powerful support is provided for subsequent medical image analysis and diagnosis.

Acquiring a query identification content pair with query characteristics as content characteristics from the medical image information block chain; the query identification content pairs comprise query identifications correspondingly determined by medical image codes and query contents correspondingly determined by demand pathological image content feature vectors;

And extracting the advanced medical image data with association relation with the current pathological image content feature vector from the medical image information blockchain according to the current pathological image content feature vector and the query identification content pair.

In the present embodiment, the server first accesses the medical image information blockchain, which is a secure, non-tamperable data storage network in which large amounts of medical image data and its associated information are stored. The server searches the blockchain for queries that feature content, identifying pairs of content. These query identification content pairs consist of medical image encodings (as query identifications) and demand pathology image content feature vectors (as query content) associated therewith. For example, the server may find a query identifier content pair, where the query identifier is a particular medical image code and the query content is a content feature vector describing a certain pathology type. The server then performs similarity matching of the content feature vector of the current pathology image with the query content in the query identification content pair retrieved from the blockchain. The similarity matching can adopt cosine similarity, euclidean distance and other methods to measure the proximity degree of two feature vectors in space. After the server finds the query identification content pairs with high similarity with the feature vectors of the content of the current pathological image, the corresponding medical image codes are marked as target query identifications. It then accesses the medical image information blockchain again, retrieving corresponding advanced medical image data based on these target query identifications. The advanced medical image data may include higher resolution pathology images, three-dimensional reconstructed images, multi-modality fusion images, etc., which provide richer, deeper information than the original pathology image. These data can be used for further diagnosis by doctors, formulation of treatment schemes, prognosis evaluation of diseases, etc. For example, if the current pathology image shows a region suspected of a tumor, the server may extract a high resolution image and a three-dimensional reconstruction image of the region from the blockchain, so that the doctor can more accurately judge the size, position and infiltration degree of the tumor. Through the processing flow, the server can effectively extract the advanced medical image data with association relation with the content feature vector of the current pathological image from the medical image information blockchain, and provides powerful support for diagnosis and treatment of doctors.

In the embodiment of the invention, the content feature vector of the current pathological image comprises a key content feature vector, a global content feature vector and a detail content feature vector; the demand pathology image content feature vector comprises a demand key content feature vector, a demand global content feature vector and a demand detail content feature vector; the query identification content pairs comprise a query key identification content pair, a query global identification content pair and a query detail identification content pair; the query key identification content pair comprises a key query identification correspondingly determined by the medical image coding and a query key content correspondingly determined by the requirement key content feature vector; the query detail identification content pair comprises a query detail identification correspondingly determined by the medical image coding and a query detail content correspondingly determined by the demand detail content feature vector; the query global identification content pair comprises a query global identification correspondingly determined by the medical image code and a query global content correspondingly determined by the demand global content feature vector; the aforementioned step of extracting the advanced medical image data having an association relationship with the current pathological image content feature vector from the medical image information blockchain according to the current pathological image content feature vector and the query identification content pair may be implemented by the following example execution.

Acquiring a first matching score between the key content feature vector and the query key content, taking the query key content with the first matching score larger than a first matching score threshold value as target query key content, taking a key query identifier corresponding to the target query key content as a target key query identifier, and acquiring key medical image data corresponding to the target key query identifier in the medical image information block chain;

Acquiring a second matching score between the global content feature vector and the query global content, taking the query global content with the second matching score being larger than a second matching score threshold value as target query global content, taking a query global identifier corresponding to the target query global content as a target query global identifier, and acquiring global medical image data corresponding to the target query global identifier in the medical image information blockchain;

Acquiring a third matching score between the detail content feature vector and the query detail content, taking the query detail content with the third matching score larger than a third matching score threshold value as target query detail content, taking a query detail identifier corresponding to the target query detail content as a target query detail identifier, and acquiring detail medical image data corresponding to the target query detail identifier in the medical image information blockchain;

and taking the key medical image data, the detail medical image data and the global medical image data as the advanced medical image data.

In an embodiment of the present invention, the server first processes the key content feature vector of the current pathology image, for example. It uses a similarity measure (e.g., cosine similarity) to calculate the matching scores between the key content feature vectors of the current pathology image and the individual query key content retrieved from the medical image information blockchain. These query keys are pre-stored in the blockchain, associated with a particular medical image encoding. For example, if the key content feature vector of the current pathology image has high similarity to a certain query key content, the matching score between them will be high. The server compares all calculated first matching scores with a preset first matching score threshold. Those query keywords that have a matching score above the threshold are considered highly relevant to the target pathology image and are therefore selected as target query keywords. The server then retrieves key medical image data associated with the target query key content from the medical image information blockchain. These critical medical image data may include higher resolution pathological image slices, magnified images of specific areas, etc., which provide the physician with more in-depth information about the pathological area. The server processes the global content feature vector of the current pathology image in a similar manner. It calculates a second match score between the global content feature vector and each of the query global content retrieved from the blockchain. These query global contents are also associated with a specific medical image coding. The server compares the calculated second matching score with a preset second matching score threshold value, and selects the query global content with the matching score higher than the threshold value as the target query global content. It then retrieves global medical image data associated with these target query global content from the medical image information blockchain. These data may include an overview image of the entire pathological section, images under different staining methods, etc., which provide context information of the lesion over a larger range. The server continues to process the detail content feature vector for the current pathology image, calculating a third match score between it and each query detail content retrieved from the blockchain. The details of these queries focus on subtle features of the image, such as morphology, structure, etc. of the cells. And the server compares the third matching score with a preset third matching score threshold value, and selects the query detail content with the matching score higher than the threshold value as the target query detail content. It then retrieves detailed medical image data associated with these targeted query detail content from the medical image information blockchain. These data may include images of cells under high power, images of specific staining markers, etc., which provide the physician with detailed information about the lesion cell level. Finally, the server integrates the acquired key medical image data, global medical image data and detail medical image data together to form a comprehensive advanced medical image data set. The data set not only contains the information of the original pathological image, but also provides multi-level additional data and analysis results, which is helpful for doctors to make more accurate diagnosis and treatment decisions.

In an embodiment of the invention, the server receives a CT scan of the lung, which has been encoded into a particular medical image encoding (e.g., DICOM format). At the same time, the server also receives pathological content associated with the image, including diagnostic reports, doctor notes, patient history, and the like. The server performs text extraction processing on the received pathological association content, and identifies and extracts keywords related to lung cancer, such as lung cancer, tumor size, pathological change position and the like, by using a Natural Language Processing (NLP). The server takes the medical image code as a query identifier, takes the extracted lung cancer related keywords as query contents, and constructs a query identifier content pair with query characteristics as keyword characteristics. For example, the query is identified as "DICOM_12345" and the query is identified as "lung cancer, tumor size, lesion location". In addition to the keyword features, the server also needs to process deep features of the pathology-associated content. For this purpose, the server uses deep learning techniques, such as Convolutional Neural Network (CNN) or Recurrent Neural Network (RNN), to perform deep analysis on the pathology-associated content, and extracts the intrinsic feature representation, i.e. the pathology-associated content feature vector. This feature vector is a point in high-dimensional space that contains information about the semantics, structure, and context of the original text. The server takes the medical image code as a query identifier, takes the extracted pathological association content feature vector as query content, and constructs a query identifier content pair with query characteristics of medical record content characteristics. Unlike the query approach of keyword features, the query based on medical record content features is more focused on the overall understanding and matching of text content. When a doctor or system needs to retrieve similar medical images according to specific pathological features (such as lung cancer type, lesion degree and the like), the server can quickly locate relevant medical images and medical record information by utilizing the constructed query identification content pairs. By comparing the query content with the similarity of the feature vectors of the medical images and medical record information in the database, the server can return the best matching result for the doctor to refer to.

In the embodiment of the present invention, the foregoing step S203 may be implemented by the following example execution.

The medical image code corresponding to the medical image data to be determined is used as a medical image data standard code to be determined;

Extracting the target demand pathological image visual feature vector with a mapping relation with the to-be-determined medical image data mark code, the target demand key content feature vector with a mapping relation with the to-be-determined medical image data mark code, the target demand global content feature vector with a mapping relation with the to-be-determined medical image data mark code, the target demand detail content feature vector with a mapping relation with the to-be-determined medical image data mark code and the target pathological association content with a mapping relation with the to-be-determined medical image data mark code from the medical image information blockchain;

And taking the target requirement key content feature vector, the target requirement global content feature vector and the target requirement detail content feature vector as the target requirement pathological image content feature vector.

In an exemplary embodiment of the invention, the server receives a piece of pending medical image data, which may be from a medical imaging system of a hospital, a pathology laboratory, or other medical related device. This data is accompanied by a unique medical image code for uniquely identifying the image in the medical image information system. The server records this medical image code as a pending medical image data tag code for subsequent retrieval and mapping in the medical image information blockchain. The server accesses a medical image information blockchain, which is a distributed, non-tamperable data storage network in which a large number of medical images and their associated metadata and feature vectors are stored. The server uses the previously recorded code of the pending medical image data to retrieve various data in the blockchain that has a mapping relationship with the code. These data include: target demand pathology image visual feature vector: this is a vector describing the visual characteristics of the image and may include features such as color, texture, shape, etc. to aid in the visual classification and identification of the image. Target demand key content feature vector: this is a feature vector describing the critical pathological content in the image, and can focus on critical information such as lesion areas, abnormal cells, etc. Target demand global content feature vector: this is a feature vector that describes the overall content and style of an image, and may include global features such as layout, structure, etc. of the image. Target demand detail content feature vector: this is a feature vector describing detailed information in an image, and can focus on fine features such as morphology and structure of cells. Target pathology associated content: this is other information associated with the medical image in question, which may include diagnostic reports, medical history records, laboratory test results, etc., to aid the physician in comprehensive analysis and diagnosis.

In the embodiment of the invention, the total number of the undetermined medical image data is at least two, and the at least two undetermined medical image data comprise first undetermined medical image data; the target pathology associated content comprises first target pathology associated content corresponding to the first medical image data to be determined; the target demand pathological image content feature vector comprises a first target demand pathological image content feature vector corresponding to the first medical image data to be determined; the target demand pathological image visual feature vector comprises a first target demand pathological image visual feature vector corresponding to the first medical image data to be determined; the aforementioned step S204 may be implemented by the following example execution.

Obtaining matching scores between the current patient medical record and the first target pathology associated content, and taking the matching scores as text matching scores corresponding to the first medical image data to be determined;

obtaining a matching score between the content feature vector of the current pathological image and the content feature vector of the pathological image of the first target requirement, and taking the matching score as a content matching score corresponding to the first medical image data to be determined;

Obtaining a matching score between the current pathological image visual feature vector and the first target demand pathological image visual feature vector, and taking the matching score as a corresponding visual matching score of the first medical image data to be determined;

Performing weighted average operation on the text matching score, the content matching score and the visual matching score to obtain a demand matching coefficient between the first predetermined medical image data and the current patient matching request;

And according to the corresponding requirement matching coefficient of each piece of undetermined medical image data, sequentially arranging the at least two pieces of undetermined medical image data based on the matching degree.

In an embodiment of the present invention, the server is required to process at least two pending medical image data and order them according to their degree of matching to the current patient, for example, in the present scenario. These pending medical image data may come from a medical image archiving system in a hospital or other possibly relevant medical images considered by a doctor for the current patient. The server determines the matching degree of each pending medical image data and the matching request of the current patient by using the medical record information of the current patient, the feature vector of the current pathology image, the target pathology associated content acquired from the medical image information blockchain, the feature vector of the content of the target demand pathology image and the vision feature vector of the target demand pathology image. The server first obtains medical record information of the current patient, which may include textual information of the patient's medical history, diagnostic records, laboratory test results, and the like. Then, the server matches medical record information of the current patient with first target pathology associated content corresponding to first pending medical image data acquired from a medical image information blockchain. The matching process can involve techniques of text similarity calculation, keyword extraction, etc., to quantify the correlation between medical record information and the targeted pathology-associated content. The calculated matching score is recorded as a corresponding text matching score for the first medical image data to be determined. The server then processes the content feature vector of the current pathology image. A similarity measurement method (such as cosine similarity) is used for calculating a matching score between the content feature vector of the current pathological image and the content feature vector of the pathological image of the first target requirement corresponding to the first medical image data to be determined obtained from the medical image information blockchain. This matching score reflects the similarity of the current pathology image to the first medical image to be determined at the content level. Similarly, the server also processes the visual feature vector of the current pathology image. It calculates a matching score between the visual feature vector of the current pathology image and the visual feature vector of the first target demand pathology image corresponding to the first medical image data to be determined. The matching score reflects the similarity of the current pathological image and the first medical image to be determined on the visual level, such as the similarity of visual characteristics of color, texture, shape and the like. The server performs weighted average operation on the text matching score, the content matching score and the visual matching score obtained by the previous calculation. The weighted average operation may assign different weights based on the importance and reliability of each matching score to comprehensively evaluate the overall degree of matching between the first pending medical image data and the current patient matching request. The operation result is the requirement matching coefficient of the first medical image data to be determined. Finally, the server repeats the above steps for each pending medical image data, calculating their respective demand matching coefficients. Then, the server ranks all the undetermined medical image data according to the demand matching coefficients, and ranks the medical image data with the highest matching degree with the current patient matching request from high to low. In this way, the doctor can quickly find the medical image data most relevant to the current patient according to the sequencing result, and powerful support is provided for diagnosis and treatment.

In the embodiment of the invention, the content feature vector of the current pathological image comprises a key content feature vector, a global content feature vector and a detail content feature vector; the first target demand pathological image content feature vector comprises a first target demand key content feature vector, a first target demand global content feature vector and a first target demand detail content feature vector; the step of obtaining the matching score between the content feature vector of the current pathological image and the content feature vector of the first target required pathological image, and taking the matching score as the content matching score corresponding to the first medical image data to be determined may be implemented through the following example execution.

Obtaining the matching scores between the key content feature vectors and the first target demand key content feature vectors, and taking the matching scores as the corresponding key content matching scores of the first medical image data to be determined;

Obtaining the matching score between the global content feature vector and the first target demand global content feature vector, and taking the matching score as the global content matching score corresponding to the first medical image data to be determined;

obtaining the matching scores between the detail content feature vectors and the first target demand detail content feature vectors, and taking the matching scores as the corresponding detail content matching scores of the first medical image data to be determined;

And taking the key content matching score, the global content matching score and the detail content matching score as the content matching score.

In the embodiment of the present invention, the server needs to evaluate the matching degree of the current pathological image and the first predetermined medical image on the content level. To more fully analyze the image content, the server divides the content feature vectors of the pathology image into key content feature vectors, global content feature vectors, and detail content feature vectors. Accordingly, the first target demand pathology image content feature vector obtained from the medical image information blockchain also includes a first target demand key content feature vector, a first target demand global content feature vector, and a first target demand detail content feature vector. The server first processes the key content feature vector of the current pathology image. The key content feature vector typically contains the most important pathological information in the image, such as the location, size, etc. of the lesion area. The server calculates a matching score between the key content feature vector of the current pathology image and the first target demand key content feature vector of the first medical image to be determined using a similarity measurement algorithm. This matching score reflects the degree of similarity of the two on the key pathology content. Next, the server processes the global content feature vector of the current pathology image. The global content feature vector describes the overall structure and layout of the image, including the distribution of different tissues or cells, etc. The server also calculates a matching score between the global content feature vector of the current pathology image and the first target demand global content feature vector of the first medical image to be determined using a similarity metric algorithm. This matching score reflects the degree of similarity of the two over the entire content of the image. The server then processes the detail content feature vector of the current pathology image. The detail content feature vector focuses on fine structure and texture information in the image, such as morphology of cells, details of nuclei, and the like. The server calculates a matching score between the detail content feature vector of the current pathology image and the first target demand detail content feature vector of the first medical image to be determined. This matching score reflects the degree of similarity of the two on the details of the image. And finally, integrating the key content matching score, the global content matching score and the detail content matching score obtained by the previous calculation by the server. The matching scores together form corresponding content matching scores of the first medical image data to evaluate the overall matching degree of the current pathological image and the first medical image on the content level. The server may perform a weighted average or other form of integrated processing on the match scores to obtain a more comprehensive content match score, depending on the particular needs.

In the embodiment of the present invention, the following implementation manner is also provided.

Acquiring medical images, extracting at least two medical image shots in the medical images, and respectively performing data cleaning operation on the at least two medical image shots to obtain a target medical image shot;

Acquiring a medical image screenshot code corresponding to the target medical image screenshot, and taking the medical image screenshot code as a medical image code;

Acquiring a medical image screenshot description corresponding to the target medical image screenshot, taking the medical image screenshot description as pathology associated content, constructing a query identification content pair with query characteristics of character description characteristics according to the medical image code and the pathology associated content, and storing the query identification content pair with the query characteristics of character description characteristics in the medical image information block chain;

Acquiring a content feature vector of a required pathological image corresponding to the target medical image screenshot and a visual feature vector of the required pathological image corresponding to the target medical image screenshot;

the medical image code is used as query content, the content feature vector of the required pathological image is used as query identification, a query identification content pair with query characteristics as content features is constructed, and the query identification content pair with the query characteristics as content features is stored in the medical image information blockchain;

And constructing a mapping relation for the visual feature vector of the required pathological image and the medical image code, and storing the mapping relation in the medical image information blockchain.

In an embodiment of the present invention, the server is required to process the medical image data and store the processed data and related feature vectors and information in the medical image information blockchain. The method involves the steps of extracting screenshot from medical images, cleaning data, generating query identification content pairs, storing mapping relations and the like. The server first obtains medical images, which may be from a hospital image archiving system or other medical data source. The server then extracts at least two medical image shots from the medical images using image processing techniques. These shots may include specific areas or pathological manifestations of interest to a doctor or researcher. The extracted medical image shots may contain noise, artifacts, or other unwanted information. Therefore, the server performs data cleaning operation on the screenshots to remove the interference factors and improve the image quality. The data cleaning can comprise the steps of denoising, enhancing contrast, adjusting brightness and the like, and finally the target medical image screenshot is obtained. The server obtains a unique code for each target medical image capture (medical image capture code) that is used to identify and retrieve images in subsequent steps. Meanwhile, the server also acquires description information (medical image screenshot description) related to the screenshots, and the description can comprise key information such as lesion type, position, severity and the like, so that the method has important value for doctors and researchers. The server encodes the medical image screenshot as a medical image encoding and describes the medical image screenshot as pathology-associated content. It then constructs a query identification content pair whose query features are literal description features from these encodings and descriptions. These query identification content pairs allow the user to retrieve relevant medical image shots by entering descriptive text. And finally, the server stores the query identification content pairs into a medical image information blockchain, so that the integrity and traceability of the data are ensured. In order to support a higher-level image retrieval function, the server also acquires a content feature vector and a visual feature vector of the required pathological image corresponding to the target medical image screenshot. These feature vectors are high-dimensional representations extracted from the image by deep learning or other machine learning algorithms for quantifying the content and visual characteristics of the image. The content feature vectors may focus on semantic information of the image, while the visual feature vectors focus on details such as visual appearance and texture of the image. The server takes the medical image code as query content, takes the demand pathological image content feature vector as query identification, and constructs a query identification content pair with query features as content features. These query-identified content pairs allow the user to retrieve relevant medical image shots by entering feature vectors that are similar to the image content. The server then stores these query identification content pairs also into the medical image information blockchain. And finally, the server establishes a mapping relation between the visual feature vector of the required pathological image and the medical image code, and stores the mapping relation into a medical image information blockchain. Thus, when a user inputs a visual feature vector, the server can use the mapping relation to quickly find the matched medical image screenshot code, and further search the corresponding image. This step improves the efficiency and accuracy of image retrieval.

In the embodiment of the invention, the at least two medical image shots include a first medical image shot; the step of performing the data cleaning operation on the at least two medical image shots to obtain the target medical image shot may be implemented by the following example.

Obtaining a matching score between the first medical image screenshot and the second medical image screenshot; the second medical image screenshot comprises medical image shots except the first medical image screenshot in the at least two medical image shots;

If the matching score is equal to or greater than a matching score threshold, taking the first medical image screenshot as a redundant medical image screenshot, and discarding the redundant medical image screenshot from the at least two medical image shots to obtain an intermediate medical image screenshot;

And performing invalid data judgment processing on the intermediate medical image screenshot to obtain an invalid local image of the intermediate medical image screenshot, and performing removal processing on the invalid local image in the intermediate medical image screenshot to obtain the target medical image screenshot.

In an embodiment of the present invention, the server needs to process at least two shots extracted from the medical image, including a first medical image shot and other shots (collectively referred to as a second medical image shot). The purpose of the processing is to remove redundant and invalid screenshots and data to ensure the quality of the data for subsequent analysis and storage. The server first obtains a matching score between the first medical image screenshot and the second medical image screenshot (i.e., other shots than the first screenshot). This match score is calculated by comparing the similarity of the two shots on content, structure or other features. If the two shots are very similar or identical, their match scores will be high. The server sets a match score threshold for determining whether the two shots are similar enough that one of them can be considered redundant. If the matching score of the first medical image capture and any of the second medical image captures is equal to or greater than the threshold, the server marks the first medical image capture as a redundant medical image capture and discards it from the collection of at least two medical image captures. The purpose of this step is to reduce data redundancy and improve the efficiency of subsequent processing. After removing redundant shots, the server further processes the remaining shots (now called intermediate medical image shots). The purpose of this step is to identify and remove invalid data in these shots. Invalid data may be poor image quality, lost information, or misleading content due to errors, artifacts, or other causes in the image acquisition process. The server uses image processing and analysis algorithms to detect invalid data in the intermediate medical image capture. These algorithms may include noise detection, edge detection, region segmentation, etc. techniques for identifying abnormal regions or features in the image. Upon detection of invalid data, the server marks these areas as invalid partial images. And finally, the server removes the invalid local image in the intermediate medical image screenshot. The manner of removal may include clipping, replacement or repair, etc., depending on the type and severity of the invalid data. After the processing, the server obtains final target medical image screenshots, redundant and invalid data are removed from the screenshots, and the screenshots can be used for subsequent medical image analysis, storage, retrieval and other operations.

In the embodiment of the present invention, the step of acquiring the medical image screenshot description corresponding to the target medical image screenshot may be implemented by the following example execution.

Acquiring medical image description contents of the medical image configuration, and taking the medical image description contents as a first medical image screenshot description;

analyzing physiological structure information in the target medical image screenshot to obtain a physiological structure description, and taking the physiological structure description as a second medical image screenshot description;

extracting diagnosis remark information in the target medical image screenshot, and describing the diagnosis remark information as a third medical image screenshot;

and taking the first medical image screenshot description, the second medical image screenshot description and the third medical image screenshot description as the medical image screenshot description.

In an embodiment of the present invention, for example, in this scenario, the server needs to generate a corresponding medical image screenshot description for the target medical image screenshot. Such descriptive information is critical to subsequent medical image retrieval, analysis and understanding. The server generates comprehensive and accurate descriptions by combining the inherent descriptions of the medical images, the physiological structure analysis, and the diagnostic remark information. The server first accesses a configuration file or database of medical images from which to extract the inherent description content associated with the medical images. Such content may include basic information such as image type, acquisition device information, patient position, scan area, etc. This information constitutes a first medical image screenshot description, providing a basic framework for the following description. Next, the server uses image processing and analysis techniques to perform a deep resolution of the target medical image capture. By identifying different tissues, organs and anatomical structures in the image, the server is able to generate detailed descriptions about these physiological structures. For example, it may identify the size, shape, location of the heart, the vessels and lungs adjacent thereto, and the like. These physiological structure descriptions constitute a second medical image screenshot description, providing the doctor or researcher with detailed information about the patient anatomy. In addition to the inherent descriptions and physiological structural analysis, the server also extracts diagnostic remark information in the target medical image screenshot. These notes may be added by the physician in reviewing the images to record information such as abnormal findings, suspected lesions, or areas that require further attention. The diagnosis remark information generally contains abundant clinical findings and expertise, and has important value for subsequent image interpretation and decision support. The server collates the diagnostic remark information into a third medical image screenshot description. Finally, the server integrates the first medical image screenshot description (inherent description), the second medical image screenshot description (physiological structure description) and the third medical image screenshot description (diagnosis remark information) comprehensively. By combining this information, the server generates a comprehensive and detailed description of the medical image screenshot. The description not only contains basic information and structural characteristics of the image, but also integrates professional insight and diagnosis information of doctors, and provides powerful support for subsequent medical image application.

In the embodiment of the invention, the text description features comprise keyword features and medical record content features; the foregoing step of constructing a query identification content pair whose query features are literal description features from the medical image encoding and the pathology-associated content may be performed by the following example.

Performing text extraction processing on the pathology-associated content to obtain medical image keywords corresponding to the pathology-associated content, taking the medical image codes as query identifications, taking the medical image keywords as query contents, and constructing query identification content pairs with query characteristics being the keyword characteristics;

and obtaining a pathology associated content feature vector corresponding to the pathology associated content, encoding the medical image as a query identifier, taking the pathology associated content feature vector as query content, and constructing a query identifier content pair with query characteristics of the medical record content feature.

In the embodiment of the invention, the server needs to process the text description information related to the medical image, and construct query identification content pairs with two different characteristics according to the information: one based on keyword features and the other based on medical record content features. These queries identify pairs of content to be used for subsequent medical image retrieval and analysis. The server firstly carries out text extraction processing on the acquired pathology associated content. The pathology-associated content may contain rich text information such as diagnostic reports, medical records, doctor notes, etc. Through text extraction techniques, the server is able to identify and extract keywords from it that are closely related to the medical image. These keywords may include disease names, symptom descriptions, anatomical sites, pathological changes, etc., which constitute medical image keywords. And then, the server codes the medical image as a query identifier, takes the extracted medical image keywords as query contents, and constructs a query identifier content pair with query characteristics as keyword characteristics. This means that these query identification content pairs can be used to quickly locate relevant medical images when a user or system needs to retrieve medical images from keywords. The introduction of the keyword features makes the retrieval process more intuitive and efficient. In addition to the keyword features, the server also needs to process deep features of the pathology-associated content. For this purpose, the server uses Natural Language Processing (NLP) or deep learning technology to perform deep analysis on the pathology-associated content, and extracts the intrinsic feature representation, i.e. the pathology-associated content feature vector. This feature vector is a point in high-dimensional space that contains information about the semantics, structure, and context of the original text, and is used to quantify the characteristics of the text content. And finally, the server takes the medical image code as a query identifier, takes the extracted pathological association content feature vector as query content, and constructs a query identifier content pair with query features being medical record content features. Unlike the query approach of keyword features, the query based on medical record content features is more focused on the overall understanding and matching of text content. This query approach provides greater accuracy and flexibility in processing complex, ambiguous query requests. Through the steps, the server successfully and tightly combines the medical image and the related text description information thereof, and provides powerful support for subsequent image retrieval and analysis.

In the embodiment of the invention, the demand pathological image content feature vector comprises a demand key content feature vector, a demand global content feature vector and a demand detail content feature vector; the step of obtaining the content feature vector of the required pathological image corresponding to the target medical image screenshot and the visual feature vector of the required pathological image corresponding to the target medical image screenshot can be implemented through the following example implementation.

Loading the target medical image screenshot to a pathology image recognition model, and obtaining screenshot category information of the target medical image screenshot by using the pathology image recognition model;

Loading the target medical image screenshot to a key feature extraction model with association relation with the screenshot category information, and extracting key content features of the target medical image screenshot by using the key feature extraction model to obtain the required key content feature vector;

loading the target medical image screenshot to a detail feature extraction model with association relation with the screenshot category information, and extracting detail content features of the target medical image screenshot by using the detail feature extraction model to obtain the required detail content feature vector;

loading the target medical image screenshot to a global feature extraction model with association relation with the screenshot category information, and extracting global content features of the target medical image screenshot by using the global feature extraction model to obtain the required global content feature vector;

Loading the target medical image screenshot to a visual feature extraction model with association relation with the screenshot category information, extracting tissue structural features of the target medical image screenshot and lesion features of the target medical image screenshot by using the visual feature extraction model, and performing depth feature fusion operation on the tissue structural features of the target medical image screenshot and the lesion features of the target medical image screenshot to obtain the visual feature vector of the required pathological image.

In the embodiment of the invention, an exemplary server needs to process the target medical image screenshot and extract various feature vectors, including a required key content feature vector, a required global content feature vector, a required detail content feature vector and a required pathological image visual feature vector. These feature vectors will be used for subsequent image analysis and understanding. The server firstly loads the target medical image screenshot into a pre-trained pathological image recognition model. The model can accurately identify the type information of the screenshot, such as lesion type, tissue position and the like, through a large amount of medical image data training. Through this step, the server obtains the screenshot category information of the target medical image screenshot. And then, the server loads the target medical image screenshot into a key feature extraction model with an association relation with the target medical image screenshot according to the screenshot category information. This model focuses on extracting key content features in the image, such as important lesion areas, key anatomy, etc. Through this process, the server obtains the demand key content feature vector, which collectively reflects the most important and representative information in the image. In addition to the key content features, the server also needs to focus on detailed information in the image. Therefore, the target medical image screenshot is loaded into a detail feature extraction model which has an association relationship with screenshot class information. This model can capture subtle changes in the image and detailed structures such as morphology, texture changes, etc. of the cells. Through this step of processing, the server obtains the demand detail content feature vector, and provides a basis for subsequent fine analysis. In order to fully understand the content of the medical image shots, the server also needs to extract global content features. The method comprises the steps of loading a target medical image screenshot into a global feature extraction model which has an association relationship with screenshot class information. The model can grasp the structure and the layout of the image on the whole and extract the characteristics reflecting the global information of the image. Through this process, the server obtains a required global content feature vector that helps to grasp the overall features and context information of the image. And finally, the server loads the target medical image screenshot into a visual feature extraction model which has an association relationship with the screenshot category information. This model can extract the tissue structural features and lesion features of the image, which reflect the visual properties and lesion appearance of the image. In order to fully utilize these features, the server performs a depth feature fusion operation on the tissue structural features and lesion features, fusing them into a more comprehensive, representative feature vector, the desired pathology image visual feature vector. The vector integrates various visual information of the image, and provides powerful support for subsequent image analysis and identification.

In a more detailed embodiment, the server is required to process a target medical image screenshot, which is a CT scan of the lung showing a lesion area suspected of lung cancer. The goal of the server is to extract the various feature vectors of this screenshot for subsequent image analysis and understanding. The server first loads the lung CT scan image into a pre-trained pathology image recognition model. The model can accurately identify the category information of the image through a large amount of lung CT image data training. In this example, the model identifies that the screenshot belongs to the category of "lung CT images, suspected lung cancer. And then, the server loads the lung CT scanning image into a key feature extraction model with association relation according to the screenshot class information. This model focuses on extracting key content features in the image, such as the location, size, and shape of the lesion area. In this example, the key feature extraction model successfully extracts the location and size information of the suspected lung cancer lesion area, generating the demand key content feature vector.

Extracting key content feature vectors:

input: and (5) displaying a suspected lung cancer lesion area by the lung CT scan image.

And (3) treatment: and identifying and extracting key features such as positions (such as coordinate information), sizes (such as diameters and areas) and shapes (such as circles and irregular shapes) of the lesion areas through a key feature extraction model.

And (3) outputting: the required key content feature vector contains key information such as the position, the size, the shape and the like of the lesion area.

In addition to the key content features, the server also needs to focus on detailed information in the image. In this example, the detail feature extraction model can capture subtle changes and detail structures of the suspected lung cancer lesion area, such as texture, edge features, and internal density changes of the lesion area. This detailed information is critical to the diagnosis of the physician and subsequent treatment planning. Through this step of processing, the server obtains the demand detail content feature vector.

Extracting detail content feature vectors:

Input: a suspected lung cancer lesion area of a lung CT scan image.

And (3) treatment: and analyzing detailed information such as texture features (such as roughness and smoothness), edge features (such as definition and ambiguity) and density changes (such as uniformity and non-uniformity) of the lesion area through a detailed feature extraction model.

And (3) outputting: the required detail content feature vector contains detail information such as texture, edge, density and the like of a lesion area.

In order to fully understand the content of the medical image shots, the server also needs to extract global content features. In this example, the global feature extraction model can grasp the structure and layout of the lung CT scan image as a whole, and extract features reflecting global information of the image, such as the overall morphology, size, structure, and the like of the lung. This global information aids the physician in locating and qualitatively analyzing the lesion area. Through this step of processing, the server obtains the required global content feature vector.

Global content feature vector extraction:

Input: complete CT scan image of lung.

And (3) treatment: and analyzing global characteristics of the shape (such as normal shape and abnormal shape), size (such as lung capacity) and structure (such as bronchus distribution and vascularity) of the whole lung through a global characteristic extraction model.

And (3) outputting: the required global content feature vector contains global information such as the overall shape, size and structure of the lung.

And finally, the server loads the lung CT scanning image into a visual feature extraction model which has an association relation with the screenshot category information. The model can extract the tissue structure characteristics and lesion characteristics of images, such as the normal tissue structure of the lung, the lesion manifestation of suspected lung cancer and the like. In order to fully utilize these features, the server performs a depth feature fusion operation on the tissue structural features and lesion features, fusing them into a more comprehensive, representative feature vector, the desired pathology image visual feature vector. The vector integrates various visual information of the image, and provides powerful support for subsequent image analysis and identification.

Visual feature vector extraction and depth feature fusion:

Input: CT scan image of lung and suspected lung cancer lesion area.

Treatment 1: and extracting normal tissue structural features (such as bronchus morphology and lung parenchymal texture) of the lung and lesion features (such as morphology, density and edge features of a lesion region) of suspected lung cancer through a visual feature extraction model.

Treatment 2: and performing depth feature fusion operation on the extracted tissue structure features and lesion features, such as feature fusion and dimension reduction processing through a neural network model, so as to generate a comprehensive feature vector.

And (3) outputting: the required pathological image visual feature vector contains comprehensive information such as normal tissue structure of lung, pathological change features of suspected lung cancer and the like.

An embodiment of the present invention provides a computer device 100, where the computer device 100 includes a processor and a nonvolatile memory storing computer instructions, and when the computer instructions are executed by the processor, the computer device 100 executes the aforementioned medical image retrieval method based on deep learning. As shown in fig. 2, fig. 2 is a block diagram of a computer device 100 according to an embodiment of the present invention. The computer device 100 comprises a medical image retrieval method based on deep learning, a memory 111, a processor 112 and a communication unit 113. For data transmission or interaction, the memory 111, the processor 112 and the communication unit 113 are electrically connected to each other directly or indirectly. For example, the elements may be electrically connected to each other via one or more communication buses or signal lines.

The foregoing description, for purpose of explanation, has been presented with reference to particular embodiments. The illustrative discussions above are not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the disclosure and its practical application, to thereby enable others skilled in the art to best utilize the disclosure and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. The medical image retrieval method based on deep learning is characterized by comprising the following steps:

2. The method of claim 1, wherein extracting the underlying medical image data associated with the current patient medical record from the medical image information blockchain comprises:

3. The method of claim 1, wherein extracting the underlying medical image data associated with the current patient medical record from the medical image information blockchain comprises:

4. The method according to claim 1, wherein said obtaining a current pathology image visual feature vector corresponding to the current pathology image and a current pathology image content feature vector corresponding to the current pathology image comprises:

Extracting a pathological content recognition model with an association relation with the pathological image recognition result and a pathological visual feature recognition model with an association relation with the pathological image recognition result; the pathological content recognition model comprises a key pathological content recognition model, a detail pathological content recognition model and a global pathological content recognition model;

the key content feature vector, the global content feature vector and the detail content feature vector are used as the current pathological image content feature vector;

5. The method of claim 1, wherein extracting, from the medical image information blockchain, advanced medical image data associated with the current pathology image content feature vector, comprises:

Acquiring a query identification content pair with query characteristics as content characteristics from the medical image information block chain; the query identification content pairs comprise query identifications correspondingly determined by medical image codes and query contents correspondingly determined by demand pathological image content feature vectors; the current pathological image content feature vector comprises a key content feature vector, a global content feature vector and a detail content feature vector; the demand pathology image content feature vector comprises a demand key content feature vector, a demand global content feature vector and a demand detail content feature vector; the query identification content pairs comprise a query key identification content pair, a query global identification content pair and a query detail identification content pair; the query key identification content pair comprises a key query identification correspondingly determined by the medical image coding and a query key content correspondingly determined by the requirement key content feature vector; the query detail identification content pair comprises a query detail identification correspondingly determined by the medical image coding and a query detail content correspondingly determined by the demand detail content feature vector; the query global identification content pair comprises a query global identification correspondingly determined by the medical image code and a query global content correspondingly determined by the demand global content feature vector;

6. The method according to claim 1, wherein the acquiring the target demand pathology image vision feature vector corresponding to the pending medical image data, the target demand pathology image content feature vector corresponding to the pending medical image data, and the target pathology associated content corresponding to the pending medical image data includes:

7. The method of claim 1, wherein the total number of pending medical image data is at least two, the at least two pending medical image data comprising first pending medical image data; the target pathology associated content comprises first target pathology associated content corresponding to the first medical image data to be determined; the target demand pathological image content feature vector comprises a first target demand pathological image content feature vector corresponding to the first medical image data to be determined; the target demand pathological image visual feature vector comprises a first target demand pathological image visual feature vector corresponding to the first medical image data to be determined;

The determining a requirement matching coefficient between the undetermined medical image data and the current patient matching request based on the current patient medical record, the current pathology image content feature vector, the current pathology image vision feature vector, the target pathology associated content, the target requirement pathology image content feature vector, and the target requirement pathology image vision feature vector, and sequentially arranging the undetermined medical image data based on the matching degree according to the requirement matching coefficient includes:

Obtaining matching scores between the current patient medical record and the first target pathology associated content, and taking the matching scores as text matching scores corresponding to the first medical image data to be determined; the current pathological image content feature vector comprises a key content feature vector, a global content feature vector and a detail content feature vector; the first target demand pathological image content feature vector comprises a first target demand key content feature vector, a first target demand global content feature vector and a first target demand detail content feature vector;

taking the key content matching score, the global content matching score and the detail content matching score as the content matching score;

8. The method according to claim 1, wherein the method further comprises:

Acquiring medical images, and extracting at least two medical image shots in the medical images, wherein the at least two medical image shots comprise a first medical image shot;

Performing invalid data judgment processing on the intermediate medical image screenshot to obtain an invalid local image of the intermediate medical image screenshot, and performing removal processing on the invalid local image in the intermediate medical image screenshot to obtain the target medical image screenshot;

the first medical image screenshot description, the second medical image screenshot description and the third medical image screenshot description are used as medical image screenshot descriptions, the medical image screenshot descriptions are used as pathology associated contents, a query identification content pair with query characteristics being word description characteristics is constructed according to the medical image codes and the pathology associated contents, and the query identification content pair with the query characteristics being word description characteristics is stored in the medical image information block chain;

9. The method of claim 8, wherein the textual description features include keyword features and medical record content features;

The construction of the query identification content pair with the query characteristics being the character description characteristics according to the medical image codes and the pathology-associated content comprises the following steps:

obtaining a pathology associated content feature vector corresponding to the pathology associated content, encoding the medical image as a query identifier, taking the pathology associated content feature vector as query content, and constructing a query identifier content pair with query characteristics of the medical record content feature;

the demand pathology image content feature vector comprises a demand key content feature vector, a demand global content feature vector and a demand detail content feature vector;

the obtaining the content feature vector of the required pathological image corresponding to the target medical image screenshot and the visual feature vector of the required pathological image corresponding to the target medical image screenshot comprises the following steps:

10. A server system comprising a server for performing the method of any of claims 1-9.