Disclosure of Invention
In view of this, the present application provides an intelligent matching method and apparatus for patient recruitment projects, which can be used to solve the technical problems that the time for processing patient leads is long, the cost is high, and some complicated patient materials cannot be identified effectively and accurately by manual work when matching the patient recruitment projects at present.
According to one aspect of the application, a patient recruitment project intelligent matching method is provided, and the method comprises the following steps:
acquiring disease material information of a tested patient, and extracting character features and image features corresponding to the disease material information;
screening candidate recruitment projects matched with the tested patient according to the character features based on medical knowledge rules;
and inputting the candidate recruitment items into a trained patient recruitment item prediction model to obtain target recruitment items matched with the tested patient.
Preferably, the disease material information includes first text information and image information, and the extracting of the character feature and the image feature corresponding to the disease material information includes:
converting the image information into second text information by using an optical character recognition technology, and extracting character features corresponding to the disease material information from the first text information and the second text information according to preset keywords;
and extracting image features irrelevant to scale scaling, rotation and brightness change in the image information by using a scale-invariant feature transformation technology.
Preferably, the screening candidate recruitment items matched with the tested patient according to the text features based on the medical knowledge rules comprises:
acquiring a medical knowledge map which is created in advance according to medical knowledge rules;
and screening candidate recruitment projects matched with the tested patient in the medical knowledge graph according to the character features.
Preferably, the medical knowledge graph includes preset recruitment items and item features corresponding to the preset recruitment items, and the screening of candidate recruitment items matched with the subject patient in the medical knowledge graph according to the text features includes:
calculating the feature similarity of the character features and the project features corresponding to the preset recruiting projects;
and determining the preset recruitment items corresponding to the feature similarity larger than a preset similarity threshold value as candidate recruitment items matched with the tested patient.
Preferably, the method further comprises:
extracting project characteristics matched with each preset recruiting project according to medical knowledge rules;
generating a first feature tag of each preset recruitment project, a second feature tag corresponding to project features matched with each preset recruitment project, and a matching mapping relation between the first feature tag and the second feature tag;
creating a medical knowledge-graph comprising the first feature labels, the second feature labels, and the matching mappings.
Preferably, before the multi-modal features obtained by feature fusion of the text features and the image features and the candidate recruitment items are input into the trained patient recruitment item prediction model to obtain target recruitment items matched with the subject patient, the method further includes:
acquiring sample data matched with a recruitment project, and dividing the sample data into a training set and a test set, wherein a target recruitment project tag successfully matched is configured in the sample data;
and iteratively training a patient recruitment item prediction model by using the training set until the test set is used for verifying that the prediction accuracy of the patient recruitment item prediction model is greater than a preset threshold value, and judging that the training of the patient recruitment item prediction model is completed.
Preferably, the multi-modal feature obtained by feature fusion of the text feature and the image feature and the candidate recruitment item are input into a trained patient recruitment item prediction model to obtain a target recruitment item matched with the subject patient, including:
performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into a trained patient recruitment item prediction model, and acquiring prediction scores output by the patient recruitment item prediction model and aiming at the candidate recruitment items;
determining the candidate recruitment item with the highest corresponding predictive score as the target recruitment item matched with the tested patient.
According to another aspect of the present application, there is provided a patient recruitment project intelligent matching apparatus, comprising:
the first extraction module is used for acquiring disease material information of a tested patient and extracting character features and image features corresponding to the disease material information;
the screening module is used for screening candidate recruitment projects matched with the tested patient according to the character features based on medical knowledge rules;
and the input module is used for inputting the multi-modal characteristics obtained by performing characteristic fusion on the character characteristics and the image characteristics and the candidate recruitment items into the trained patient recruitment item prediction model to obtain target recruitment items matched with the tested patient.
Preferably, the disease material information includes first text information and image information, and the first extraction module. The method comprises the following steps: a first extraction unit and a second extraction unit;
the first extraction unit is used for converting the image information into second text information by using an optical character recognition technology and extracting character features corresponding to the disease material information from the first text information and the second text information according to preset keywords;
the second extraction unit is used for extracting the image characteristics which are irrelevant to scale scaling, rotation and brightness change in the image information by using a scale-invariant feature transformation technology.
Preferably, the screening module comprises: an acquisition unit and a screening unit;
the acquisition unit is used for acquiring a medical knowledge map which is created in advance according to medical knowledge rules;
the screening unit is used for screening candidate recruitment items matched with the tested patient in the medical knowledge graph according to the character features.
Preferably, the medical knowledge graph includes preset recruitment items and item features corresponding to the preset recruitment items, and the screening unit is specifically configured to calculate feature similarities between the text features and the item features corresponding to the preset recruitment items; and determining the preset recruitment items corresponding to the feature similarity larger than a preset similarity threshold value as candidate recruitment items matched with the tested patient.
Preferably, the apparatus further comprises: the second extraction module, the generation module and the creation module;
the second extraction module is used for extracting project features matched with all preset recruiting projects according to medical knowledge rules;
the generating module is configured to generate a first feature tag of each preset recruitment item, a second feature tag corresponding to a item feature matched to each preset recruitment item, and a matching mapping relationship between the first feature tag and the second feature tag;
the creating module is used for creating a medical knowledge graph containing the first feature label, the second feature label and the matching mapping relation.
Preferably, the apparatus further comprises: a training module;
the training module is used for acquiring sample data matched with a recruitment project and dividing the sample data into a training set and a test set, wherein the sample data is provided with a target recruitment project tag successfully matched; and iteratively training a patient recruitment item prediction model by using the training set until the prediction accuracy of the patient recruitment item prediction model is verified to be greater than a preset threshold by using the test set, and judging that the training of the patient recruitment item prediction model is finished.
Preferably, the input module includes: an input unit and a determination unit;
the input unit is configured to input a multi-modal feature obtained by feature fusion of the text feature and the image feature and the candidate recruitment item into a trained patient recruitment item prediction model, and acquire a prediction score for each candidate recruitment item output by the patient recruitment item prediction model;
the determining unit is configured to determine the candidate recruitment item with the highest predicted score as the target recruitment item matching the subject patient.
According to yet another aspect of the present application, a storage medium is provided, on which a computer program is stored, which when executed by a processor, implements the above-described patient recruitment project intelligent matching method and apparatus.
According to a further aspect of the present application, a computer device is provided, which includes a storage medium, a processor, and a computer program stored on the storage medium and running on the processor, and when the processor executes the program, the method and apparatus for intelligently matching patient recruitment projects described above are implemented.
The application provides an intelligent matching method and device for patient recruitment projects, which can be used for extracting character features corresponding to disease material information by using an optical character recognition technology and extracting image features corresponding to the disease material information by using a scale-invariant feature transformation technology after the disease material information of a tested patient is obtained; further, screening candidate recruitment items matched with the tested patient according to the character characteristics based on medical knowledge rules; and finally, performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into the trained patient recruitment item prediction model, and obtaining target recruitment items matched with the tested patients. According to the technical scheme, the optical character recognition technology, the scale invariant feature transformation technology and the model recommendation algorithm technology are applied to patient recruitment, matching of patients and recruitment projects can be completed from multiple layers, so that the projects are intelligently and automatically matched, the process is automatic, the labor and material cost can be reduced, the matching efficiency of the patient recruitment projects is improved, and the recognition accuracy is effectively guaranteed.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Detailed Description
Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the invention are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the computer system/server include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
The computer system/server may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
The embodiment of the invention provides an intelligent matching method for patient recruitment projects, which comprises the following steps of:
101. acquiring the disease material information of the tested patient, and extracting the character characteristic and the image characteristic corresponding to the disease material information.
The tested patient is a clinical trial patient to be subjected to patient recruitment project matching verification, the disease material information comprises basic information and medical record information of the tested patient, and the basic information comprises: disease name, subtype, sex, age, patient identification number, level of daily activity, etc. The medical record information includes: accepted treatment, accepted treatment time, accepted treatment medication, treatment assay information, disease condition information, and complication information, pathology reports, CT reports, imaging reports, disease diagnosis, and the like. In a specific application scenario, the disease material information of the subject patient may be acquired through acquiring the disease material information of the subject patient on a medical platform, or acquiring the disease material information of the subject patient in a questionnaire manner. Because the data contained in the disease material information corresponds to different data dimensions, the character features and the image features which have characteristic representations and have uniform specific dimensions can be obtained through feature extraction, so that the character features and the image features are further utilized to carry out rapid matching verification on patient recruitment projects.
The execution main body of the application can be a system for supporting patient recruitment project matching, can be configured at a client or a server, and can extract character features and image features corresponding to disease material information after the disease material information of a tested patient is acquired; further screening candidate recruitment projects matched with the tested patients according to character features based on medical knowledge rules; and finally, performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into the trained patient recruitment item prediction model, and obtaining target recruitment items matched with the tested patients.
102. And screening candidate recruitment projects matched with the tested patient according to the character features based on the medical knowledge rule.
In a specific application scenario, different recruitment projects have different requirements on the condition of a patient, for example, a recruitment project corresponding to a certain experimental drug requires that the patient does not use gemcitabine-containing drugs in the past treatment; patients can be allowed to have hepatitis B, but the hepatitis B quantification cannot exceed 2000IU/mL and the like. For this reason, a medical knowledge rule may be created in advance according to requirements of different preset recruitment items on patients, after character features of disease materials corresponding to the tested patient are extracted, irrelevant recruitment items irrelevant to the character features (such as the disease type, the medication history, the number of disease treatment lines, and the disease gene type of the tested patient) may be further filtered out in a plurality of preset recruitment items according to the medical knowledge rule, and candidate recruitment items with high relevance to the character features and the features corresponding to the image features are reserved, so as to facilitate rapid screening of target recruitment items matched with the tested patient among the candidate recruitment items.
103. And inputting the candidate recruitment items into the trained patient recruitment item prediction model to obtain target recruitment items matched with the tested patient.
The prediction model of the patient recruitment project can specifically adopt an ultra-deep factorization model (xDeepFM) formed by combining FM and DNN, and the model has memory capacity and generalization capacity. The memory capacity of the method is obtained by weighting and summing first-order features in a Linear part, and the generalization capacity is realized by utilizing a full-connection deep learning neural network DNN.
In a specific application scenario, before executing the steps of this embodiment, the patient recruitment item prediction model needs to be pre-trained until it is determined that the patient recruitment item prediction model meets a preset training standard, and the trained patient recruitment item prediction model is further used for matching evaluation of the target recruitment item. Accordingly, in creating and training a patient recruitment program predictive model, example steps may specifically include: acquiring sample data matched with the recruitment projects, and dividing the sample data into a training set and a test set, wherein target recruitment project tags successfully matched are configured in the sample data; and iteratively training the patient recruitment item prediction model by using the training set until the prediction accuracy of the patient recruitment item prediction model is verified to be greater than a preset threshold by using the test set, and judging that the training of the patient recruitment item prediction model is finished. Specifically, after a score prediction result corresponding to the patient recruitment item prediction model in the test set is obtained, the prediction accuracy of the patient recruitment item prediction model is further calculated according to the score prediction result and the target recruitment item label, and when the prediction accuracy is judged to be greater than a preset threshold value, the patient recruitment item prediction model is judged to be trained completely. The sample data is a data set obtained after the patient and the project are successfully matched, and comprises character features and picture features of the successfully matched patient, and specifically, a record can be formed by the character features and the picture features of the successfully matched patient and the successfully matched target recruitment project and used as a positive sample to train the model. The preset threshold is a value greater than 0 and less than 1, when the value of the preset threshold is closer to 1, the training accuracy of the prediction model for the patient recruitment project is higher, and the specific value of the preset threshold can be set according to the actual application scenario without specific limitation.
According to the intelligent matching method for the patient recruitment projects, after disease material information of a tested patient is obtained, character features corresponding to the disease material information are extracted by using an optical character recognition technology, and image features corresponding to the disease material information are extracted by using a scale-invariant feature transformation technology; further, based on medical knowledge rules, screening candidate recruitment projects matched with the tested patients according to character features; and finally, performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into the trained patient recruitment item prediction model, and obtaining target recruitment items matched with the tested patients. According to the technical scheme, the optical character recognition technology, the scale invariant feature transformation technology and the model recommendation algorithm technology are applied to patient recruitment, matching of patients and recruitment projects can be completed from multiple layers, so that the projects are intelligently and automatically matched, the process is automatic, the labor and material cost can be reduced, the matching efficiency of the patient recruitment projects is improved, and the recognition accuracy is effectively guaranteed.
Further, in order to better explain the matching process of the clinical drug trial patients, as a refinement and an extension to the above embodiment, another intelligent matching method for patient recruitment projects is provided in the embodiment of the present invention, as shown in fig. 2, the method includes:
201. disease material information of a subject patient is acquired, wherein the disease material information includes first text information and image information.
In a specific application scenario, the first text information may include basic information and medical record information of the subject patient, where the basic information includes: disease name, subtype, sex, age, patient identification number, level of daily activity, etc. The medical record information includes: character information such as accepted treatment method, accepted treatment time, accepted treatment medicine, treatment test information, disease condition information, complication information, disease diagnosis result, and the like; the image information may include medical pictures such as pathology picture and text reports, CT reports, image reports, and the like.
For this embodiment, the disease material information of the patient to be tested may be automatically acquired in the medical management system and the user registration information according to a preset time period, or the disease material information may be acquired by acquiring information of the patient to be tested in a questionnaire manner, so that the disease material information may be subjected to multi-modal feature analysis of the character feature dimension and the image feature dimension by performing the embodiment steps 202 to 206, and the recruitment item matching analysis result may be determined.
202. And converting the image information into second text information by using an optical character recognition technology, and extracting character features corresponding to the disease material information from the first text information and the second text information according to preset keywords.
For this embodiment, when converting the image information into the second text information by using the optical character recognition technology, the method may specifically include the following steps: 1. preprocessing a medical picture in the disease material information: the process comprises binarization (pixel), denoising, gradient correction and the like; 2. analyzing the layout: segmenting and processing the document to be identified in rows; 3. character cutting: the step needs character positioning and character cutting, the boundary of the character string is positioned, then the character string is cut individually, and the characters which are cut individually are identified. 4. Extracting character features: and extracting required character features to provide basis for later recognition. 5. Character recognition: and carrying out template rough classification and template fine matching on the feature vector extracted from the current character and a feature template library to identify the character. 6. And (3) page recovery: and typesetting the recognition result according to the original layout, and outputting a document in a Word or pdf format to obtain second text information of the disease material of the patient.
Correspondingly, when extracting the character features corresponding to the disease material information according to the first text information and the second text information, the corresponding preset keywords can be configured for each character type in advance, and for example, for the character feature extraction of the basic information, the configurable preset keywords include: for the extraction of the character features of medical record information, the preset keywords can be configured to include: "diagnosis result", "diagnosis item", "drug name", "treatment item", and the like. Through matching of the keywords, text character strings under each data type can be extracted from the first text information and the second text information, character string features corresponding to the text character strings under each data type are further determined by using a text feature extraction technology, and character features corresponding to the disease material information can be obtained by splicing the character string features. The text feature extraction technique may include one-hot coding (one-hot), word frequency-inverse file frequency (TF-IDF), and the like, which are not limited herein.
203. And extracting image features irrelevant to scale scaling, rotation and brightness change in the image information by using a scale-invariant feature transformation technology.
The Scale Invariant Feature Transform (SIFT) mainly comprises two stages, wherein one stage is the generation of SIFT features, namely feature vectors which are irrelevant to scale scaling, rotation and brightness change are extracted from a plurality of images; the second stage is the matching of SIFT feature vectors. The low-level feature extraction in the SIFT method is to select the apparent features, and the features have image scale (feature size) and rotation invariance and have certain invariance to illumination change. For the embodiment, the scale-invariant feature transformation technology can be used to extract effective features in medical pictures (such as pathological image-text reports, CT reports, image reports, and the like) to obtain image features of patient disease materials.
204. And acquiring a medical knowledge graph which is created in advance according to medical knowledge rules, wherein the medical knowledge graph comprises each preset recruitment project and project characteristics corresponding to each preset recruitment project.
In a specific application scenario, before executing the steps of this embodiment, a medical knowledge graph may be created in advance according to medical knowledge rules, and accordingly, the steps of this embodiment may specifically include: extracting project characteristics matched with each preset recruiting project according to medical knowledge rules; generating a first feature tag of each preset recruitment project, a second feature tag corresponding to project features matched with each preset recruitment project, and a matching mapping relation between the first feature tag and the second feature tag; and creating a medical knowledge graph comprising the first feature labels, the second feature labels and the matching mapping relation. The nodes in the medical knowledge graph represent first feature tags of all preset recruitment projects and second feature tags corresponding to project features matched with all the preset recruitment projects, and edges in the structure of the knowledge graph represent matching mapping relations between the first feature tags and the corresponding second feature tags. Accordingly, when the steps of the embodiment are performed, the pre-created medical knowledge graph can be directly called, and the medical knowledge graph is used for performing subsequent preliminary screening on candidate recruitment projects.
205. And screening candidate recruitment projects matched with the tested patient in the medical knowledge map according to the character characteristics.
For this embodiment, the step 205 of the embodiment may specifically include: calculating the feature similarity of the character features and the project features corresponding to each preset recruiting project; and determining the preset recruitment items with the corresponding feature similarity larger than a preset similarity threshold as candidate recruitment items matched with the tested patient. When the feature similarity is calculated, the feature similarity can be calculated by using a preset feature distance calculation formula, or can be calculated by using a machine learning model, and specific limitation is not performed here. The preset feature distance calculation formula may be any distance function formula suitable for the metric, such as euclidean distance formula (euclidean distance), manhattan distance formula (manhattan distance), JaccardDistance formula (JaccardDistance), mahalanobis distance formula (mahalanobis distance), etc., and the machine learning model may include a decision tree model, a random forest model, a neural network model, etc. It should be noted that. The preset feature distance calculation formula and the machine learning model can be specifically selected according to the actual application scene, and are not specifically limited herein.
206. And inputting the candidate recruitment items into the trained patient recruitment item prediction model to obtain target recruitment items matched with the tested patient.
In a specific application scenario, before executing the step of this embodiment, feature fusion needs to be performed on the text features and the image features, and specifically, feature vectors may be spliced between the text features and the image features, for example, after the feature vectors corresponding to the text features are spliced to the feature vectors corresponding to the image features, or after the feature vectors corresponding to the image features are spliced to the feature vectors corresponding to the text features, a specific splicing manner is not specifically limited. In addition, the patient recruitment item prediction model needs to be pre-trained in advance until it is determined that the patient recruitment item prediction model meets the preset training standard, and the specific training process may refer to the related description in step 103 of the embodiment and is not described herein again.
Accordingly, after determining that the training of the patient recruitment item prediction model is completed, the step of this embodiment may be further performed, and accordingly, the step 206 of this embodiment may specifically include: performing feature fusion on the character features and the image features to obtain multi-modal features, inputting candidate recruitment items into a trained patient recruitment item prediction model, and acquiring prediction scores output by the patient recruitment item prediction model and aiming at each candidate recruitment item; the candidate recruiting item with the highest corresponding predictive score is determined as the target recruiting item matching the subject patient. In a specific application scenario, in addition to determining the candidate recruitment item with the highest corresponding prediction score as the target recruitment item most matched with the tested patient, as another optional manner, a preset number of target recruitment items more matched with the tested patient may be extracted from the candidate recruitment items according to the sequence of the prediction scores from high to low, and a reason for higher matching degree is provided for the reference selection of a medical teacher.
For a complete description of the technical solution in the present application, the implementation process of the present application is fully described herein with reference to the schematic diagram of the intelligent matching of patient recruitment projects in fig. 3 of the specification: after the disease material information of the tested patient is obtained, character features in the disease material information are extracted through OCR recognition, image features in the disease material information are extracted through Scale Invariant Feature Transform (SIFT), and then candidate recruitment projects matched with the tested patient are screened according to the character features based on medical knowledge rules; after character features and image features are subjected to feature fusion to obtain multi-modal features, the multi-modal features and candidate recruitment items are input into an xdepFM model (patient recruitment item prediction model) which is trained, prediction scores output by the xdepFM model and aiming at the candidate recruitment items are obtained, and the candidate recruitment item with the highest corresponding prediction score is determined as a target recruitment item matched with a tested patient.
When matching analysis of patient recruitment projects is performed in the prior art, a recruitment team usually needs to screen out effective clues first after patient clues are submitted, and some ineffective clues such as submission of ineffective materials are excluded, so that the patient clues are not aged. After effective clues are screened out and submitted through manual review, the background is required to manually perform primary screening, after successful clues are screened out, manual re-screening is performed, and reasons for primary screening rejection are given out when failure occurs; and (4) analyzing the disease condition of the clues entering the rescreening by a medical teacher, looking at medical materials of the patient, extracting and matching whether the clues meet the final conditions of the recruitment project, and reporting the clues meeting the conditions to a pharmaceutical manufacturer. Therefore, a problem is faced, when there are many clues, the manual review time is slow, and the accuracy of the review is difficult to guarantee. According to the method, from the perspective that the characteristics of the patient are matched with the characteristics of the project, the similarity between the characteristics of the patient material characters and the characteristics of the project is calculated, whether the patient is suitable for the recruited project is judged, when the matching degree of the patient and the project is low, clues are directly eliminated, the manual examination and verification time is reduced, and the reason for refusal is fed back; otherwise, the preset recruitment item with high matching degree with the patient clues is determined as a candidate recruitment item of the tested patient, and further based on the multi-modal fusion features corresponding to the patient material character features and the patient image picture features, the target recruitment item which is most matched with the tested patient is extracted from the candidate recruitment items by using a patient recruitment item prediction model, and the reason for high matching degree is provided for reference of a medical teacher. According to the technical scheme, the patient recruitment efficiency and the matching accuracy can be improved through an intelligent algorithm, and the situation tends to be great for a big data era.
By means of the intelligent matching method for the patient recruitment projects, after disease material information of a tested patient is obtained, character features corresponding to the disease material information are extracted by means of an optical character recognition technology, and image features corresponding to the disease material information are extracted by means of a scale-invariant feature transformation technology; further, based on medical knowledge rules, screening candidate recruitment projects matched with the tested patients according to character features; and finally, performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into the trained patient recruitment item prediction model, and obtaining target recruitment items matched with the tested patients. According to the technical scheme, the optical character recognition technology, the scale invariant feature transformation technology and the model recommendation algorithm technology are applied to patient recruitment, matching of patients and recruitment projects can be completed from multiple layers, so that the projects are intelligently and automatically matched, the process is automatic, the labor and material cost can be reduced, the matching efficiency of the patient recruitment projects is improved, and the recognition accuracy is effectively guaranteed.
Further, as an implementation of the method shown in fig. 1, an embodiment of the present invention provides an intelligent matching apparatus for patient recruitment projects, as shown in fig. 4, the apparatus includes: a first extraction module 31, a screening module 32, and an input module 33.
The first extraction module 31 is configured to obtain disease material information of a patient to be tested, and extract text features and image features corresponding to the disease material information;
the screening module 32 is used for screening candidate recruitment projects matched with the tested patient according to the character characteristics based on the medical knowledge rules;
the input module 33 may be configured to input the trained patient recruitment item prediction model with the multi-modal features obtained by feature fusion of the text features and the image features, and the candidate recruitment items to obtain a target recruitment item matched with the subject patient.
In a specific application scenario, as shown in fig. 5, the first extraction module 31 includes: a first extraction unit 311, a second extraction unit 312;
the first extraction unit 311 is configured to convert the image information into second text information by using an optical character recognition technology, and extract character features corresponding to the disease material information from the first text information and the second text information according to a preset keyword;
the second extraction unit 312 may be configured to extract image features of the image information that are not related to scale scaling, rotation, and brightness change by using a scale-invariant feature transform technique.
In a specific application scenario, as shown in fig. 5, the screening module 32 includes: an acquisition unit 321 and a screening unit 322;
an obtaining unit 321 operable to obtain a medical knowledge graph created in advance according to medical knowledge rules;
the screening unit 322 may be configured to screen candidate recruitment items matching the subject patient in the medical knowledge map according to the text features.
In a specific application scenario, the medical knowledge graph includes each preset recruitment item and item features corresponding to each preset recruitment item, and the screening unit 322 is specifically configured to calculate feature similarity between the text features and the item features corresponding to each preset recruitment item; and determining the preset recruitment items with the corresponding feature similarity larger than a preset similarity threshold as candidate recruitment items matched with the tested patient.
In a specific application scenario, as shown in fig. 5, the apparatus further includes: a second extraction module 34, a generation module 35, and a creation module 36;
a second extraction module 34, configured to extract, according to medical knowledge rules, project features matched with each preset recruiting project;
the generating module 35 may be configured to generate a first feature tag of each preset recruitment item, a second feature tag corresponding to a feature of an item matched to each preset recruitment item, and a matching mapping relationship between the first feature tag and the second feature tag;
a creation module 36 operable to create a medical knowledge-graph comprising the first feature labels, the second feature labels and the matching mappings.
In a specific application scenario, as shown in fig. 5, the apparatus further includes: a training module 37;
the training module 37 may be configured to acquire sample data that completes the matching of the recruitment project, and divide the sample data into a training set and a test set, where a target recruitment project tag that is successfully matched is configured in the sample data; and iteratively training the patient recruitment item prediction model by using the training set until the prediction accuracy of the patient recruitment item prediction model is verified to be greater than a preset threshold by using the test set, and judging that the training of the patient recruitment item prediction model is finished.
In a specific application scenario, as shown in fig. 5, the input module 33 includes: an input unit 331, a determination unit 332;
the input unit 331 is configured to input a multi-modal feature obtained by feature fusion of a text feature and an image feature and a candidate recruitment item into a trained patient recruitment item prediction model, and acquire a prediction score for each candidate recruitment item output by the patient recruitment item prediction model;
a determining unit 332 may be configured to determine the candidate recruitment item with the highest corresponding predictive score as the target recruitment item matching the subject patient.
Based on the methods shown in fig. 1 and fig. 2, correspondingly, the embodiment of the invention further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the methods shown in fig. 1 to fig. 2.
Based on the above embodiments of the method shown in fig. 1 and the apparatus shown in fig. 4, an embodiment of the present invention further provides an entity structure diagram of a computer device, as shown in fig. 6, where the computer device includes: a processor 41, a memory 42, and a computer program stored on the memory 42 and executable on the processor, wherein the memory 42 and the processor 41 are arranged on a bus 43 such that the method shown in fig. 1-2 is implemented when the processor 41 executes the program.
According to the technical scheme, after the disease material information of the tested patient is obtained, the character features corresponding to the disease material information are extracted by using an optical character recognition technology, and the image features corresponding to the disease material information are extracted by using a scale-invariant feature transformation technology; further, based on medical knowledge rules, screening candidate recruitment projects matched with the tested patients according to character features; and finally, performing feature fusion on the character features and the image features to obtain multi-modal features, inputting the candidate recruitment items into the trained patient recruitment item prediction model, and obtaining target recruitment items matched with the tested patients. According to the technical scheme, the optical character recognition technology, the scale invariant feature transformation technology and the model recommendation algorithm technology are applied to patient recruitment, matching of patients and recruitment projects can be completed from multiple layers, so that the projects are intelligently and automatically matched, the process is automatic, the labor and material cost can be reduced, the matching efficiency of the patient recruitment projects is improved, and the recognition accuracy is effectively guaranteed.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The method and system of the present invention may be implemented in a number of ways. For example, the methods and systems of the present invention may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustrative purposes only, and the steps of the method of the present invention are not limited to the order specifically described above unless specifically indicated otherwise. Furthermore, in some embodiments, the present invention may also be embodied as a program recorded in a recording medium, the program including machine-readable instructions for implementing a method according to the present invention. Thus, the present invention also covers a recording medium storing a program for executing the method according to the present invention.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.