CN112434756A - Training method, processing method, device and storage medium of medical data - Google Patents
Training method, processing method, device and storage medium of medical data Download PDFInfo
- Publication number
- CN112434756A CN112434756A CN202011477617.4A CN202011477617A CN112434756A CN 112434756 A CN112434756 A CN 112434756A CN 202011477617 A CN202011477617 A CN 202011477617A CN 112434756 A CN112434756 A CN 112434756A
- Authority
- CN
- China
- Prior art keywords
- medical data
- standard
- code
- training
- negative
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 64
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000003672 processing method Methods 0.000 title abstract description 7
- 238000012545 processing Methods 0.000 claims abstract description 30
- 230000004044 response Effects 0.000 claims abstract description 18
- 238000002372 labelling Methods 0.000 claims description 22
- 210000004351 coronary vessel Anatomy 0.000 description 10
- 210000004072 lung Anatomy 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000013538 segmental resection Methods 0.000 description 7
- 201000009030 Carcinoma Diseases 0.000 description 6
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 6
- 238000002271 resection Methods 0.000 description 6
- 210000000709 aorta Anatomy 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 201000005202 lung cancer Diseases 0.000 description 4
- 208000020816 lung neoplasm Diseases 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011470 radical surgery Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000002685 pulmonary effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 201000005296 lung carcinoma Diseases 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 208000007536 Thrombosis Diseases 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
The present disclosure relates to a training method, a processing method, an apparatus, and a storage medium for medical data, the training method including acquiring medical data defined as a positive case; generating a negative case in response to the acquisition of the positive case; training of the medical data is performed based at least on the negative examples. The processing method comprises the steps of acquiring original medical data; inputting original medical data into a model obtained based on a training method; outputting the target medical data. The device comprises a positive case generation module; a negative case generation module; and a training module. Through the embodiments of the present disclosure, the result that the algorithm output does not conform to the medical logic can be avoided, and the performance of medical data processing is improved.
Description
Technical Field
The present disclosure relates to the field of medical data intelligent processing technology, and in particular, to a medical data training method, a medical data processing method, a processing apparatus for medical data training, and a computer-readable storage medium.
Background
There are some concepts in medical data that seem literally close but medically need to be distinguished, such as "pericardiotomy" and "pericardiotomy exploratory"; "one coronary artery (aorta) coronary artery bypass graft" and "two coronary artery (aorta) coronary artery bypass graft". If the similarity matching of the keywords is used, good coding effect on the data is difficult to obtain, and even if a deep learning algorithm is used, the original medical data can be wrongly coded on some similar standard names or some similar names are output. The ICD standard table generally comprises English letters and Arabic numerals, standard names (Chinese or English) are recorded in the table, the relation of different standard names can be shown, and related contents are clustered in the whole standard table and are given to similar codes.
Disclosure of Invention
The present disclosure is intended to provide a medical data training method, a medical data processing method, a processing apparatus for medical data training, and a computer-readable storage medium, which can avoid the result that the algorithm output does not conform to the medical logic, and improve the performance of medical data processing.
According to one aspect of the present disclosure, there is provided a method for training medical data, including:
acquiring medical data defined as a positive case;
generating a negative case in response to the acquisition of the positive case;
training of the medical data is performed based at least on the negative examples.
In some embodiments, wherein the acquiring medical data defined as a positive case comprises:
and marking the medical data by combining a standard information table, wherein the marking result exists in the standard information table.
In some embodiments, wherein generating negative examples in response to the obtaining of positive examples comprises:
selecting standard information related to the labeling result in the standard information table according to the labeling result;
and generating a negative example based on the selected standard information and the original information in the positive example.
In some embodiments, the annotation result includes an annotation code representing original information in a positive case;
selecting the standard information related to the labeling result in the standard information table according to the labeling result, wherein the selecting step comprises the following steps:
according to the marking code, determining a standard code related to the marking code in the standard information table;
and selecting the standard code and standard information represented by the standard code.
In some embodiments, the determining, according to the annotation code, the standard code associated with the annotation code in the standard information table includes:
comparing the label code with the standard code;
and taking the standard code which is the same as the label code on the preset digit as the standard code related to the label code.
In some embodiments, the determining, according to the annotation code, the standard code associated with the annotation code in the standard information table further includes:
a preset number of standard codes are extracted.
In some embodiments, wherein the standard information table comprises an ICD-9-CM-3 standard table.
According to one aspect of the present disclosure, a method of processing medical data is provided, wherein,
acquiring original medical data;
inputting raw medical data into a model obtained based on the training method as described above;
outputting the target medical data.
According to one aspect of the present disclosure, there is provided a processing apparatus for medical data training, comprising:
a due case generation module configured for acquiring medical data defined as a due case;
a negative case generation module configured for generating a negative case in response to the obtaining of the positive case;
a training module configured for training of medical data based at least on the negative examples.
According to one aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement:
a method of training based on the above medical data, or
The method for processing medical data according to the above.
The training method of medical data, the processing device for medical data training and the computer-readable storage medium of various embodiments of the present disclosure are provided by acquiring medical data defined as positive examples; generating a negative case in response to the acquisition of the positive case; the medical data is trained at least based on the negative examples, so that the negative examples can be automatically generated based on the generation of positive example samples, the negative example training data is fully utilized, the algorithm model can learn not only the positive example data containing a certain ICD standard word in the original medical data, but also the knowledge scene that the original medical data does not contain the standard word which is closer to the interpretation result and actually does not have reference so as to accord with the design logic of the ICD standard table. According to the method, the internal coding logic is mined through deep learning, the result that the algorithm output does not conform to the medical logic is avoided in the mode, the coding performance of the algorithm model is improved, and the performance can be improved by 1% compared with the method without negative training data, so that the accuracy and the efficiency of medical research and medical diagnosis and treatment are improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure, as claimed.
Drawings
In the drawings, which are not necessarily drawn to scale, like reference numerals may designate like components in different views. Like reference numerals with letter suffixes or like reference numerals with different letter suffixes may represent different instances of like components. The drawings illustrate various embodiments generally, by way of example and not by way of limitation, and together with the description and claims, serve to explain the disclosed embodiments.
Fig. 1 shows a flow chart of a method of training medical data to which embodiments of the present disclosure relate;
fig. 2 shows a flow chart of a method of processing medical data to which an embodiment of the present disclosure relates;
fig. 3 shows an architecture diagram of a processing device for medical data training to which embodiments of the present disclosure relate;
FIG. 4 illustrates an example of a portion of an ICD-9-CM-3 standard table to which embodiments of the present disclosure relate;
fig. 5 shows an architecture diagram of a medical data processing apparatus according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described below clearly and completely with reference to the accompanying drawings of the embodiments of the present disclosure. It is to be understood that the described embodiments are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the disclosure without any inventive step, are within the scope of protection of the disclosure.
Unless otherwise defined, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items.
To maintain the following description of the embodiments of the present disclosure clear and concise, a detailed description of known functions and known components have been omitted from the present disclosure.
The present disclosure relates to training and processing under deep learning models with respect to medical data for accurate characterization of medical information and medical term concepts. There are some concepts in medical data that seem literally close but medically need to be distinguished, such as "pericardiotomy" and "pericardiotomy exploratory"; "one coronary artery (aorta) coronary artery bypass graft" and "two coronary artery (aorta) coronary artery bypass graft". If the similarity matching of the keywords is used, good coding effect on the data is difficult to obtain, and even if a deep learning algorithm is used, the original medical data can be wrongly coded on some similar standard names or some similar names are output. The codes of the ICD standard table generally consist of english letters and arabic numerals, and for example, the codes include "36.1100 (aorta) coronary artery bypass graft of one coronary artery", "37.1200 x010 pericardial blood clot removal" and the like for surgical information. The table not only records standard names (Chinese or English) but also can show the relationship of different standard names, and the whole standard table clusters related contents and gives similar codes.
As one solution, as shown in fig. 1 in combination with fig. 4, an embodiment of the present disclosure provides a training method of medical data, including:
s101: acquiring medical data defined as a positive case;
s102: generating a negative case in response to the acquisition of the positive case;
s103: training of the medical data is performed based at least on the negative examples.
One of the inventive concepts disclosed in the present disclosure is intended to be able to automatically generate negative examples based on the generation of positive example samples, and make full use of negative example training data, so that an algorithm model can learn not only positive example data including a certain ICD standard word in original medical data, but also learn a knowledge scene that the original medical data does not include a standard word that is closer to an interpretation result and is actually not mentioned, so as to meet the design logic of an ICD standard table itself.
The original medical data in the embodiments of the present disclosure, which belongs to the data source, need not be particularly limited, and may be historical data or current real-time data. From the aspect of data format, medical record text data, video data, audio data, etc. can be used as long as medical information that can be contained in the medical record text data can be identified through identification means, such as text recognition (e.g., NLP recognition, OCR recognition, etc.), some medical information such as diagnosis content, operation content, etc. described in the medical record text data can be identified through voice recognition, video image recognition, etc., or medical information content identified through character splitting, word splitting, etc. can be identified. Referring to the ICD-9-CM-3 standard table, the raw medical data of the embodiments of the present disclosure includes content for characterizing medical information, the raw medical data may be labeled, etc. to generate a positive example, the positive example may include the raw medical information in the medical data, and the labeled content of the raw medical information. In response to the generation of the positive examples, the negative examples are automatically generated. Of course, the present disclosure focuses primarily on negative case-based data training, but does not preclude training with respect to defining positive cases. In a specific application scenario, the original medical data of the present disclosure may also be included in medical records and diagnostic books, which include a plurality of or a plurality of diagnostic information and surgical information, and may be used for interpretation of related medical information by manual or machine through labeling or parsing.
In some embodiments, the acquiring of medical data defined as a positive case of the present disclosure includes:
and marking the medical data by combining a standard information table, wherein the marking result exists in the standard information table.
Specifically, the processing of medical data defined as a proper case according to the present disclosure may be implemented by manually labeling the medical data, or by labeling the medical data through machine recognition with a corresponding interpretation capability. For example, labeling is performed by professional personnel, medical experts. Based on the expert annotation results, positive examples of training data for the present disclosure are generated. Each good case can be treated as a binary set in a format referred to as "plain, ICD code | ICD standard name", for example, a binary set in a format labeled "thoracoscopic assisted small-incision right superior lung cancer radical surgery", and 32.4100| thoracoscopic inferior lobular resection "in which the expert contains the record contents" thoracoscopic assisted small-incision right superior lung cancer radical surgery "for the raw data.
In some embodiments, the generating negative examples of the present disclosure in response to the obtaining of positive examples comprises:
selecting standard information related to the labeling result in the standard information table according to the labeling result;
and generating a negative example based on the selected standard information and the original information in the positive example.
Specifically, the present disclosure aims to at least enable intelligent response to generation of positive examples, that is, generation of negative examples, so that only the training method related to the embodiments of the present disclosure can learn not only positive example data including a certain standard word in original medical data, but also a knowledge scene that the original medical data does not include a standard word that is closer to the positive example result and is actually not mentioned. That is, the basic conditions of the negative example of the present disclosure are: the information which is different from the true case is required to be contained, and the difference in the specific training process can be adjusted according to the required precision of the training model. And, the object generating the difference can also be selected according to the measurement standard required by the actual data processing. Embodiments of the present disclosure may automatically generate negative examples from positive examples, or hard-to-divide samples containing multiple negative examples, in a manner that contrasts with an ICD standard information table.
In particular, continuing with the above example, the positive example in this embodiment may be standardized
"thoracoscopic assisted small incision radical right suprapulmonary carcinoma procedure, 32.4100| -thoracoscopic lobectomy"
Wherein, the "thoracoscopic assisted small incision right suprapulmonary carcinoma radical treatment" can be used as the original information of the embodiment, and the "32.4100 | thoracoscopic lobectomy" can be used as the labeling result compared with the ICD-9-CM-3 standard table.
In response to the generation of the positive examples described above, the present embodiment may automatically generate negative examples from the positive examples, as well as in conjunction with the ICD-9-CM-3 standard table.
Further, the labeling result of the present disclosure includes a labeling code representing original information in a positive case;
selecting the standard information related to the labeling result in the standard information table according to the labeling result, wherein the selecting step comprises the following steps:
according to the marking code, determining a standard code related to the marking code in the standard information table;
and selecting the standard code and standard information represented by the standard code.
In particular, based on the positive example
"thoracoscopic assisted small incision radical right suprapulmonary carcinoma procedure, 32.4100| -thoracoscopic lobectomy"
The reference code "32.4100" is included, and in this embodiment, the reference code may be used as an index to select the standard information from the standard information table by referring to the ICD-9-CM-3 standard table.
For example, indexed by "32.4100", one can select from the ICD-9-CM-3 standard table:
32.4100X 002 composite lobectomy under thoracoscope "
32.4101 segmental resection of the lobes of the lung with adjacent lobes under thoracoscopy "
"32.4901 segmental resection of the pulmonary lobes with adjacent pulmonary lobes"
32.4902 lobectomy "
The standard information such as "32.4903 | pulmonary lobe sleeve resection" satisfies the condition that the information and the proper case are encoded closer to each other.
In the embodiment of the present disclosure, a negative example may be specially generated for the positive example through a separate execution step, and the negative example includes the above original information, standard code, and standard information, as follows:
"thoracoscopic assisted small incision right superior lung carcinoma radical treatment, 32.4100X 002| compound lobectomy under thoracoscopic"
"thoracoscopic assisted small incision radical right supralung carcinoma procedure, 32.4101| segmental resection of inferior and adjacent lobes of the lung under thoracoscopic"
"thoracoscopic assisted small incision radical right supralung carcinoma, 32.4901 segmental resection of the lobes of the lung with the adjacent lobes"
Radical operation of thoracoscopic assisted small incision for upper right lung carcinoma, 32.4902 lung lobectomy "
"thoracoscopic assisted small incision radical right supralung carcinoma, 32.4903 Lung lobe Cuff resection"
Through the negative examples, in the embodiment of the disclosure, the hard-to-distinguish samples can be constructed based on the automatically generated negative examples, so that the training samples are provided for the training of the medical data.
In some embodiments, the determining, according to the annotation code, the standard code associated with the annotation code in the standard information table includes:
comparing the label code with the standard code;
and taking the standard code which is the same as the label code on the preset digit as the standard code related to the label code.
Specifically, in the process of constructing the hard-to-divide sample, the standard information with the same six-bit code, five-bit code, four-bit code and three-bit code as the marking code in the positive example in the ICD-9-CM-3 standard information table can be sequentially searched. For example, when the annotation code of the positive example in this embodiment is 32.4900, the construction of the hard-to-divide sample can be to look up whether there is a code whose first six bits are 32.4900 (i.e., "no calculation is made in six bits) in the ICD-9-CM-3 standard information table, and if there are other codes whose first six bits are 32.4900 in the codes, such as 32.4900x002 and 32.4900x002, the negative examples are generated by these codes. Similarly, the standard information of five-bit code, four-bit code and three-bit code can be selected. With the requirements of actual medical data processing and medical data training scenarios, the number of encoding bits involved in the embodiments of the present disclosure may be preset in a value interval, for example, the upper limit of the number of bits is six bits, and the lower limit is three bits.
In some embodiments, the determining, according to the annotation code, the standard code associated with the annotation code in the standard information table further includes:
a preset number of standard codes are extracted.
Specifically, continuing with the previous example, with a preset number of 10, if there are less than 10 of the first six bits, the step is decreased, and the ICD-9-CM-3 standard information table is searched for codes with 32.490 in the first five bits, such as 32.4901, 32.4902, and the like. And the like until the position where more than 10 negative examples can be generated is satisfied. On the other hand, in the specific implementation process, if it is found in the above process that the number of the first bits is more than 10, for example, the first five bits have satisfied the condition of 10 standard codes, 10 bits may be arbitrarily selected.
The standard information table in the embodiments of the present disclosure includes an ICD-9-CM-3 standard table based on a specific usage mode of a standard code. Of course, the standard table can be expanded to code the surgery operation classification code national clinical edition 2.0 and the disease classification and code national clinical edition 2.0 which are issued in a unified way, and the coding system is constructed based on the international ICD9-CM-3 and ICD10 expansion.
In combination with the above example, the embodiments of the present disclosure perform medical data training based on the above hard-to-classify sample, that is, the expert labeling result and the information of the ICD standard table can be fully utilized, so that the algorithm model not only learns the positive example data in the original text containing a certain ICD standard word, but also can perform binary operation in the format of "thoracoscopic assisted small-incision right superior lung cancer radical surgery, 32.4100| thoracoscopic inferior lobe resection," which is a negative example, at least learning the "thoracoscopic assisted small-incision right superior lung cancer radical surgery" of the original medical data does not contain the "32.4100 | thoracoscopic inferior lobe resection" following the standard result, and the original surgical data does not actually refer to the methods such as "32.4100 × 002| thoracoscopic inferior compound lobular resection", "32.4101 | thoracoscopic inferior lobe with adjacent lobe segmental resection of lung lobe", "32.4901 | lobe with segmental resection of adjacent lung lobe", "or" of adjacent lung lobe segmental resection, "32.4902 lobectomy", "32.4903 lobemias" and the like.
In some embodiments, the training method of the present disclosure may perform training based on the negative examples related to each embodiment, or may combine the negative examples with the positive examples, and perform data training through two learning logic dimensions of the positive examples and the negative examples.
As one of the aspects of the present disclosure, as shown in fig. 2, the present disclosure also provides a method of processing medical data, wherein,
s201: acquiring original medical data;
s202: inputting original medical data into a model obtained based on a training method;
s203: outputting the target medical data.
Specifically, the model of the present embodiment, the construction method and the training method thereof can be implemented by the medical data training method in the above embodiments. The training method may specifically include:
acquiring medical data defined as a positive case;
generating a negative case in response to the acquisition of the positive case;
training of the medical data is performed based at least on the negative examples.
The target medical data in the embodiment is used as medical data in aspects of building a standard table, finally outputting and recording medical information, assisting medical research and the like
As one of the aspects of the present disclosure, as shown in fig. 3, the present disclosure also provides a processing apparatus for medical data training, including:
a due case generation module configured for acquiring medical data defined as a due case;
a negative case generation module configured for generating a negative case in response to the obtaining of the positive case;
a training module configured for training of medical data based at least on the negative examples.
In combination with the foregoing example, the negative case generation module of the present disclosure is further configured to:
selecting standard information related to the labeling result in the standard information table according to the labeling result;
and generating a negative example based on the selected standard information and the original information in the positive example.
The negative case generation module of the present disclosure is further configured to:
according to the marking code, determining a standard code related to the marking code in the standard information table;
and selecting the standard code and standard information represented by the standard code.
The negative case generation module of the present disclosure is further configured to:
comparing the label code with the standard code;
and taking the standard code which is the same as the label code on the preset digit as the standard code related to the label code.
Further, it may be configured to: a preset number of standard codes are extracted.
As one of the aspects of the present disclosure, as shown in fig. 5, the present disclosure also provides a processing apparatus of medical data, including:
an acquisition unit configured for acquiring raw medical data;
a processing model, derived based on a training method, for processing the raw medical data.
In some embodiments, the obtaining unit of the present disclosure, which may be an input device, a screen capturing device, a text recognition device, etc., is intended to enable obtaining medical data containing medical information, which may include surgical information, diagnostic information, and codes characterizing the information accordingly.
In some embodiments, the present disclosure relates to a method for training medical data of a processing model, comprising:
acquiring medical data defined as a positive case;
generating a negative case in response to the acquisition of the positive case;
training of the medical data is performed based at least on the negative examples.
In particular, one of the inventive concepts of the present disclosure is directed to providing medical data by acquiring medical data defined as a positive case; generating a negative case in response to the acquisition of the positive case; the medical data is trained at least based on the negative examples, so that the negative examples can be automatically generated based on the generation of positive example samples, the negative example training data is fully utilized, the algorithm model can learn not only the positive example data containing a certain ICD standard word in the original medical data, but also the knowledge scene that the original medical data does not contain the standard word which is closer to the interpretation result and actually does not have reference so as to accord with the design logic of the ICD standard table. According to the method, the internal coding logic is mined through deep learning, the result that the algorithm output does not conform to the medical logic is avoided in the mode, the coding performance of the algorithm model is improved, and the performance can be improved by 1% compared with the method without negative training data, so that the accuracy and the efficiency of medical research and medical diagnosis and treatment are improved.
As one of the aspects of the present disclosure, the present disclosure also provides a computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, mainly implement a training method according to the medical data described above, including at least:
acquiring medical data defined as a positive case;
generating a negative case in response to the acquisition of the positive case;
training of the medical data is performed based at least on the negative examples.
As one of the aspects of the present disclosure, the present disclosure also provides a computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, mainly implement a processing method according to the medical data described above, including at least:
acquiring original medical data;
inputting raw medical data into a model obtained based on the training method as described above;
outputting the target medical data.
In some embodiments, a processor executing computer-executable instructions may be a processing device including more than one general-purpose processing device, such as a microprocessor, Central Processing Unit (CPU), Graphics Processing Unit (GPU), or the like. More specifically, the processor may be a Complex Instruction Set Computing (CISC) microprocessor, Reduced Instruction Set Computing (RISC) microprocessor, Very Long Instruction Word (VLIW) microprocessor, processor running other instruction sets, or processors running a combination of instruction sets. The processor may also be one or more special-purpose processing devices such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), a system on a chip (SoC), or the like.
In some embodiments, the computer-readable storage medium may be a memory, such as a read-only memory (ROM), a random-access memory (RAM), a phase-change random-access memory (PRAM), a static random-access memory (SRAM), a dynamic random-access memory (DRAM), an electrically erasable programmable read-only memory (EEPROM), other types of random-access memory (RAM), a flash disk or other form of flash memory, a cache, a register, a static memory, a compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD) or other optical storage, a tape cartridge or other magnetic storage device, or any other potentially non-transitory medium that may be used to store information or instructions that may be accessed by a computer device, and so forth.
In some embodiments, the computer-executable instructions may be implemented as a plurality of program modules that collectively implement the method for displaying medical images according to any one of the present disclosure.
The present disclosure describes various operations or functions that may be implemented as or defined as software code or instructions. The display unit may be implemented as software code or modules of instructions stored on a memory, which when executed by a processor may implement the respective steps and methods.
Such content may be source code or differential code ("delta" or "patch" code) that may be executed directly ("object" or "executable" form). A software implementation of the embodiments described herein may be provided through an article of manufacture having code or instructions stored thereon, or through a method of operating a communication interface to transmit data through the communication interface. A machine or computer-readable storage medium may cause a machine to perform the functions or operations described, and includes any mechanism for storing information in a form accessible by a machine (e.g., a computing display device, an electronic system, etc.), such as recordable/non-recordable media (e.g., Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory display devices, etc.). The communication interface includes any mechanism for interfacing with any of a hardwired, wireless, optical, etc. medium to communicate with other display devices, such as a memory bus interface, a processor bus interface, an internet connection, a disk controller, etc. The communication interface may be configured by providing configuration parameters and/or transmitting signals to prepare the communication interface to provide data signals describing the software content. The communication interface may be accessed by sending one or more commands or signals to the communication interface.
The computer-executable instructions of embodiments of the present disclosure may be organized into one or more computer-executable components or modules. Aspects of the disclosure may be implemented with any number and combination of such components or modules. For example, aspects of the disclosure are not limited to the specific computer-executable instructions or the specific components or modules illustrated in the figures and described herein. Other embodiments may include different computer-executable instructions or components having more or less functionality than illustrated and described herein.
The above description is intended to be illustrative and not restrictive. For example, the above-described examples (or one or more versions thereof) may be used in combination with each other. For example, other embodiments may be used by those of ordinary skill in the art upon reading the above description. In addition, in the foregoing detailed description, various features may be grouped together to streamline the disclosure. This should not be interpreted as an intention that a disclosed feature not claimed is essential to any claim. Rather, the subject matter of the present disclosure may lie in less than all features of a particular disclosed embodiment. Thus, the following claims are hereby incorporated into the detailed description as examples or embodiments, with each claim standing on its own as a separate embodiment, and it is contemplated that these embodiments may be combined with each other in various combinations or permutations. The scope of the disclosure should be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.
The above embodiments are merely exemplary embodiments of the present disclosure, which is not intended to limit the present disclosure, and the scope of the present disclosure is defined by the claims. Various modifications and equivalents of the disclosure may occur to those skilled in the art within the spirit and scope of the disclosure, and such modifications and equivalents are considered to be within the scope of the disclosure.
Claims (10)
1. A method of training medical data, comprising:
acquiring medical data defined as a positive case;
generating a negative case in response to the acquisition of the positive case;
training of the medical data is performed based at least on the negative examples.
2. The method of claim 1, wherein the acquiring medical data defined as positive examples comprises:
and marking the medical data by combining a standard information table, wherein the marking result exists in the standard information table.
3. The method of claim 2, wherein generating negative examples in response to the obtaining of positive examples comprises:
selecting standard information related to the labeling result in the standard information table according to the labeling result;
and generating a negative example based on the selected standard information and the original information in the positive example.
4. The method of claim 3, wherein the annotation result comprises an annotation code characterizing the original information in the positive case;
selecting the standard information related to the labeling result in the standard information table according to the labeling result, wherein the selecting step comprises the following steps:
according to the marking code, determining a standard code related to the marking code in the standard information table;
and selecting the standard code and standard information represented by the standard code.
5. The method of claim 4, wherein the determining the standard code associated with the label code in the standard information table according to the label code comprises:
comparing the label code with the standard code;
and taking the standard code which is the same as the label code on the preset digit as the standard code related to the label code.
6. The method of claim 1, wherein the determining the standard code associated with the label code in the standard information table according to the label code further comprises:
a preset number of standard codes are extracted.
7. The method according to any one of claims 2 to 6, wherein the standard information table comprises an ICD-9-CM-3 standard information table.
8. A method of processing medical data, wherein,
acquiring original medical data;
inputting raw medical data into a model obtained based on the training method of any one of claims 1 to 7;
outputting the target medical data.
9. Processing apparatus for medical data training, comprising:
a due case generation module configured for acquiring medical data defined as a due case;
a negative case generation module configured for generating a negative case in response to the obtaining of the positive case;
a training module configured for training of medical data based at least on the negative examples.
10. A computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement:
a method of training medical data according to any one of claims 1 to 7; or
The method of processing medical data according to claim 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011477617.4A CN112434756A (en) | 2020-12-15 | 2020-12-15 | Training method, processing method, device and storage medium of medical data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011477617.4A CN112434756A (en) | 2020-12-15 | 2020-12-15 | Training method, processing method, device and storage medium of medical data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112434756A true CN112434756A (en) | 2021-03-02 |
Family
ID=74691745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011477617.4A Pending CN112434756A (en) | 2020-12-15 | 2020-12-15 | Training method, processing method, device and storage medium of medical data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112434756A (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080288292A1 (en) * | 2007-05-15 | 2008-11-20 | Siemens Medical Solutions Usa, Inc. | System and Method for Large Scale Code Classification for Medical Patient Records |
JP2016110440A (en) * | 2014-12-08 | 2016-06-20 | 日本電信電話株式会社 | Term meaning learning device, term meaning determining device, method, and program |
CN105894088A (en) * | 2016-03-25 | 2016-08-24 | 苏州赫博特医疗信息科技有限公司 | Medical information extraction system and method based on depth learning and distributed semantic features |
CN106383853A (en) * | 2016-08-30 | 2017-02-08 | 刘勇 | Realization method and system for electronic medical record post-structuring and auxiliary diagnosis |
CN108596900A (en) * | 2018-05-02 | 2018-09-28 | 武汉联合创想科技有限公司 | Thyroid-related Ophthalmopathy medical image data processing unit, method, computer readable storage medium and terminal device |
CN109545297A (en) * | 2018-10-30 | 2019-03-29 | 平安医疗健康管理股份有限公司 | A kind of disease coding method and calculating equipment based on big data |
CN111026841A (en) * | 2019-11-27 | 2020-04-17 | 云知声智能科技股份有限公司 | Automatic coding method and device based on retrieval and deep learning |
CN111145852A (en) * | 2019-12-30 | 2020-05-12 | 杭州依图医疗技术有限公司 | Medical information processing method and device and computer-readable storage medium |
CN111259664A (en) * | 2020-01-14 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Method, device and equipment for determining medical text information and storage medium |
CN111914069A (en) * | 2019-05-10 | 2020-11-10 | 京东方科技集团股份有限公司 | Training method and device, dialogue processing method and system and medium |
-
2020
- 2020-12-15 CN CN202011477617.4A patent/CN112434756A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080288292A1 (en) * | 2007-05-15 | 2008-11-20 | Siemens Medical Solutions Usa, Inc. | System and Method for Large Scale Code Classification for Medical Patient Records |
JP2016110440A (en) * | 2014-12-08 | 2016-06-20 | 日本電信電話株式会社 | Term meaning learning device, term meaning determining device, method, and program |
CN105894088A (en) * | 2016-03-25 | 2016-08-24 | 苏州赫博特医疗信息科技有限公司 | Medical information extraction system and method based on depth learning and distributed semantic features |
CN106383853A (en) * | 2016-08-30 | 2017-02-08 | 刘勇 | Realization method and system for electronic medical record post-structuring and auxiliary diagnosis |
CN108596900A (en) * | 2018-05-02 | 2018-09-28 | 武汉联合创想科技有限公司 | Thyroid-related Ophthalmopathy medical image data processing unit, method, computer readable storage medium and terminal device |
CN109545297A (en) * | 2018-10-30 | 2019-03-29 | 平安医疗健康管理股份有限公司 | A kind of disease coding method and calculating equipment based on big data |
CN111914069A (en) * | 2019-05-10 | 2020-11-10 | 京东方科技集团股份有限公司 | Training method and device, dialogue processing method and system and medium |
CN111026841A (en) * | 2019-11-27 | 2020-04-17 | 云知声智能科技股份有限公司 | Automatic coding method and device based on retrieval and deep learning |
CN111145852A (en) * | 2019-12-30 | 2020-05-12 | 杭州依图医疗技术有限公司 | Medical information processing method and device and computer-readable storage medium |
CN111259664A (en) * | 2020-01-14 | 2020-06-09 | 腾讯科技(深圳)有限公司 | Method, device and equipment for determining medical text information and storage medium |
Non-Patent Citations (3)
Title |
---|
KEYANG XU, ET AL.: "Multimodal Machine Learning for Automated ICD Coding", 《ARXIV》, pages 1 - 18 * |
王婷;王祺;黄越圻;殷亦超;高炬;: "基于症状构成成分的上下位关系自动抽取方法", 计算机应用, no. 10, pages 271 - 277 * |
阮彤;高炬;冯东雷;钱夕元;王婷;孙程琳;: "基于电子病历的临床医疗大数据挖掘流程与方法", 大数据, no. 05, pages 86 - 101 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6894058B2 (en) | Hazardous address identification methods, computer-readable storage media, and electronic devices | |
CN107644011B (en) | System and method for fine-grained medical entity extraction | |
US20190347269A1 (en) | Structured report data from a medical text report | |
CN112749277B (en) | Medical data processing method, device and storage medium | |
CN112560400B (en) | Medical data processing method, device and storage medium | |
CN110162786A (en) | Construct the method, apparatus of configuration file and drawing-out structure information | |
CN110895961A (en) | Text matching method and device in medical data | |
CN112735544A (en) | Medical record data processing method and device and storage medium | |
CN115859914A (en) | Diagnosis ICD automatic coding method and system based on medical history semantic understanding | |
CN113436730A (en) | Hospital disease diagnosis classification automatic coding method and system | |
CN111563380A (en) | Named entity identification method and device | |
CN113658720A (en) | Method, apparatus, electronic device and storage medium for matching diagnostic name and ICD code | |
CN112735545B (en) | Self-training method, model, processing method, device and storage medium | |
US20230281392A1 (en) | Computer-readable recording medium storing computer program, machine learning method, and natural language processing apparatus | |
CN113111660A (en) | Data processing method, device, equipment and storage medium | |
CN111797626B (en) | Named entity recognition method and device | |
CN112735543B (en) | Medical data processing method, device and storage medium | |
CN112434756A (en) | Training method, processing method, device and storage medium of medical data | |
CN112749545B (en) | Medical data processing method, device and storage medium | |
CN110008475A (en) | Participle processing method, device, equipment and storage medium | |
CN112712868A (en) | Medical data analysis method, device and storage medium | |
CN112700825B (en) | Medical data processing method, device and storage medium | |
CN112687369A (en) | Medical data training method and device and storage medium | |
CN114691907A (en) | Cross-modal retrieval method, device and medium | |
CN114974554A (en) | Method, device and storage medium for fusing atlas knowledge to strengthen medical record features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |