CN115344665A - Medical record text processing method and device, electronic equipment and computer-readable storage medium - Google Patents

Medical record text processing method and device, electronic equipment and computer-readable storage medium Download PDF

Info

Publication number
CN115344665A
CN115344665A CN202110519427.2A CN202110519427A CN115344665A CN 115344665 A CN115344665 A CN 115344665A CN 202110519427 A CN202110519427 A CN 202110519427A CN 115344665 A CN115344665 A CN 115344665A
Authority
CN
China
Prior art keywords
medical record
retrieval
text
record text
correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110519427.2A
Other languages
Chinese (zh)
Inventor
谭传奇
陈漠沙
黄松芳
黄非
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Innovation Co
Original Assignee
Alibaba Singapore Holdings Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Singapore Holdings Pte Ltd filed Critical Alibaba Singapore Holdings Pte Ltd
Priority to CN202110519427.2A priority Critical patent/CN115344665A/en
Publication of CN115344665A publication Critical patent/CN115344665A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Computational Linguistics (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application discloses a method and a device for processing a medical record text, electronic equipment and a computer-readable storage medium. The method for processing the medical record text comprises the following steps: acquiring a medical record text to be processed; performing text analysis processing on the medical record text to be processed to obtain a plurality of fields; taking each field in the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of retrieval results; calculating the correlation between the plurality of search results and the search condition; and determining a retrieval result matched with the medical record text to be processed according to the correlation. The method and the device solve the problem of low manual matching efficiency in the prior art, and can realize high-precision automatic searching and matching processing and greatly improve the searching and matching efficiency between the natural language text and the standard term because the fuzzy searching can be carried out based on the field and the final matched searching result can be determined based on the correlation.

Description

Medical record text processing method and device, electronic equipment and computer-readable storage medium
Technical Field
The present application relates to the field of text processing technologies, and in particular, to a method and an apparatus for processing a medical record text, an electronic device, and a computer-readable storage medium.
Background
Background
With the development of big data technology, more and more industries adopt electronic databases to manage data generated in operation. However, a large amount of business data generated in daily business has been electronically input using computer technology. But actual operators often use natural language to write various recorded texts in daily business activities. However, for a database managed by big data technology, only the recorded text of a specific term is allowed, so that the classification and the retrieval can be conveniently carried out. For example, in the field of hospitals, when a doctor writes a medical record of a patient, descriptive records such as a patient's condition and a judgment of a corresponding symptom are often described in a natural language familiar to the doctor, but only one specific term is generally assigned to the same patient condition and the same symptom in a corresponding medical database, and therefore, in such a case, for example, a record text written by the doctor in the natural language needs to be matched with the term in the corresponding database, so that the whole medical data of the patient can be managed by using a big data technology. However, in the prior art, it is difficult to ensure high matching accuracy when performing such matching due to the richness and diversity of natural languages. Therefore, a technical solution capable of achieving a higher degree of matching with respect to text written in a natural language is required.
Disclosure of Invention
The embodiment of the application provides a method and a device for processing a medical record text, electronic equipment and a computer-readable storage medium, so as to overcome the defect that the precision of matching a natural language text with a standard term is low in the prior art.
In order to achieve the above object, an embodiment of the present application provides a method for processing a medical record text, including:
acquiring a medical record text to be processed, wherein the medical record text to be processed comprises a natural language text related to a patient;
performing text analysis processing on the medical record text to be processed to obtain a plurality of fields;
taking each field in the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of retrieval results;
calculating the correlation between the plurality of search results and the search condition;
and determining a retrieval result matched with the medical record text to be processed according to the correlation.
The embodiment of the present application further provides a device for processing a medical record text, including:
the system comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a medical record text to be processed, and the medical record text to be processed comprises a natural language text related to a patient;
the analysis module is used for performing text analysis processing on the medical record text to be processed to obtain a plurality of fields;
the retrieval module is used for retrieving each field in the fields as a retrieval condition in a preset medical record database to obtain a plurality of retrieval results;
the calculation module is used for calculating the correlation between the plurality of retrieval results and the retrieval conditions;
and the determining module is used for determining a retrieval result matched with the medical record text to be processed according to the correlation.
An embodiment of the present application further provides an electronic device, including:
a memory for storing a program;
and the processor is used for operating the program stored in the memory, and the program executes the method for processing the medical record text provided by the embodiment of the application when running.
The embodiment of the present application further provides a computer-readable storage medium, on which a computer program executable by a processor is stored, wherein the program, when executed by the processor, implements the method for processing the medical record text as provided in the embodiment of the present application.
According to the medical record text processing method and device, the electronic device and the computer readable storage medium, the natural language text is analyzed into the fields, the corresponding retrieval result is obtained from the medical record database aiming at each field, the correlation between the retrieval result and the fields is further calculated, the retrieval result matched with the medical record text to be processed is determined according to the correlation, the problem that manual matching efficiency is low in the prior art is solved, fuzzy retrieval can be conducted based on the fields, the final matched retrieval result can be determined based on the correlation, high-precision automatic retrieval matching processing can be achieved, and retrieval matching efficiency between the natural language text and standard terms is greatly improved.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic view of an application scenario of a method for processing a medical record text provided in an embodiment of the present application;
FIG. 2 is a flow chart of an embodiment of a method for processing medical records provided by the present application;
FIG. 3 is a flow chart of another embodiment of a method for processing medical records provided by the present application;
FIG. 4 is a schematic structural diagram of an embodiment of a device for processing medical records provided in the present application;
fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Example one
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
With the development of big data technology, more and more industries adopt electronic databases to manage data generated in operation. However, a large amount of business data generated in daily business has been electronically input using computer technology. But actual operators often use natural language to write various recorded texts in daily business activities. However, for a database managed by big data technology, only the recorded text of a specific term is allowed, so that the classification and the retrieval can be conveniently carried out. For example, in the medical field, when a doctor writes a medical record of a patient, descriptive records such as a patient's condition and a judgment of a corresponding symptom are often described in a natural language familiar to the doctor, but only one specific standard term is generally assigned to the same condition and the same symptom in a corresponding medical database, and therefore, in such a case, when managing the operation of a hospital, it is necessary to match a record text written in the natural language by the doctor, for example, to the terms in the corresponding database, so that it is possible to manage all medical data of the patient using a big data technology.
For example, in the medical management of hospitals or medical management institutions, it is necessary to use such big data technology to perform various medical statistics, and an important basis of such medical statistics is medical term standardization, i.e., various diagnoses or disease descriptions written in natural language by the above-mentioned doctors during the treatment. However, in actual clinical medicine, different doctors have different descriptions of the same disease condition, the same operation, the same medicine, examination and even the same symptom. In such a case, medical authorities collect a large amount of medical record text written in such natural language and need to make statistics on the basis of the medical record text. In the prior art, it is necessary to match these medical record texts with specific standard terms in a database in order to classify and count these medical records. For example, in classifying patients, diagnostic related classifications (DRGs) are commonly used to classify patients into 500-600 classification groups based on their days of hospitalization, clinical diagnosis, medical condition, surgery, disease severity, complications and complications, and other attribute factors. In other words, the above attribute factors are written as medical record texts in the natural language used by doctors to write to patients on a daily basis, so that when the DRG-based patient classification is performed, it is necessary to match these natural language forms or medical record texts including natural language texts related to patients to predetermined standard classification terms (i.e., standardized codes).
But it is difficult to ensure high matching accuracy in performing such matching due to the richness and diversity of natural languages. That is, as described above, since doctors usually write medical history texts in natural language used to them, even if there are a large variety of medical text contents for patients in the same category, the same category usually has only one standard term preset, which requires that such texts in different writing methods or content forms can be correctly matched to one standard term.
For this reason, in the prior art, it has appeared that matching is manually performed manually by a person who is arranged to read medical records or the like in natural language written by doctors and manually match them to standard terms to classify patients into classification groups, but such manual solutions consume a lot of manpower and are inefficient. In addition, since the classification group based on the standard terms is already a database of the standard terms, it has been proposed in the prior art to directly input, for example, natural language text written by a doctor as a keyword into the database for query, but because of the diversity of natural languages, the probability of obtaining accurate results by directly performing query is very low.
Thus, as shown in FIG. 1, text written in various natural language forms can be entered into the system for processing the case text of the embodiments of the present application. In the embodiment of the present application, the medical record text to be processed thus inputted includes a natural language text related to the patient, i.e., a text written in a natural language. For example, in the above medical management field, such a natural language text may be structurally parsed in a medical record text processing system for the medical record text to be processed so as to parse the natural language text into field text. After that, the text of the field may be further preprocessed, for example, the number and sequence number, such as 1 or (1), may be recognized and removed, or punctuation marks included in the field, such as question mark (. For example, a concatenated field of "hypertension + diabetes" is included in the field text obtained after the structural analysis is performed on the case text to be processed, so that such a field can be further split. For example, two fields, hypertension and diabetes, can be split. Therefore, such preprocessed fields can be suitable for retrieval and matching processing in the embodiment of the present application.
For example, in the embodiment of the present application, the parsed and split fields or the preprocessed fields may be directly input into the standard term database for retrieval, which may also be referred to as an accurate query or an accurate matching process in the embodiment of the present application. In other words, if the fields included in the medical record written in natural language by the operator, such as a doctor, are already standard terms, the standard terms that completely match the medical record can be directly found from the database, and the completely matching result can be directly obtained as the corresponding retrieval result.
Furthermore, in actual industry operations, the various recorded texts written by operators in natural language usually do not contain the above-mentioned fields that can directly correspond to the standard terms in the database, but contain other fields in various forms or even descriptive language. Therefore, a large number of non-standard terms are often included in the fields obtained after the above-mentioned structural analysis or after further preprocessing, that is, the search results corresponding thereto cannot be directly obtained through the above-mentioned precise matching process. Therefore, in the embodiment of the present application, the fields thus obtained may be first input into the database for retrieval. In the embodiment of the present application, the database may include a term library storing standard terms and may further include a history database storing a history of search results, i.e., a corresponding list of searched fields and determined search results. For example, a Lucene (Lu Sen) search engine may be used to search a database for all search results related to such fields, and these search results may be sorted, for example, according to the matching degree of the search results and the corresponding fields, and the search results in the top of the order may be selected as corresponding search result candidates. That is, in the embodiment of the present application, a plurality of search result candidates may be obtained by performing a fuzzy query on each field parsed from a text written in a natural language. But a plurality of search result candidates thus obtained need to determine one of them as a final search result. Therefore, in the embodiment of the present application, a search result larger than a threshold value may be selected as a final search result corresponding to the field by calculating the correlation between the queried field and the candidate.
In addition, in the embodiment of the present application, the selection process from among the search result candidates may also be assisted by acquiring various attribute information related to the text to be processed. For example, in the medical field, a medical record is typically used to describe a patient's condition, and thus, the recorded text that describes the condition as on the medical record may typically correspond to the patient's identity information, or the like. For example, in the embodiment of the present application, after the above fuzzy query or matching is performed to obtain a plurality of search result candidates, attribute information associated with the text to be processed, for example, identity information of a patient corresponding to a medical record, and the like, may be obtained, and these attribute information may be used as the screening condition to perform preliminary screening on the searched search result candidates. For example, if the retrieved search result candidates include gynopathy and andropathy, if the patient corresponding to the medical record text to be processed is known to be a male through the acquired information, the gynopathy and similar search result candidates can be directly removed.
According to the medical record text processing scheme provided by the embodiment of the application, the natural language text is analyzed into the fields, the corresponding retrieval result is obtained from the medical record database aiming at each field, the correlation between the retrieval result and the fields is further calculated, and the retrieval result matched with the medical record text to be processed is determined according to the correlation, so that the problem that the manual matching efficiency is low in the prior art is solved, in addition, the fuzzy retrieval can be carried out based on the fields, the final matched retrieval result is determined based on the correlation, the high-precision automatic retrieval matching processing can be realized, and the retrieval matching efficiency between the natural language text and the standard terms is greatly improved.
In particular, the processing scheme of the medical record text provided by the present application can be automatically matched with the standard terms according to the medical record text written by the doctor in the natural language, and therefore, the present application can be particularly applied to various medical institutions such as public hospitals and private hospitals, etc., in which the doctor can write the medical record text in the natural language text format using the natural language form familiar or customary to the doctor in daily routine medical work, and after inputting the medical record text into the management system of the medical institution, the medical institution can process the collected natural language medical record text written by the doctor in various forms using the processing scheme of the medical record text described in the embodiment of the present application by obtaining authorization or purchasing medical record processing services, etc. to obtain the processed medical record text in the standard term form, that is, the scheme of the embodiment of the application can be used by various medical institutions or public medical management departments and other institutions which need to establish a database for medical records written by a large number of doctors, so that the original natural language medical record text is converted into a text in a unified form using standard terms, the management of the medical record data is facilitated, the further research or secondary development can be carried out based on the text data of the medical records sorted in the way, the patient can be better served or the medical scheme can be further improved, therefore, after the treatment scheme of the medical record text of the embodiment of the application is popularized and widely used in the medical institutions, the medical record text written by the doctors per se can be efficiently and accurately converted into the medical record text using the standard terms, and the collection and sorting efficiency of the medical data of the medical institutions can be greatly improved, in addition, since research work in the medical field also needs to be based on a large amount of patient data, in a medical institution which obtains technical authorization and can use the processing scheme of the embodiment of the application, a researcher can obtain medical data of a first hand more quickly, so that the medical data based on the standard term can be researched in time. In addition, the management organization can also obtain a large number of medical records using standard terms by processing the medical record texts of the doctors managed by the management organization by using the processing scheme of the embodiment of the application, so that the management efficiency of diagnosis and treatment of each doctor is improved, and the medical experience of patients is improved.
The above embodiments are illustrations of technical principles and exemplary application frameworks of the embodiments of the present application, and specific technical solutions of the embodiments of the present application are further described in detail below through a plurality of embodiments.
Example two
Fig. 2 is a flowchart of an embodiment of a method for processing a medical record text provided by the present application, and an execution subject of the method may be various terminal or server devices with text processing capability, or may be a device or chip integrated on these devices. As shown in fig. 2, the method for processing the medical record text includes the following steps:
s201, obtaining a medical record text to be processed.
In the embodiment of the present application, various texts may be input by a user to a server running a processing scheme of a medical record text according to the present application. For example, a doctor can input various medical record texts through a keyboard, handwriting touch or a voice recognition mode, and the medical record text processing method of the embodiment of the application can process the medical record texts based on the texts input by the user. In particular, according to embodiments of the present application, the purpose of the user input of these texts is related to a specific target object, for example, diagnostic information for a patient or therapeutic information for a patient, etc., and thus, at least one target entity such as a patient may be included in such texts. In addition, the method for processing the medical record text can also be applied to the construction of various databases, for example, various texts can be automatically acquired from the internet, and target subjects, such as medical records uploaded by other doctors, or even disease descriptions uploaded by patients, and the like, are screened out to supplement information such as disease descriptions related to target subjects in the medical database.
S202, performing text analysis processing on the case text to be processed to obtain a plurality of fields.
Since the information contained in the other pages except the information contained in the first page of the medical record text is usually in the form of a structured field, the information can be parsed to obtain the field text after the medical record text to be processed is obtained in step S201.
And S203, taking each field of the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of retrieval results.
After the fields are obtained, they may be retrieved as input in a preset medical records database in step S203. For example, in the embodiment of the present application, the database may include a term library storing standard terms and may also include a history database storing a history of search results, i.e., a corresponding list of searched fields and determined search results. For example, a Lucene (Lu Sen) search engine may be employed to find all search results related to such fields in the database, and these search results may be sorted, for example, according to the matching degree of the search results and the corresponding fields, and the search results of the top ranked ones may be selected as corresponding search result candidates.
S204, calculating the correlation between a plurality of search results and search conditions.
The search result corresponding to the field obtained in step S203 is actually a result candidate corresponding to the searched field, and therefore, the correlation with the field may be calculated for such a candidate in step S204. For example, a BERT model (Bidirectional Encoder representation based on a transformer) may be used to calculate the correlation or probability between a field and the search result obtained by searching using the field in step S203.
And S205, determining a retrieval result matched with the medical record text to be processed according to the correlation.
Therefore, in step S205, a search result, for example, larger than a threshold value may be regarded as a search field matching the field according to the correlation or probability calculated in step S204.
According to the medical record text processing method provided by the embodiment of the application, the natural language text is analyzed into the fields, the corresponding retrieval result is obtained from the medical record database aiming at each field, the correlation between the retrieval result and the fields is further calculated, and the retrieval result matched with the medical record text to be processed is determined according to the correlation, so that the problem that the manual matching efficiency is low in the prior art is solved, in addition, the fuzzy retrieval can be carried out based on the fields, the final matched retrieval result is determined based on the correlation, the high-precision automatic retrieval matching processing can be realized, and the retrieval matching efficiency between the natural language text and the standard terms is greatly improved.
EXAMPLE III
FIG. 3 is a flowchart of another embodiment of a method for processing a medical record provided by the present application. As shown in fig. 3, the method for processing a medical record text provided in this embodiment may include the following steps:
s301, obtaining a medical record text to be processed.
In the embodiment of the present application, various texts may be input by a user to a server running a processing scheme of a medical record text according to the present application. For example, a doctor can input various medical record texts through a keyboard, handwriting touch or a voice recognition mode, and the medical record text processing method of the embodiment of the application can process the medical record texts based on the texts input by the user. In particular, according to embodiments of the present application, the purpose of the user input of these texts is related to a specific target object, for example, diagnostic information for a patient or therapeutic information for a patient, etc., and thus, at least one target entity such as a patient may be included in such texts. In addition, the method for processing the medical record text can also be applied to the construction of various databases, for example, various texts can be automatically acquired from the internet, and target subjects, such as medical records uploaded by other doctors, or even disease descriptions uploaded by patients, and the like, are screened out to supplement information such as disease descriptions related to target subjects in the medical database.
S302, performing text analysis processing on the case text to be processed to obtain a plurality of fields.
Since the information contained in the other pages except the information contained in the first page of the medical record text is usually in the form of a structured field, the information can be parsed to obtain the field text after the medical record text to be processed is obtained in step S201.
S303, acquiring the attribute information of the patient from the medical record text to be processed.
Since the medical records to be processed in the present application are usually related to the target object such as the patient, in the embodiment of the present application, the attribute information of the patient can be additionally obtained in the medical records to be processed, so as to be used as reference information in the subsequent retrieval and matching process. In particular, in the embodiment of the present application, the attribute information may describe a basic attribute of a target object corresponding to a medical record text to be processed, for example, in a case that the medical record text to be processed is a medical record text, the attribute information of the patient may be identity information of a patient to which the medical record belongs.
S304, taking each field of the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of candidate results.
After the fields are obtained, they may be retrieved as input in a preset medical records database in step S304. For example, in the embodiment of the present application, the database may include a term library storing standard terms and may also include a history database storing a history of search results, i.e., a corresponding list of searched fields and determined search results. For example, a Lucene (Lu Sen) search engine may be used to search a database for all search results related to such fields, and these search results may be sorted, for example, according to the matching degree of the search results and the corresponding fields, and the search results in the top of the order may be selected as corresponding search result candidates.
In the embodiment of the present application, a part or all of the search results determined in the historical search operation before the current search operation may be recorded in a preset medical record database.
S305, screening the plurality of candidate results according to the attribute information of the patient to obtain the candidate result matched with the attribute information of the patient as the retrieval result.
In the embodiment of the present application, after the fuzzy query or the matching is performed in step S304 to obtain a plurality of search result candidates, the attribute information of the patient obtained in step S303, for example, the identity information of the patient corresponding to the medical record, may be used as the screening condition to perform the preliminary screening on the search result candidates retrieved in step S304. For example, if the search result candidate retrieved in step S304 includes a gynecological disease and a andropathy, if the acquired information in step S303 indicates that the patient corresponding to the medical record text to be processed is a male, then the gynecological disease and similar search result candidates can be directly removed from step S304 in step S305.
S306, vectorizing the field as the search condition to obtain a field vector, and vectorizing each of the search results to obtain a plurality of search result vectors.
S307, a correlation between the field vector and each search result vector is calculated.
The search result corresponding to the field obtained in step S305 is actually a result candidate corresponding to the searched field, and therefore, the correlation with the field can be calculated for such a candidate. For example, a BERT model (Bidirectional Encoder representation based on a transformer) may be used to calculate the correlation or probability between a field and the search result obtained by searching using the field in step S304. For example, it is possible to perform vectorization processing on the fields obtained parsed in step S302 as the search conditions in step S306 to obtain field vectors, and perform vectorization processing on each search result in step S305 to obtain a search result vector. Thereafter, the correlation between the field vector and each of the retrieval result vectors may be calculated at step S307.
S308, the search result whose correlation is greater than the predetermined threshold is taken as the search result matched with the field as the search condition.
Therefore, a search result having a calculated correlation, for example, greater than a threshold value, may be regarded as a search field matching this field in step S308.
According to the method for processing the medical case text, the natural language text is analyzed into the fields, the corresponding retrieval result is obtained from the medical case database aiming at each field, the correlation between the retrieval result and the fields is further calculated, and the retrieval result matched with the medical case text to be processed is determined according to the correlation, so that the problem that manual matching efficiency is low in the prior art is solved, fuzzy retrieval can be carried out based on the fields, the final matched retrieval result is determined based on the correlation, high-precision automatic retrieval matching processing can be achieved, and retrieval matching efficiency between the natural language text and standard terms is greatly improved.
Example four
Fig. 4 is a schematic structural diagram of an embodiment of a device for processing medical records provided by the present application, which can be used to execute the method steps shown in fig. 2 and fig. 3. As shown in fig. 4, the medical record text processing device may include: a first obtaining module 41, a parsing module 42, a retrieving module 43, a calculating module 44 and a determining module 45.
The first obtaining module 41 can be used to obtain the medical record text to be processed.
In the embodiment of the present application, various texts input by the user to the server running the processing scheme of the medical record text according to the present application may be acquired by the first acquisition module 41. For example, a doctor can input various medical record texts through a keyboard, handwriting touch or a voice recognition mode, and the medical record text processing device of the embodiment of the application can process the medical record texts based on the texts input by the user. In particular, according to embodiments of the present application, the purpose of the user input of these texts is related to a specific target object, for example, diagnostic information for a patient or therapeutic information for a patient, etc., and thus, at least one target entity such as a patient may be included in such texts. In addition, the method for processing the medical record text can also be applied to the construction of various databases, for example, various texts can be automatically acquired from the internet, and target subjects, such as medical records uploaded by other doctors, or even disease descriptions uploaded by patients, and the like, are screened out to supplement information such as disease descriptions related to target subjects in the medical database.
The parsing module 42 may be configured to perform text parsing on the medical record text to be processed to obtain a plurality of fields.
Since the information contained in the to-be-processed medical record text, except for the information contained in the first page of the medical record, is usually in the form of a structured field, the parsing module 42 may parse the to-be-processed medical record text, especially the natural language text related to the patient, to obtain the field text after obtaining the to-be-processed medical record text.
The searching module 43 may be configured to search in a preset medical record database by using each of the plurality of fields as a searching condition, so as to obtain a plurality of searching results.
After the fields are obtained, the search module 43 may search the fields as input in a predetermined medical records database. For example, in the embodiment of the present application, the database may include a term library storing standard terms and may also include a history database storing a history of search results, i.e., a corresponding list of searched fields and determined search results. For example, a Lucene (Lu Sen) search engine may be used to search a database for all search results related to such fields, and these search results may be sorted, for example, according to the matching degree of the search results and the corresponding fields, and the search results in the top of the order may be selected as corresponding search result candidates.
The calculation module 44 may be configured to calculate correlations between the plurality of search results and the search conditions.
The search result corresponding to the field obtained by the search module 43 is actually a result candidate corresponding to the searched field, and therefore, the calculation module 44 may calculate the correlation between fields for such a candidate. For example, a BERT model (Bidirectional Encoder representation based on a transformer) may be used to calculate the correlation or probability between a field and a search result obtained by the search module 43 using the field for searching.
In the embodiment of the present application, the medical record processing device may further include a second obtaining module 46. The second obtaining module 46 can be used to obtain the attribute information of the patient in the medical record text to be processed.
Since the medical records to be processed in the present application are usually related to the target object such as the patient, in the embodiment of the present application, the attribute information of the patient in the medical records to be processed can be additionally obtained by the second obtaining module 46, so as to be used as the reference information in the subsequent retrieving and matching process. Particularly, in the embodiment of the present application, the attribute information of the patient may describe basic attributes of the target object corresponding to the medical record text to be processed, for example, in a case that the medical record text to be processed is a medical record text, the attribute information of the patient may be identity information of a patient to which the medical record belongs.
Therefore, in the case of acquiring the attribute information of the patient, the retrieval module 43 may be further configured to acquire a plurality of candidate results corresponding to the plurality of fields in a preset medical record database according to each of the plurality of fields, and perform screening processing on the plurality of candidate results according to the attribute information of the patient to acquire a candidate result matching with the attribute information of the patient as the retrieval result.
In the embodiment of the present application, after the retrieval module 43 performs fuzzy query or matching to obtain a plurality of retrieval result candidates, attribute information associated with the text to be processed, for example, identity information of a patient corresponding to a medical record, acquired by the second acquisition module 46 may be used as a screening condition to further perform preliminary screening on the retrieved retrieval result candidates in the retrieval module 43. For example, if the search result candidate retrieved by the retrieval module 43 includes a gynecological disease and a male disease, the retrieval module 43 may directly remove the gynecological disease and similar search result candidates if the second retrieval module 46 retrieves information indicating that the patient corresponding to the medical record text to be processed is a male.
In the embodiment of the present application, the calculation module 44 may further include: a vectorization unit 441 and a correlation calculation unit 442.
The vectorization unit 441 is configured to perform vectorization processing on the field serving as the search condition to obtain a field vector, and perform vectorization processing on each of the plurality of search results to obtain a plurality of search result vectors.
The correlation calculation unit 442 may be configured to calculate a correlation between the field vector and each of the retrieval result vectors.
The determination module 45 may be configured to determine a search result matching the medical record text to be processed according to the correlation.
The determination module 45 may be further configured to take the search result with the correlation larger than the predetermined threshold as the search result matched with the field as the search condition. Therefore, the determination module 45 may regard the search result, for example, larger than the threshold value, as the search field matching the field according to the correlation or probability calculated by the calculation module 44.
According to the medical record text processing device, the natural language text is analyzed into the fields, the corresponding retrieval results are obtained from the medical record database aiming at each field, the correlation between the retrieval results and the fields is further calculated, and the retrieval results matched with the medical record text to be processed are determined according to the correlation, so that the problem that manual matching efficiency is low in the prior art is solved, fuzzy retrieval can be carried out based on the fields, the final matched retrieval results can be determined based on the correlation, high-precision automatic retrieval matching processing can be achieved, and retrieval matching efficiency between the natural language text and standard terms is greatly improved.
EXAMPLE five
The internal functions and structure of the medical record text processing apparatus, which may be implemented as an electronic device, are described above. Fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided in the present application. As shown in fig. 5, the electronic device includes a memory 51 and a processor 52.
The memory 51 stores programs. In addition to the above-described programs, the memory 51 may also be configured to store other various data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, and the like.
The memory 51 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The processor 51 is not limited to a Central Processing Unit (CPU), but may be a processing chip such as a Graphic Processing Unit (GPU), a Field Programmable Gate Array (FPGA), an embedded neural Network Processor (NPU), or an Artificial Intelligence (AI) chip. And a processor 52, coupled to the memory 51, for executing the program stored in the memory 51, and executing the method for processing the medical record text according to the second or third embodiment.
Further, as shown in fig. 5, the electronic device may further include: communication components 53, power components 54, audio components 55, display 56, and other components. Only some of the components are schematically shown in fig. 5, and it is not meant that the electronic device comprises only the components shown in fig. 5.
The communication component 53 is configured to facilitate wired or wireless communication between the electronic device and other devices. The electronic device may access a wireless network based on a communication standard, such as WiFi,3G, 4G, or 5G, or a combination thereof. In an exemplary embodiment, the communication component 53 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 53 further comprises a Near Field Communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
A power supply component 54 provides power to the various components of the electronic device. The power components 54 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for an electronic device.
The audio component 55 is configured to output and/or input audio signals. For example, the audio component 55 includes a Microphone (MIC) configured to receive external audio signals when the electronic device is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 51 or transmitted via the communication component 53. In some embodiments, audio assembly 55 also includes a speaker for outputting audio signals.
The display 56 includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A medical record text processing method comprises the following steps:
acquiring a medical record text to be processed, wherein the medical record text to be processed comprises a natural language text related to a patient;
performing text analysis processing on the medical record text to be processed to obtain a plurality of fields;
taking each field in the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of retrieval results;
calculating the correlation between the plurality of search results and the search condition;
and determining a retrieval result matched with the medical record text to be processed according to the correlation.
2. The medical record text processing method according to claim 1, wherein the calculating of the correlation between the plurality of search results and the search condition comprises:
vectorizing the field serving as the retrieval condition to obtain a field vector, and vectorizing the retrieval results to obtain a plurality of retrieval result vectors;
and calculating the correlation between the field vector and each retrieval result vector.
3. The medical record text processing method according to claim 2, wherein the determining the search result matching the medical record text to be processed according to the correlation comprises:
and taking the retrieval result with the relevance larger than a preset threshold value as the retrieval result matched with the field serving as the retrieval condition.
4. The method of claim 1, wherein the method further comprises: and recording part or all of the retrieval results determined in the historical retrieval operation before the current retrieval operation in the preset medical record database.
5. The method of processing medical record text according to claim 1, wherein the method further comprises:
acquiring attribute information of the patient from the medical record text to be processed;
then, the retrieving in the preset medical record database by using each of the fields as a retrieving condition to obtain a plurality of retrieving results includes:
taking each field in the plurality of fields as a retrieval condition, and retrieving in a preset medical record database to obtain a plurality of candidate results;
and screening the candidate results according to the attribute information of the patient to obtain the candidate result matched with the attribute information of the patient as the retrieval result.
6. A medical record text processing apparatus, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a medical record text to be processed, and the medical record text to be processed comprises a natural language text related to a patient;
the analysis module is used for carrying out text analysis processing on the medical record text to be processed so as to obtain a plurality of fields;
the retrieval module is used for retrieving each field in the fields as a retrieval condition in a preset medical record database to obtain a plurality of retrieval results;
the calculation module is used for calculating the correlation between the plurality of search results and the search conditions;
and the determining module is used for determining a retrieval result matched with the medical record text to be processed according to the correlation.
7. The medical record text processing device of claim 6, wherein the calculation module comprises:
the vectorization unit is used for vectorizing the field serving as the search condition to obtain a field vector and vectorizing the search results to obtain a plurality of search result vectors;
a correlation calculation unit for calculating a correlation between the field vector and each of the retrieval result vectors.
8. The medical record text processing device of claim 7, wherein the determination module is further configured to:
and taking the retrieval result with the relevance larger than a preset threshold value as the retrieval result matched with the field serving as the retrieval condition.
9. An electronic device, comprising:
a memory for storing a program;
a processor for executing the program stored in the memory, the program executing the method for processing a medical record text according to any one of claims 1 to 5.
10. A computer-readable storage medium, on which a computer program executable by a processor is stored, wherein the program, when executed by the processor, implements the method of processing a medical record text according to any one of claims 1 to 5.
CN202110519427.2A 2021-05-12 2021-05-12 Medical record text processing method and device, electronic equipment and computer-readable storage medium Pending CN115344665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110519427.2A CN115344665A (en) 2021-05-12 2021-05-12 Medical record text processing method and device, electronic equipment and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110519427.2A CN115344665A (en) 2021-05-12 2021-05-12 Medical record text processing method and device, electronic equipment and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN115344665A true CN115344665A (en) 2022-11-15

Family

ID=83977598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110519427.2A Pending CN115344665A (en) 2021-05-12 2021-05-12 Medical record text processing method and device, electronic equipment and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN115344665A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118012890A (en) * 2024-02-02 2024-05-10 北京偶数科技有限公司 Matching method for data fields and data standards and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118012890A (en) * 2024-02-02 2024-05-10 北京偶数科技有限公司 Matching method for data fields and data standards and readable storage medium

Similar Documents

Publication Publication Date Title
CN112786194B (en) Medical image diagnosis guiding and guiding system, method and equipment based on artificial intelligence
CN108831559B (en) Chinese electronic medical record text analysis method and system
CN111316281B (en) Semantic classification method and system for numerical data in natural language context based on machine learning
CN110364234B (en) Intelligent storage, analysis and retrieval system and method for electronic medical records
US20170300635A1 (en) Identification of codable sections in medical documents
CN112614565A (en) Traditional Chinese medicine classic famous prescription intelligent recommendation method based on knowledge-graph technology
CN113345577B (en) Diagnosis and treatment auxiliary information generation method, model training method, device, equipment and storage medium
CN112541066B (en) Text-structured-based medical and technical report detection method and related equipment
CN113886716A (en) Emergency disposal recommendation method and system for food safety emergencies
CN111785383A (en) Data processing method and related equipment
CN114400062A (en) Interpretation method and device of inspection report, computer equipment and storage medium
CN112655047A (en) Method for classifying medical records
CN110867228B (en) Intelligent information grabbing and evaluating method and system for wound severity of wound inpatient
CN115617840A (en) Medical data retrieval platform construction method, system, computer and storage medium
CN115344665A (en) Medical record text processing method and device, electronic equipment and computer-readable storage medium
CN114220542A (en) Physical examination information management method and device, storage medium and computing equipment
Chahid et al. Data Preprocessing For Machine Learning Applications in Healthcare: A Review
CN113343680A (en) Structured information extraction method based on multi-type case history texts
Rammal et al. Heart failure prediction models using big data techniques
CN116469505A (en) Data processing method, device, computer equipment and readable storage medium
Yang et al. SYRIAC: The systematic review information automated collection system a data warehouse for facilitating automated biomedical text classification
CN115344664A (en) Medical case text processing method and device, electronic equipment and computer-readable storage medium
KR100781210B1 (en) Method and apparatus of detecting hospital information
CN111966794B (en) Diagnosis and treatment data identification method, system and device
CN116562271B (en) Quality control method and device for electronic medical record, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240311

Address after: # 03-06, Lai Zan Da Building 1, 51 Belarusian Road, Singapore

Applicant after: Alibaba Innovation Co.

Country or region after: Singapore

Address before: Room 01, 45th Floor, AXA Building, 8 Shanton Road, Singapore

Applicant before: Alibaba Singapore Holdings Ltd.

Country or region before: Singapore