CN117271766A

CN117271766A - Medical text analysis method and device, storage medium and electronic equipment

Info

Publication number: CN117271766A
Application number: CN202311192758.5A
Authority: CN
Inventors: 顾冬冬; 王晟; 薛忠
Original assignee: Shanghai United Imaging Intelligent Healthcare Co Ltd
Current assignee: Shanghai United Imaging Intelligent Healthcare Co Ltd
Priority date: 2023-09-14
Filing date: 2023-09-14
Publication date: 2023-12-22

Abstract

The specification discloses a medical text analysis method, a device, a storage medium and electronic equipment, wherein the method comprises the steps of obtaining an original medical text of a user, inputting the original medical text into a pre-trained text normalization model to obtain a standard medical text output by the text normalization model, inputting the standard medical text into a pre-trained text analysis model to obtain an analysis result aiming at the standard medical text output by the text analysis model. According to the method, medical text standardization is achieved through the model, and the medical text standardization efficiency is improved. In addition, the standard medical text is input into a pre-trained text analysis model to obtain an analysis result of the standard medical text according to the text analysis model. And the medical text analysis is realized through the model, so that the efficiency and the accuracy of the medical text analysis are improved.

Description

Medical text analysis method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of medical treatment, and in particular, to a method and apparatus for analyzing a medical text, a storage medium, and an electronic device.

Background

With the development of natural language processing technology, models of natural language processing are applied to various fields. For example, in the medical field, there may be a large difference in descriptions of a disorder by different users for that disorder. That is, there is a large difference in medical text that is obtained by different users describing the same condition. Then there will be a large difference in the analysis results of outputting the medical text to the text analysis model. Therefore, how to normalize medical texts is a problem to be solved. In addition, how to improve the accuracy and efficiency of medical text analysis is a problem to be solved in analyzing medical texts.

Based on this, the present specification provides a method of analyzing medical text.

Disclosure of Invention

The present disclosure provides a method, an apparatus, a storage medium, and an electronic device for analyzing a medical text, so as to at least partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides a method for analyzing medical text, comprising the following steps:

acquiring an original medical text of a user;

inputting the original medical text into a pre-trained text normalization model to obtain a standard medical text output by the text normalization model;

inputting the standard medical text into a pre-trained text analysis model, and obtaining an analysis result aiming at the standard medical text, which is output by the text analysis model.

Optionally, the original medical text includes image-seen, case text.

Optionally, the text normalization model comprises a large scale language model; the text analysis model includes a large-scale language model.

Optionally, training the text normalization model specifically includes:

acquiring a non-standardized medical text;

determining standard medical texts of the non-standardized medical texts according to the standardized medical corpus, and taking the standard medical texts as labels of the non-standardized medical texts;

inputting the non-standardized medical text into a text normalization model, and determining a sample standard medical text output by the text normalization model;

training the text normalization model according to the sample standard medical text and the label of the non-standardized medical text.

Optionally, training the text analysis model specifically includes:

acquiring a sample medical text;

expanding a pre-stored knowledge graph according to the relation among all medical entities in the sample medical text;

training the text analysis model according to the expanded knowledge graph and the sample medical text.

Optionally, acquiring the sample medical text specifically includes:

acquiring a sample original medical text;

and inputting the sample original medical text into a pre-trained text normalization model to obtain a sample medical text output by the text normalization model.

Optionally, training the text analysis model according to the expanded knowledge graph and the sample medical text, specifically including:

according to the expanded knowledge graph, determining a sample analysis result of the sample medical text as a label of the sample medical text;

inputting the sample medical text into a text analysis model to obtain an analysis result which is output by the text analysis model and aims at the sample medical text;

training the text analysis model according to the analysis result of the sample medical text and the label of the sample medical text, which are output by the text analysis model.

Optionally, training the text analysis model according to the analysis result of the sample medical text and the label of the sample medical text, which are output by the text analysis model, specifically including:

determining the difference between the analysis result, output by the text analysis model, for the sample medical text and the label of the sample medical text;

and training the text analysis model according to the difference.

The present specification provides an analysis device of medical text, the device comprising:

the original medical text acquisition module is used for acquiring the original medical text of the user;

the standard medical text determining module is used for inputting the original medical text into a pre-trained text normalization model to obtain a standard medical text output by the text normalization model;

and the analysis module is used for inputting the standard medical text into a pre-trained text analysis model and obtaining an analysis result which is output by the text analysis model and aims at the standard medical text.

Optionally, the original medical text includes image-seen, case text.

Optionally, the apparatus further comprises:

the text normalization model training module is used for acquiring non-standardized medical texts; determining standard medical texts of the non-standardized medical texts according to the standardized medical corpus, and taking the standard medical texts as labels of the non-standardized medical texts; inputting the non-standardized medical text into a text normalization model, and determining a sample standard medical text output by the text normalization model; training the text normalization model according to the sample standard medical text and the label of the non-standardized medical text.

Optionally, the apparatus further comprises:

the text analysis model training module is used for acquiring a sample medical text; expanding a pre-stored knowledge graph according to the relation among all medical entities in the sample medical text; training the text analysis model according to the expanded knowledge graph and the sample medical text.

Optionally, the text analysis model training module is specifically configured to obtain a sample original medical text; and inputting the sample original medical text into a pre-trained text normalization model to obtain a sample medical text output by the text normalization model.

Optionally, the text analysis model training module is specifically configured to determine, according to the extended knowledge graph, a sample analysis result of the sample medical text, as a label of the sample medical text; inputting the sample medical text into a text analysis model to obtain an analysis result which is output by the text analysis model and aims at the sample medical text; training the text analysis model according to the analysis result of the sample medical text and the label of the sample medical text, which are output by the text analysis model.

Optionally, the text analysis model training module is specifically configured to determine a difference between an analysis result for the sample medical text output by the text analysis model and a label of the sample medical text; and training the text analysis model according to the difference.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the above described method of analyzing medical text.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above method of analyzing medical text when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

the analysis method of medical texts provided in the present specification can be seen in that the original medical text of a user is converted into standard medical text by inputting the original medical text into a text normalization model trained in advance. The medical text normalization is realized through the model, and the medical text normalization efficiency is improved. In addition, the standard medical text is input into a pre-trained text analysis model to obtain an analysis result of the standard medical text according to the text analysis model. And the medical text analysis is realized through the model, so that the efficiency and the accuracy of the medical text analysis are improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. Attached at

In the figure:

FIG. 1 is a flow chart of a method for analyzing medical text provided in the present specification;

FIG. 2 is a flow chart of analysis of medical text provided in the present specification;

FIG. 3 is a schematic diagram of an analysis device for medical texts provided in the present specification;

fig. 4 is a schematic view of the electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present application based on the embodiments herein.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a flow chart of a method for analyzing medical text provided in the present specification, specifically including the following steps:

s100: the original medical text of the user is obtained.

The natural language processing model can process texts generally, information in the texts is extracted, reasoning prediction is carried out according to the information in the texts, but aiming at texts with stronger professional fields, the model needs to be trained, and then the text information is acquired, so that the accuracy of the model in acquiring the text information is improved. For example, the medical field requires specialized language models to understand medical documents and terms, while the legal field requires specialized models to understand legal terms and legal logic. When extracting medical text information, because medical texts of different user description symptoms have larger difference, if the information in the medical text is required to be acquired accurately, the medical text needs to be normalized. Accordingly, the present specification provides a method of analyzing medical text.

The execution subject of the specification can be a server for normalizing the medical text, a server for training the model, or other electronic devices for normalizing the medical text. For convenience of explanation, the present description will be made with the server as the main body of execution.

In one or more embodiments of the present description, since the original medical text of the user needs to be normalized, the server needs to first acquire the original medical text of the user. Wherein the original medical text comprises texts related to medicine such as image view, pathology text, case text, inspection report, etc

S102: inputting the original medical text into a pre-trained text normalization model to obtain a standard medical text output by the text normalization model.

Fig. 2 is a schematic flow chart of analysis of a medical text provided in the present specification, as shown in fig. 2.

In order to acquire more accurate medical information in the original medical text, the original medical text can be input into a pre-trained text normalization model to obtain a standard medical text output by the text normalization model, wherein the standard medical text is described by using standard sentences while the semantics in the original medical text are not misinterpreted, and the standards comprise no ambiguity of the semantics and no grammar errors of the sentences. Standard medical text may also label what is stated in the original medical text, what is non-judgment, what is descriptive of the property, etc. For example, the original medical text is "i am headache today, possibly feverish. "standard medical text is" user, symptoms: headache, non-judgment: fever with fever.

In addition, the text normalization model only performs normalization processing on the original medical text, and does not perform reasoning prediction on the original medical text, and the text normalization model comprises a large-scale language model, and the parameter quantity of the large-scale language model is at least in the order of hundred million.

S104: inputting the standard medical text into a pre-trained text analysis model, and obtaining an analysis result aiming at the standard medical text, which is output by the text analysis model.

After normalizing the original medical text, the server may further input the standard medical text into a pre-trained text analysis model, and obtain an analysis result for the standard medical text output by the text analysis model. Wherein the text analysis model comprises a large-scale language model. Because the standard medical text is obtained by normalizing the original medical text, the standard medical text better reflects the structure and the semantic relation of medical information, and then the server fully considers the context and the semantic relation of the medical text through the text analysis model to obtain the analysis result of the standard medical text, thereby improving the accuracy of the text analysis result.

In addition, as the text analysis model further utilizes the medical knowledge graph when analyzing the standard medical text, the accuracy of the text analysis result is further improved, and the medical knowledge graph is constructed according to medical language text corpus, wherein the medical language text corpus comprises medical books, medical reports and the like. The medical knowledge graph is constructed by taking entities in the medical language text corpus as nodes and the relationship between the entities as edges, and comprises information such as symptoms, image manifestations, diseases and the like. For example, a medical knowledge graph can be constructed by taking fever and dizziness as nodes and the causal relationship between fever and dizziness as edges.

Based on the analysis method of the medical text shown in fig. 1, the original medical text of the user is converted into standard medical text by inputting the original medical text into a pre-trained text normalization model. The medical text normalization is realized through the model, and the medical text normalization efficiency is improved. In addition, the standard medical text is input into a pre-trained text analysis model to obtain an analysis result of the standard medical text according to the text analysis model. And the medical text analysis is realized through the model, so that the efficiency and the accuracy of the medical text analysis are improved.

The specification also provides a training method of the text normalization model.

In particular, the server obtains non-standardized medical text, including non-standardized medical diagnostic reports and daily conversations of the user. And determining the standard medical text of the non-standardized medical text according to the standardized medical corpus, and taking the standard medical text as a label of the non-standardized medical text. The standardized medical corpus comprises medical corpuses in standard expression forms such as medical books. The server inputs the non-standardized medical text into a text normalization model, and determines a sample standard medical text output by the text normalization model. Training the text normalization model according to the sample standard medical text and the label of the non-standardized medical text.

According to the labels of the sample standard medical text and the non-standardized medical text, when the text normalization model is trained, the server can determine the difference between the labels of the sample standard medical text and the non-standardized medical text, train the text normalization model by taking the minimized difference as a training target, and also can set a training threshold, and when the difference is smaller than the training threshold, the training of the text normalization model is stopped, which is not limited in the specification.

The specification also provides a training method of the text analysis model, which uses the execution subject as a server for description. The text analysis model is a large language model and the text analysis model is already pre-trained. After any text is entered into the text analysis model, the text analysis model will output text, but the text analysis model may not have analysis functionality because the text analysis model is not training for adjustment. For example, the input is "I fever", the output may be "good rest, drink more water", and the expected analysis result should be the analysis result of "new crown infection positive" or other diseases possibly suffered from, and thus, the text analysis model after pre-training is subjected to adjustment training.

Specifically, the server firstly acquires a sample medical text, and in order to further improve the accuracy of an analysis result output by the text analysis model, the server can acquire a sample original medical text, input the sample original medical text into a pre-trained text normalization model, and obtain the sample medical text output by the text normalization model.

And then expanding a pre-stored knowledge graph according to the relation among the medical entities in the sample medical text, wherein the pre-stored knowledge graph is a medical knowledge graph which is constructed by taking the entities in the medical language text corpus as nodes and the relation among the entities as sides, and the medical knowledge graph comprises information such as symptoms, image manifestations, diseases and the like. The pre-stored knowledge graph is expanded to improve the accuracy of the analysis result obtained by the subsequent text analysis model, and the more medical knowledge the knowledge graph comprises, the higher the accuracy of the analysis result obtained by the trained text analysis model is likely to be.

And finally, training the text analysis model according to the expanded knowledge graph and the sample medical text. And determining a sample analysis result of the sample medical text as a label of the sample medical text according to the expanded knowledge graph. This is because while pre-training can learn generic language features, downstream tasks typically require more specialized and fine-grained semantic understanding, which requires label data to be supervised. In addition, the medical knowledge graph not only comprises symptoms and corresponding diseases, but also comprises other information such as image representation, and the like, and because the medical knowledge graph is obtained according to professional medical texts such as medical books, the sample analysis result obtained according to the medical knowledge graph is accurate and can be used as a label of a sample medical text to adjust and train the text analysis model.

When the labels are obtained according to the knowledge graph, the labels can be determined according to keyword matching. For example, the sample medical text is "fever of user, sore throat", and then keyword matching can be performed in the medical knowledge graph according to "fever", "sore throat", and the label "positive for new crown infection" can be obtained. That is, the server determines a first entity in the sample medical text, and determines a second entity matching the first entity in the expanded knowledge-graph according to the first entity. And determining a third entity of a preset category as a sample analysis result of the sample medical text according to the second entity in the expanded knowledge graph, wherein the preset category comprises a disease name, and the third entity and the second entity are directly or indirectly connected in the expanded knowledge graph, namely the third entity and the second entity have correlation. Matching means that the similarity of the word vector of the text of the first entity and the word vector of the text of the second entity reaches a preset value, and/or the first entity is identical to the second entity.

After the labels are acquired, the sample medical text is input into a text analysis model, an analysis result, output by the text analysis model, of the sample medical text is acquired, the difference between the analysis result, output by the text analysis model, of the sample medical text and the labels of the sample medical text is determined, and the text analysis model is trained according to the difference. The server may minimize the difference as a training target, train the text analysis model, and set a difference threshold, and stop training the text analysis model when the difference is smaller than the difference threshold, which is not limited in this specification. The text analysis results include inferential predictions of the medical text, which may assist the physician in treating the user. Training the text analysis model through the labels obtained by the expanded knowledge graph and the sample medical text to enable the text analysis model to learn the knowledge of the expanded knowledge graph, and analyzing the input text by the trained text analysis model to obtain the symptoms corresponding to the input text. The output of the text analysis model may be one or more conditions and corresponding probabilities corresponding to the input text. For example, the output may be "new crown infection positive 85%, influenza 5%, … …", which is not limited in this specification.

The above method for analyzing medical texts provided for one or more embodiments of the present specification further provides a corresponding apparatus for analyzing medical texts based on the same concept, as shown in fig. 3.

Fig. 3 is a schematic diagram of an analysis device for medical texts provided in the present specification, the device comprising:

the original medical text acquisition module 300 is used for acquiring an original medical text of a user;

the standard medical text determining module 302 is configured to input the original medical text into a pre-trained text normalization model, and obtain a standard medical text output by the text normalization model;

and the analysis module 304 is used for inputting the standard medical text into a pre-trained text analysis model to obtain an analysis result which is output by the text analysis model and aims at the standard medical text.

Optionally, the original medical text includes image-seen, case text.

Optionally, the apparatus further comprises:

a text normalization model training module 306 for obtaining non-normalized medical text; determining standard medical texts of the non-standardized medical texts according to the standardized medical corpus, and taking the standard medical texts as labels of the non-standardized medical texts; inputting the non-standardized medical text into a text normalization model, and determining a sample standard medical text output by the text normalization model; training the text normalization model according to the sample standard medical text and the label of the non-standardized medical text.

Optionally, the apparatus further comprises:

a text analysis model training module 308 for obtaining a sample medical text; expanding a pre-stored knowledge graph according to the relation among all medical entities in the sample medical text; training the text analysis model according to the expanded knowledge graph and the sample medical text.

Optionally, the text analysis model training module 308 is specifically configured to obtain a sample original medical text; and inputting the sample original medical text into a pre-trained text normalization model to obtain a sample medical text output by the text normalization model.

Optionally, the text analysis model training module 308 is specifically configured to determine, according to the extended knowledge graph, a sample analysis result of the sample medical text as a label of the sample medical text; inputting the sample medical text into a text analysis model to obtain an analysis result which is output by the text analysis model and aims at the sample medical text; training the text analysis model according to the analysis result of the sample medical text and the label of the sample medical text, which are output by the text analysis model.

Optionally, the text analysis model training module 308 is specifically configured to determine a difference between an analysis result for the sample medical text output by the text analysis model and a label of the sample medical text; and training the text analysis model according to the difference.

The present specification also provides a computer readable storage medium storing a computer program operable to perform the above-described method of analyzing medical text provided in fig. 1.

The present specification also provides a schematic structural diagram of the electronic device shown in fig. 4. At the hardware level, as shown in fig. 4, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile storage, and may of course include hardware required by other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the method of analyzing medical text described above with respect to fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present application.

Claims

1. A method of analyzing medical text, the method comprising:

acquiring an original medical text of a user;

2. The method of claim 1, wherein the original medical text comprises image-wise, case text.

3. The method of claim 1, wherein the text normalization model comprises a large scale language model; the text analysis model includes a large-scale language model.

4. The method of claim 1, wherein training the text normalization model specifically comprises:

acquiring a non-standardized medical text;

5. The method of claim 1, wherein training the text analysis model comprises:

acquiring a sample medical text;

6. The method of claim 5, wherein obtaining the sample medical text comprises:

acquiring a sample original medical text;

7. The method of claim 5, wherein training the text analysis model based on the expanded knowledge-graph and the sample medical text, comprises:

8. The method of claim 5, wherein training the text analysis model based on the analysis result for the sample medical text and the label of the sample medical text output by the text analysis model specifically comprises:

and training the text analysis model according to the difference.

9. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-8.

10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any of the preceding claims 1-8 when executing the program.