CN115270792A - Medical entity identification method and device - Google Patents

Medical entity identification method and device Download PDF

Info

Publication number
CN115270792A
CN115270792A CN202210795182.0A CN202210795182A CN115270792A CN 115270792 A CN115270792 A CN 115270792A CN 202210795182 A CN202210795182 A CN 202210795182A CN 115270792 A CN115270792 A CN 115270792A
Authority
CN
China
Prior art keywords
representation
matrix
entity
text
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210795182.0A
Other languages
Chinese (zh)
Inventor
王亦宁
刘升平
梁家恩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN202210795182.0A priority Critical patent/CN115270792A/en
Publication of CN115270792A publication Critical patent/CN115270792A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a medical entity identification method, which comprises the steps of acquiring an entity to be identified, labeling the entity and an entity label through a special symbol, and constructing an output template of a text generation model according to the entity and the entity label; constructing input and output of a text generation model; inputting a text sequence to be recognized and a first matrix, wherein the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to an output template; coding the first matrix through a coder to obtain coded representation of the text sequence to be recognized; calculating the encoded representation by a decoder to obtain a decoded representation; and training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.

Description

Medical entity identification method and device
Technical Field
The invention relates to the technical field of data processing, in particular to a medical entity identification method and device.
Background
Medical entity identification generally uses a sequence labeling method, a BME label is defined for each character, the characters of the beginning, the middle position and the end of the entity are respectively represented, and an O label represents the characters inside a non-entity; and then training a neural network model to fit the label of each element, finally performing post-processing on the prediction result, and combining the BME labels to obtain a final extraction result.
The problems existing in the prior art are as follows: when the sequence labeling method is used, the text granularity must be characters, and the method cannot process the identification tasks of non-continuous medical entities and nested medical entities.
Disclosure of Invention
The invention aims to provide a medical entity identification method and a device, which are used for solving the problems that text granularity must be characters when a sequence marking method is used in the prior art, and the method cannot process the identification tasks of non-continuous medical entities and nested medical entities,
the invention provides a medical entity identification method in a first aspect, which comprises the following steps:
acquiring an entity to be identified, labeling the entity and an entity label through a special symbol, and constructing an output template of a text generation model according to the entity and the entity label;
constructing input and output of the text generation model; the input is a text sequence to be recognized and a first matrix, and the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after preprocessing the identification result, and the identification result is displayed according to the output template;
coding the first matrix through a coder to obtain a coded representation of a text sequence to be recognized; calculating the encoded representation by a decoder to obtain a decoded representation;
and training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.
In one possible implementation, the first matrix is determined according to the following method:
and preprocessing the text sequence to be recognized through a pre-training language model BART to obtain a first matrix.
In a possible implementation manner, the encoding the text sequence to be recognized by the encoder to obtain the encoded representation of the text sequence to be recognized specifically includes:
by the formula
Figure BDA0003735481800000021
Calculating the coded representation of each word in the text sequence to be recognized;
wherein,
Figure BDA0003735481800000022
representing the encoded representation of the t word in the nth layer,
in a possible implementation manner, the training the text generation model according to the encoded representation and the decoded representation to obtain a final decoded representation specifically includes:
calculating the decoding representation of each word through a first function to obtain a generation probability;
performing matrix transformation on the decoded representation to obtain a first matrix transformation result;
performing matrix transformation on the coded representation to obtain a second matrix transformation result;
calculating the score of a copying mechanism according to the first matrix conversion result and the second matrix conversion result;
calculating a balance factor according to the first matrix conversion result and the second matrix conversion result;
calculating a fusion score according to the balance factor, the score and the generation probability;
determining a word corresponding to the maximum probability as a generation result according to the fusion score;
sequentially combining the generated results of each word to obtain a final decoding representation;
and extracting the recognition result according to the special symbol.
In a possible implementation manner, the calculating, by the first function, the decoded representation for each word to obtain the generation probability specifically includes:
linearly changing the decoding expression through a first function to obtain a linear change result;
and calculating probability distribution according to the linear change result.
In a possible implementation manner, the first matrix is encoded by an encoder to obtain an encoded representation of a text sequence to be recognized; calculating the encoded representation by a decoder to obtain a decoded representation specifically comprising:
the coded representation is expressed by
Figure BDA0003735481800000031
Calculating;
wherein,
Figure BDA0003735481800000032
representing the coded representation of the t-th word sequence in the n-th layer, the top coded representation hN,hNRepresenting the coded representation of all words in the nth layer, vtRepresents the input of the encoder at time t;
decoding is represented by the formula
Figure BDA0003735481800000033
Calculating;
wherein h isNIndicating the hidden state that the encoder has obtained,
Figure BDA0003735481800000034
for decoded representation of the t-th word sequence in the n-th layer, utRepresenting the input to the decoder at time t.
In a second aspect, the present invention provides a medical entity identification apparatus, the apparatus comprising:
the system comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring an entity to be recognized and marking the entity and an entity label through a special symbol;
the output template building module is used for building an output template of a text generation model according to the entity and the entity label;
an input output construction module for constructing input and output of the text generation model; the input is a text sequence to be recognized and a first matrix, and the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to the output template;
the coding and decoding module is used for coding the first matrix through a coder to obtain the coded representation of the text sequence to be identified; calculating the encoded representation by a decoder to obtain a decoded representation;
and the model training module is used for training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.
In a third aspect, the present invention provides a chip system comprising a processor coupled to a memory, the memory storing program instructions, which when executed by the processor implement the medical entity identification method of any one of the first aspect.
In a fourth aspect, the present invention provides a computer readable storage medium having a computer program stored thereon, the computer program being executed by a processor to perform the medical entity identification method of any one of the first aspect.
In a fifth aspect, the invention provides a computer program product for causing a computer to perform the method of identifying a medical entity according to any one of the first aspect when the computer program product is run on the computer.
By applying the entity identification method provided by the invention, the entity identification is modeled into the text generation task by constructing the template, the barrier of identifying by using the sequence marking task is broken through, and meanwhile, the method also integrates the copying mechanism in the pointer network, can directly copy the entity in the original sentence into the template, and can solve the tasks of the discontinuous medical entity and the nested medical entity in the entity identification.
Drawings
Fig. 1 is a schematic flow chart of a medical entity identification method according to an embodiment of the present invention;
FIG. 2 is a diagram of a source sentence and a result;
FIG. 3 is a flowchart of step 140 of FIG. 1;
fig. 4 is a schematic structural diagram of a medical entity identification apparatus according to a second embodiment of the present invention;
fig. 5 is a schematic structural diagram of a chip system according to a third embodiment of the present invention;
FIG. 6 is a schematic diagram of a computer-readable storage medium provided in a fourth embodiment of the present invention;
fig. 7 is a schematic diagram of a computer program product according to a fifth embodiment of the present invention.
Detailed Description
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments.
The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a schematic flow chart of a medical entity identification method according to an embodiment of the present invention; the method is applied to a scenario when identifying a medical entity, as shown in fig. 1, and comprises the following steps:
step 110, acquiring an entity to be identified, labeling the entity and an entity label through a special symbol, and constructing an output template of a text generation model according to the entity and the entity label;
specifically, the entity to be identified is marked by using brackets, and tag information of the entity type is added inside the brackets. Referring to fig. 2, entities include symptoms and time, entity labels are fever, three days, and diarrhea in sequence, fever and diarrhea in the original sentence are marked as symptom labels in the output template, three days in the original sentence are marked as time labels, a special symbol is a bracket, and the labels can be marked using the bracket: the entities are put together so that the recognition result can be obtained by extracting the content in the brackets later.
Step 120, constructing input and output of a text generation model; inputting a text sequence to be recognized and a first matrix, wherein the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to an output template;
specifically, in step 110, the structure of the output template of the text generation model is described, and in step 120, the input and output of the text generation model are described.
For example, X = [ X ]1,x2...,xn]Represents a sequence of input text words to be recognized, and V = [ V ]1,v2,...,vn]And representing a first matrix obtained by preprocessing a text word sequence to be recognized through a generative pre-training language model BART.
The output of the text generation model is the result part in fig. 2, and can be represented as Y = [ Y =1,y2...,ym]A second matrix of the output end can be obtained by preprocessing Y using the pre-training model BART, i.e., U = [ U ])1,u2...,um]. Since Y and X are partially overlapped, the output result can be expressed as entityi=(name:xi...xi+k) Wherein x isi...xi+kIndicating that Y is a word corresponding to X. The Name represents an entity Name, i.e., the entity described above.
Step 130, coding the first matrix through a coder to obtain coded representation of the text sequence to be identified; calculating the encoded representation by a decoder to obtain a decoded representation;
specifically, an encoder is used to encode X to obtain an encoded representation of the input sequence information. Definition of Selfenc() For an encoder computation unit based on the self-attention mechanism, the encoded representation of each word passing through the encoder can be calculated by the following formula:
Figure BDA0003735481800000061
wherein,
Figure BDA0003735481800000062
the coded representation of the t-th word sequence in the n-th layer is represented by an encoder, and the topmost coded representation h can be obtainedN,hNRepresenting coded representations of all words in the nth layer, vtThe input to the encoder at time t is represented, for example, by a vector in the second matrix.
Decoder network dependency hNAnd a convolution attention mechanism module to obtain a decoded representation. Definition of Selfdec() For the decoder computation unit based on self-attention, the output hidden state of the decoder at the time t
Figure BDA0003735481800000063
Calculated from the following formula:
Figure BDA0003735481800000064
wherein h isNRepresenting the hidden state obtained by the encoder, wherein the hidden state is either the encoded representation or the decoded representation, for example, the hidden state obtained by the encoder is the encoded representation, the hidden state obtained by the decoder is the decoded representation,
Figure BDA0003735481800000065
for decoded representation of the t-th word sequence in the n-th layer, utRepresenting the input to the decoder at time t,may be the input to the decoder at time t, and may be, for example, a vector in the first matrix.
Step 140, training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.
Therein, referring to fig. 3, step 140 comprises the steps of:
1401, calculating the decoding representation of each word through a first function to obtain a generation probability;
the decoding representation is subjected to linear change through a first function, and a linear change result is obtained; and calculating probability distribution according to the linear change result.
In particular, the uppermost hidden state of the decoder output
Figure BDA0003735481800000071
After a layer of linear transformation, the following is shown:
Figure BDA0003735481800000072
wherein, OtThe first function is the softmax function, which is an input representation of the softmax layer.
For linear variation result OtAnd outputting the probability distribution of each time t in the target word set combination Z through a first function softmax. Where the target vocabulary set Z refers to the candidate set of all words generated by the model output, and softmax is then used to compute the probability distribution in these candidate word sets.
Probgen=softmax(W·Ot+b)
Wherein ProbgenFor probability distribution, W and b are training parameters of the model, and the dimension of W is the same as the dimension of Z in the word list set.
1402, performing matrix transformation on the decoded representation to obtain a first matrix transformation result;
specifically, the current hidden layer state of the decoder is obtained
Figure BDA0003735481800000073
Performing matrix transformation to obtain a first matrix transformation result, as shown in the following formula:
Figure BDA0003735481800000074
wherein q istIn order to be the result of the first matrix conversion,
Figure BDA0003735481800000075
is a trainable first parameter matrix used to perform a first matrix transformation.
Step 1403, performing matrix transformation on the coded representation to obtain a second matrix transformation result;
specifically, the highest hidden layer state h of the encoderNPerforming matrix transformation to obtain a second matrix transformation result, which is shown as the following formula:
Figure BDA0003735481800000076
Figure BDA0003735481800000077
wherein K and V are the second matrix transformation results,
Figure BDA0003735481800000078
a second parameter matrix which can be trained and used for the second matrix transformation,
Figure BDA0003735481800000079
is a trainable third parameter matrix used to perform the second matrix transformation.
Step 1404, calculating a score of a copy mechanism according to the first matrix conversion result and the second matrix conversion result;
in particular, q is obtainedtK, V calculate the score of the copy mechanism:
Figure BDA0003735481800000081
wherein ProbcopyAre scores.
Step 1405, calculating a balance factor according to the first matrix conversion result and the second matrix conversion result;
wherein the balance factor is calculated by the following formula:
Figure BDA0003735481800000082
wherein, q obtained by each calculation is addedtK, V are summed and then W is summedTMultiplying, and calculating by a second function and a sigmoid function to obtain a balance factor
Figure BDA0003735481800000083
WTIs a trainable transformation matrix.
Step 1406, calculating a fusion score according to the balance factor, the score and the generation probability;
specifically, the fusion score is calculated according to the following formula;
Figure BDA0003735481800000084
wherein ProbfinalAnd (4) scoring the fusion, namely, finally obtaining a fusion score for each word sequence.
Step 1407, determining the word corresponding to the maximum probability as a generation result according to the fusion score;
specifically, the word corresponding to the maximum probability is selected as the generation result of the time t, and the following formula is shown:
yt=Max(Probfinal)
wherein, ytIs the result of the generation at time t.
Step 1408, combining the generated results of each word in turn to obtain a final decoded representation;
and step 1409, extracting the identification result according to the special symbol.
Specifically, according to 1401 to 1407, the generation results of a plurality of words are sequentially obtained, and the generation results of the words are combined to obtain the final decoded representation, for example, the final decoded representation may be represented as Y = [ Y ]1,y2...,ym]Wherein, y1,y2...,ymThe combination of the multiple generated results obtained in sequence according to steps 1401-1407 may be referred to as a final decoded representation, and for the final decoded representation, the corresponding identification result entity may be extracted through bracketsi
By applying the entity identification method provided by the invention, the entity identification is modeled into a text generation task by constructing a template, the barrier of identifying by using a sequence marking task is broken, and meanwhile, the method also integrates a copy mechanism in a pointer network, so that the entity in the original sentence can be directly copied into the template, and the tasks of a discontinuous medical entity and a nested medical entity in the entity identification can be solved.
Example two
An embodiment of the present invention provides a medical entity identification apparatus, as shown in fig. 4, the apparatus includes: the system comprises an acquisition module 410, an output template construction module 420, an input and output construction module 430, a coding and decoding module 440 and a model training module 450.
The obtaining module 410 is configured to obtain an entity to be identified, and label the entity and an entity tag through a special symbol;
the output template building module 420 is used for building an output template of the text generation model according to the entity and the entity label;
the input and output construction module 430 is used for constructing the input and output of the text generation model; the input is a text sequence to be identified and a first matrix, and the first matrix is obtained after the text to be identified is preprocessed; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to an output template;
the encoding and decoding module 440 is configured to encode the first matrix through an encoder to obtain an encoded representation of the text sequence to be identified; calculating the encoded representation by a decoder to obtain a decoded representation;
the model training module 450 is configured to train the text generation model according to the encoded representation and the decoded representation to obtain a final decoded representation.
Further, the input output construction module 430 determines the first matrix according to the following method: and preprocessing the text sequence to be recognized through a pre-training language model BART to obtain a first matrix.
Further, the encoding and decoding module 440 encodes the text sequence to be recognized through the encoder to obtain the encoded representation of the text sequence to be recognized specifically includes:
by the formula
Figure BDA0003735481800000091
Calculating the code representation of each word in the text sequence to be recognized; wherein,
Figure BDA0003735481800000092
representing the encoded representation of the t word in the nth layer,
further, the training of the text generation model by the model training module 450 according to the encoding representation and the decoding representation to obtain the final decoding representation specifically includes: calculating the decoding representation of each word through a first function to obtain a generation probability; performing matrix transformation on the decoding representation to obtain a first matrix transformation result; performing matrix transformation on the coded representation to obtain a second matrix transformation result; calculating the score of a copying mechanism according to the first matrix conversion result and the second matrix conversion result; calculating a balance factor according to the first matrix conversion result and the second matrix conversion result; calculating a fusion score according to the balance factor, the score and the generation probability; determining a word corresponding to the maximum probability as a generation result according to the fusion score; sequentially combining the generated results of each word to obtain a final decoding representation; and extracting the recognition result according to the special symbol.
Further, the calculating, by the model training module 450 through the first function, each word decoding representation to obtain the generation probability specifically includes: linearly changing the decoding representation through a first function to obtain a linear change result; and calculating probability distribution according to the linear change result.
Further, the encoding and decoding module 440 encodes the first matrix through an encoder to obtain an encoded representation of the text sequence to be identified; calculating the encoded representation by a decoder to obtain a decoded representation specifically comprising:
the code is expressed by formula
Figure BDA0003735481800000101
Calculating;
wherein,
Figure BDA0003735481800000102
representing the coded representation of the t-th word sequence in the n-th layer, the top coded representation hN,hNRepresenting coded representations of all words in the nth layer, vtRepresents the input of the encoder at time t;
decoding is expressed by the formula
Figure BDA0003735481800000103
Calculating;
wherein h isNIndicating the hidden state that the encoder has obtained,
Figure BDA0003735481800000104
for decoded representation of the t-th word sequence in the n-th layer, utRepresenting the input to the decoder at time t.
The apparatus provided in the second embodiment of the present invention can execute the method steps in the first embodiment of the method, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can all be implemented in the form of software invoked by a processing element; or can be realized in a hardware mode completely; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the determining module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and may be called by a processing element of the apparatus to execute the functions of the determining module. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, the steps of the above method or the above modules may be implemented by hardware integrated logic circuits in a processor element or instructions in software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when some of the above modules are implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call the program code. As another example, these modules may be integrated together and implemented in the form of a System-on-a-chip (SOC).
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the application occur wholly or partially upon loading and execution of the computer program instructions on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, bluetooth, microwave, etc.) means.
EXAMPLE III
A third embodiment of the present invention provides a chip system, as shown in fig. 5, which includes a processor, where the processor is coupled to a memory, and the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the chip system implements any one of the medical entity identification methods provided in the first embodiment.
Example four
A fourth embodiment of the present invention provides a computer-readable storage medium, as shown in fig. 6, which includes a program or instructions, and when the program or instructions are run on a computer, the method for identifying a medical entity according to any one of the methods provided in the first embodiment is implemented.
EXAMPLE five
Embodiment five provides a computer program product comprising instructions, as shown in fig. 7, which when run on a computer, cause the computer to perform any one of the medical entity identification methods provided in embodiment one.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only examples of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A medical entity identification method, the method comprising:
acquiring an entity to be identified, labeling the entity and an entity label through a special symbol, and constructing an output template of a text generation model according to the entity and the entity label;
constructing input and output of the text generation model; the input is a text sequence to be recognized and a first matrix, and the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to the output template;
coding the first matrix through a coder to obtain a coded representation of a text sequence to be recognized; calculating the encoded representation by a decoder to obtain a decoded representation;
and training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.
2. The method of claim 1, wherein the first matrix is determined according to the following method:
and preprocessing the text sequence to be recognized through a pre-training language model BART to obtain a first matrix.
3. The method according to claim 1, wherein the encoding the text sequence to be recognized by the encoder to obtain the encoded representation of the text sequence to be recognized specifically comprises:
by the formula
Figure FDA0003735481790000011
Calculating the code representation of each word in the text sequence to be recognized;
wherein,
Figure FDA0003735481790000012
representing the encoded representation of the t word in the nth layer,
4. the method of claim 1, wherein the training the text generation model according to the encoded representation and the decoded representation to obtain a final decoded representation specifically comprises:
calculating the decoding representation of each word through a first function to obtain a generation probability;
performing matrix transformation on the decoded representation to obtain a first matrix transformation result;
performing matrix transformation on the coded representation to obtain a second matrix transformation result;
calculating the score of a copying mechanism according to the first matrix conversion result and the second matrix conversion result;
calculating a balance factor according to the first matrix conversion result and the second matrix conversion result;
calculating a fusion score according to the balance factor, the score and the generation probability;
determining a word corresponding to the maximum probability as a generation result according to the fusion score;
sequentially combining the generated results of each word to obtain a final decoding representation;
and extracting the recognition result according to the special symbol.
5. The method of claim 4, wherein said calculating the decoded representation for each word by the first function to obtain the generation probability specifically comprises:
linearly changing the decoding expression through a first function to obtain a linear change result;
and calculating probability distribution according to the linear change result.
6. The method according to claim 1, wherein the first matrix is encoded by an encoder to obtain an encoded representation of a text sequence to be recognized; calculating the encoded representation by a decoder to obtain a decoded representation specifically comprising:
the coded representation is expressed by
Figure FDA0003735481790000021
Calculating;
wherein,
Figure FDA0003735481790000022
representing the coded representation of the t-th word sequence in the n-th layer, the top coded representation hN,hNRepresenting coded representations of all words in the nth layer, vtRepresents the input of the encoder at time t;
decodingIs expressed by formula
Figure FDA0003735481790000023
Calculating;
wherein h isNIndicating the hidden state that the encoder has obtained,
Figure FDA0003735481790000024
for decoded representation of the t-th word sequence in the n-th layer, utRepresenting the input to the decoder at time t.
7. A medical entity identification apparatus, characterized in that the apparatus comprises:
the system comprises an acquisition module, a recognition module and a recognition module, wherein the acquisition module is used for acquiring an entity to be recognized and marking the entity and an entity label through a special symbol;
the output template construction module is used for constructing an output template of a text generation model according to the entity and the entity label;
an input-output construction module for constructing an input and an output of the text generation model; the input is a text sequence to be recognized and a first matrix, and the first matrix is obtained after preprocessing the text to be recognized; the output is an identification result and a second matrix, the second matrix is obtained after the identification result is preprocessed, and the identification result is displayed according to the output template;
the coding and decoding module is used for coding the first matrix through a coder to obtain the coded representation of the text sequence to be identified; calculating the encoded representation by a decoder to obtain a decoded representation;
and the model training module is used for training the text generation model according to the coding representation and the decoding representation to obtain a final decoding representation.
8. A chip system comprising a processor coupled to a memory, the memory storing program instructions that, when executed by the processor, implement the medical entity identification method of any of claims 1-6.
9. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program is executed by a processor for performing the method for identifying a medical entity according to any one of claims 1-6.
10. A computer program product, characterized in that it causes a computer to carry out the medical entity identification method according to any one of claims 1-6, when said computer program product is run on the computer.
CN202210795182.0A 2022-07-07 2022-07-07 Medical entity identification method and device Pending CN115270792A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210795182.0A CN115270792A (en) 2022-07-07 2022-07-07 Medical entity identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210795182.0A CN115270792A (en) 2022-07-07 2022-07-07 Medical entity identification method and device

Publications (1)

Publication Number Publication Date
CN115270792A true CN115270792A (en) 2022-11-01

Family

ID=83763546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210795182.0A Pending CN115270792A (en) 2022-07-07 2022-07-07 Medical entity identification method and device

Country Status (1)

Country Link
CN (1) CN115270792A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738985A (en) * 2023-08-11 2023-09-12 北京亚信数据有限公司 Standardized processing method and device for medical text

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116738985A (en) * 2023-08-11 2023-09-12 北京亚信数据有限公司 Standardized processing method and device for medical text
CN116738985B (en) * 2023-08-11 2024-01-26 北京亚信数据有限公司 Standardized processing method and device for medical text

Similar Documents

Publication Publication Date Title
CN111985239B (en) Entity identification method, entity identification device, electronic equipment and storage medium
CN110866401A (en) Chinese electronic medical record named entity identification method and system based on attention mechanism
CN112329465A (en) Named entity identification method and device and computer readable storage medium
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN110309511B (en) Shared representation-based multitask language analysis system and method
US20180365594A1 (en) Systems and methods for generative learning
CN114580424B (en) Labeling method and device for named entity identification of legal document
CN111428470B (en) Text continuity judgment method, text continuity judgment model training method, electronic device and readable medium
CN113887229A (en) Address information identification method and device, computer equipment and storage medium
CN114492661B (en) Text data classification method and device, computer equipment and storage medium
CN110852066A (en) Multi-language entity relation extraction method and system based on confrontation training mechanism
CN111814479A (en) Enterprise short form generation and model training method and device
CN115270792A (en) Medical entity identification method and device
CN117251545A (en) Multi-intention natural language understanding method, system, equipment and storage medium
CN114707518B (en) Semantic fragment-oriented target emotion analysis method, device, equipment and medium
CN116127978A (en) Nested named entity extraction method based on medical text
CN116362242A (en) Small sample slot value extraction method, device, equipment and storage medium
CN113033155B (en) Automatic coding method for medical concepts by combining sequence generation and hierarchical word lists
CN114925175A (en) Abstract generation method and device based on artificial intelligence, computer equipment and medium
CN114911940A (en) Text emotion recognition method and device, electronic equipment and storage medium
CN115240712A (en) Multi-mode-based emotion classification method, device, equipment and storage medium
CN114372467A (en) Named entity extraction method and device, electronic equipment and storage medium
CN117371447A (en) Named entity recognition model training method, device and storage medium
CN108921911B (en) Method for automatically converting structured picture into source code
CN117932487B (en) Risk classification model training and risk classification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination