CN109670179A

CN109670179A - Case history text based on iteration expansion convolutional neural networks names entity recognition method

Info

Publication number: CN109670179A
Application number: CN201811563980.0A
Authority: CN
Inventors: 田珂珂; 印鉴; 高静
Original assignee: Guangdong Heng Electrical Information Polytron Technologies Inc; National Sun Yat Sen University
Current assignee: Guangdong Heng Electrical Information Polytron Technologies Inc; National Sun Yat Sen University
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2019-04-23
Anticipated expiration: 2038-12-20
Also published as: CN109670179B

Abstract

The present invention provides a kind of case history text name entity recognition method based on iteration expansion convolutional neural networks, this method is named Entity recognition in medical electronics medical record data collection CCKS2017, input one section of Chinese electronic health record text, use iteration expansion convolutional neural networks and condition random field as model framework, use Chinese radical as feature, to extract the name entity in text, such as disease name, detection methods.

Description

Case history text based on iteration expansion convolutional neural networks names entity recognition method

Technical field

The present invention relates to natural language processings and clinical relevant fields, more particularly, to one kind based on iteration expansion volume The case history text of product neural network names entity recognition method.

Background technique

In recent years, with the development of big data and computer technology, more and more medical institutions start using electronics disease Go through system.Electronic medical record system is medicine special-purpose software.Hospital records patient assessment's by electronic health record in a manner of electronic Information, comprising: medical history, checks inspection result, doctor's advice, operation record, nursing record etc. at progress note, wherein existing structure Change information, also has non-structured free text, there are also Figure and Images.

With the development of artificial intelligence technology, many team begin trying artificial intelligence technology being used for medical field, with As a kind of medical assistance means.Electronic health record is as a kind of important medical data, and it comprises many non-structured texts. Analysis to unstructured case history text is to allow computer understanding case history, the basis using case history.Based on the structure to case history Change, the relationship and its probability between multiple knowledge points such as symptom, disease, drug, inspection inspection can be calculated, construct medical treatment neck The knowledge mapping in domain advanced optimizes the work of doctor.

The structuring of case history text, a kind of important means name Entity recognition.I.e. given one section of medical text, is extracted The medicine entity of specified type out, and they are referred in the classification pre-defined, classification include symptom, physical feeling, Treatment, disease, inspection item etc..Such as: " for patient by complex treatment, neck-shoulder pain symptom is substantially reduced ", traditional Chinese medicine is real Body includes " shoulder neck " (physical feeling), " pain " (symptom).

The name Entity recognition of medical domain is different from general domain, and the main distinction is as follows: (1) many of medical field Professional term and rarely used word, such as " loratadine tablet ", current Chinese word segmentation tool cannot be segmented well, thus can shadow Ring subsequent recognition effect.(2) part entity title is longer, such as " Cerebrolysin Vial nourishing brain cell " (treatment), part mould Type has been difficult to set up longer Context-dependent.

For first problem, it is contemplated that participle effect of the existing participle tool on medicine text is poor, herein I No longer segmented, directly Chinese character is operated.On the one hand other portions of erroneous effects model caused by being avoided that because of participle Point, model vocabulary size is on the other hand also reduced, parameter is reduced, avoids over-fitting.In addition, for case history text, it may appear that big Amount has the character of specific radical, such as chest, liver, spleen, lung human organ are all by " moon " word, in addition " cancer, acute diseases such as cholera and sunstroke, hemorrhoid, phlegm " etc. with " Epileptic " is the word of radical, all related to disease or symptom, therefore we are input to radical as feature in model, to alleviate life The problems such as rare word.For Second Problem, it is contemplated that enabling model to read in long range using expansion convolutional neural networks Hereafter, too big without regard to convolution kernel is made.To sum up, we have proposed based on expansion convolutional neural networks and Chinese radical feature Case history name entity recognition method.

Summary of the invention

The present invention provides a kind of case history that convolutional neural networks are expanded for extracting the name entity in text based on iteration Text names entity recognition method.

In order to reach above-mentioned technical effect, technical scheme is as follows:

A kind of case history text name entity recognition method based on iteration expansion convolutional neural networks, comprising the following steps:

S1: the model for naming the iteration of Entity recognition to expand convolutional neural networks and condition random field is established；

S2: the loss function of model is established；

S3: the training of model is carried out, and is tested on test set.

Further, the detailed process of the step S1 is:

S11: building Embedding, since model will handle text, and text cannot directly be handled by model, need elder generation Text conversion is indicated at vector, i.e., with Embedding layers of completion, the vector including word is indicated and the vector of its radical indicates；

S12: building iteration expands convolutional neural networks, and for extracting feature, expansion convolutional neural networks include four layers swollen Swollen convolutional layer, expansion radius are respectively 1,2,3,3, and every layer includes 100 convolution kernels, and each convolution kernel width is 3, the last layer Output be re-entered into first layer, i.e. our so-called iteration, iteration 4 times altogether；

S13: building conditional random field models, the feature that previous step is extracted is as the input of condition random field, condition Random field exports a sequence label to each word, to mark whether the word is a part of entity, if it does, being entity Beginning, centre or end up, and belong to what kind of entity.

Further, the detailed process of the step S2 is:

S21: loss function is provided by negative log-likelihood function, and likelihood function is equal to, and predicts the score of the label come, Than the score of upper all possible label:

Wherein, s (x, y) is score function；

S22: the calculating of score is divided into two parts, conversion score A i, j between (1) label are transformed into mark from label i Sign the score of j；(2) label score Pm, the n of word, that is, give some word m, and label is the score of n, it may be assumed that

Further, detailed process is as follows by the step S3:

S31: to input text, splitting individual character processing, and each word obtains its radical, by Embedding layer acquisition word with The vector of radical indicates, is input to iteration expansion convolutional neural networks to extract feature and the feature extracted is input to condition In random field, final label is obtained；

S32: comparing label and the model answer of prediction, and loss function is calculated by mode described in step S2, uses Adam optimizer, Lai Youhua loss function update model parameter；

S33: by data set according to ratio cut partition training set, the test set of 9:1, repetitive exercise 100 times on training set, Penalty values have tended towards stability after 50 times, the model after training are saved, and the test result on test set.It is right when test result Each word of test sample all exports a corresponding label；

S34: S31-S33 is repeated, 5 cross validations are done on test set, use the indexs such as accurate rate, recall rate, F1 value The effect of model is measured, takes 5 average values as last effect.

Compared with prior art, the beneficial effect of technical solution of the present invention is:

The present invention provides a kind of method of case history text name Entity recognition based on iteration expansion convolutional neural networks, this Invention is named Entity recognition in medical electronics medical record data collection CCKS2017, inputs one section of Chinese electronic health record text, uses Iteration expands convolutional neural networks and condition random field as model framework, uses Chinese radical as feature, to extract in text Name entity, such as disease name, detection methods.

Detailed description of the invention

Fig. 1 is flow diagram of the present invention；

Fig. 2 is the algorithm structure schematic diagram in embodiment 1.

Specific embodiment

The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent；

In order to better illustrate this embodiment, the certain components of attached drawing have omission, zoom in or out, and do not represent actual product Size；

To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing 's.

The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.

Embodiment 1

As shown in Figure 1, a kind of case history text based on iteration expansion convolutional neural networks names entity recognition method, including Following steps:

S2: the loss function of model is established；

S3: the training of model is carried out, and is tested on test set.

The detailed process of step S1 is:

The detailed process of step S2 is:

Wherein, s (x, y) is score function；

Detailed process is as follows by step S3:

The present invention be directed to the name Entity recognition of case history text, the data set that we use is CCKS2017 Chinese electronics Case history Entity recognition data set is issued by national knowledge mapping and semantic computation conference.Data set owner will be contained to be led with medicine The relevant text in domain.The entity class being related to includes: organ (body), symptom (symptom), checks (check), disease Disease (disease), treatment method (treatment), each entity distribution is as shown in table 1 in data set.Data set notation methods are BIOES, i.e. Begin (entity starts), Intermediate (among entity), End (entity ending), (single word is Single One entity), O (non-physical)." B-body, E-body, B-symptom, I- are labeled as such as " abdominal pain sense disappearance " Symptom, E-symptom, O, O ", then we from this mark it is found that " abdomen " be type for " organ " entity, and " feeling of pain " is the entity that type is " symptom ", and " disappearance " is not entity.

The distribution situation of table 1, training set entity

In existing method, the preferable way of effect be in conjunction with term vector two-way shot and long term memory network (LSTM)+ Condition random field (CRF).Wherein shot and long term memory network is for understanding read statement and extracting feature, and condition random field is for producing Raw label.But in case history text, speech habits are different from general term, generally simplify, rigorously, and context relation is not Greatly, therefore shot and long term memory network herein and is not suitable for.In addition, term vector is also not suitable for herein, because exist in case history text compared with Multi-specialized noun, has segmented very big difficulty to it, and mistake caused by segmenting can accumulate in model always, therefore term vector is not yet It is applicable in.So we have proposed combine the iteration of word vector to expand convolutional neural networks+condition random field model.

Steps are as follows for specific method: the vector for obtaining each word first indicates.In next step, vector expression is input to expansion In convolutional neural networks, after four layers of expansion convolutional layer, the feature and a vector for obtaining each word are indicated.It connects down Come, the expression of these vectors is input in condition random field, condition random field exports its label.Details is as follows:

1. reading in data set CCKS2017 first.In data set, every style of writing originally includes two parts, word, corresponding label.Null For dividing different training samples.After the sample of data set is upset at random, data set is divided by training with the ratio of 8:1 Collection and test set.

2. model, including three parts are constructed, word vector, expansion convolutional neural networks, condition random field.Word vector is used for Distribution to each word indicates that expansion convolutional neural networks are used to extract the feature of each word, and condition random field is in training rank Section is used for assessment tag subsequence score, is used to export the sequence label of highest scoring in test phase.Wherein, word vector module is adopted With the term vector that pre-training is crossed on external corpus.

3., by word vector module, obtaining the expression of its vector, input using every 32 samples of training set as a batch Into model, model is trained.In order to avoid over-fitting, we are added dropout layers after module, with certain probability (being set as 0.5 herein) inactivates term vector.Trained objective function is to minimize negative log-likelihood function, it may be assumed that

Wherein:

Wherein s (x, y) is score function.

4. repeating big totally 100 epoch of step 3.After the completion of training, model parameter is saved in local file.It reads and surveys Examination collection data, the prediction name entity on test set, and labeled data is compared, measure model performance.Test index uses F1 value, It is defined as follows:

F1 value=(2* accurate rate * recall rate)/(accurate rate+recall rate)

The entity number that accurate rate=(the correctly predicted number in the entity that prediction obtains)/prediction obtains

Recall rate=(number being predicted correctly in the entity of labeled data)/labeled data entity number

In order to embody the effect of our models, we select other two model to compare.One is two-way length Phase memory network+condition random field (BiLSTM+CRF), the model are a classical models for naming Entity recognition field, are being permitted Good effect is all achieved on more data sets.Another model is HITSZ_CNER, which is in CCKS2017 match Champion's model, the i.e. optimal model of achievement in match.

Test result is as shown in table 2, we compared the effect of our model (IDCNN+CRF) and previous model, Generally, our model on electronic health record text name Entity recognition work on have biggish promotion, it is each not It is generic physically, also have bigger promotion.In addition, we compared using radical feature (with feat) and not Using the modelling effect of feature (no feat), comparison display, the radical feature that we are arranged can be obviously improved model performance.This Invention is from the feature of case history text, the methods of reasonable utilization word vector, radical feature, expansion convolutional neural networks, Preferably to be identified to medical bodies.

Specific structure of the invention is as shown in Fig. 2.

Described in attached drawing positional relationship for only for illustration, should not be understood as the limitation to this patent；

Table 2. compares the effect (F1 value, %) of different models.

Method	Bilstm+CRF	HITSZ_CNER	IDCNN+CRFw/ofeat.	IDCNN+CRF
					Organ	88.10	87.42	87.10	87.56
Symptom	95.73	96.34	95.16	96.94
					Disease	77.45	78.60	79.57	80.14
It checks	95.69	94.36	96.02	96.11
					Treatment method	72.71	78.92	75.10	75.74
It amounts to	90.82	91.08	91.62	92.53

The same or similar label correspond to the same or similar components；

Obviously, the above embodiment of the present invention be only to clearly illustrate example of the present invention, and not be pair The restriction of embodiments of the present invention.For those of ordinary skill in the art, may be used also on the basis of the above description To make other variations or changes in different ways.There is no necessity and possibility to exhaust all the enbodiments.It is all this Made any modifications, equivalent replacements, and improvements etc., should be included in the claims in the present invention within the spirit and principle of invention Protection scope within.

Claims

1. a kind of case history text based on iteration expansion convolutional neural networks names entity recognition method, which is characterized in that including Following steps:

S2: the loss function of model is established；

S3: the training of model is carried out, and is tested on test set.

2. the case history text according to claim 1 based on iteration expansion convolutional neural networks names entity recognition method, It is characterized in that, the detailed process of the step S1 is:

S11: building Embedding, since model will handle text, and text cannot directly be handled by model, and needing first will be literary Word is converted into vector expression, i.e., with Embedding layers of completion, the vector including word is indicated and the vector of its radical indicates；

S12: building iteration expands convolutional neural networks, and for extracting feature, expansion convolutional neural networks include four layers of expansion volume Lamination, expansion radius are respectively 1,2,3,3, and every layer includes 100 convolution kernels, and each convolution kernel width is 3, the last layer it is defeated It is re-entered into first layer out, i.e. our so-called iteration, altogether iteration 4 times；

S13: building conditional random field models, the feature that previous step is extracted is as the input of condition random field, condition random Field exports a sequence label to each word, to mark whether the word is a part of entity, if it does, being opening for entity Head, intermediate or ending, and belong to what kind of entity.

3. the case history text according to claim 2 based on iteration expansion convolutional neural networks names entity recognition method, It is characterized in that, the detailed process of the step S2 is:

S21: loss function is provided by negative log-likelihood function, and likelihood function is equal to, and the score of the label come is predicted, than upper The score of all possible label:

Wherein, s (x, y) is score function；

S22: the calculating of score is divided into two parts, conversion score A i, j between (1) label are transformed into label j's from label i Score；(2) label score Pm, the n of word, that is, give some word m, and label is the score of n, it may be assumed that

4. the case history text according to claim 3 based on iteration expansion convolutional neural networks names entity recognition method, It is characterized in that, detailed process is as follows by the step S3:

S31: to input text, individual character processing is split, each word obtains its radical, by Embedding layers of acquisition word and radical Vector indicate, be input to iteration expansion convolutional neural networks and to extract feature the feature extracted be input to condition random In, final label is obtained；

S32: label and the model answer of prediction are compared, loss function is calculated by mode described in step S2, uses Adam Optimizer, Lai Youhua loss function update model parameter；

S33: by data set according to ratio cut partition training set, the test set of 9:1, repetitive exercise 100 times on training set, at 50 times Penalty values have tended towards stability afterwards, the model after training are saved, and the test result on test set.When test result, to test Each word of sample all exports a corresponding label；

S34: S31-S33 is repeated, 5 cross validations are done on test set, are weighed using indexs such as accurate rate, recall rate, F1 values The effect for measuring model, takes 5 average values as last effect.