WO2020135337A1 - 实体语义关系分类 - Google Patents

实体语义关系分类 Download PDF

Info

Publication number
WO2020135337A1
WO2020135337A1 PCT/CN2019/127449 CN2019127449W WO2020135337A1 WO 2020135337 A1 WO2020135337 A1 WO 2020135337A1 CN 2019127449 W CN2019127449 W CN 2019127449W WO 2020135337 A1 WO2020135337 A1 WO 2020135337A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
semantic relationship
training sample
vector corresponding
text
Prior art date
Application number
PCT/CN2019/127449
Other languages
English (en)
French (fr)
Inventor
樊芳利
Original Assignee
新华三大数据技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 新华三大数据技术有限公司 filed Critical 新华三大数据技术有限公司
Priority to JP2021534922A priority Critical patent/JP7202465B2/ja
Priority to US17/287,476 priority patent/US20210391080A1/en
Priority to EP19901450.7A priority patent/EP3985559A4/en
Publication of WO2020135337A1 publication Critical patent/WO2020135337A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • Deep learning is a method of machine representation learning in machine learning. In the practical application of deep learning, a deep learning model needs to be trained in advance.
  • the sample data used for training the deep learning model includes feature data of multiple dimensions, and the deep learning model is continuously trained according to the sample data to obtain an accurate prediction model.
  • the prediction model is used to perform data prediction operations online.
  • FIG. 1 is a schematic structural block diagram of an electronic device provided by an embodiment of this application.
  • FIG. 2 is a schematic structural diagram of an entity semantic relationship classification model provided by an embodiment of this application.
  • FIG. 3 is a schematic flowchart of a training method for an entity semantic relationship classification model provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of the sub-steps of S204 in FIG. 3;
  • FIG. 5 is another schematic flowchart of the sub-step of S204 in FIG. 3;
  • FIG. 6 is a schematic flowchart of an entity semantic relationship classification method provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of sub-steps of S303 in FIG. 6;
  • FIG. 8 is a schematic flowchart of the sub-steps of S3031 in FIG. 7;
  • FIG. 9 is a schematic structural diagram of an entity semantic relationship classification model training device provided by an embodiment of this application.
  • FIG. 10 is a schematic structural diagram of an entity semantic relationship classification device according to an embodiment of the present application.
  • a deep learning model can be used to deepen the text information on the basis of entity recognition, thereby promoting unstructured sentence structure.
  • the entity is a naming reference item, such as the name of a person, place, equipment, disease, etc.
  • naming reference item such as the name of a person, place, equipment, disease, etc.
  • a classification method based on a neural network model is generally used to classify the entity semantic relationship.
  • the specific method is to use a large amount of corpus that has been classified as entity semantic relations as input to the neural network model to train the neural network model, and then use the trained neural network model to classify the entity semantic relationship of the new corpus .
  • RNTN Recursive Neural Tensor Network, Recurrent Neural Tensor Network
  • PCNN Pulse Coupled Neural Network, etc.
  • EMR Electronic Medical Record
  • FIG. 1 is a schematic structural block diagram of an electronic device 100 according to an embodiment of the present application.
  • the electronic device 100 may be, but not limited to, a server, a personal computer (Personal Computer, PC), a tablet computer, a smart phone, a personal digital assistant (Personal Digital Assistant, PDA), and so on.
  • the electronic device 100 includes a memory 101, a processor 102, and a communication interface 103.
  • the memory 101, the processor 102, and the communication interface 103 are directly or indirectly electrically connected to each other to implement data transmission or interaction.
  • the memory 101, the processor 102, and the communication interface 103 may be electrically connected to each other through one or more communication buses or signal lines.
  • the memory 101 may be used to store program instructions and modules, such as the program instructions/modules corresponding to the entity semantic relationship classification model training device 400 and the program instructions/modules corresponding to the entity semantic relationship classification device 500 provided in the embodiments of the present application, and the processor 102 By executing the program instructions and modules stored in the memory 101, various functional applications and data processing are executed.
  • the communication interface 103 can be used for signaling or data communication with other node devices.
  • the memory 101 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), erasable In addition to read-only memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable read-only memory (Electric Erasable Programmable Read-Only Memory, EEPROM), etc.
  • RAM Random Access Memory
  • ROM read-only memory
  • PROM Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • EEPROM Electrically erasable read-only memory
  • the processor 102 may be an integrated circuit chip with signal processing capabilities.
  • the processor 102 may be a general-purpose processor, including but not limited to a central processing unit (Central Processing Unit, CPU), a digital signal processor (Digital Signal Processing, DSP), an NPU (Neural-network Processing Units, embedded neural network processing
  • the processor 102 can also be an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device , Discrete hardware components, etc.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • FIG. 1 is only an illustration, and the electronic device 100 may include more or fewer components than those shown in FIG. 1 or have a configuration different from that shown in FIG. 1.
  • Each component shown in FIG. 1 may be implemented by hardware, software, or a combination thereof.
  • the following takes an example of mining entities in a corpus and classifying entity semantic relationships as an example to describe a method for training an entity semantic relationship classification model provided by the embodiments of the present application, so as to use the trained entity semantic relationship classification model, Classify the semantic relations of entities in the corpus.
  • FIG. 2 is a schematic structural diagram of an entity semantic relationship classification model provided by an embodiment of the present application.
  • the entity semantic relationship classification model may use bidirectional A model of Gated Recurrent Neural Network (BiGated Recurrent Unit, BiGRU) combined with Attention mechanism.
  • BiGRU Gated Recurrent Neural Network
  • the entity semantic relationship classification model is to add an attention layer before the output layer of the BiGRU model.
  • the level of the GRU layer is set to 1, and the number of neurons in the GRU layer is 230.
  • the user can also set the level of the GRU layer to 2 or other levels according to actual needs, and correspondingly set the number of neurons in the GRU layer to other numbers.
  • the embodiments of the present application only provide one possible implementation manner, and do not limit the specific number.
  • the dropout parameter of the BiGRU+Attention model is set to 1, that is, each neuron of the GRU layer in the BiGRU+Attention model is not discarded during training.
  • the user can also set the Dropout parameter to other values as needed to determine inactivation of some neurons in the GRU layer.
  • the entity relationship semantic classification model may also adopt other models, such as the GRU model.
  • the embodiments of the present application are exemplified based on the BiGRU+Attention model.
  • FIG. 3 is a schematic flowchart of an entity semantic relationship classification model training method provided by an embodiment of the present application.
  • the entity semantic relationship classification model training method is applied to the electronic device 100 shown in FIG.
  • the electronic device 100 uses an entity semantic relationship classification model to classify the semantic relationship of entities in the corpus.
  • the training method of the entity semantic relation classification model includes the following steps:
  • S201 Receive at least one training sample and identify the first entity and the second entity of each training sample in the at least one training sample.
  • the training samples may be received multiple times at a time, or may be received one at a time, and the user may adjust according to actual needs in the practice process.
  • S202 Obtain a first position distance between each text in each training sample and the corresponding first entity and a second position distance from the corresponding second entity.
  • Step S202 is specifically: for each training sample, obtain a first position distance between each text in the training sample and the first entity of the training sample, and obtain a first position distance between each text in the training sample and the training sample The second position distance of the two entities.
  • S203 Combine feature vectors corresponding to all characters in each training sample to obtain a model input vector corresponding to each training sample.
  • Step S203 is specifically: for each training sample, combine feature vectors corresponding to all characters in the training sample to obtain a model input vector corresponding to the training sample.
  • the feature vector corresponding to each text is obtained by combining the word vector corresponding to each text and the position embedding vector.
  • the position embedding vector corresponding to each text includes the vector corresponding to the first position distance of each text, each text The second position distance corresponds to the vector. That is, for each character, the feature vector corresponding to the character is obtained by combining the character vector corresponding to the character and the position embedding vector.
  • the position embedding vector corresponding to the character includes the vector corresponding to the first position distance of the character, the character The second position distance corresponds to the vector.
  • the model input vector corresponding to each training sample is used as the input of the entity semantic relationship classification model to train the entity semantic relationship classification model.
  • the training parameters can be set as follows: batch_size is set to 50; epochs is set to 100, that is, each training sample is used 100 times; after each training 100 times, Save the model.
  • the significance of the training parameter is that each time 50 training samples are used to train the entity semantic relationship classification model, and each training sample is used 100 times, and the entity semantic relationship classification model is saved after each training 100 times.
  • the electronic device when using each batch of training samples to train the entity semantic relationship classification model, the electronic device needs to receive 50 training samples and identify the first entity and the second entity in each training sample, where, The first entity and the second entity in each training sample constitute an entity pair.
  • each entity in the training sample has an entity identifier, and the electronic device recognizes the entity identifier in each training sample to obtain each The first entity and the second entity in the training sample.
  • a training sample is: there is no ⁇ e1> bulging symptom ⁇ /e1> in the precordial area, depression, normal apical beats, no pericardial friction, and normal relative voiced voice, ⁇ e2 >Heart rate test ⁇ /e2>70 beats/min, the heart rhythm is uniform, no pathological murmur is heard in the auscultation area of each heart valve.
  • the electronic device recognizes ⁇ e1> ⁇ /e1> when receiving the training sample,
  • the obtained first entity is "uplift”, the type is symptom, recognition ⁇ e2> ⁇ /e2>, the obtained second entity is "heart rate”, the type is test.
  • other methods may also be used to identify the first entity and the second entity in the training sample, for example, by presetting an entity library and storing multiple entities in the entity library, Therefore, the training samples are identified based on the index of the preset entity library to obtain the first entity and the second entity.
  • each word in a single training sample has a different position relative to the first entity and the second entity, then each word has an entity semantic relationship to both the first entity and the second entity
  • the contribution of type recognition is also different.
  • each text in each training sample obtained is the first position distance from each corresponding first entity, and The second position distance from the corresponding second entity.
  • the entity is generally composed of multiple words, such as “swelling” and “heart rate” in the above example
  • the first word in the entity is used as the calculation standard for the distance of the second position,
  • the positional distance relative to the same entity may be the same, but the difference is that one word is in front of the entity in word order, and the other word is behind the entity, for example in the above exemplary training sample
  • the word “zone” is at a distance of 2 from the first position of the "bulge” of the first entity, but the word “concave” in the “depression” is also the same as the first position of the "bulge” of the first entity.
  • a location distance is 2.
  • positive and negative values can be used to distinguish the direction of the position distance.
  • the position distance of the word before the word order of the entity is represented by a negative value
  • the position distance of the entity The positional distance of the word after the word order is expressed by a positive value.
  • the word "zone" in the "heart front zone” is located before the word order of the first entity "uplift”
  • the word “concave” in “depression” is located in the first entity "uplift”
  • the distance of the first position of the word "region” is -2
  • the distance of the first position of the word "recess” is 2.
  • the word vector obtained after vectorization of the "front” word is [0.1, 0.5, 0.4, 0.3]
  • the vector corresponding to the first position distance of the first word is [0.4,0.6]
  • the vector corresponding to the distance of the second position is [0.6,0.4]
  • the feature vector corresponding to the first word is [0.1,0.5,0.4,0.3 , 0.4, 0.6, 0.6, 0.4].
  • the word vector obtained after vectorization of the above "previous" word is a 4-dimensional vector [0.1, 0.5, 0.4, 0.3] is only for illustration. In some other implementations of the embodiments of the present application, it may also be pre-stored in an electronic device
  • the word vector table of other dimension numbers enables word vectors with different dimension numbers to be obtained after vectorization of the "previous" word. For example, using a word vector table with a number of dimensions pre-stored in an electronic device of 100, the word vector with 100 dimensions is obtained after vectorization of the "previous" word.
  • the vector corresponding to the first position distance and the vector corresponding to the second position distance are both schematic illustrations, and use 2 dimensions. In some other implementations of the embodiments of the present application, other pre-stored electronic devices may also be used.
  • the position of the number of dimensions is embedded in the vector table, for example, the position embedding vector table with 4 dimensions.
  • a model input vector corresponding to each training sample may be recorded in a two-dimensional matrix arrangement.
  • the feature vector corresponding to the word “heart” in the “heart front zone” is used as the first line of the model input vector; the feature corresponding to the word “front” in the “heart front zone” The vector is used as the second line of the model input vector; and so on, until the feature vectors corresponding to all the characters in the training sample are combined to obtain the model input vector corresponding to the training sample.
  • the model input vector corresponding to each training sample is used as the input of the entity semantic relationship classification model to train the entity semantic relationship classification model.
  • the corresponding model input vectors of multiple training samples can be used as the input of the entity semantic relationship classification model, for example, according to the above
  • the input of the entity semantic relationship classification model is the model input vector corresponding to 50 training samples.
  • each model input vector has the same dimension. For example, taking a word vector with a number of dimensions of 100 and a position embedding vector with a number of dimensions of 4 as an example, the number of dimensions of the model input vector of training sample 1 is 70*108, then the number of dimensions of the model input vector of training sample 2 is also 70*108.
  • 70 represents that the training sample contains 70 texts
  • 108 represents that the feature vector corresponding to each text contains 108 elements
  • these 108 elements include 100 elements of the word vector
  • the first position is away from the corresponding position of the embedding vector. 4 elements, and 4 elements of the vector embedded in the position corresponding to the second position distance.
  • training sample 1 contains 60 words
  • training sample 2 contains 70 words
  • Sample 3 contains 73 words
  • the model input vectors corresponding to multiple training samples need to be unified. That is, the number of dimensions of the model input vector corresponding to multiple training samples is unified.
  • the training sample 1 can be The part of less than 70 words is filled with a pre-set vector, such as the 0 vector, and the 70*108-dimensional model input vector corresponding to training sample 1 is obtained; in addition, the combination of each text in training sample 3 is combined.
  • the feature vector gets a 73*108 dimension vector, 73>70, you can remove the part of the training sample 3 that exceeds 70 words, and only retain the feature vector corresponding to the first 70 words in word order, and then get the corresponding corresponding to the training sample 3 70*108 dimension model input vector.
  • the above training sample may use an electronic medical record, and the model input vector corresponding to the training sample is a combination of n feature vectors.
  • the dimension of the model input vector is set to 70*108, which means that the model input vector contains 70 character corresponding feature vectors, and the dimension of each feature vector is 108.
  • n is the average number of words contained in at least one electronic medical record (that is, the above training sample). For example, if the training sample uses 50 electronic medical records in total, and the average number of words contained in the 50 electronic medical records is 70, then n is equal to 70.
  • n may also be set to a fixed value, for example, n is set to 100.
  • corpora other than electronic medical records may also be used as training samples. For example, intelligent customer service dialogue or consultation information, etc.
  • the model input vector corresponding to the single training sample is also used as the input of the entity semantic relationship classification model.
  • batch_size is set to 1
  • the input of the entity semantic relationship classification model is a model input vector corresponding to 1 training sample.
  • the input layer of the model obtains a 70*108-dimensional model input vector, after preprocessing by the feature embedding layer, it is input to the GRU layer for calculation; GRU layer output 108 predicted entity semantic relationship types to the attention layer; the attention layer calculates the respective probability value of each predicted entity semantic relationship type according to the obtained 108 predicted entity semantic types, and the obtained 108 predicted entities Among the semantic relationship types, the entity semantic relationship type with the largest probability value is used as the entity semantic relationship type corresponding to the training sample.
  • the entity semantic relationship classification model will give the predicted entity semantic relationship type for each training sample. For example, in the above example, enter the model input vector corresponding to 50 training samples, then the entity semantic relationship classification model will get 50 training The prediction entity semantic relationship type corresponding to each sample.
  • FIG. 4 is a schematic flowchart of the sub-steps of S204 in FIG. 3.
  • S204 includes the following sub-steps:
  • Step S2041 is to use the model input vector corresponding to each training sample as the input of the entity semantic relationship classification model, to obtain the predicted entity semantic relationship type corresponding to each training sample obtained by the entity semantic relationship classification model, and to predict the entity semantic relationship
  • the type is the predicted entity semantic relationship type of both the first entity and the second entity in each training sample.
  • S2042 Obtain the deviation value of the entity semantic relationship type in each training sample and the entity semantic relationship type of the first entity and the second entity stored in advance for each training sample.
  • Step S2042 is to obtain the deviation value between the predicted entity semantic relationship type and the preset entity semantic relationship type corresponding to each training sample, and the preset entity semantic relationship type corresponds to each pre-stored first entity and second entity corresponding to each training sample The type of entity semantic relationship between the two.
  • the cross entropy function can be used to calculate the deviation value of each training sample.
  • each training sample corresponds to the predicted entity semantic relationship type and the preset entity semantic relationship type as the input of the cross entropy function, and then each The cross entropy function value corresponding to the training sample is used as the deviation value corresponding to each training sample; then the deviation value of each training sample in a training process is added to obtain the deviation value of each training sample in the training process And, the sum of the deviation values characterizes the overall deviation degree of the training process. For example, in the above example where batch_size is set to 50, the sum of the deviation values is the result obtained by adding the deviation values of 50 training samples.
  • the overall deviation characterizing the training process is large, and the entity semantic relationship type predicted by the entity semantic relationship classification model differs greatly from the actual entity semantic relationship type. Adjust the entity semantic relationship The parameters in the classification model to train the classification model of the semantic relationship of the entity. Conversely, if the sum of the deviation values does not exceed the first deviation threshold, the entity semantic relationship type predicted by the entity semantic relationship classification model is closer to the actual entity semantic relationship type. This training result meets the training requirements, and the model training is judged to end. .
  • the above training process is to adjust the parameters of the entity semantic relationship classification model by using the overall deviation degree of a single training of the entity semantic relationship classification model.
  • the output of a single training sample can also be used to adjust the parameters of the entity semantic relationship classification model.
  • FIG. 5 is another schematic flowchart of the sub-steps of S204 in FIG. 3.
  • S204 may further include the following sub-steps:
  • S2042 Obtain the sum of the entity semantic relationship types in each training sample, and correspond to the pre-stored deviation values of the entity semantic relationship types of the first entity and the second entity for each training sample.
  • S2045 Determine whether the deviation value of the target training sample exceeds the second deviation threshold; if yes, adjust the parameters in the entity semantic relation classification model to train the entity semantic relation classification model; if not, end the training.
  • the cross entropy function is used to calculate the deviation value of each training sample; among all the training samples of the input entity semantic relationship classification model, the target training sample is determined. If the deviation value of the target training sample exceeds the second deviation threshold, it means that the training process does not meet the training requirements, and then the parameters in the entity semantic relationship classification model are adjusted to train the entity semantic relationship classification model. Conversely, if the deviation value of the target training sample does not exceed the second deviation threshold, it indicates that the training result meets the training requirements, and it is determined that the model training is ended.
  • the target training sample may be any of all training samples of the input entity semantic relationship classification model, may be any training sample whose deviation value exceeds the second deviation threshold, or may be a traversal of the input entity semantic relationship classification model in sequence For all training samples of, each training sample is used as the target training sample for judgment. In some other implementation manners of the embodiments of the present application, the training sample with the largest deviation value among all the training samples of the input model may also be used as the target training sample.
  • the method corresponding to FIG. 4 is to adjust the parameters of the entity semantic relationship classification model according to the overall deviation degree of a single training
  • the method corresponding to FIG. 5 is to adjust the parameters of the entity semantic relationship classification model using the output result of a single training sample
  • the parameters in the entity semantic relationship classification model are adjusted to train the entity semantic relationship classification During the model, the weight coefficient and bias coefficient of the GRU layer in the BiGRU+Attention model and the attention matrix of the attention layer are adjusted to achieve the purpose of training the entity semantic relationship classification model.
  • the following takes mining entities in a corpus and classifying entity semantic relationships as an example. Based on the entity semantic relationship classification model obtained after the training of the above entity semantic relationship classification model training method, an entity semantic relationship classification method provided by the embodiments of the present application is provided. Be explained.
  • FIG. 6 is a schematic flowchart of a method for classifying entity semantic relationship according to an embodiment of the present application.
  • the method for classifying entity semantic relationship may be applied to an electronic device shown in FIG.
  • the relationship classification method includes the following steps:
  • S301 Determine the first entity and the second entity in the corpus.
  • S302 Obtain a first position distance between each text in the corpus and the first entity and a second position distance from the second entity.
  • S303 Combine feature vectors corresponding to all characters in the corpus to obtain a model input vector corresponding to the corpus.
  • the feature vector corresponding to each text is obtained by combining the word vector corresponding to each text in the corpus with the position embedding vector.
  • the position embedding vector corresponding to each text includes the vector corresponding to the first position distance of each text, each The distance between the second position of each text is the corresponding vector.
  • S304 Use the model input vector corresponding to the corpus as the input of the preset entity semantic relationship classification model to determine the entity semantic relationship type of both the first entity and the second entity.
  • Step S304 is to use the model input vector corresponding to the corpus as the input of the entity semantic relationship classification model to determine the entity semantic relationship type of both the first entity and the second entity.
  • the corpus may use electronic medical records.
  • the anterior region has no ⁇ e1>protruding symptom ⁇ /e1>, depression, normal apical beats, no pericardial friction, and the heart is relatively dull and normal
  • ⁇ e2>heart rate test ⁇ /e2>70 times /Minute the heart rhythm is equal
  • the heart valve auscultation area is unheard and pathological murmurs
  • the electronic device obtains the type of entity semantic relationship between the entity pairs in the corpus, as a possible implementation, it can be based on the entity identifiers " ⁇ e1> ⁇ /e1>” and " ⁇ e2> ⁇ / e2>”, it is determined that the first entity in the corpus is “uplift” and the second entity is “heart rate”.
  • an entity library preset in the electronic device may also be used, and multiple entities are pre-stored in the entity library, so as to index based on the preset entity library
  • the corpus is identified to obtain the above-mentioned first entity "swell” and second entity "heart rate".
  • each text is obtained separately from the first The first position distance of an entity and the second position distance of the second entity, so that the word vector corresponding to each text, the vector corresponding to the first position distance of each text, and the second position distance of each text correspond
  • the vectors are combined to obtain the feature vector corresponding to each text, and then the feature vectors corresponding to all the words in the corpus are combined to obtain the model input vector corresponding to the corpus, and the model input vector is used as the entity semantic relationship in the electronic device
  • the input of the classification model determines the type of entity semantic relationship between the first entity "uplift” and the second entity "heart rate" in the corpus.
  • an entity semantic relationship classification model training method obtains the first entity and the second entity in the corpus by determining, and according to the first position of each text in the corpus and the first entity Distance and the second position distance between each text in the corpus and the second entity to obtain the feature vector corresponding to each text, and then combine the feature vectors corresponding to all the text in the corpus to obtain the model input vector corresponding to the corpus, thus
  • the model input vector corresponding to the corpus is used as the input of the entity semantic relationship classification model to obtain the entity semantic relationship type corresponding to the corpus.
  • it can improve the classification accuracy of entity semantic relations.
  • FIG. 7 is a schematic flowchart of the sub-steps of S303 in FIG. 6.
  • S303 includes the following sub-steps:
  • S3031 Obtain a word vector corresponding to each character in the corpus, and a first position embedding vector and a second position embedding vector corresponding to the first position distance and the second position distance of each character.
  • Step S3031 is to obtain a word vector corresponding to each text in the corpus, and obtain a first position embedding vector corresponding to the first position distance of each text, and a second position embedding vector corresponding to the second position of each text.
  • S3032 Combine the word vector corresponding to each character, the first position embedding vector, and the second position embedding vector to obtain a feature vector corresponding to each character.
  • the electronic device obtains the model input vector corresponding to the corpus, the above corpus "there is no ⁇ e1>protrusion symptom ⁇ /e1> and depression in the precordial area, the apex beats are normal, and there is no pericardial friction feeling, and the heart is relatively dull and normal, ⁇ e2> "Heart rate test ⁇ /e2> 70 beats/min, the heart rhythm is uniform, no pathological murmurs are heard in the auscultation area of the heart valves" as an example.
  • the feature vector corresponding to each text is obtained according to the above-mentioned "heart" word, and then the corpus is obtained in the manner of obtaining the model input vector corresponding to the training sample in the above step of training the model.
  • the feature vectors corresponding to all the characters in are combined to obtain the model input vector corresponding to the corpus.
  • FIG. 8 is a schematic flowchart of the sub-steps of S3031 in FIG. 7.
  • S3031 includes the following sub-steps:
  • S30312 Determine, in the position embedding vector table, the first position embedding vector and the second position embedding vector corresponding to the first position distance and the second position distance, respectively.
  • Step S30312 is: in the position embedding vector table, the first position embedding vector corresponding to the first position distance of each character and the second position embedding vector corresponding to the second position distance of each character are determined respectively.
  • a position embedding vector table is stored in the electronic device, and the position embedding vector table has a correspondence relationship between position distance and position embedding vector. According to the position embedding vector table, the first position embedding vector can be converted from the first position distance, and the second position embedding vector can be converted from the second position distance.
  • the position embedding vector table may be a vector with the number of dimensions m*n, and each column of elements in the position embedding vector table constitutes a specific position embedding vector, using specific values of the first position distance and the second position distance, Query the number of corresponding columns in the position embedding vector table, take all elements at the first position distance from the corresponding column as the first position embedding vector corresponding to the first position distance, and all elements at the second position distance from the corresponding column as the second position The second position corresponding to the distance embeds the vector.
  • the third column of the position embedding vector table is queried, and all elements contained in the third column of the position embedding vector table are used as the first position embedding vector;
  • the second position distance When it is "33” the 33rd column of the position embedding vector table is queried, and all elements contained in the 33rd column of the position embedding vector table are used as the second position embedding vector.
  • the position embedding vector can also be directly represented by the size of the position distance value, for example, in the above example, the first position distance is “3” and the second position distance is “33", Then the embedding vector in the first position is "3" and the embedding vector in the second position is "33".
  • the size of the position distance value is used to directly represent the position embedding vector, which can be regarded as a method of representing the position embedding vector using a one-dimensional vector.
  • the position embedding vector table may be the entity semantic relationship classification model described above. Before being used to identify the entity semantic relationship type of both the first entity and the second entity in the corpus, back propagation (BackPropagation) , BP) algorithm to generate.
  • BackPropagation BackPropagation
  • BP back propagation
  • the randomly generated initial vector table is continuously optimized by the BP algorithm to obtain the position embedding vector table.
  • a 4-layer neural network structure is set, including 1 input layer L 1 , two hidden layers L 2 and L 3 , and 1 output layer L 4 , and neurons of input layer L 1
  • the number is set to 10
  • the number of neurons in the hidden layers L 2 and L 3 is set to 256
  • the neurons in the output layer L 4 The number of is set to 3
  • stop learning When the global error is less than the preset threshold, stop learning, and use the output result of the output layer in the last learning as the position embedded in the vector table; or when the global error is not less than the preset threshold, but the number of learning reaches 20,000 times, Stop learning, and embed the output result of the output layer in the last learning as the position into the vector table.
  • FIG. 9 is a schematic structural diagram of an entity semantic relationship classification model training device 400 provided by an embodiment of the present application, and is applied to an electronic device.
  • the electronic device presets an entity semantic relationship classification model .
  • the entity semantic relationship classification model training device 400 includes a transceiver module 401, a second processing module 402, and a training module 403.
  • the transceiver module 401 is used to receive at least one training sample and identify the first entity and the second entity of each training sample in the at least one training sample.
  • the second processing module 402 is used to obtain, for each training sample, the first position distance between each text in the training sample and the first entity of the training sample, and to obtain each text in the training sample separately from the training sample The second position distance of the second entity.
  • the second processing module 402 is also used to combine the feature vectors corresponding to all the text in each training sample to obtain the model input vector corresponding to each training sample, where the feature vector corresponding to each text is composed of each training sample.
  • the word vector corresponding to each text in the text is combined with the position embedding vector.
  • the position embedding vector corresponding to each text includes a vector corresponding to the first position distance of each text and a vector corresponding to the second position distance of each text. .
  • the training module 403 is used to input the model input vector corresponding to each training sample as the input of the entity semantic relationship classification model to train the entity semantic relationship classification model.
  • the training module 403 may specifically be used for:
  • the predicted entity semantic relationship type is the predicted entity The type of entity semantic relationship between the first entity and the second entity in the training samples;
  • the preset entity semantic relationship type is the entity semantics of each pre-stored first entity and second entity corresponding to each training sample Relationship type
  • the parameters in the entity semantic relationship classification model are adjusted to train the entity semantic relationship classification model.
  • the training module 403 may specifically be used for:
  • the predicted entity semantic relationship type is the predicted entity The type of entity semantic relationship between the first entity and the second entity in the training samples;
  • the preset entity semantic relationship type is the entity semantics of each pre-stored first entity and second entity corresponding to each training sample Relationship type
  • the parameters in the entity semantic relationship classification model are adjusted according to the target training sample deviation value to train the entity semantic relationship classification model.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • the training module 403 may specifically be used to:
  • each neuron of the GRU layer in the entity semantic relationship classification model during training is not discarded.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • At least one training sample is at least one electronic medical record
  • the model input vector corresponding to the training sample is a combination of n feature vectors, where n is the average number of words contained in at least one electronic medical record.
  • FIG. 10 is a schematic structural diagram of an entity semantic relationship classification apparatus 500 according to an embodiment of the present application, which is applied to an electronic device.
  • the electronic device presets an entity semantic relationship classification model.
  • the relationship classification device 500 includes a first processing module 501 and an identification module 502.
  • the first processing module 501 is used to determine the first entity and the second entity in the corpus.
  • the first processing module 501 is further used to obtain the first position distance of each text in the corpus from the first entity and the second position distance from the second entity.
  • the recognition module 502 is used to combine the feature vectors corresponding to all the words in the corpus to obtain the model input vector corresponding to the corpus, where the feature vector corresponding to each text is performed by the word vector corresponding to each text in the corpus and the position embedding vector It is obtained after combination that the position embedding vector corresponding to each text includes a vector corresponding to the first position distance of each text and a vector corresponding to the second position distance of each text.
  • the first processing module 501 may specifically be used for:
  • the feature vectors corresponding to all the words in the corpus are combined to obtain the model input vector corresponding to the corpus.
  • the first processing module 501 may specifically be used for:
  • the first position embedding vector corresponding to the first position distance of each character and the second position embedding vector corresponding to the second position distance of each character are determined respectively.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • the corpus is an electronic medical record.
  • An embodiment of the present application provides an electronic device.
  • the electronic device includes a memory and a processor.
  • the memory is used to store one or more programs and a preset entity semantic relationship classification model
  • For each training sample obtain the first positional distance between each text in the training sample and the first entity of the training sample, and obtain the second position of each text in the training sample and the second entity of the training sample Location distance
  • the feature vector corresponding to each text is made by the word vector corresponding to each text and the position embedding vector It is obtained after combination that the position embedding vector corresponding to each text includes the vector corresponding to the first position distance of each text and the vector corresponding to the second position distance of each text;
  • the model input vector corresponding to each training sample is used as the input of the entity semantic relationship classification model to train the entity semantic relationship classification model.
  • the specific implementation may be:
  • the predicted entity semantic relationship type is the predicted entity The type of entity semantic relationship between the first entity and the second entity in the training samples;
  • the preset entity semantic relationship type is the entity semantics of each pre-stored first entity and second entity corresponding to each training sample Relationship type
  • the parameters in the entity semantic relationship classification model are adjusted to train the entity semantic relationship classification model.
  • the specific implementation may be:
  • the predicted entity semantic relationship type is the predicted entity The type of entity semantic relationship between the first entity and the second entity in the training samples;
  • the preset entity semantic relationship type is the entity semantics of each pre-stored first entity and second entity corresponding to each training sample Relationship type
  • the parameters in the entity semantic relationship classification model are adjusted to train the entity semantic relationship classification model.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • each neuron of the GRU layer in the entity semantic relationship classification model during training is not discarded.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • At least one training sample is at least one electronic medical record
  • the model input vector corresponding to the training sample is a combination of n feature vectors, where n is the average number of words contained in at least one electronic medical record.
  • An embodiment of the present application provides an electronic device.
  • the electronic device includes a memory and a processor.
  • the memory is used to store one or more programs and a preset entity semantic relationship classification model
  • the feature vectors corresponding to all the words in the corpus are combined to obtain the model input vector corresponding to the corpus, where the feature vector corresponding to each text is obtained by combining the word vector corresponding to each text in the corpus with the position embedding vector.
  • the position embedding vector corresponding to each text includes a vector corresponding to the first position distance of each text and a vector corresponding to the second position distance of each text;
  • the model input vector corresponding to the corpus is used as the input of the entity semantic relationship classification model to determine the entity semantic relationship type of both the first entity and the second entity.
  • the specific implementation may be:
  • the feature vectors corresponding to all the characters in the corpus are combined to obtain the model input vector corresponding to the corpus.
  • the specific implementation may be:
  • the first position embedding vector corresponding to the first position distance of each character and the second position embedding vector corresponding to the second position distance of each character are determined respectively.
  • the entity semantic relationship classification model is a model combining BiGRU and Attention mechanism
  • the corpus is an electronic medical record.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program and a preset entity semantic relationship classification model are stored.
  • the computer program is implemented when executed by a processor:
  • the feature vectors corresponding to all the words in the corpus are combined to obtain the model input vector corresponding to the corpus, where the feature vector corresponding to each text is obtained by combining the word vector corresponding to each text in the corpus with the position embedding vector.
  • the position embedding vector corresponding to each text includes a vector corresponding to the first position distance of each text and a vector corresponding to the second position distance of each text;
  • the model input vector corresponding to the corpus is used as the input of the entity semantic relationship classification model to determine the entity semantic relationship type of both the first entity and the second entity.
  • An embodiment of the present application also provides a computer-readable storage medium on which a computer program and a preset entity semantic relationship classification model are stored.
  • the computer program is implemented when executed by a processor:
  • For each training sample obtain the first positional distance between each text in the training sample and the first entity of the training sample, and obtain the second position of each text in the training sample and the second entity of the training sample Location distance
  • the feature vector corresponding to each text is made by the word vector corresponding to each text and the position embedding vector It is obtained after combination that the position embedding vector corresponding to each text includes the vector corresponding to the first position distance of each text and the vector corresponding to the second position distance of each text;
  • the model input vector corresponding to each training sample is used as the input of the entity semantic relationship classification model to train the entity semantic relationship classification model.
  • An embodiment of the present application also provides a computer program, which is implemented when the computer program is executed by a processor:
  • the feature vectors corresponding to all the words in the corpus are combined to obtain the model input vector corresponding to the corpus, where the feature vector corresponding to each text is obtained by combining the word vector corresponding to each text in the corpus with the position embedding vector.
  • the position embedding vector corresponding to each text includes a vector corresponding to the first position distance of each text and a vector corresponding to the second position distance of each text;
  • the model input vector corresponding to the corpus is used as the input of the preset entity semantic relationship classification model to determine the entity semantic relationship type of both the first entity and the second entity.
  • An embodiment of the present application also provides a computer program, which is implemented when the computer program is executed by a processor:
  • For each training sample obtain the first positional distance between each text in the training sample and the first entity of the training sample, and obtain the second position of each text in the training sample and the second entity of the training sample Location distance
  • the feature vector corresponding to each text is made by the word vector corresponding to each text and the position embedding vector It is obtained after combination that the position embedding vector corresponding to each text includes the vector corresponding to the first position distance of each text and the vector corresponding to the second position distance of each text;
  • the model input vector corresponding to each training sample is used as the input of the preset entity semantic relationship classification model to train the entity semantic relationship classification model.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, and the module, program segment, or part of the code contains one or more executables for implementing prescribed logical functions instruction.
  • the functions noted in the block may occur out of the order noted in the figures.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with a dedicated hardware-based system that performs specified functions or actions Or, it can be realized by a combination of dedicated hardware and computer instructions.
  • the functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.
  • an entity semantic relationship classification method, model training method, device, electronic device, storage medium, and computer program provided by the embodiments of the present application obtain the first entity and the second entity in the corpus by determining, and according to the corpus The first position distance between each text in the first entity and the second position distance between each text in the second entity, to obtain the feature vector corresponding to each text, and then combine the feature vectors corresponding to all the characters in the corpus, Obtain the model input vector corresponding to the corpus, so that the model input vector corresponding to the corpus is used as the input of the entity semantic relationship classification model to obtain the entity semantic relationship type corresponding to the corpus.
  • it can improve the classification accuracy of entity semantic relations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

一种实体语义关系分类方法、模型训练方法及电子设备。通过确定获得语料中的第一实体和第二实体,并根据语料中每个文字距离第一实体的第一位置距离和距离第二实体的第二位置距离,获得每个文字对应的特征向量,进而将语料中所有文字各自的特征向量进行组合,得到语料对应的模型输入向量,从而将语料对应的模型输入向量作为实体语义关系分类模型的输入,得到该语料对应的实体语义关系。相比于现有技术,能够提升实体语义关系的分类准确率。

Description

实体语义关系分类
本申请要求于2018年12月29日提交中国专利局、申请号为201811641958.3申请名称为“实体语义关系分类方法、模型训练方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
背景技术
深度学习是机器学习中一种对数据进行表征学习的方法。在深度学习的实际应用中,需要预先训练出深度学习模型。
训练深度学习模型所采用的样本数据包括多个维度的特征数据,根据样本数据不断地对深度学习模型进行训练,得出精确的预测模型。该预测模型用于在线执行数据预测操作。
附图简要说明
图1为本申请一实施例所提供的一种电子设备的示意性结构框图;
图2为本申请一实施例所提供的一种实体语义关系分类模型的示意性结构图;
图3为本申请一实施例所提供的一种实体语义关系分类模型训练方法的示意性流程图;
图4为图3中S204的子步骤的一种示意性流程图;
图5为图3中S204的子步骤的另一种示意性流程图;
图6为本申请一实施例所提供的一种实体语义关系分类方法的示意性流程图;
图7为图6中S303的子步骤的一种示意性流程图;
图8为图7中S3031的子步骤的一种示意性流程图;
图9为本申请一实施例所提供的一种实体语义关系分类模型训练装置的示意性结构图;
图10为本申请一实施例提供的一种实体语义关系分类装置的示意性结构图。
图中:100-电子设备;101-存储器;102-处理器;103-通信接口;400-实体语义关系分类模型训练装置;401-收发模块;402-第二处理模块;403-训练 模块;500-实体语义关系分类装置;501-第一处理模块;502-识别模块。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。
因此,以下对在附图中提供的本申请的实施例的详细描述并非旨在限制要求保护的本申请的范围,而是仅仅表示本申请的选定实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。同时,在本申请实施例的描述中,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。
在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
在文本信息抽取任务中,可以采用深度学习模型,在实体识别的基础上,对文本信息进行深度挖掘,从而促进非结构化的语句结构化。其中,实体是一种命名性指称项,比如人名、地名、设备名、疾病名称等等。当然,可以理解的是,在不同的领域,会相应地定义领域内的各种实体类型。
对语料中的实体之间的语义关系进行分类的应用场景中,一般采用基于神经网络模型的分类方法对实体语义关系进行分类。具体手段是将大量已经分类好实体语义关系的语料作为神经网络模型的输入,以对该神经网络模型进行训练,然后再将训练好的神经网络模型用于对新的语料的实体语义关系进行分类。例如,采用RNTN(Recursive Neural Tensor Network,递归神经张量网络)、PCNN(Pulse Coupled Neural Network,脉冲耦合神经网络)等基于卷积神经网络的模型对实体语义关系进行分类。然而这些模型对某些领域语料的实体语义关系进行分类的准确率可能达不到要求。
以电子病历(Electronic Medical Record,EMR)作为文本信息抽取对象为例。由于电子病历记录了患者的疾病和症状、治疗过程和治疗效果,基于建立的深度学习模型,挖掘电子病历中的实体并对实体语义关系进行分类。而实体语义关系的分类准确率不高,使得不能更加高效、精确地收集以往的临床信息作为历史数据,以辅助医疗决策。
请参阅图1,图1为本申请一实施例所提供的一种电子设备100的示意性结构框图。电子设备100可以是但不限于服务器、个人电脑(Personal Computer,PC)、平板电脑、智能手机、个人数字助理(Personal Digital Assistant,PDA)等等。电子设备100包括存储器101、处理器102和通信接口103,该存储器101、处理器102和通信接口103相互之间直接或间接地电性连接,以实现数据的传输或交互。例如,存储器101、处理器102和通信接口103相互之间可通过一条或多条通信总线或信号线实现电性连接。存储器101可用于存储程序指令及模块,如本申请实施例所提供的实体语义关系分类模型训练装置400对应的程序指令/模块,以及实体语义关系分类装置500对应的程序指令/模块,处理器102通过执行存储在存储器101内的程序指令及模块,从而执行各种功能应用以及数据处理。该通信接口103可用于与其他节点设备进行信令或数据的通信。
其中,存储器101可以是但不限于,随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM)、电可擦除只读存储器(Electric  Erasable Programmable Read-Only Memory,EEPROM)等。
处理器102可以是一种集成电路芯片,具有信号处理能力。该处理器102可以是通用处理器,包括但不限于中央处理器(Central Processing Unit,CPU)、数字信号处理器(Digital Signal Processing,DSP)、NPU(Neural-network Process Units,嵌入式神经网络处理器)等;该处理器102还可以是专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。
可以理解,图1所示的结构仅为示意,电子设备100可以包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。图1中所示的各组件可以采用硬件、软件或其组合实现。
下面以挖掘语料中的实体并对实体语义关系进行分类的为例,对本申请实施例提供的一种对实体语义关系分类模型进行训练的方法进行说明,从而运用训练好的实体语义关系分类模型,对语料中实体语义关系进行分类。
可选地,请参阅图2,图2为本申请一实施例所提供的一种实体语义关系分类模型的示意性结构图,作为一种可能的实现方式,该实体语义关系分类模型可以采用双向门控循环神经网络(BiGated Recurrent Unit,BiGRU)结合注意力(Attention)机制的模型。具体的,该实体语义关系分类模型是在BiGRU模型的输出层之前,添加注意力层。
其中,作为一种可能的实现方式,在该BiGRU+Attention模型(即实体语义关系分类模型)中,GRU层的层级设置为1,且GRU层中神经元的数量为230。具体实践中,用户也可根据实际需要将GRU层的层级设置为2层或其他层级,对应地将GRU层中的神经元的数量设置为其他数量。本申请实施例仅提供一个可能的实现方式,并不对具体数量做限定。
作为一种可能的实现方式,该BiGRU+Attention模型的丢弃(Dropout)参数设置为1,即在训练时该BiGRU+Attention模型中GRU层的每个神经元均不被舍弃。当然,在实践中,用户也可根据需要将Dropout参数设置为其他值,以确定将GRU层的部分神经元进行失活处理。本申请实施例通过反复实现后,给出一个可能的实现方式,但并不对具体的数量做限定。
在本申请实施例其他的一些实施方式,该实体关系语义分类模型还可以采用其他的一些模型,比如GRU模型。本申请实施例在BiGRU+Attention模型的基础上进行举例说明。
请参阅图3,图3为本申请一实施例所提供的一种实体语义关系分类模型训练方法的示意性流程图,该实体语义关系分类模型训练方法应用于如图1所示的电子设备100,该电子设备100采用实体语义关系分类模型用于对语料中实体的语义关系进行分类。该实体语义关系分类模型训练方法包括以下步骤:
S201,接收至少一个训练样本,识别至少一个训练样本中每个训练样本的第一实体和第二实体。
其中,所述训练样本可以是一次接收多个,也可以一次接收一个,用户在实践过程中可根据实际的需要进行调整。
S202,获得每个训练样本中每个文字各自与对应的第一实体的第一位置距离以及与对应的第二实体的第二位置距离。
步骤S202具体为:针对每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离。
S203,将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本各自对应的模型输入向量。
步骤S203具体为:针对每个训练样本,将该训练样本中所有文字各自对应的特征向量进行组合,得到该训练样本对应的模型输入向量。
其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量。即,针对每一个文字,该文字对应的特征向量由该文字对应的字向量与位置嵌入向量进行组合后获得,该文字对应的位置嵌入向量包括该文字的第一位置距离对应的向量、该文字的第二位置距离对应的向量。
S204,将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练。
在对实体语义关系分类模型进行训练时,需要将所有的训练样本分批输入实体语义关系分类模型。每一批次包含的训练样本的数量称为批尺寸(batch_size),每一训练样本所使用的次数称为epochs。
示例性地,在训练该实体语义关系分类模型时,可以对训练参数进行如下设置:batch_size设置为50;epochs设置为100,即每一个训练样本使用的次数是100次;每训练100次后,保存模型。该训练参数的意义在于,每一次采用50个训练样本对实体语义关系分类模型进行训练,且每一训练样本使用100次,每训练100次后保存实体语义关系分类模型。
则按照上述示例,采用每一批次的训练样本对实体语义关系分类模型进行训练时,电子设备需要接收50个训练样本,并识别每一训练样本中的第一实体和第二实体,其中,每一训练样本中的第一实体和第二实体均构成一个实体对。
其中,作为一种可能的实现方式,实体语义关系分类模型在利用训练样本进行训练之前,训练样本中的每一实体均具有实体标识,电子设备识别每一训练样本中的实体标识,获得每个训练样本中的第一实体和第二实体。
以训练样本为电子病历的场景为例,假定一训练样本为:心前区无<e1>隆起symptom</e1>、凹陷,心尖搏动正常,无心包摩擦感,心脏相对浊音界正常,<e2>心率test</e2>70次/分,心律齐,心脏各瓣膜听诊区未闻及病理性杂音。其中,<e1></e1>作为第一实体的标识,<e2></e2>作为第二实体的标识,则电子设备在接收到该训练样本时,识别<e1></e1>,所获得的第一实体为“隆起”,类型为symptom,识别<e2></e2>,所获得的第二实体为“心率”,类型为test。
在本申请实施例其他的一些实现方式中,还可以采用其他的一些方式识别训练样本中的第一实体和第二实体,比如,通过预设一实体库,在实体库中存储多个实体,从而基于该预设的实体库进行索引对训练样本进行识别,获得第一实体和第二实体。
在例如上述电子病历的应用场景中,单个训练样本中的每个字相对于第一实体和第二实体的位置不同,则每个字各自对第一实体与第二实体两者的实体语义关系类型识别的贡献也不同。一般来说,距离两个实体越近的字, 越有可能对两个实体的实体语义关系类型的识别有较大贡献。
因此,在本申请实施例中,加入位置嵌入的概念,在对实体语义关系分类模型进行训练时,获得的每个训练样本中每个文字各自与对应的第一实体的第一位置距离、以及与对应的第二实体的第二位置距离。
并且,由于实体一般是由多个字组成,比如上述示例中的“隆起”和“心率”,所以在获得每个训练样本中每个文字的第一位置距离及第二位置距离时,可以预先确定训练样本中第一位置距离及第二位置距离的计算标准。比如说,可以约定,在计算第一位置距离时,按照语序顺序以第一实体中的第一个字作为第一位置距离的计算标准;在计算第二位置距离时,按照语序顺序以第二实体中的第一个字作为第二位置距离的计算标准,
例如,在上述示例性的训练样本中,如“心前区无<e1>隆起symptom</e1>、凹陷,心尖搏动正常,无心包摩擦感,心脏相对浊音界正常,<e2>心率test</e2>70次/分,心律齐,心脏各瓣膜听诊区未闻及病理性杂音”该训练样本中,若按照上述计算第一位置距离和第二位置距离的约定,则在该训练样本中,“心前区”中的“前”字的第一位置距离为3,第二位置距离为33;而“心前区”中的“区”字的第一位置距离则为2,第二位置距离则为32。
由于不同的两个字,相对于同一实体的位置距离可能相同,而区别在于按照语序顺序其中一个字位于该实体前,而另一个字则位于该实体后,比如说在上述示例性的训练样本中,“心前区”中的“区”字,与第一实体“隆起”的第一位置距离为2,但“凹陷”中的“凹”字,同样与第一实体“隆起”的第一位置距离为2。
因此,作为一种可能的实现方式,可以采用正负值的方式对位置距离的方向进行区分,按照语序的顺序,将位于实体的语序之前的字的位置距离用负值表示,而将位于实体的语序之后的字的位置距离用正值来表示。比如在上述示例中,按照语序的顺序,“心前区”中的“区”字位于第一实体“隆起”的语序之前,“凹陷”中的“凹”字则位于第一实体“隆起”的语序之后,则“区”字的第一位置距离为-2,凹”字的第一位置距离为2。
作为另一种可能的实现方式,在上述采用正负值的方式对位置距离的方向进行区分的基础上,还可以对每个位置距离的值均加上一预先设定的值, 从而将每个位置距离的值变换为正值,比如在上述示例性的训练样本中,若将该预先设定的值赋为68,则“区”字的第一位置距离为-2+68=66,“凹”字的第一位置距离为2+68=70。
根据上述获得每个训练样本中每个文字的第一位置距离以及第二位置距离,按照用于将文字转换为向量的字向量表,以及用于将位置距离转换为向量的位置嵌入向量表,将每个文字均向量化为字向量、并将每个第一位置距离均向量化为位置嵌入向量、及将第二位置距离均向量化为位置嵌入向量,然后将每个文字各自的字向量、第一位置距离对应的位置嵌入向量和第二位置距离对应的位置嵌入向量进行组合,得到每个文字对应的特征向量。
比如,在上述示例性的训练样本中,以“心前区”中的“前”字为例,假定“前”字向量化后得到的字向量为[0.1,0.5,0.4,0.3],“前”字的第一位置距离对应的向量为[0.4,0.6],第二位置距离对应的向量为[0.6,0.4],则“前”字对应的特征向量为[0.1,0.5,0.4,0.3,0.4,0.6,0.6,0.4]。
上述“前”字向量化后得到的字向量为4维的向量[0.1,0.5,0.4,0.3]仅为示意,在本申请实施例其他的一些实施方式中,还可以使用预存于电子设备中其他维度数的字向量表,使得“前”字向量化后得到维度数不同的字向量。比如,利用预存于电子设备中维度数为100的字向量表,则“前”字向量化后得到具有100维度的字向量。
同理,上述第一位置距离对应的向量和第二位置距离对应的向量均为示意说明,使用了2维,在本申请实施例其他的一些实施方式中,还可以使用预存于电子设备中其他维度数的位置嵌入向量表,比如具有4维的位置嵌入向量表。
本申请实施例中,在得到每个文字对应的特征向量后,将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本对应的模型输入向量。其中,作为一种可能的实现方式,可以采用二维矩阵排列的方式记录每一训练样本对应的模型输入向量。
比如在上述示例性的训练样本中,将“心前区”中的“心”字对应的特征向量作为模型输入向量的第一行;将“心前区”中的“前”字对应的特征向量作为模型输入向量的第二行;以此类推,直至将该训练样本中的所有文 字对应的特征向量均进行组合后,得到该训练样本对应的模型输入向量。
由此,根据组合得到的每个训练样本对应的模型输入向量,将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,以对该实体语义关系分类模型进行训练。
其中,作为一种可能的实现方式,在将模型输入向量作为实体语义关系分类模型的输入时,可以将多个训练样本各自对应的模型输入向量一起作为实体语义关系分类模型的输入,比如根据上述训练参数的设置,batch_size设置为50,则该实体语义关系分类模型的一次输入为50个训练样本对应的模型输入向量。
在例如上述实体语义关系分类模型的一次输入为50个训练样本对应的模型输入向量的示例中,每一模型输入向量具有相同的维度。比如,以维度数为100的字向量和维度数为4的位置嵌入向量为例,训练样本1的模型输入向量的维度数为70*108,则训练样本2的模型输入向量的维度数同样为70*108。其中,70表征训练样本中包含有70个文字,108表征每个文字对应的特征向量包含108个元素,这108个元素中包括字向量的100个元素、第一位置距离对应的位置嵌入向量的4个元素、以及第二位置距离对应的位置嵌入向量的4个元素。
并且,若多个训练样本一起作为实体语义关系分类模型的输入,由于不同的训练样本所包含的文字数量可能不尽相同,比如训练样本1包含60个字,训练样本2包含70个字,训练样本3包含73个字,因此,需要将多个训练样本对应的模型输入向量统一化。也就是,统一多个训练样本对应的模型输入向量的维度数。示例性地,若将模型输入向量的维度统一设定为70*108,则组合训练样本1中每个文字对应的特征向量得到60*108维度的向量,60<70,可以将训练样本1中不足70字的部分,用预先设定的向量进行填充,比如用0向量进行填充,进而得到训练样本1对应的70*108维度的模型输入向量;另外,组合训练样本3中每个文字对应的特征向量得到73*108维度的向量,73>70,可以将训练样本3中超出70字的部分进行剔除,只保留按照语序顺序的前70个字对应的特征向量,进而得到训练样本3对应的70*108维度的模型输入向量。
作为一种可能的实现方式,上述的训练样本可采用电子病历,且训练样本对应的模型输入向量为n个特征向量的组合。比如,上述示例中,模型输入向量的维度设置为70*108,表征该模型输入向量包含70个字对应的特征向量,每个特征向量的维度为108。
其中,作为一种可能的实现方式,n为至少一个电子病历(即上述训练样本)中包含的平均字数。例如,若训练样本总计使用50个电子病历,且50个电子病历包含的平均字数为70,则n等于70。
可以理解,在本申请实施例其他的一些实现方式中,n还可以设定为固定的值,比如设定n等于100。
在本申请实施例其他的一些实现方式中,还可以采用除电子病历以外的语料作为训练样本。例如,智能客服对话或者咨询信息等等。
作为另一种可能的实现方式,在将模型输入向量作为实体语义关系分类模型的输入时,还将单个训练样本对应的模型输入向量作为实体语义关系分类模型的输入。比如在上述训练参数的设置将batch_size设置为1,则该实体语义关系分类模型的一次输入为1个训练样本对应的模型输入向量。
下面结合如图2所示的实体语义关系分类模型,对例如维度数为70*108的模型输入向量的训练过程进行举例说明。此时,在如图2所示的模型中,T=108,该模型的输入层获得70*108维的模型输入向量,经特征嵌入层预处理后,输入到GRU层进行计算;GRU层输出108个预测的实体语义关系类型至注意力层;注意力层根据获得的108个预测的实体语义类型,计算每一预测的实体语义关系类型各自的概率值,并将得到的108个预测的实体语义关系类型中,概率值最大的实体语义关系类型,作为该训练样本对应的实体语义关系类型。
可选地,在例如上述其中一种可能的实现方式中,若将多个训练样本各自对应的模型输入向量一起作为实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练,则在训练时,实体语义关系分类模型会为每一训练样本给出预测实体语义关系类型,比如上述示例中,输入50个训练样本对应的模型输入向量,则实体语义关系分类模型会得出50个训练样本各自对应的预测实体语义关系类型。
因此,请参阅图4,图4为图3中S204的子步骤的一种示意性流程图,作为一种可能的实现方式,S204包括以下子步骤:
S2041,获得每个训练样本通过实体语义关系分类模型训练后得到的第一实体与第二实体两者的实体语义关系类型。
步骤S2041即为:将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,获得通过实体语义关系分类模型得到的每个训练样本对应的预测实体语义关系类型,预测实体语义关系类型为预测的每个训练样本中第一实体与第二实体两者的实体语义关系类型。
S2042,获得每个训练样本中实体语义关系类型和为每个训练样本对应预先存储的第一实体和第二实体的实体语义关系类型的偏差值。
步骤S2042即为:获得每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,预设实体语义关系类型为每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型。
S2043,获得每个训练样本的偏差值之和。
S2044,判断偏差值之和是否超过第一偏差阈值;若为是,则调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型;若为否,训练结束。
示例性地,可以利用交叉熵函数计算每个训练样本的偏差值,比如,将每个训练样本对应预测实体语义关系类型和预设实体语义关系类型作为该交叉熵函数的输入,进而将每个训练样本对应得到的交叉熵函数值作为每个训练样本对应的偏差值;然后将一次训练过程中,每个训练样本的偏差值相加,得到该次训练过程中每个训练样本的偏差值之和,将该偏差值之和表征该次训练过程的整体偏差程度。比如在上述batch_size设置为50的示例中,该偏差值之和为50个训练样本的偏差值相加得到的结果。若偏差值之和超过第一偏差阈值,则表征该次训练过程的整体偏差程度较大,实体语义关系分类模型预测的实体语义关系类型与实际的实体语义关系类型相差较大,调整实体语义关系分类模型中的参数,以训练该实体语义关系分类模型。反之,若偏差值之和未超过第一偏差阈值,则表征实体语义关系分类模型预测的实体语义关系类型与实际的实体语义关系类型较为接近,该次训练结果的满足训练 要求,判定模型训练结束。
上述训练过程为利用实体语义关系分类模型单次训练的整体偏差程度对实体语义关系分类模型的参数进行调整。作为另一种可能的实现方式,还可以利用单个训练样本的输出结果对实体语义关系分类模型的参数进行调整。
请参阅图5,图5为图3中S204的子步骤的另一种示意性流程图,S204还可以包括以下子步骤:
S2041,获得每个训练样本通过实体语义关系分类模型训练后得到的第一实体与第二实体两者的实体语义关系类型。
S2042,获得每个训练样本中实体语义关系类型和,为每个训练样本对应预先存储的第一实体和第二实体的实体语义关系类型的偏差值。
S2045,判断目标训练样本的偏差值是否超过第二偏差阈值;若为是,则调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型;若为否,训练结束。
如上述示例,利用交叉熵函数计算每个训练样本的偏差值;在输入实体语义关系分类模型的所有训练样本中,确定目标训练样本。若该目标训练样本的偏差值超过第二偏差阈值,则表征该次训练过程的不满足训练要求,则调整实体语义关系分类模型中的参数,以训练该实体语义关系分类模型。反之,若目标训练样本的偏差值未超过第二偏差阈值,则表征该次训练结果的满足训练要求,判定模型训练结束。
其中,目标训练样本可以是输入实体语义关系分类模型的所有训练样本中的任一,可以是任一其偏差值超过第二偏差阈值的训练样本,或者也可以是依次遍历输入实体语义关系分类模型的所有训练样本,将每一训练样本均作为目标训练样本进行判定。在本申请实施例其他的一些实施方式中,还可以是将输入模型的所有训练样本中偏差值最大的训练样本作为目标训练样本。
图4对应的方式采用的是根据单次训练的整体偏差程度对实体语义关系分类模型的参数进行调整,图5对应的方式为利用单个训练样本的输出结果对实体语义关系分类模型的参数进行调整,针对具体的对实体语义关系分类模型的参数进行调整时,用户可根据实际需求采用不同的方法。
作为一种可能的实现方式,在本申请实施例中,若实体语义关系分类模 型为如图2所示的BiGRU+Attention模型,则在调整实体语义关系分类模型中的参数以训练实体语义关系分类模型时,调整BiGRU+Attention模型中GRU层的权重系数及偏置系数,以及注意力层的注意力矩阵,从而达到训练该实体语义关系分类模型的目的。
下面以挖掘语料中的实体并对实体语义关系进行分类为例,基于上述实体语义关系分类模型训练方法训练结束后得到的实体语义关系分类模型,对本申请实施例提供的一种实体语义关系分类方法进行说明。
请参阅图6,图6为本申请一实施例所提供的一种实体语义关系分类方法的示意性流程图,该实体语义关系分类方法可应用于如图1所示的电子设备,该实体语义关系分类方法包括以下步骤:
S301,确定出语料中的第一实体与第二实体。
S302,获得语料中每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离。
S303,将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量。其中,每个文字对应的特征向量由语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量。
S304,将语料对应的模型输入向量,作为预设的实体语义关系分类模型的输入,确定出第一实体与第二实体两者的实体语义关系类型。
步骤S304即为:将语料对应的模型输入向量作为实体语义关系分类模型的输入,确定出第一实体与第二实体两者的实体语义关系类型。
作为一种可能的实现方式,该语料可以采用电子病历。示例性地,以电子病历“心前区无<e1>隆起symptom</e1>、凹陷,心尖搏动正常,无心包摩擦感,心脏相对浊音界正常,<e2>心率test</e2>70次/分,心律齐,心脏各瓣膜听诊区未闻及病理性杂音”作为语料为例,对上述实体语义关系分类方法进行说明。
电子设备在获得该语料中实体对间的实体语义关系类型时,作为一种可能的实现方式,可以根据该语料中包含的实体标识“<e1></e1>”和“<e2></e2>”,确定出该语料中的第一实体为“隆起”,第二实体为“心率”。
或者,作为另一种可能的实现方式,还可以利用在电子设备中预设的一实体库,在实体库中预先存储多个实体,从而基于该预设的实体库进行索引的方式,对该语料进行识别,从而获得上述的第一实体“隆起”和第二实体“心率”。
在本申请实施例中,根据该语料中每个字各自与第一实体“隆起”的第位置距离,以及每个字各自与第二实体“心率”的位置距离,获得每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离,从而将每个文字对应的字向量、每个文字的第一位置距离对应的向量、以及每个文字的第二位置距离对应的向量进行组合,得到每个文字对应的特征向量,进而再将语料中所有文字对应的特征向量进行组合,得到该语料对应的模型输入向量,并将该模型输入向量作为电子设备中实体语义关系分类模型的输入,确定出该语料中第一实体“隆起”与第二实体“心率”两者的实体语义关系类型。
基于上述设计,本申请实施例所提供的一种实体语义关系分类模型训练方法,通过确定获得语料中的第一实体和第二实体,并根据语料中每个文字与第一实体的第一位置距离,以及语料中每个文字与第二实体的第二位置距离,获得每个文字对应的特征向量,进而将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,从而将语料对应的模型输入向量作为实体语义关系分类模型的输入,得到该语料对应的实体语义关系类型。相比于相关技术,能够提升实体语义关系的分类准确率。
可选地,请参阅图7,图7为图6中S303的子步骤的一种示意性流程图,作为一种可能的实现方式,S303包括以下子步骤:
S3031,获得语料中每个文字对应的字向量,以及每个文字的第一位置距离与第二位置距离各自对应的第一位置嵌入向量和第二位置嵌入向量。
步骤S3031即为:获得语料中每个文字对应的字向量,以及获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量。
S3032,将每个文字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得每个文字对应的特征向量。
S3033,将语料中所有文字各自对应的特征向量进行组合,获得语料对应的模型输入向量。
电子设备在获得语料对应的模型输入向量时,以上述语料“心前区无<e1>隆起symptom</e1>、凹陷,心尖搏动正常,无心包摩擦感,心脏相对浊音界正常,<e2>心率test</e2>70次/分,心律齐,心脏各瓣膜听诊区未闻及病理性杂音”为例。首先获得该语料中的每个文字对应的特征向量,以“心前区”中的“心”字为例,将“心”字向量化后得到的字向量,以及将“心”字与第一实体“隆起”的第一位置距离向量化后得到的第一位置嵌入向量,“心”字与第二实体“心率”的第二位置距离向量化后得到的第二位置嵌入向量,对“心”字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得“心”字对应的特征向量。
同理,将该语料中的每一文字均按照上述“心”字的方式,得到每一文字各自对应的特征向量,然后按照如上述训练模型的步骤中获得训练样本对应的模型输入向量的方式将语料的所有文字对应的特征向量进行组合,得到该语料对应的模型输入向量。
可选地,请参阅图8,图8为图7中S3031的子步骤的一种示意性流程图,作为一种可能的实现方式,S3031包括以下子步骤:
S30311,获得位置嵌入向量表。
S30312,在位置嵌入向量表中,分别确定出第一位置距离和第二位置距离各自对应的第一位置嵌入向量和第二位置嵌入向量。
步骤S30312即为:在位置嵌入向量表中,分别确定出每个文字的第一位置距离对应的第一位置嵌入向量,以及每个文字的第二位置距离对应的第二位置嵌入向量。
在本申请实施例中,电子设备内存储有一位置嵌入向量表,该位置嵌入向量表有位置距离与位置嵌入向量的对应关系。根据该位置嵌入向量表,能够将第一位置距离转换得到第一位置嵌入向量,以及将第二位置距离转换得到第二位置嵌入向量。
比如,该位置嵌入向量表可以是维度数为m*n的向量,位置嵌入向量表中的每一列元素均构成一个具体的位置嵌入向量,利用第一位置距离和第二 位置距离具体的值,查询该位置嵌入向量表中对应的列数,将第一位置距离对应列的所有元素作为该第一位置距离对应的第一位置嵌入向量,以及第二位置距离对应列的所有元素作为第二位置距离对应的第二位置嵌入向量。比如,当第一位置距离为“3”时,则查询该位置嵌入向量表的第3列,将位置嵌入向量表中第3列包含的所有元素作为第一位置嵌入向量;当第二位置距离为“33”时,则查询该位置嵌入向量表的第33列,将位置嵌入向量表中第33列包含的所有元素作为第二位置嵌入向量。
在本申请实施例其他的一些实施方式中,还可以利用位置距离值的大小来直接表示位置嵌入向量,比如上述示例中的第一位置距离为“3”,第二位置距离为“33”,则第一位置嵌入向量为“3”,第二位置嵌入向量为“33”。
本申请实施例中,采用位置距离值的大小直接表示位置嵌入向量,可以看做是采用一维向量的表示位置嵌入向量方法。
作为一种可能的实现方式,该位置嵌入向量表可以是上述实体语义关系分类模型在用于识别语料中第一实体与第二实体两者的实体语义关系类型之前,采用反向传播(Back Propagation,BP)算法来生成。
示例性地,在采用BP算法生成位置嵌入向量表时,随机生成的初始向量表,经BP算法不断优化该初始向量表,进而得到该位置嵌入向量表。
比如,在一个优化示例中,设置4层神经网络结构,包括1个输入层L 1,两个隐含层L 2和L 3,以及1个输出层L 4,输入层L 1的神经元的个数设置为10,隐含层L 2和L 3的神经元的个数设置为256,激活函数设置为修正线性单元ReLU(x)=max(0,x),输出层L 4的神经元的个数设置为3,输出层的函数设置为Soft max(x)=exp(x i)/∑ jexp(x j),其中,i,j=0,1,2...N,x i、x j为优化后得到的位置嵌入向量表,初始化网络权值设置为w,误差函数设置为
Figure PCTCN2019127449-appb-000001
Figure PCTCN2019127449-appb-000002
其中,样本个数k=1,2,……,m,d o(k)表示第k个样本下的期望输出,y o(k)表示第k个样本下的实际输出,o=1,2,3……,q,q表示输出层的神经元个数,m表示一次训练中训练样本的总个数,设置学习率 ln为0.01,最大学习次数为20000次。
在优化的过程中,首先将多个初始向量表作为初始样本及各自对应的样本标签作为输入,计算隐含层各神经元的输入和输出;然后利用网络的期望输出和实际输出,计算误差函数对输出层的各神经元的偏导数δ o(m);再利用隐含层到输出层的连接权值以及输出层的偏导数δ o(m)和隐含层的输出,计算误差函数对隐含层各神经元的偏导数δ h(m)a;然后利用输出层各神经元的偏导数δ o(m)和隐含层各神经元的输出来修正连接权值w,以及利用隐含层各神经元的偏导数δ h(m)a和输入层各神经元的输入来修正连接权值w;然后在每次循环中,对于输出层的结果,计算全局误差
Figure PCTCN2019127449-appb-000003
其中,样本个数k=1,2,……,m,d o(k)表示第k个样本下的期望输出,y o(k)表示第k个样本下的实际输出,o=1,2,3……,q,q表示输出层的神经元个数,m表示一次训练中训练样本的总个数。在全局误差小于预设阈值时,则停止学习,将最后一次学习时输出层的输出结果作为位置嵌入向量表;或者是在全局误差并未小于预设阈值,但学习次数达到20000次时,也停止学习,将最后一次学习时输出层的输出结果作为位置嵌入向量表。
请参阅图9,图9为本申请一实施例所提供的一种实体语义关系分类模型训练装置400的示意性结构图,应用于一电子设备,该电子设备中预设有一实体语义关系分类模型。该实体语义关系分类模型训练装置400包括收发模块401、第二处理模块402及训练模块403。
收发模块401用于接收至少一个训练样本,识别至少一个训练样本中每个训练样本的第一实体和第二实体。
第二处理模块402用于针对每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离。
第二处理模块402还用于将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个训练样本中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离分别对应的向量。
训练模块403用于将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练。
可选地,作为一种可能的实现方式,训练模块403具体可以用于:
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,获得通过实体语义关系分类模型得到的每个训练样本对应的预测实体语义关系类型,预测实体语义关系类型为预测的每个训练样本中第一实体与第二实体两者的实体语义关系类型;
获得每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,预设实体语义关系类型为每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
获得每个训练样本的偏差值之和;
当偏差值之和超过第一偏差阈值,则调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型。
可选地,作为另一种可能的实现方式,训练模块403具体可以用于:
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,获得通过实体语义关系分类模型得到的每个训练样本对应的预测实体语义关系类型,预测实体语义关系类型为预测的每个训练样本中第一实体与第二实体两者的实体语义关系类型;
获得每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,预设实体语义关系类型为每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
每当至少一个训练样本中目标训练样本的偏差值超过第二偏差阈值时,则根据目标训练样本的偏差值调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型。
可选地,作为一种可能的实现方式,实体语义关系分类模型为BiGRU结合Attention机制的模型,训练模块403具体可以用于:
调整实体语义关系分类模型中GRU层的权重系数及偏置系数,以及注意力层的注意力矩阵。
可选地,作为一种可能的实现方式,在训练中实体语义关系分类模型中GRU层的每个神经元不被舍弃。
可选地,作为一种可能的实现方式,实体语义关系分类模型为BiGRU结合Attention机制的模型;
至少一个训练样本为至少一个电子病历,训练样本对应的模型输入向量为n个特征向量的组合,其中,n为至少一个电子病历中包含的平均字数。
请参阅图10,图10为本申请一实施例提供的一种实体语义关系分类装置500的示意性结构图,应用于一电子设备,电子设备中预设有一实体语义关系分类模型,该实体语义关系分类装置500包括第一处理模块501及识别模块502。
第一处理模块501用于确定出语料中的第一实体与第二实体。
第一处理模块501还用于,获得语料中每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离。
识别模块502用于将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,其中,每个文字对应的特征向量由语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量。
可选地,作为一种可能的实现方式,第一处理模块501具体可以用于:
获得语料中每个文字对应的字向量,以及获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量;
将语料中每个文字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得每个文字对应的特征向量;
将语料中所有文字各自对应的特征向量进行组合,获得语料对应的模型 输入向量。
可选地,作为一种可能的实现方式,第一处理模块501具体可以用于:
获得位置嵌入向量表,其中,位置嵌入向量表记录有位置距离与位置嵌入向量的对应关系;
在位置嵌入向量表中,分别确定出每个文字的第一位置距离对应的第一位置嵌入向量,以及每个文字的第二位置距离对应的第二位置嵌入向量。
可选地,作为一种可能的实现方式,实体语义关系分类模型为BiGRU结合Attention机制的模型,语料为电子病历。
本申请一实施例所提供了一种电子设备。该电子设备包括存储器和处理器。
其中,存储器,用于存储一个或多个程序和预设的一实体语义关系分类模型;
当一个或多个程序被处理器执行时,实现:
接收至少一个训练样本,识别至少一个训练样本中每个训练样本的第一实体和第二实体;
针对每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离;
将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练。
可选地,当一个或多个程序被处理器执行时,具体可以实现:
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,获得通过实体语义关系分类模型得到的每个训练样本对应的预测实体语义关系类型,预测实体语义关系类型为预测的每个训练样本中第一实体与 第二实体两者的实体语义关系类型;
获得每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,预设实体语义关系类型为每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
获得每个训练样本的偏差值之和;
当偏差值之和超过第一偏差阈值,则调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型。
可选地,当一个或多个程序被处理器执行时,具体可以实现:
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,获得通过实体语义关系分类模型得到的每个训练样本对应的预测实体语义关系类型,预测实体语义关系类型为预测的每个训练样本中第一实体与第二实体两者的实体语义关系类型;
获得每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,预设实体语义关系类型为每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
每当至少一个训练样本中目标训练样本的偏差值超过第二偏差阈值时,则调整实体语义关系分类模型中的参数,以训练实体语义关系分类模型。
可选地,实体语义关系分类模型为BiGRU结合Attention机制的模型;
当一个或多个程序被处理器执行时,具体可以实现:
调整实体语义关系分类模型中GRU层的权重系数及偏置系数,以及注意力层的注意力矩阵。
可选地,在训练中实体语义关系分类模型中GRU层的每个神经元不被舍弃。
可选地,实体语义关系分类模型为BiGRU结合Attention机制的模型;
至少一个训练样本为至少一个电子病历,训练样本对应的模型输入向量为n个特征向量的组合,其中,n为至少一个电子病历中包含的平均字数。
本申请一实施例所提供了一种电子设备。该电子设备包括存储器和处理器。
其中,存储器,用于存储一个或多个程序和预设的一实体语义关系分类 模型;
当一个或多个程序被处理器执行时,实现:
确定出语料中的第一实体与第二实体;
获得语料中每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离;
将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,其中,每个文字对应的特征向量由语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将语料对应的模型输入向量作为实体语义关系分类模型的输入,确定出第一实体与第二实体两者的实体语义关系类型。
可选地,当一个或多个程序被处理器执行时,具体可以实现:
获得语料中每个文字对应的字向量,以及获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量;
将语料中每个文字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得每个文字对应的特征向量;
将语料中所有文字各自对应的特征向量进行组合,获得语料对应的模型输入向量。
可选地,当一个或多个程序被处理器执行时,具体可以实现:
获得位置嵌入向量表,其中,位置嵌入向量表记录有位置距离与位置嵌入向量的对应关系;
在位置嵌入向量表中,分别确定出每个文字的第一位置距离对应的第一位置嵌入向量,以及每个文字的第二位置距离对应的第二位置嵌入向量。
可选地,实体语义关系分类模型为BiGRU结合Attention机制的模型,语料为电子病历。
本申请一实施例还提供了一种计算机可读存储介质,其上存储有计算机程序和预设的一实体语义关系分类模型,该计算机程序被处理器执行时实现:
确定出语料中的第一实体与第二实体;
获得语料中每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离;
将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,其中,每个文字对应的特征向量由语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将语料对应的模型输入向量作为实体语义关系分类模型的输入,确定出第一实体与第二实体两者的实体语义关系类型。
本申请一实施例还提供了一种计算机可读存储介质,其上存储有计算机程序和预设的一实体语义关系分类模型,该计算机程序被处理器执行时实现:
接收至少一个训练样本,识别至少一个训练样本中每个训练样本的第一实体和第二实体;
针对每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离;
将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将每个训练样本各自对应的模型输入向量作为实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练。
本申请一实施例还提供了一种计算机程序,该计算机程序被处理器执行时实现:
确定出语料中的第一实体与第二实体;
获得语料中每个文字各自与第一实体的第一位置距离以及与第二实体的第二位置距离;
将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,其中,每个文字对应的特征向量由语料中每个文字对应的字向量 与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将语料对应的模型输入向量作为预设的实体语义关系分类模型的输入,确定出第一实体与第二实体两者的实体语义关系类型。
本申请一实施例还提供了一种计算机程序,该计算机程序被处理器执行时实现:
接收至少一个训练样本,识别至少一个训练样本中每个训练样本的第一实体和第二实体;
针对每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离;
将每个训练样本中所有文字各自对应的特征向量进行组合,得到每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
将每个训练样本各自对应的模型输入向量作为预设的实体语义关系分类模型的输入,以对实体语义关系分类模型进行训练。
在本申请所提供的实施例中,所揭露的装置和方法,也可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,附图中的流程图和框图显示了根据本申请实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算 机指令的组合来实现。
另外,在本申请实施例中的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。
功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
综上,本申请实施例所提供的一种实体语义关系分类方法、模型训练方法、装置、电子设备、存储介质及计算机程序,通过确定获得语料中的第一实体和第二实体,并根据语料中每个文字与第一实体的第一位置距离、每个文字与第二实体的第二位置距离,获得每个文字对应的特征向量,进而将语料中所有文字各自对应的特征向量进行组合,得到语料对应的模型输入向量,从而将语料对应的模型输入向量作为实体语义关系分类模型的输入,得到该语料对应的实体语义关系类型。相比于相关技术,能够提升实体语义关系的分类准确率。
以上仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其它的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨 在将落在权利要求的等同要件的含义和范围内的所有变化囊括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。

Claims (15)

  1. 一种实体语义关系分类方法,应用于一电子设备,所述电子设备中预设有一实体语义关系分类模型,所述方法包括:
    确定出语料中的第一实体与第二实体;
    获得所述语料中每个文字各自与所述第一实体的第一位置距离以及与所述第二实体的第二位置距离;
    将所述语料中所有文字各自对应的特征向量进行组合,得到所述语料对应的模型输入向量,其中,每个文字对应的特征向量由所述语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
    将所述语料对应的模型输入向量作为所述实体语义关系分类模型的输入,确定出所述第一实体与所述第二实体两者的实体语义关系类型。
  2. 如权利要求1所述的方法,将所述语料中所有文字各自对应的特征向量进行组合,得到所述语料对应的模型输入向量的步骤,包括:
    获得所述语料中每个文字对应的字向量,以及获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量;
    将所述语料中每个文字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得每个文字对应的特征向量;
    将所述语料中所有文字各自对应的特征向量进行组合,获得所述语料对应的模型输入向量。
  3. 如权利要求2所述的方法,获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量的步骤,包括:
    获得位置嵌入向量表,其中,所述位置嵌入向量表记录有位置距离与位置嵌入向量的对应关系;
    在所述位置嵌入向量表中,分别确定出每个文字的第一位置距离对应的第一位置嵌入向量,以及每个文字的第二位置距离对应的第二位置嵌入向量。
  4. 一种实体语义关系分类模型训练方法,应用于一电子设备,所述电子设备中预设有一实体语义关系分类模型,所述方法包括:
    接收至少一个训练样本,识别所述至少一个训练样本中每个训练样本的第一实体和第二实体;
    针对所述每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练样本的第二实体的第二位置距离;
    将所述每个训练样本中所有文字各自对应的特征向量进行组合,得到所述每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,以对所述实体语义关系分类模型进行训练。
  5. 如权利要求4所述的方法,将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,以对所述实体语义关系分类模型进行训练的步骤,包括:
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,获得通过所述实体语义关系分类模型得到的所述每个训练样本对应的预测实体语义关系类型,所述预测实体语义关系类型为预测的所述每个训练样本中第一实体与第二实体两者的实体语义关系类型;
    获得所述每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,所述预设实体语义关系类型为所述每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
    获得所述每个训练样本的偏差值之和;
    当所述偏差值之和超过第一偏差阈值,则调整所述实体语义关系分类模型中的参数,以训练所述实体语义关系分类模型。
  6. 如权利要求4所述的方法,将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,以对所述实体语义关系分类模 型进行训练的步骤,包括:
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,获得通过所述实体语义关系分类模型得到的所述每个训练样本对应的预测实体语义关系类型,所述预测实体语义关系类型为预测的所述每个训练样本中第一实体与第二实体两者的实体语义关系类型;
    获得所述每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,所述预设实体语义关系类型为所述每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
    每当所述至少一个训练样本中目标训练样本的偏差值超过第二偏差阈值时,则调整所述实体语义关系分类模型中的参数,以训练所述实体语义关系分类模型。
  7. 如权利要求5所述的方法,所述实体语义关系分类模型为双向门控循环神经网络BiGRU结合注意力Attention机制的模型,调整所述实体语义关系分类模型中的参数的步骤,包括:
    调整所述实体语义关系分类模型中门控循环神经网络GRU层的权重系数及偏置系数,以及注意力层的注意力矩阵。
  8. 如权利要求4所述的方法,所述实体语义关系分类模型为双向门控循环神经网络BiGRU结合注意力Attention机制的模型;
    所述至少一个训练样本为至少一个电子病历,所述训练样本对应的模型输入向量为n个特征向量的组合,其中,所述n为所述至少一个电子病历中包含的平均字数。
  9. 一种电子设备,包括:
    存储器,用于存储一个或多个程序和预设的一实体语义关系分类模型;
    处理器;
    当所述一个或多个程序被所述处理器执行时,实现:
    确定出语料中的第一实体与第二实体;
    获得所述语料中每个文字各自与所述第一实体的第一位置距离以及与所述第二实体的第二位置距离;
    将所述语料中所有文字各自对应的特征向量进行组合,得到所述语料对 应的模型输入向量,其中,每个文字对应的特征向量由所述语料中每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
    将所述语料对应的模型输入向量作为所述实体语义关系分类模型的输入,确定出所述第一实体与所述第二实体两者的实体语义关系类型。
  10. 如权利要求9所述的电子设备,当所述一个或多个程序被所述处理器执行时,具体实现:
    获得所述语料中每个文字对应的字向量,以及获得每个文字的第一位置距离对应的第一位置嵌入向量,每个文字的第二位置距离对应的第二位置嵌入向量;
    将所述语料中每个文字对应的字向量、第一位置嵌入向量和第二位置嵌入向量进行组合,获得每个文字对应的特征向量;
    将所述语料中所有文字各自对应的特征向量进行组合,获得所述语料对应的模型输入向量。
  11. 如权利要求10所述的电子设备,当所述一个或多个程序被所述处理器执行时,具体实现:
    获得位置嵌入向量表,其中,所述位置嵌入向量表记录有位置距离与位置嵌入向量的对应关系;
    在所述位置嵌入向量表中,分别确定出每个文字的第一位置距离对应的第一位置嵌入向量,以及每个文字的第二位置距离对应的第二位置嵌入向量。
  12. 一种电子设备,包括:
    存储器,用于存储一个或多个程序和预设的一实体语义关系分类模型;
    处理器;
    当所述一个或多个程序被所述处理器执行时,实现:
    接收至少一个训练样本,识别所述至少一个训练样本中每个训练样本的第一实体和第二实体;
    针对所述每个训练样本,获得该训练样本中每个文字各自与该训练样本的第一实体的第一位置距离,以及获得该训练样本中每个文字各自与该训练 样本的第二实体的第二位置距离;
    将所述每个训练样本中所有文字各自对应的特征向量进行组合,得到所述每个训练样本各自对应的模型输入向量,其中,每个文字对应的特征向量由每个文字对应的字向量与位置嵌入向量进行组合后获得,每个文字对应的位置嵌入向量包括每个文字的第一位置距离对应的向量、每个文字的第二位置距离对应的向量;
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,以对所述实体语义关系分类模型进行训练。
  13. 如权利要求12所述的电子设备,当所述一个或多个程序被所述处理器执行时,具体实现:
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,获得通过所述实体语义关系分类模型得到的所述每个训练样本对应的预测实体语义关系类型,所述预测实体语义关系类型为预测的所述每个训练样本中第一实体与第二实体两者的实体语义关系类型;
    获得所述每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,所述预设实体语义关系类型为所述每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
    获得所述每个训练样本的偏差值之和;
    当所述偏差值之和超过第一偏差阈值,则调整所述实体语义关系分类模型中的参数,以训练所述实体语义关系分类模型。
  14. 如权利要求12所述的电子设备,当所述一个或多个程序被所述处理器执行时,具体实现:
    将所述每个训练样本各自对应的模型输入向量作为所述实体语义关系分类模型的输入,获得通过所述实体语义关系分类模型得到的所述每个训练样本对应的预测实体语义关系类型,所述预测实体语义关系类型为预测的所述每个训练样本中第一实体与第二实体两者的实体语义关系类型;
    获得所述每个训练样本对应的预测实体语义关系类型和预设实体语义关系类型的偏差值,所述预设实体语义关系类型为所述每个训练样本对应预先存储的第一实体和第二实体两者的实体语义关系类型;
    每当所述至少一个训练样本中目标训练样本的偏差值超过第二偏差阈值时,则调整所述实体语义关系分类模型中的参数,以训练所述实体语义关系分类模型。
  15. 如权利要求12所述的电子设备,所述实体语义关系分类模型为双向门控循环神经网络BiGRU结合注意力Attention机制的模型;
    所述至少一个训练样本为至少一个电子病历,所述训练样本对应的模型输入向量为n个特征向量的组合,其中,所述n为所述至少一个电子病历中包含的平均字数。
PCT/CN2019/127449 2018-12-29 2019-12-23 实体语义关系分类 WO2020135337A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2021534922A JP7202465B2 (ja) 2018-12-29 2019-12-23 エンティティ意味関係分類
US17/287,476 US20210391080A1 (en) 2018-12-29 2019-12-23 Entity Semantic Relation Classification
EP19901450.7A EP3985559A4 (en) 2018-12-29 2019-12-23 CLASSIFICATION OF SEMANTIC ENTITY RELATIONSHIPS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811641958.3A CN109754012A (zh) 2018-12-29 2018-12-29 实体语义关系分类方法、模型训练方法、装置及电子设备
CN201811641958.3 2018-12-29

Publications (1)

Publication Number Publication Date
WO2020135337A1 true WO2020135337A1 (zh) 2020-07-02

Family

ID=66404511

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/127449 WO2020135337A1 (zh) 2018-12-29 2019-12-23 实体语义关系分类

Country Status (5)

Country Link
US (1) US20210391080A1 (zh)
EP (1) EP3985559A4 (zh)
JP (1) JP7202465B2 (zh)
CN (1) CN109754012A (zh)
WO (1) WO2020135337A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326698A (zh) * 2021-06-18 2021-08-31 深圳前海微众银行股份有限公司 检测实体关系的方法、模型训练方法及电子设备
CN115270801A (zh) * 2022-09-28 2022-11-01 浙江太美医疗科技股份有限公司 文本信息抽取模型的训练方法、文本信息抽取方法和应用
CN117057343A (zh) * 2023-10-10 2023-11-14 腾讯科技(深圳)有限公司 道路事件识别方法、装置、设备及存储介质

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109754012A (zh) * 2018-12-29 2019-05-14 新华三大数据技术有限公司 实体语义关系分类方法、模型训练方法、装置及电子设备
CN110111901B (zh) * 2019-05-16 2023-04-18 湖南大学 基于rnn神经网络的可迁移病人分类系统
CN110275928B (zh) * 2019-06-24 2022-11-22 复旦大学 迭代式实体关系抽取方法
CN110442725B (zh) * 2019-08-14 2022-02-25 科大讯飞股份有限公司 实体关系抽取方法及装置
CN110928889A (zh) * 2019-10-23 2020-03-27 深圳市华讯方舟太赫兹科技有限公司 训练模型更新方法、设备以及计算机存储介质
CN112989032A (zh) * 2019-12-17 2021-06-18 医渡云(北京)技术有限公司 实体关系分类方法、装置、介质及电子设备
US11893060B2 (en) * 2020-02-06 2024-02-06 Naver Corporation Latent question reformulation and information accumulation for multi-hop machine reading
CN111429985B (zh) * 2020-03-02 2023-10-27 北京嘉和海森健康科技有限公司 电子病历数据处理方法及系统
CN113591886B (zh) * 2020-04-30 2023-11-07 北京百度网讯科技有限公司 用于信息分类的方法、装置、设备及计算机可读存储介质
CN112199508B (zh) * 2020-08-10 2024-01-19 淮阴工学院 一种基于远程监督的参数自适应农业知识图谱推荐方法
CN114254106A (zh) * 2020-09-25 2022-03-29 北京灵汐科技有限公司 文本分类方法、装置、设备及存储介质
CN112416931A (zh) * 2020-11-18 2021-02-26 脸萌有限公司 信息生成方法、装置和电子设备
CN112786132B (zh) * 2020-12-31 2023-03-24 北京懿医云科技有限公司 病历文本数据分割方法、装置、可读存储介质及电子设备
CN113408588B (zh) * 2021-05-24 2023-02-14 上海电力大学 一种基于注意力机制的双向gru轨迹预测方法
CN114398943B (zh) * 2021-12-09 2023-04-07 北京百度网讯科技有限公司 样本增强方法及其装置
CN114881038B (zh) * 2022-07-12 2022-11-11 之江实验室 基于跨度和注意力机制的中文实体与关系抽取方法及装置
CN116151243B (zh) * 2023-04-23 2023-06-23 昆明理工大学 一种基于类型相关性表征的实体关系抽取方法
CN116665228B (zh) * 2023-07-31 2023-10-13 恒生电子股份有限公司 图像处理方法及装置
CN117496542B (zh) * 2023-12-29 2024-03-15 恒生电子股份有限公司 文档信息提取方法、装置、电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407211A (zh) * 2015-07-30 2017-02-15 富士通株式会社 对实体词的语义关系进行分类的方法和装置
CN107305543A (zh) * 2016-04-22 2017-10-31 富士通株式会社 对实体词的语义关系进行分类的方法和装置
CN108268643A (zh) * 2018-01-22 2018-07-10 北京邮电大学 一种基于多粒度lstm网络的深层语义匹配实体链接方法
CN109754012A (zh) * 2018-12-29 2019-05-14 新华三大数据技术有限公司 实体语义关系分类方法、模型训练方法、装置及电子设备

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102167719B1 (ko) * 2014-12-08 2020-10-19 삼성전자주식회사 언어 모델 학습 방법 및 장치, 음성 인식 방법 및 장치
CN107220237A (zh) * 2017-05-24 2017-09-29 南京大学 一种基于卷积神经网络的企业实体关系抽取的方法
US11328006B2 (en) * 2017-10-26 2022-05-10 Mitsubishi Electric Corporation Word semantic relation estimation device and word semantic relation estimation method
CN108875809A (zh) * 2018-06-01 2018-11-23 大连理工大学 联合attention机制与神经网络的生物医学实体关系分类方法
CN108763555A (zh) * 2018-06-01 2018-11-06 北京奇虎科技有限公司 基于需求词的画像数据获取方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407211A (zh) * 2015-07-30 2017-02-15 富士通株式会社 对实体词的语义关系进行分类的方法和装置
CN107305543A (zh) * 2016-04-22 2017-10-31 富士通株式会社 对实体词的语义关系进行分类的方法和装置
CN108268643A (zh) * 2018-01-22 2018-07-10 北京邮电大学 一种基于多粒度lstm网络的深层语义匹配实体链接方法
CN109754012A (zh) * 2018-12-29 2019-05-14 新华三大数据技术有限公司 实体语义关系分类方法、模型训练方法、装置及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3985559A4

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326698A (zh) * 2021-06-18 2021-08-31 深圳前海微众银行股份有限公司 检测实体关系的方法、模型训练方法及电子设备
CN115270801A (zh) * 2022-09-28 2022-11-01 浙江太美医疗科技股份有限公司 文本信息抽取模型的训练方法、文本信息抽取方法和应用
CN115270801B (zh) * 2022-09-28 2022-12-30 浙江太美医疗科技股份有限公司 文本信息抽取模型的训练方法、文本信息抽取方法和应用
CN117057343A (zh) * 2023-10-10 2023-11-14 腾讯科技(深圳)有限公司 道路事件识别方法、装置、设备及存储介质
CN117057343B (zh) * 2023-10-10 2023-12-12 腾讯科技(深圳)有限公司 道路事件识别方法、装置、设备及存储介质

Also Published As

Publication number Publication date
US20210391080A1 (en) 2021-12-16
CN109754012A (zh) 2019-05-14
JP2022514842A (ja) 2022-02-16
EP3985559A4 (en) 2022-10-05
JP7202465B2 (ja) 2023-01-11
EP3985559A1 (en) 2022-04-20

Similar Documents

Publication Publication Date Title
WO2020135337A1 (zh) 实体语义关系分类
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
CN108597519B (zh) 一种话单分类方法、装置、服务器和存储介质
US10824653B2 (en) Method and system for extracting information from graphs
WO2020147428A1 (zh) 交互内容生成方法、装置、计算机设备及存储介质
WO2020232877A1 (zh) 一种问题答案选取方法、装置、计算机设备及存储介质
CN111602128A (zh) 计算机实现的确定方法和系统
US20220083605A1 (en) Predictive system for generating clinical queries
CN110866124B (zh) 基于多数据源的医学知识图谱融合方法及装置
US20220075958A1 (en) Missing semantics complementing method and apparatus
CN112257449B (zh) 命名实体识别方法、装置、计算机设备和存储介质
JP2022524830A (ja) 機械学習アプリケーションにおけるカテゴリフィールド値の取り扱い
KR102298330B1 (ko) 음성인식과 자연어 처리 알고리즘을 통해 의료 상담 요약문과 전자 의무 기록을 생성하는 시스템
WO2021120779A1 (zh) 一种基于人机对话的用户画像构建方法、系统、终端及存储介质
WO2021017300A1 (zh) 问题生成方法、装置、计算机设备及存储介质
CN112418059A (zh) 一种情绪识别的方法、装置、计算机设备及存储介质
WO2021051877A1 (zh) 人工智能面试中获取输入文本和相关装置
CN113094478A (zh) 表情回复方法、装置、设备及存储介质
US10902215B1 (en) Social hash for language models
CN108009157B (zh) 一种语句归类方法及装置
WO2021063089A1 (zh) 规则匹配方法、规则匹配装置、存储介质及电子设备
CN115358817A (zh) 基于社交数据的智能产品推荐方法、装置、设备及介质
CN111506764B (zh) 音频数据筛选方法、计算机设备和存储介质
WO2021082570A1 (zh) 基于人工智能的语义识别方法、装置和语义识别设备
CN113761126A (zh) 文本内容的识别方法、装置、设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19901450

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021534922

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019901450

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2019901450

Country of ref document: EP

Effective date: 20210729