WO2020215683A1 - Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device - Google Patents

Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device Download PDF

Info

Publication number
WO2020215683A1
WO2020215683A1 PCT/CN2019/117723 CN2019117723W WO2020215683A1 WO 2020215683 A1 WO2020215683 A1 WO 2020215683A1 CN 2019117723 W CN2019117723 W CN 2019117723W WO 2020215683 A1 WO2020215683 A1 WO 2020215683A1
Authority
WO
WIPO (PCT)
Prior art keywords
convolutional neural
neural network
loss function
named entity
text
Prior art date
Application number
PCT/CN2019/117723
Other languages
French (fr)
Chinese (zh)
Inventor
金戈
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020215683A1 publication Critical patent/WO2020215683A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the technical field of processing herein, in particular to a method and device for semantic recognition based on convolutional neural networks, non-volatile readable storage media, and computer equipment.
  • the disadvantage of the prior art is that the two independent recognition models used to realize named entity recognition and entity relationship recognition are prone to information redundancy in the process of joint use.
  • the current solution is only It is limited to partially combining the above two independent recognition models based on the cyclic neural network to increase the calculation rate of the network model, thereby improving the efficiency of named entity recognition and entity relationship recognition, but the improvement effect is weak.
  • this application provides a semantic recognition method and device based on a convolutional neural network, a non-volatile readable storage medium, and computer equipment.
  • the main purpose is to solve the existing two independent methods for named entity recognition and entity relationship recognition.
  • information redundancy is easy to exist between each other, and the calculation rate of the adopted network model is low.
  • a semantic recognition method based on a convolutional neural network including:
  • the third convolutional neural network preset in the semantic recognition model is used to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
  • a semantic recognition device based on a convolutional neural network including:
  • the first convolutional neural network module is used to obtain the text vector of the text to be recognized by using the first convolutional neural network preset in the semantic recognition model;
  • the second convolutional neural network module is configured to use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
  • the third convolutional neural network module is used to use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
  • a non-volatile readable storage medium having computer readable instructions stored thereon, and the program is executed by a processor to realize the above-mentioned semantic recognition method based on convolutional neural network.
  • a computer device including a non-volatile readable storage medium, a processor, and computer readable instructions stored on the non-volatile readable storage medium and running on the processor ,
  • the processor executes the program, the above semantic recognition method based on the convolutional neural network is realized.
  • the convolutional neural network-based semantic recognition method and device, non-volatile readable storage medium, and computer equipment provided in this application will be used for named entity recognition and entity relationship with existing recurrent neural networks.
  • this application uses the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized, and uses the preset first convolutional neural network in the semantic recognition model.
  • the second convolutional neural network determines the named entity in the text to be recognized according to the acquired text vector, and uses the third convolutional neural network preset in the semantic recognition model, according to the acquired text vector and the determined named entity, Determine the entity relationship in the text to be recognized.
  • FIG. 1 shows a schematic flowchart of a semantic recognition method based on a convolutional neural network provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of another semantic recognition method based on a convolutional neural network provided by an embodiment of the present application
  • Fig. 3 shows a schematic structural diagram of a semantic recognition device based on a convolutional neural network provided by an embodiment of the present application.
  • the preprocessing can be specifically set according to the actual application scenario, for example, the preprocessing is set as word segmentation processing, that is, the text to be recognized is marked with words as the unit; or the preprocessing is set as word filtering processing, that is, After the word segmentation is performed on the text to be recognized, unimportant words are eliminated, such as auxiliary verbs such as "can, should", and unimportant words such as interjections such as "oh, ah", to improve the semantic recognition of the text to be recognized The efficiency is not specifically limited here.
  • the specific word segmentation processing of the text to be recognized is to use the SBME notation method to mark the words in the text to be recognized, that is, to mark the word as S and the beginning of the word as B.
  • the middle part of the word is marked as M
  • the end of the word is marked as E
  • the initial text vector is generated according to the marked text to be recognized.
  • the training sample set includes multiple phrase corpora.
  • the phrase corpus is in a short sentence format, that is, a short sentence is divided into a comma.
  • each phrase corpus includes two interrelated words, for example, "China, Shanghai” , And mark the relationship between the two words in each phrase corpus, for example, mark the relationship between the words "China, Shanghai” as the upper and lower relationship to construct a training sample set.
  • the relationship between two words in the phrase corpus can be set in various ways. For example, mark the relationship between “Copyright Office and Trademark Office” as a parallel relationship, and mark “Copyright Office, Trademark Office” in The term attribute of the Copyright Office and the Trademark Office of is a national institution; the relationship between the mark “Canine family, dog” is an inclusive relationship, and the word attribute of canine family and dog in the mark “Canine family, dog” is animal, etc., here There is no specific limitation on the mutual relationship.
  • the preset second convolutional neural network is used to identify the named entities contained in the text to be recognized, the output result of the preset first convolutional neural network is used as the input of the preset second convolutional neural network, and the preset is input
  • the output of the second convolutional neural network is the named entity contained in the text to be recognized.
  • the text to be recognized includes multiple words, and a named entity or named entity category is output for each word.
  • the named entity category includes person name, place name, organization name, Named entity categories such as product names and proper nouns.
  • the preset third convolutional neural network is used to identify the entity relationship contained in the text to be recognized, and the preset output results of the first convolutional neural network and the preset second convolutional neural network are used as presets
  • the input of the third convolutional neural network is input to the preset third convolutional neural network, and the output result is the entity relationship between the named entities contained in the text to be recognized.
  • the preset number of named entities output by the second convolutional neural network is two or three
  • the preset third convolutional neural network uses the preset third convolutional neural network to output two or three
  • the entity relationship between two named entities because the text to be recognized is in short sentence format, and the preset third convolutional neural network is only used for the recognition of the relationship between a small number of named entities, so that the recognition efficiency of the text to be recognized is obtained Significant improvement.
  • the acquired text to be recognized can be hierarchically recognized according to the constructed semantic recognition model, and different convolutional neural networks in the semantic recognition model can be used to realize named entities and entity relationships in the text to be recognized.
  • this embodiment can not only improve the recognition efficiency of the text to be recognized, Avoid the information redundancy problem caused by the joint use of the existing two independent recognition models.
  • the application scenarios of this embodiment are more broad, that is, it can be simultaneously applied to the recognition of named entities alone, the recognition of entity relationships alone, and At the same time, for the application scenarios of named entity and entity relationship recognition, there is no need to build different semantic recognition models for different needs.
  • the maintenance and optimization of the later model reduce the cost, and it does not affect the model at all while reducing the cost. Semantic recognition efficiency and semantic recognition accuracy.
  • this method include:
  • the loss functions of the second and third convolutional neural networks are constructed based on cross entropy.
  • the second The loss function of the convolutional neural network is the cross entropy used to identify named entities
  • the loss function of the third convolutional neural network is the cross entropy used to identify the relationship.
  • the first loss function, the second loss function, and the third loss function can be set differently according to the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network.
  • the same loss function can also be used.
  • the first loss function, the second loss function, and the third loss function are not specifically set here.
  • the first loss function and the second loss function are set Same as the third loss function, the calculation formula is:
  • x is the data sample in the sample set of the first convolutional neural network, the second convolutional neural network and the third convolutional neural network used for training initialization
  • p and q are the true probability distributions of the sample set respectively, not true Probability distributions.
  • the determined first loss function, second loss function, and third loss function train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain a preset The first convolutional neural network, the second convolutional neural network and the third convolutional neural network.
  • step 202 may specifically include: determining the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function ; Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network Network and third convolutional neural network.
  • the determined first loss function, second loss function, and third loss function are added and averaged to obtain the loss function of the semantic recognition model. Further, if the actual application scenario is If the number of named entities in the text to be recognized is large, the weight value of the second loss function should be increased accordingly. If there are more entity relationships in the text to be recognized in the actual application scenario, the weight value of the third loss function should be increased accordingly. There is no specific limitation on the calculation method of the loss function of the semantic recognition model.
  • the convex optimization algorithm is used to automatically update the network parameters in the hidden layer of the neural network.
  • the preset first convolutional neural network, second convolutional neural network, and third convolutional neural network are obtained.
  • convex optimization algorithm also known as convex optimization algorithm, or convex minimization algorithm, is a sub-field of mathematical optimization, which uses the idea of local optimal value that is global optimal value to update the network parameters in the hidden layer of neural network .
  • the adaptive moment estimation (Adam: Adaptive Moment Estimation) optimization algorithm is a first-order optimization algorithm that can replace the traditional stochastic gradient descent process.
  • the Adam optimization algorithm is used to update the network parameters in the hidden layer of the neural network .
  • Python's tensorflow library the loss function of the semantic recognition model is optimized by convex function. Specifically, the loss function is minimized as the goal, and the Adam optimization algorithm is used to iteratively update the network parameters in the semantic recognition model to obtain the preset
  • the first convolutional neural network, the second convolutional neural network and the third convolutional neural network specifically limits the number of convolutional layers in the semantic recognition model.
  • the specific training process is to compare the named entity recognition result output by the second convolutional neural network with the named entity or labeled word attributes in the training sample set. If the comparison results are inconsistent, it means that the recognition is wrong; and , According to the named entity recognition result output by the second convolutional neural network, compare the entity relationship recognition result output by the third convolutional neural network with the entity relationship marked by the corresponding output named entity recognition result in the training sample set. If the comparison results are inconsistent , It means the recognition error.
  • the loss function of the semantic recognition model is used to correct the erroneous recognition results, and then complete the training of the semantic recognition model, and obtain a semantic recognition model capable of simultaneously performing named entity recognition and entity relationship recognition.
  • the initialized text vector is obtained by performing word segmentation processing on the acquired text to be recognized, and the initialized text vector is used as the input of the first convolutional neural network preset in the semantic recognition model.
  • the preset embedding layer of the first convolutional neural network uses a preset word vector dictionary to convert the initialized text vector into a word vector and a word vector for representing the text to be recognized.
  • the preset word vector dictionary contains the word vector corresponding to each word in the initialized text vector and the word vector corresponding to each word.
  • the preset first convolutional neural network includes a double-layer one-dimensional full convolution structure, and the word vectors and word vectors from the embedding layer pass through the double-layer one-dimensional full convolution structure to output the text vector of the text to be recognized.
  • the convolution kernel is used to perform a convolution operation (ie, dot multiplication) with the word vector and word vector of the text to be recognized, and all the obtained convolution operation results are used as the text vector of the text to be recognized.
  • set the length of the convolution kernel to 3, that is, use the convolution kernel with dimension 3 to perform convolution operation with the word vector and word vector of the text to be recognized, and use the text vector of the text to be recognized as the preset The input of the second convolutional neural network and the preset third convolutional neural network.
  • the preset first convolutional neural network is a shared network structure of the preset second convolutional neural network and the preset third convolutional neural network, thereby realizing the preset second convolutional neural network and the preset
  • the sharing of the underlying parameters in the third convolutional neural network effectively avoids the information redundancy problem caused by the joint use of the existing two independent recognition models, and further improves the efficiency of semantic recognition.
  • the second convolutional neural network preset in the semantic recognition model is used to perform named entity recognition (NER: Named Entity Recognition) on the obtained text vector to obtain the named entity to be determined.
  • named entity recognition is also called “proprietary name recognition”, which refers to the recognition of entities with specific meaning in the text to be recognized.
  • the preset second convolutional neural network is a dense connection structure DenseNet.
  • the dense connection structure has a large number of dense connections, which can maximize the information flow between all layers in the neural network.
  • the input of each layer of the neural network is the union of the output of all the previous layers, and the feature map output by this layer will also be directly passed to all subsequent layers as input, so as to realize the repeated use of features and reduce redundancy.
  • the preset second convolutional neural network includes a two-layer convolution structure, and further convolution operations are performed on the convolution operation result output by the preset first convolutional neural network in the semantic recognition model based on the two-layer convolution structure. , Get the named entity to be determined.
  • the preset convolution structure in the second convolutional neural network is a one-dimensional full convolution structure, and the one-dimensional full convolution structure can maintain the same length as the convolution operation result output through it, that is, based on the one-dimensional full convolution
  • the product structure makes the convolution operation result output by the preset first convolutional neural network and the convolution operation result output through the one-dimensional full convolution structure a sequence of equal length.
  • step 206 may specifically include: if the boundary character recognition result of the named entity to be determined is consistent with the preset boundary character recognition result, determining the The named entity is the final named entity; if the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  • the second convolutional neural network preset in the semantic recognition model uses the second convolutional neural network preset in the semantic recognition model to perform boundary character recognition according to the obtained SBME mark in the named entity to be determined. Specifically, if the mark in the obtained named entity to be determined is S, that is If the named entity to be determined is a single character, then the single character is recognized; if the recognition result is consistent with the preset boundary character recognition result, the single character is determined to be the final named entity. For example, if the recognized named entity to be determined is "cat", and the recognition result is consistent with the preset boundary character recognition result, then the final recognized named entity is a cat. If the recognition result is inconsistent with the preset boundary character recognition result, it means that the word is not a named entity.
  • the named entity is recognized according to the tag B and the tag E; if the recognition result is consistent with the preset If the boundary character recognition results are consistent, it is determined that the named entity to be determined is the final named entity.
  • the mark in the named entity to be determined includes BME, it is recognized that the mark B and mark E in the named entity to be determined correspond to "pre” and "home”, and the recognition result is consistent with the preset boundary character recognition result, then Identify the final named entity as a prophet; if the mark in the named entity to be determined includes BE, identify the mark B and mark E in the named entity to be determined corresponding to "work” and "home”, and the recognition result is the same as the preset If the boundary character recognition results are consistent, the final named entity is recognized as the writer. If the recognition result is inconsistent with the preset boundary character recognition result, it means that the multi-character or double-character is not a named entity.
  • the recognition result is not a named entity.
  • the named entity to be identified is "writer”
  • the recognition result is inconsistent with the preset boundary character recognition result
  • "writer” is used as a new training sample of the semantic recognition model, and the semantic recognition model is further refined. Optimized to improve the recognition accuracy of the semantic recognition model.
  • the preset boundary character recognition result can be the single character of the named entity, and the head and tail of the double character and multi-character, or the word attribute of the word mark in the training sample set, that is, the word attribute of the word. As well as the beginning and end of double-word and multi-word.
  • the text to be recognized can include one or more named entities. Therefore, according to the text vector of the text to be recognized, the preset activation function softmax in the second convolutional neural network is used to output one or more named entities.
  • the recognition result that is, the output result corresponds to one or more named entities included in the text to be recognized.
  • the second convolutional neural network also includes an activation function softmax. Based on the activation function softmax, the calculation result (ie, the named entity to be determined) obtained through the two-layer convolution structure in the second convolutional neural network is further classified Operate to get the final named entity.
  • the preset third convolutional neural network is a densely connected structure DenseNet.
  • a convolutional layer and a pooling layer are constructed, and are fully connected through the activation function softmax
  • the layer outputs the recognition result, and the output result is a multi-classification variable, that is, one or more entity relationships included in the text to be recognized are determined according to the probability values of different classifications.
  • the corresponding relationship between the named entity in the training sample set and the marked entity relationship is used to determine the entity relationship, and the identified entity relationship is compared with the determined entity relationship. If the recognition results are consistent , The recognized entity relationship is the entity relationship in the text to be recognized; if the recognition results are inconsistent, it means that the recognition is wrong, and the wrong recognition result is adjusted to the corresponding relationship between the named entity in the training sample set and the marked entity relationship.
  • the named entity to be determined is added to the training sample set for training the semantic recognition model as a new phrase corpus, and the name to be determined is marked in the phrase corpus
  • the word attribute of the entity is a recognition error, so that the semantic recognition model can effectively improve the recognition accuracy of the text to be recognized after optimization training.
  • the first convolutional neural network preset in the semantic recognition model is used to obtain the text vector of the text to be recognized, and the second convolutional neural network preset in the semantic recognition model is used, according to the obtained
  • the text vector determines the named entity in the text to be recognized
  • the third convolutional neural network preset in the semantic recognition model is used to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
  • the system can According to the sentence input by the user, the semantic recognition model is used to realize accurate and rapid recognition of the sentence, thereby providing users with more accurate services and improving user experience.
  • an embodiment of the present application provides a semantic recognition device based on a convolutional neural network.
  • the device includes: a first convolutional neural network module 31, a second The convolutional neural network module 32 and the third convolutional neural network module 33.
  • the first convolutional neural network module 31 can be used to obtain the text vector of the text to be recognized by using the preset first convolutional neural network in the semantic recognition model; the first convolutional neural network module 31 recognizes the text to be recognized for the device
  • the second convolutional neural network module 32 may be used to use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the text vector obtained by the first convolutional neural network module 31;
  • the second convolutional neural network module 32 is the main functional module of the device for identifying named entities in the text to be recognized, and is also the core functional module of the device.
  • the third convolutional neural network module 33 can be used to use the preset third convolutional neural network in the semantic recognition model, according to the text vector obtained by the first convolutional neural network module 31 and the second convolutional neural network module 32
  • the determined named entity determines the entity relationship in the text to be recognized; the third convolutional neural network module 33 is the main functional module of the device that recognizes the entity relationship in the text to be recognized, and is also the core functional module of the device.
  • the first convolutional neural network module 31 can be specifically used to obtain the word vectors and word vectors of the text to be recognized by using the word vector dictionary; perform convolution operations on the obtained word vectors and word vectors to obtain The text vector of the text to be recognized.
  • a training module 34 which can be used to determine the first loss function and the second loss according to the initialized first, second, and third convolutional neural networks.
  • Function and third loss function according to the determined first loss function, second loss function and third loss function, perform the initialization of the first convolutional neural network, the second convolutional neural network and the third convolutional neural network Train to obtain the preset first convolutional neural network, second convolutional neural network and third convolutional neural network.
  • the training module 34 may be specifically used to determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function; use the semantic recognition model
  • the loss function of training initializes the first convolutional neural network, the second convolutional neural network, and the third convolutional neural network to obtain the preset first, second, and third convolutional neural networks The internet.
  • the second convolutional neural network module 32 can be specifically used to perform convolution operations on the acquired text vector to obtain the named entity to be determined; perform boundary character recognition on the named entity to be determined, according to the recognition The result determines the final named entity.
  • the second convolutional neural network module 32 can be specifically used to determine the named entity to be determined if the boundary character recognition result of the named entity to be determined is consistent with the preset boundary character recognition result Is the final named entity; if the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  • the training module 34 may be specifically used to train the semantic recognition model using the newly added training samples to obtain an optimized semantic recognition model. It should be noted that, for other corresponding descriptions of the functional units involved in the convolutional neural network-based semantic recognition device provided by the embodiment of the present application, reference may be made to the corresponding descriptions in FIG. 1 and FIG. 2, and details are not repeated here.
  • an embodiment of the present application also provides a non-volatile readable storage medium on which computer readable instructions are stored, and the program is executed when the processor is executed.
  • the above-mentioned semantic recognition method based on convolutional neural network as shown in Fig. 1 and Fig. 2.
  • the technical solution of the present application can be embodied in the form of a software product, and the software product can be stored in a non-volatile non-volatile readable storage medium (can be CD-ROM, U disk, mobile hard disk) Etc.), including several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in each implementation scenario of this application.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the embodiments of the present application also provide a computer device, which can be a personal computer, a server, or a network.
  • the physical device includes a non-volatile readable storage medium and a processor; the non-volatile readable storage medium is used to store computer readable instructions; and the processor is used to execute computer readable instructions to achieve the above Figure 1 and Figure 2 show the semantic recognition method based on convolutional neural network.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a Wi-Fi module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the network interface can optionally include a standard wired interface, a wireless interface (such as a Bluetooth interface, a WI-FI interface), etc.
  • the non-volatile readable storage medium may also include an operating system and a network communication module.
  • the operating system is a program that manages the hardware and software resources of computer equipment, and supports the operation of information processing programs and other software and/or programs.
  • the network communication module is used to implement communication between various components in the non-volatile readable storage medium and communication with other hardware and software in the physical device.
  • this embodiment can effectively avoid the existing two This independent recognition model causes information redundancy in the process of joint use, thereby effectively improving the efficiency of semantic recognition.

Abstract

Disclosed are a semantic recognition method and apparatus based on a convolutional neural network, and a non-volatile readable storage medium and a computer device, which relate to the technical field of text processing and can improve the efficiency of semantic recognition. The method comprises: using a first convolutional neural network preset in a semantic recognition model to acquire a text vector of a text to be recognized; using a second convolutional neural network preset in the semantic recognition model to determine, according to the acquired text vector, a named entity in said text; and using a third convolutional neural network preset in the semantic recognition model to determine, according to the acquired text vector and the determined named entity, an entity relationship in said text. The present application is applicable to intelligent question-answering of customer services in an insurance product business.

Description

基于卷积神经网络的语义识别方法及装置、非易失性可读存储介质、计算机设备Semantic recognition method and device based on convolutional neural network, non-volatile readable storage medium, and computer equipment
本申请要求与2019年4月26日提交中国专利局、申请号为201910345595.7、申请名称为“基于卷积神经网络的语义识别方法及装置、存储介质及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims priority with the Chinese patent application filed on April 26, 2019 with the Chinese Patent Office, the application number is 201910345595.7, and the application title is "Semantic Recognition Method and Apparatus Based on Convolutional Neural Network, Storage Medium and Computer Equipment". The entire content is incorporated into the application by reference.
技术领域Technical field
本申请涉及本文处理技术领域,尤其是涉及到基于卷积神经网络的语义识别方法及装置、非易失性可读存储介质及计算机设备。This application relates to the technical field of processing herein, in particular to a method and device for semantic recognition based on convolutional neural networks, non-volatile readable storage media, and computer equipment.
背景技术Background technique
随着科学技术的发展,对于一些词语和词语之间的关系识别方法越来越多,所适用的场景也越来越广泛,例如一些地名之间的上下关系,国家机构之间的层级关系,物品种类的包含关系等,而这些需要各自独立的识别模型分别来实现词语(即,命名实体)的识别以及词语和词语之间的关系(即,实体关系)识别。With the development of science and technology, there are more and more ways to identify the relationship between some words and words, and the applicable scenarios are also more and more extensive, such as the upper and lower relationships between some place names, the hierarchical relationship between state institutions, The inclusion relations of the types of items, etc., and these require independent recognition models to realize the recognition of words (namely entities) and the recognition of the relationships between words and words (ie, entity relationships).
现有技术存在的不足为,上述用于实现命名实体识别和实体关系识别的两种独立的识别模型在联合使用的过程中,彼此之间容易存在信息冗余的问题,目前的解决方法也仅局限于基于循环神经网络将上述两种独立的识别模型进行部分的联合,以提升网络模型的计算速率,从而提升命名实体识别和实体关系识别的效率,但提升效果较弱。The disadvantage of the prior art is that the two independent recognition models used to realize named entity recognition and entity relationship recognition are prone to information redundancy in the process of joint use. The current solution is only It is limited to partially combining the above two independent recognition models based on the cyclic neural network to increase the calculation rate of the network model, thereby improving the efficiency of named entity recognition and entity relationship recognition, but the improvement effect is weak.
发明内容Summary of the invention
有鉴于此,本申请提供了基于卷积神经网络的语义识别方法及装置、非易失性可读存储介质、计算机设备,主要目的在于解决现有用于命名实体识别和实体关系识别的两种独立的识别模型在联合使用的过程中,彼此之间容易存在信息冗余,以及所采用的网络模型的计算速率较低的问题。In view of this, this application provides a semantic recognition method and device based on a convolutional neural network, a non-volatile readable storage medium, and computer equipment. The main purpose is to solve the existing two independent methods for named entity recognition and entity relationship recognition. In the process of joint use of the recognition models, information redundancy is easy to exist between each other, and the calculation rate of the adopted network model is low.
根据本申请的一个方面,提供了一种基于卷积神经网络的语义识别方法,该方法包括:According to one aspect of the present application, there is provided a semantic recognition method based on a convolutional neural network, the method including:
利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;Use the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized;
利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;Use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。The third convolutional neural network preset in the semantic recognition model is used to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
根据本申请的另一方面,提供了一种基于卷积神经网络的语义识别装置,该装置包括:According to another aspect of the present application, there is provided a semantic recognition device based on a convolutional neural network, the device including:
第一卷积神经网络模块,用于利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;The first convolutional neural network module is used to obtain the text vector of the text to be recognized by using the first convolutional neural network preset in the semantic recognition model;
第二卷积神经网络模块,用于利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;The second convolutional neural network module is configured to use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
第三卷积神经网络模块,用于利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。The third convolutional neural network module is used to use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
依据本申请又一个方面,提供了一种非易失性可读存储介质,其上存储有计算机可读指令,所述程序被处理器执行时实现上述基于卷积神经网络的语义识别方法。According to another aspect of the present application, there is provided a non-volatile readable storage medium having computer readable instructions stored thereon, and the program is executed by a processor to realize the above-mentioned semantic recognition method based on convolutional neural network.
依据本申请再一个方面,提供了一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,所述处理器执行所述程序时实现上述基于卷积神经网络的语义识别方法。According to another aspect of the present application, a computer device is provided, including a non-volatile readable storage medium, a processor, and computer readable instructions stored on the non-volatile readable storage medium and running on the processor , When the processor executes the program, the above semantic recognition method based on the convolutional neural network is realized.
借由上述技术方案,本申请提供的基于卷积神经网络的语义识别方法及装置、非易失性可读存储介质、计算机设备,与现有基于循环神经网络将用于命名实体识别和实体关系识别的两种独立的识别模型进行部分联合的技术方案相比,本申请利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量,利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体,以及利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。可见,通过利用语义识别模型中的多层卷积神经网络实现对命名实体和实体关系的识别,能够有效避免现有两种独立的识别模型在联合使用过程中造成的信息冗余问题,从而有效提升语义识别效率。With the above technical solutions, the convolutional neural network-based semantic recognition method and device, non-volatile readable storage medium, and computer equipment provided in this application will be used for named entity recognition and entity relationship with existing recurrent neural networks. Compared with the technical solution in which the two independent recognition models for recognition are partially combined, this application uses the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized, and uses the preset first convolutional neural network in the semantic recognition model. The second convolutional neural network determines the named entity in the text to be recognized according to the acquired text vector, and uses the third convolutional neural network preset in the semantic recognition model, according to the acquired text vector and the determined named entity, Determine the entity relationship in the text to be recognized. It can be seen that by using the multi-layer convolutional neural network in the semantic recognition model to realize the recognition of named entities and entity relations, it can effectively avoid the information redundancy problem caused by the existing two independent recognition models in the process of joint use. Improve the efficiency of semantic recognition.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solution of this application. In order to understand the technical means of this application more clearly, it can be implemented in accordance with the content of the specification, and to make the above and other purposes, features and advantages of this application more obvious and understandable. , The following specifically cite the specific implementation of this application.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application and do not constitute an improper limitation of the application. In the attached picture:
图1示出了本申请实施例提供的一种基于卷积神经网络的语义识别方法的流程示意图;FIG. 1 shows a schematic flowchart of a semantic recognition method based on a convolutional neural network provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种基于卷积神经网络的语义识别方法的流程示意图;FIG. 2 shows a schematic flowchart of another semantic recognition method based on a convolutional neural network provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种基于卷积神经网络的语义识别装置的结构示意图。Fig. 3 shows a schematic structural diagram of a semantic recognition device based on a convolutional neural network provided by an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。针对现有用于实现命名实体识别和实体关系识别的两种独立的识别模型在联合使用的过程中,彼此之间容易存在信息冗余,以及所采用的网络模型的计算速率较低的问题。本实施例提供了一种基于卷积神经网络的语义识别方法,能够有效避免现有两种独立的识别模型在联合使用过程中造成的信息冗余问题,从而有效提升语义识别效率,如图1所示,该方法包括:Hereinafter, the application will be described in detail with reference to the drawings and in conjunction with embodiments. It should be noted that the embodiments in this application and the features in the embodiments can be combined with each other if there is no conflict. For the existing two independent recognition models used to realize named entity recognition and entity relationship recognition, in the process of joint use, information redundancy is easy to exist between each other, and the calculation rate of the adopted network model is low. This embodiment provides a semantic recognition method based on a convolutional neural network, which can effectively avoid the information redundancy problem caused by the joint use of the existing two independent recognition models, thereby effectively improving the efficiency of semantic recognition, as shown in Figure 1 As shown, the method includes:
101、利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量。101. Obtain the text vector of the text to be recognized by using the first convolutional neural network preset in the semantic recognition model.
获取待识别文本,对获取到的待识别文本进行预处理,得到初始化的文本向量,并将初始化的文本向量输入语义识别模型预设的第一卷积神经网络,生成用于表征待识别文本的文本向量。其中,预处理可以根据实际的应用场景进行具体设定,例如设定该预处理为分词处理,即以词语为单位对待识别文本进行分词标记;或者设定该预处理为词语筛选处理,即以词语为单位对待识别文本进行分词标记后,剔除不重要的词语,例如,“能够、应该”等助动词,以及“喔、啊”等感叹词等不重要的词语,以提升对待识别文本的语义识别效率,此处不对预处理进行具体限定。Obtain the text to be recognized, preprocess the text to be recognized to obtain the initialized text vector, and input the initialized text vector into the first convolutional neural network preset in the semantic recognition model to generate a characterization of the text to be recognized Text vector. Among them, the preprocessing can be specifically set according to the actual application scenario, for example, the preprocessing is set as word segmentation processing, that is, the text to be recognized is marked with words as the unit; or the preprocessing is set as word filtering processing, that is, After the word segmentation is performed on the text to be recognized, unimportant words are eliminated, such as auxiliary verbs such as "can, should", and unimportant words such as interjections such as "oh, ah", to improve the semantic recognition of the text to be recognized The efficiency is not specifically limited here.
其中,以设定该预处理为分词处理为例,对待识别文本进行分词处理具体为,利用SBME标记法对待识别文本中的词语分别进行标记,即将单字标记为S,词的首部标记为B,词的中部标记为M,词的尾部标记为E,并根据标记后的待识别文本生成初始化的文本向量。Among them, taking the preprocessing as the word segmentation processing as an example, the specific word segmentation processing of the text to be recognized is to use the SBME notation method to mark the words in the text to be recognized, that is, to mark the word as S and the beginning of the word as B. The middle part of the word is marked as M, the end of the word is marked as E, and the initial text vector is generated according to the marked text to be recognized.
在对待识别文本进行语义识别之前构建本申请的语义识别模型,并获取用于训练该语义识别模型的训练样本集,即该训练样本集能够用于训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,从而得到语义识别模型。该训练样本集包括多个词组语料,词组语料为短句格式,即以逗号划分为一个短句,具体为,每个词组语料中包括两个具有相互关系的词语,例如,“中国、上海”,并在每个词组语料中标记两个词语之间的关系,例如,对“中国、上海”标记词语之间的关系为上下关系,从而构建训练样本集。此外,还可以对上述多个词组语料中的每个词语标记对应的词语属性,例如,将“中国、上海”中的中国和上海分别标记为地名,或者“犬科、狗”中的犬科和狗分别标记为动物。在实际应用过程中,词组语料中两个词语之间的关系可以进行多种设定,例如,标记“版权局、商标局”之间的关系为并列关系,标记“版权局、商标局”中的版权局和商标局的词语属性为国家机构;标记“犬科、狗”之间的关系为包含关系,标记“犬科、狗”中的犬科和狗的词语属性为动物等,此处不对相互关系进行具体限定。Before performing semantic recognition on the text to be recognized, construct the semantic recognition model of this application, and obtain the training sample set used to train the semantic recognition model, that is, the training sample set can be used to train the initialized first convolutional neural network, second Convolutional neural network and the third convolutional neural network, thereby obtaining a semantic recognition model. The training sample set includes multiple phrase corpora. The phrase corpus is in a short sentence format, that is, a short sentence is divided into a comma. Specifically, each phrase corpus includes two interrelated words, for example, "China, Shanghai" , And mark the relationship between the two words in each phrase corpus, for example, mark the relationship between the words "China, Shanghai" as the upper and lower relationship to construct a training sample set. In addition, you can also mark the corresponding word attribute for each word in the above multiple phrase corpus, for example, mark China and Shanghai in "China, Shanghai" as place names, or canine in "Canine, Dog" And dogs are marked as animals. In the actual application process, the relationship between two words in the phrase corpus can be set in various ways. For example, mark the relationship between "Copyright Office and Trademark Office" as a parallel relationship, and mark "Copyright Office, Trademark Office" in The term attribute of the Copyright Office and the Trademark Office of is a national institution; the relationship between the mark "Canine family, dog" is an inclusive relationship, and the word attribute of canine family and dog in the mark "Canine family, dog" is animal, etc., here There is no specific limitation on the mutual relationship.
102、利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体。102. Use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector.
预设的第二卷积神经网络用于识别待识别文本中包含的命名实体,将预设的第一卷积神经网络的输出结果作为预设的第二卷积神经网络的输入,输入预设的第二卷积神经网络,输出结果即待识别文本中包含的命名实体。The preset second convolutional neural network is used to identify the named entities contained in the text to be recognized, the output result of the preset first convolutional neural network is used as the input of the preset second convolutional neural network, and the preset is input The output of the second convolutional neural network is the named entity contained in the text to be recognized.
其中,待识别文本中包含的命名实体可以为多个,即待识别文本中包括多个词语,针对每个词语对应输出一个命名实体或者命名实体类别,命名实体类别包括人名、地名、机构名、产品名、专有名词等命名实体类别。Among them, there can be multiple named entities contained in the text to be recognized, that is, the text to be recognized includes multiple words, and a named entity or named entity category is output for each word. The named entity category includes person name, place name, organization name, Named entity categories such as product names and proper nouns.
103、利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。103. Using the third convolutional neural network preset in the semantic recognition model, determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
预设的第三卷积神经网络用于识别待识别文本中包含的实体关系,将预设的第一卷积神经网络的输出结果和预设的第二卷积神经网络的输出结果作为预设的第三卷积神经网络的输入,输入预设的第三卷积神经网络,输出结果即待识别文本中包含的命名实体间的实体关系。The preset third convolutional neural network is used to identify the entity relationship contained in the text to be recognized, and the preset output results of the first convolutional neural network and the preset second convolutional neural network are used as presets The input of the third convolutional neural network is input to the preset third convolutional neural network, and the output result is the entity relationship between the named entities contained in the text to be recognized.
在实际应用中,由于待识别文本为短句格式,在经由预设的第二卷积神经网络处理后,得到的命名实体数量不会很多,因此,被局限于几类的实名实体在经由预设的第三卷积神经网络处理后得到的实体关系将相对较为确定且准确。例如,预设的第二卷积神经网络输出的命名实体数量为两个或三个,则根据输出的两个或三个命名实体,利用预设的第三卷积神经网络输出两个或三个命名实体之间的实体关系,由于待识别文本为短句格式,且预设的第三卷积神经网络仅用于少量命名实体间的相互关系识别,从而使得对待识别文本的识别效率得到了显著的提升。In practical applications, since the text to be recognized is in a short sentence format, the number of named entities obtained after being processed by the preset second convolutional neural network will not be large. Therefore, the real-name entities that are limited to a few types of The entity relationship obtained after processing by the third convolutional neural network will be relatively certain and accurate. For example, if the preset number of named entities output by the second convolutional neural network is two or three, then according to the output two or three named entities, use the preset third convolutional neural network to output two or three The entity relationship between two named entities, because the text to be recognized is in short sentence format, and the preset third convolutional neural network is only used for the recognition of the relationship between a small number of named entities, so that the recognition efficiency of the text to be recognized is obtained Significant improvement.
对于本实施例可以按照上述方案,根据所构建的语义识别模型对获取到的待识别文本进行层级式识别,利用语义识别模型中不同的卷积神经网络分别实现对待识别文本中命名实体和实体关系的识别,与现有的基于循环神经网络将用于命名实体识别和实体关系识别的两种独立的识别模型进行部分联合的技术方案相比,本实施例不仅能够提升对待识别文本的识别效率,避免现有两种独立的识别模型在联合使用过程中造成的信息冗余问题,同时本实施例的适用场景更加的宽泛,即能够同时适用于单独对命名实体识别,单独对实体关系识别,以及同时对命名实体和实体关系识别的应用场景,而不需要针对不同的需求去构建不同的语义识别模型,对于后期的模型维护与优化都降低了成本,且在降低成本的同时完全不影响模型的语义识别效率和语义识别准确度。For this embodiment, according to the above scheme, the acquired text to be recognized can be hierarchically recognized according to the constructed semantic recognition model, and different convolutional neural networks in the semantic recognition model can be used to realize named entities and entity relationships in the text to be recognized. Compared with the existing technical solution based on the recurrent neural network that partially combines two independent recognition models for named entity recognition and entity relationship recognition, this embodiment can not only improve the recognition efficiency of the text to be recognized, Avoid the information redundancy problem caused by the joint use of the existing two independent recognition models. At the same time, the application scenarios of this embodiment are more broad, that is, it can be simultaneously applied to the recognition of named entities alone, the recognition of entity relationships alone, and At the same time, for the application scenarios of named entity and entity relationship recognition, there is no need to build different semantic recognition models for different needs. The maintenance and optimization of the later model reduce the cost, and it does not affect the model at all while reducing the cost. Semantic recognition efficiency and semantic recognition accuracy.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例的具体实施过程,提供了另一种基于卷积神经网络的语义识别方法,如图2所示,该方法包括:Further, as a refinement and extension of the specific implementation of the above-mentioned embodiment, in order to fully explain the specific implementation process of this embodiment, another semantic recognition method based on convolutional neural network is provided. As shown in FIG. 2, this method include:
201、根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数。201. Determine the first loss function, the second loss function, and the third loss function according to the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network.
由于第二卷积神经网络用于识别命名实体,以及第三卷积神经网络用于识别实体关系,因此基于交叉熵构造第二卷积神经网络和第三卷积神经网络的损失函数,第二卷积神经网络的损失函数为用于识别命名实体的交叉熵,第三卷积神经网络的损失函数为用于识别关系识别的交叉熵。Since the second convolutional neural network is used to identify named entities, and the third convolutional neural network is used to identify entity relationships, the loss functions of the second and third convolutional neural networks are constructed based on cross entropy. The second The loss function of the convolutional neural network is the cross entropy used to identify named entities, and the loss function of the third convolutional neural network is the cross entropy used to identify the relationship.
根据实际应用场景的需要,第一损失函数、第二损失函数和第三损失函数可以根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络分别设定不同的损失函数,也可以使用相同的损失函数,此处不对第一损失函数、第二损失函数和第三损失函数进行具体设定,在本实施例中,设定第一损失函数、第二损失函数和第三损失函数相同,计算公式为:According to the needs of actual application scenarios, the first loss function, the second loss function, and the third loss function can be set differently according to the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network. For the loss function, the same loss function can also be used. The first loss function, the second loss function, and the third loss function are not specifically set here. In this embodiment, the first loss function and the second loss function are set Same as the third loss function, the calculation formula is:
Figure PCTCN2019117723-appb-000001
Figure PCTCN2019117723-appb-000001
其中,x为用于训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络的样本集中的数据样本,p、q分别为样本集的真实概率分布,非真实概率分布。Among them, x is the data sample in the sample set of the first convolutional neural network, the second convolutional neural network and the third convolutional neural network used for training initialization, p and q are the true probability distributions of the sample set respectively, not true Probability distributions.
202、根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。202. According to the determined first loss function, second loss function, and third loss function, train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain a preset The first convolutional neural network, the second convolutional neural network and the third convolutional neural network.
为了说明步骤202的具体实施方式,作为一种优选实施例,步骤202具体可以包括:根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;利用所述语义识别模型的损失函数训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。To illustrate the specific implementation of step 202, as a preferred embodiment, step 202 may specifically include: determining the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function ; Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network Network and third convolutional neural network.
例如,根据实际应用场景的需要,对所确定的第一损失函数、第二损失函数和第三损失函数进行相加求平均值,得到语义识别模型的损失函数,进一步地,若实际应用场景中,待识别文本中命名实体数量较多,则相应增加第二损失函数的权重值,若实际应用场景中,待识别文本中实体关系数量较多,则相应增加第三损失函数的权重值,此处不对语义识别模型的损失函数的计算方式进行具体限定。For example, according to the needs of actual application scenarios, the determined first loss function, second loss function, and third loss function are added and averaged to obtain the loss function of the semantic recognition model. Further, if the actual application scenario is If the number of named entities in the text to be recognized is large, the weight value of the second loss function should be increased accordingly. If there are more entity relationships in the text to be recognized in the actual application scenario, the weight value of the third loss function should be increased accordingly. There is no specific limitation on the calculation method of the loss function of the semantic recognition model.
在初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络的训练过程中,根据所确定的损失函数,利用凸优化算法自动更新神经网络隐层中的网络参数,从而得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。其中,凸优化算法,又称为凸最优化算法,或者凸最小化算法,是数学最优化的一个子领域,利用局部最优值即全局最优值的思想更新神经网络隐层中的网络参数。In the training process of the initialized first convolutional neural network, second convolutional neural network and third convolutional neural network, according to the determined loss function, the convex optimization algorithm is used to automatically update the network parameters in the hidden layer of the neural network, Thus, the preset first convolutional neural network, second convolutional neural network, and third convolutional neural network are obtained. Among them, convex optimization algorithm, also known as convex optimization algorithm, or convex minimization algorithm, is a sub-field of mathematical optimization, which uses the idea of local optimal value that is global optimal value to update the network parameters in the hidden layer of neural network .
自适应矩估计(Adam:Adaptive Moment Estimation)优化算法是一种能够替代传统随机梯度下降过程的一阶优化算法,根据本申请的训练样本集,利用Adam优化算法更新神经网络隐层中的网络参数。在Python的tensorflow库中,对语义识别模型的损失函数进行凸函数优化,具体为,以损失函数最小化为目标,利用Adam优化算法对语义识别模型中的网络参数进行迭代更新,以得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。其中,本申请对语义识别模型中的卷积层数量进行具体限定。The adaptive moment estimation (Adam: Adaptive Moment Estimation) optimization algorithm is a first-order optimization algorithm that can replace the traditional stochastic gradient descent process. According to the training sample set of this application, the Adam optimization algorithm is used to update the network parameters in the hidden layer of the neural network . In Python's tensorflow library, the loss function of the semantic recognition model is optimized by convex function. Specifically, the loss function is minimized as the goal, and the Adam optimization algorithm is used to iteratively update the network parameters in the semantic recognition model to obtain the preset The first convolutional neural network, the second convolutional neural network and the third convolutional neural network. Among them, this application specifically limits the number of convolutional layers in the semantic recognition model.
在实际应用中,具体的训练过程为,将第二卷积神经网络输出的命名实体识别结果与训练样本集中的命名实体或者标记的词语属性进行比较,若比较结果不一致,则说明识别错误;以及,根据第二卷积神经网络输出的命名实体识别结果,将第三卷积神经网络输出的实体关系识别结果与训练样本集中对应输出的命名实体识别结果标记的实体关系进行比较,若比较结果不一致,则说明识别错误。利用语义识别模型的损失函数对错误的识别结果进行纠错,进而完成对语义识别模型的训练,得到能够同时进行命名实体识别以及实体关系识别的语义识别模型。In practical applications, the specific training process is to compare the named entity recognition result output by the second convolutional neural network with the named entity or labeled word attributes in the training sample set. If the comparison results are inconsistent, it means that the recognition is wrong; and , According to the named entity recognition result output by the second convolutional neural network, compare the entity relationship recognition result output by the third convolutional neural network with the entity relationship marked by the corresponding output named entity recognition result in the training sample set. If the comparison results are inconsistent , It means the recognition error. The loss function of the semantic recognition model is used to correct the erroneous recognition results, and then complete the training of the semantic recognition model, and obtain a semantic recognition model capable of simultaneously performing named entity recognition and entity relationship recognition.
203、利用字词向量词典获取待识别文本的字向量和词向量。203. Use the word vector dictionary to obtain the word vector and the word vector of the text to be recognized.
通过对获取到的待识别文本进行分词处理后得到初始化的文本向量,并将初始化的文本向量作为语义识别模型预设的第一卷积神经网络的输入。预设的第一卷积神经网络的嵌入层利用预设的字词向量词典将初始化的文本向量转换成用于表征待识别文本的字向量和词向量。其中,预设的字词向量词典中包含初始化的文本向量中每个字对应的字向量,以及每个词对应的词向量。The initialized text vector is obtained by performing word segmentation processing on the acquired text to be recognized, and the initialized text vector is used as the input of the first convolutional neural network preset in the semantic recognition model. The preset embedding layer of the first convolutional neural network uses a preset word vector dictionary to convert the initialized text vector into a word vector and a word vector for representing the text to be recognized. Among them, the preset word vector dictionary contains the word vector corresponding to each word in the initialized text vector and the word vector corresponding to each word.
204、对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。204. Perform a convolution operation on the obtained word vector and word vector to obtain a text vector of the text to be recognized.
预设的第一卷积神经网络包括双层的一维全卷积结构,来自嵌入层的字向量和词向量经由双层的一维全卷积结构,输出得到待识别文本的文本向量。具体为,利用卷积核分别与待识别文本的字向量和词向量进行卷积运算(即点乘运算),并将得到的所有的卷积运算结果作为待识别文本的文本向量。The preset first convolutional neural network includes a double-layer one-dimensional full convolution structure, and the word vectors and word vectors from the embedding layer pass through the double-layer one-dimensional full convolution structure to output the text vector of the text to be recognized. Specifically, the convolution kernel is used to perform a convolution operation (ie, dot multiplication) with the word vector and word vector of the text to be recognized, and all the obtained convolution operation results are used as the text vector of the text to be recognized.
例如,设定卷积核长度为3,即利用维度为3的卷积核分别与待识别文本的字向量和词向量进行卷积运算,并将得到的待识别文本的文本向量作为预设的第二卷积神经网络和预设的第三卷积神经网络的输入。For example, set the length of the convolution kernel to 3, that is, use the convolution kernel with dimension 3 to perform convolution operation with the word vector and word vector of the text to be recognized, and use the text vector of the text to be recognized as the preset The input of the second convolutional neural network and the preset third convolutional neural network.
其中,预设的第一卷积神经网络为预设的第二卷积神经网络和预设的第三卷积神经网络的共享网络结构,从而实现预设的第二卷积神经网络和预设的第三卷积神经网络中底层参数的共享,有效避免现有两种独立的识别模型在联合使用过程中造成的信息冗余问题,进一步提升语义识别效率。Among them, the preset first convolutional neural network is a shared network structure of the preset second convolutional neural network and the preset third convolutional neural network, thereby realizing the preset second convolutional neural network and the preset The sharing of the underlying parameters in the third convolutional neural network effectively avoids the information redundancy problem caused by the joint use of the existing two independent recognition models, and further improves the efficiency of semantic recognition.
205、对所获取的文本向量进行卷积运算,得到待确定的命名实体。205. Perform a convolution operation on the obtained text vector to obtain a named entity to be determined.
利用语义识别模型中预设的第二卷积神经网络,对所获取的文本向量进行命名实体识别(NER:Named Entity Recognition),得到待确定的命名实体。其中,命名实体识别又称作“专名识别”,是指识别待识别文本中具有特定意义的实体。The second convolutional neural network preset in the semantic recognition model is used to perform named entity recognition (NER: Named Entity Recognition) on the obtained text vector to obtain the named entity to be determined. Among them, named entity recognition is also called "proprietary name recognition", which refers to the recognition of entities with specific meaning in the text to be recognized.
预设的第二卷积神经网络为密集连接结构DenseNet,密集连接结构存在大量密集连接,能够最大化神经网络中所有层之间的信息流,通过将神经网络中所有层进行两两连接,使得神经网络每一层的输入都是前面所有层输出的并集,且该层输出的特征图也会被直接传给其后的所有层作为输入,从而实现特征的重复利用,降低冗余性。The preset second convolutional neural network is a dense connection structure DenseNet. The dense connection structure has a large number of dense connections, which can maximize the information flow between all layers in the neural network. By connecting all layers in the neural network in pairs, The input of each layer of the neural network is the union of the output of all the previous layers, and the feature map output by this layer will also be directly passed to all subsequent layers as input, so as to realize the repeated use of features and reduce redundancy.
此外,预设的第二卷积神经网络包括两层卷积结构,基于两层卷积结构对语义识别模型中预设的第一卷积神经网络输出的卷积运算结果进行进一步的卷积运算,得到待确定的命名实体。In addition, the preset second convolutional neural network includes a two-layer convolution structure, and further convolution operations are performed on the convolution operation result output by the preset first convolutional neural network in the semantic recognition model based on the two-layer convolution structure. , Get the named entity to be determined.
其中,预设的第二卷积神经网络中的卷积结构为一维全卷积结构,一维全卷积结构能够与经由其输出的卷积运算结果保持等长,即基于一维全卷积结构,使得预设的第一卷积神经网络输出的卷积运算结果和经由一维全卷积结构输出的卷积运算结果为等长序列。Among them, the preset convolution structure in the second convolutional neural network is a one-dimensional full convolution structure, and the one-dimensional full convolution structure can maintain the same length as the convolution operation result output through it, that is, based on the one-dimensional full convolution The product structure makes the convolution operation result output by the preset first convolutional neural network and the convolution operation result output through the one-dimensional full convolution structure a sequence of equal length.
206、对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。206. Perform boundary character recognition on the named entity to be determined, and determine the final named entity according to the recognition result.
为了说明步骤206的具体实施方式,作为一种优选实施例,步骤206具体可以包括:若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。To illustrate the specific implementation of step 206, as a preferred embodiment, step 206 may specifically include: if the boundary character recognition result of the named entity to be determined is consistent with the preset boundary character recognition result, determining the The named entity is the final named entity; if the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
利用语义识别模型中预设的第二卷积神经网络,根据得到的待确定的命名实体中的SBME标记进行边界字符识别,具体为,若得到的待确定的命名实体中的标记为S,即待确定的命名实体为单字,则对该单字进行识别;若识别结果与预设的边界字符识别结果一致,则确定该单字是最终的命名实体。例如,若识别待确定的命名实体为“猫”,该识别结果与预设的边界字符识别结果一致,则识别得到的最终的 命名实体为猫。若识别结果与预设的边界字符识别结果不一致,则表示该单字不是命名实体。例如,若识别待确定的命名实体为“怎”,该识别结果与预设的边界字符识别结果不一致,则将“怎”作为语义识别模型的新增训练样本,对语义识别模型进行进一步优化,以提升语义识别模型的识别准确度。Use the second convolutional neural network preset in the semantic recognition model to perform boundary character recognition according to the obtained SBME mark in the named entity to be determined. Specifically, if the mark in the obtained named entity to be determined is S, that is If the named entity to be determined is a single character, then the single character is recognized; if the recognition result is consistent with the preset boundary character recognition result, the single character is determined to be the final named entity. For example, if the recognized named entity to be determined is "cat", and the recognition result is consistent with the preset boundary character recognition result, then the final recognized named entity is a cat. If the recognition result is inconsistent with the preset boundary character recognition result, it means that the word is not a named entity. For example, if the named entity to be identified is "How", and the recognition result is inconsistent with the preset boundary character recognition result, then "How" is used as a new training sample of the semantic recognition model to further optimize the semantic recognition model. To improve the recognition accuracy of the semantic recognition model.
若得到的待确定的命名实体中的标记包括BME或者BE,即待确定的命名实体为多字或者双字,则根据标记B和标记E对该命名实体进行识别;若识别结果与预设的边界字符识别结果一致,则确定该待确定的命名实体是最终的命名实体。例如,若待确定的命名实体中的标记包括BME,识别待确定的命名实体中的标记B和标记E对应“预”和“家”,该识别结果与预设的边界字符识别结果一致,则识别最终的命名实体为预言家;若待确定的命名实体中的标记包括BE,识别待确定的命名实体中的标记B和标记E对应“作”和“家”,该识别结果与预设的边界字符识别结果一致,则识别最终的命名实体为作家。若识别结果与预设的边界字符识别结果不一致,则表示该多字或者双字不是命名实体,可能是多识别了字符或者是少识别了字符的原因,导致识别结果不是命名实体。例如,若识别待确定的命名实体为“作家他”,该识别结果与预设的边界字符识别结果不一致,则将“作家他”作为语义识别模型的新增训练样本,对语义识别模型进行进一步优化,以提升语义识别模型的识别准确度。If the obtained tag in the to-be-determined named entity includes BME or BE, that is, the to-be-determined named entity is multi-character or double-character, then the named entity is recognized according to the tag B and the tag E; if the recognition result is consistent with the preset If the boundary character recognition results are consistent, it is determined that the named entity to be determined is the final named entity. For example, if the mark in the named entity to be determined includes BME, it is recognized that the mark B and mark E in the named entity to be determined correspond to "pre" and "home", and the recognition result is consistent with the preset boundary character recognition result, then Identify the final named entity as a prophet; if the mark in the named entity to be determined includes BE, identify the mark B and mark E in the named entity to be determined corresponding to "work" and "home", and the recognition result is the same as the preset If the boundary character recognition results are consistent, the final named entity is recognized as the writer. If the recognition result is inconsistent with the preset boundary character recognition result, it means that the multi-character or double-character is not a named entity. It may be because more or less characters are recognized, and the recognition result is not a named entity. For example, if the named entity to be identified is "writer", and the recognition result is inconsistent with the preset boundary character recognition result, then "writer" is used as a new training sample of the semantic recognition model, and the semantic recognition model is further refined. Optimized to improve the recognition accuracy of the semantic recognition model.
在实际应用中,预设的边界字符识别结果可以为命名实体的单字,以及双字和多字中的首部和尾部,也可以为训练样本集中对词语标记的词语属性,即词语属性的单字,以及双字和多字中的首部和尾部。In practical applications, the preset boundary character recognition result can be the single character of the named entity, and the head and tail of the double character and multi-character, or the word attribute of the word mark in the training sample set, that is, the word attribute of the word. As well as the beginning and end of double-word and multi-word.
在实际应用中,待识别文本中可以包括一个或者多个命名实体,因此,根据待识别文本的文本向量,利用预设的第二卷积神经网络中的激活函数softmax输出一个或者多个命名实体的识别结果,即输出结果对应待识别文本中包括的一个或者多个命名实体。具体为,第二卷积神经网络还包括激活函数softmax,基于激活函数softmax对第二卷积神经网络中经由两层卷积结构得到的运算结果(即,待确定的命名实体)进行进一步地分类运算,得到最终的命名实体。In practical applications, the text to be recognized can include one or more named entities. Therefore, according to the text vector of the text to be recognized, the preset activation function softmax in the second convolutional neural network is used to output one or more named entities. The recognition result, that is, the output result corresponds to one or more named entities included in the text to be recognized. Specifically, the second convolutional neural network also includes an activation function softmax. Based on the activation function softmax, the calculation result (ie, the named entity to be determined) obtained through the two-layer convolution structure in the second convolutional neural network is further classified Operate to get the final named entity.
207、利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。207. Using the third convolutional neural network preset in the semantic recognition model, determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity.
预设的第三卷积神经网络为密集连接结构DenseNet,在预设的第一卷积神经网络的基础上构建一层卷积层和一层池化层,并通过包含激活函数softmax的全连接层输出识别结果,该输出结果为多分类变量,即根据不同分类的概率值确定待识别文本中包括的一个或者多个实体关系。The preset third convolutional neural network is a densely connected structure DenseNet. On the basis of the preset first convolutional neural network, a convolutional layer and a pooling layer are constructed, and are fully connected through the activation function softmax The layer outputs the recognition result, and the output result is a multi-classification variable, that is, one or more entity relationships included in the text to be recognized are determined according to the probability values of different classifications.
在实际应用中,根据所确定的命名实体,利用训练样本集中命名实体与所标记的实体关系的对应关系确定实体关系,将识别得到的实体关系与所确定的实体关系进行比较,若识别结果一致,则识别得到的实体关系为待识别文本中的实体关系;若识别结果不一致,则说明识别错误,将错误的识别结果调整为利用训练样本集中命名实体与所标记的实体关系的对应关系所确定的实体关系,作为待识别文本中的实体关系。其中,将错误的识别结果作为新增训练样本,以实现对语义识别模型的训练,得到优化的语义识别模型。In practical applications, according to the determined named entity, the corresponding relationship between the named entity in the training sample set and the marked entity relationship is used to determine the entity relationship, and the identified entity relationship is compared with the determined entity relationship. If the recognition results are consistent , The recognized entity relationship is the entity relationship in the text to be recognized; if the recognition results are inconsistent, it means that the recognition is wrong, and the wrong recognition result is adjusted to the corresponding relationship between the named entity in the training sample set and the marked entity relationship. The entity relationship of as the entity relationship in the text to be recognized. Among them, the wrong recognition result is used as a new training sample to realize the training of the semantic recognition model and obtain an optimized semantic recognition model.
208、利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。208. Use the newly-added training sample to train the semantic recognition model to obtain an optimized semantic recognition model.
当待确定的命名实体不是最终的命名实体时,将待确定的命名实体补入用于训练该语义识别模型的训练样本集中作为一个新的词组语料,并在词组语料中标记该待确定的命名实体的词语属性为识别错误,以使语义识别模型在进行优化训练后能够有效提升对待识别文本的识别准确度。When the named entity to be determined is not the final named entity, the named entity to be determined is added to the training sample set for training the semantic recognition model as a new phrase corpus, and the name to be determined is marked in the phrase corpus The word attribute of the entity is a recognition error, so that the semantic recognition model can effectively improve the recognition accuracy of the text to be recognized after optimization training.
通过应用本实施例的技术方案,利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量,利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体,以及利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系。与现有基于循环神经网络将用于命名实体识别和实体关系识别的两种独立的识别模型进行部分联合的技术方案相比,本实施例应用于保险产品业务中的客服智能问答时,系统能够根据用户输入的语句,利用语义识别模型实现对语句准确、且快速地识别,从而为用户提供更准确的服务,提升用户体验。By applying the technical solution of this embodiment, the first convolutional neural network preset in the semantic recognition model is used to obtain the text vector of the text to be recognized, and the second convolutional neural network preset in the semantic recognition model is used, according to the obtained The text vector determines the named entity in the text to be recognized, and the third convolutional neural network preset in the semantic recognition model is used to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity. Compared with the existing technical solution based on the recurrent neural network in which two independent recognition models for named entity recognition and entity relationship recognition are partially combined, when this embodiment is applied to customer service intelligent question answering in insurance product business, the system can According to the sentence input by the user, the semantic recognition model is used to realize accurate and rapid recognition of the sentence, thereby providing users with more accurate services and improving user experience.
进一步的,作为图1方法的具体实现,本申请实施例提供了一种基于卷积神经网络的语义识别装置,如图3所示,该装置包括:第一卷积神经网络模块31、第二卷积神经网络模块32、第三卷积神经网络模块33。Further, as a specific implementation of the method in FIG. 1, an embodiment of the present application provides a semantic recognition device based on a convolutional neural network. As shown in FIG. 3, the device includes: a first convolutional neural network module 31, a second The convolutional neural network module 32 and the third convolutional neural network module 33.
第一卷积神经网络模块31,可以用于利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;该第一卷积神经网络模块31为本装置识别待识别文本中的命名实体和实体关系的基础模块。The first convolutional neural network module 31 can be used to obtain the text vector of the text to be recognized by using the preset first convolutional neural network in the semantic recognition model; the first convolutional neural network module 31 recognizes the text to be recognized for the device The basic module of named entities and entity relationships in the.
第二卷积神经网络模块32,可以用于利用语义识别模型中预设的第二卷积神经网络,根据第一卷积神经网络模块31获取到的文本向量确定待识别文本中的命名实体;第二卷积神经网络模块32为本装置中识别待识别文本中的命名实体的主要功能模块,也是本装置的核心功能模块。The second convolutional neural network module 32 may be used to use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the text vector obtained by the first convolutional neural network module 31; The second convolutional neural network module 32 is the main functional module of the device for identifying named entities in the text to be recognized, and is also the core functional module of the device.
第三卷积神经网络模块33,可以用于利用语义识别模型中预设的第三卷积神经网络,根据第一卷积神经网络模块31获取到的文本向量和第二卷积神经网络模块32所确定的命名实体,确定待识别文本中的实体关系;第三卷积神经网络模块33为本装置中识别待识别文本中的实体关系的主要功能模块,也是本装置的核心功能模块。The third convolutional neural network module 33 can be used to use the preset third convolutional neural network in the semantic recognition model, according to the text vector obtained by the first convolutional neural network module 31 and the second convolutional neural network module 32 The determined named entity determines the entity relationship in the text to be recognized; the third convolutional neural network module 33 is the main functional module of the device that recognizes the entity relationship in the text to be recognized, and is also the core functional module of the device.
在具体的应用场景中,第一卷积神经网络模块31,具体可以用于利用字词向量词典获取待识别文本的字向量和词向量;对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。In specific application scenarios, the first convolutional neural network module 31 can be specifically used to obtain the word vectors and word vectors of the text to be recognized by using the word vector dictionary; perform convolution operations on the obtained word vectors and word vectors to obtain The text vector of the text to be recognized.
在具体的应用场景中,还包括训练模块34,可以用于根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数;根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。In a specific application scenario, it also includes a training module 34, which can be used to determine the first loss function and the second loss according to the initialized first, second, and third convolutional neural networks. Function and third loss function; according to the determined first loss function, second loss function and third loss function, perform the initialization of the first convolutional neural network, the second convolutional neural network and the third convolutional neural network Train to obtain the preset first convolutional neural network, second convolutional neural network and third convolutional neural network.
在具体的应用场景中,训练模块34,具体可以用于根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;利用所述语义识别模型的损失函数训练初始化的第 一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。In a specific application scenario, the training module 34 may be specifically used to determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function; use the semantic recognition model The loss function of training initializes the first convolutional neural network, the second convolutional neural network, and the third convolutional neural network to obtain the preset first, second, and third convolutional neural networks The internet.
在具体的应用场景中,第二卷积神经网络模块32,具体可以用于对所获取的文本向量进行卷积运算,得到待确定的命名实体;对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。In a specific application scenario, the second convolutional neural network module 32 can be specifically used to perform convolution operations on the acquired text vector to obtain the named entity to be determined; perform boundary character recognition on the named entity to be determined, according to the recognition The result determines the final named entity.
在具体的应用场景中,第二卷积神经网络模块32,具体可以用于若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。In a specific application scenario, the second convolutional neural network module 32 can be specifically used to determine the named entity to be determined if the boundary character recognition result of the named entity to be determined is consistent with the preset boundary character recognition result Is the final named entity; if the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
在具体的应用场景中,训练模块34,具体可以用于利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。需要说明的是,本申请实施例提供的一种基于卷积神经网络的语义识别装置所涉及各功能单元的其他相应描述,可以参考图1和图2中的对应描述,在此不再赘述。In a specific application scenario, the training module 34 may be specifically used to train the semantic recognition model using the newly added training samples to obtain an optimized semantic recognition model. It should be noted that, for other corresponding descriptions of the functional units involved in the convolutional neural network-based semantic recognition device provided by the embodiment of the present application, reference may be made to the corresponding descriptions in FIG. 1 and FIG. 2, and details are not repeated here.
基于上述如图1和图2所示方法,相应的,本申请实施例还提供了一种非易失性可读存储介质,其上存储有计算机可读指令,该程序被处理器执行时实现上述如图1和图2所示的基于卷积神经网络的语义识别方法。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性非易失性可读存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景所述的方法。Based on the above-mentioned method shown in Figure 1 and Figure 2, correspondingly, an embodiment of the present application also provides a non-volatile readable storage medium on which computer readable instructions are stored, and the program is executed when the processor is executed. The above-mentioned semantic recognition method based on convolutional neural network as shown in Fig. 1 and Fig. 2. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product, and the software product can be stored in a non-volatile non-volatile readable storage medium (can be CD-ROM, U disk, mobile hard disk) Etc.), including several instructions to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in each implementation scenario of this application.
基于上述如图1、图2所示的方法,以及图3所示的虚拟装置实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括非易失性可读存储介质和处理器;非易失性可读存储介质,用于存储计算机可读指令;处理器,用于执行计算机可读指令以实现上述如图1和图2所示的基于卷积神经网络的语义识别方法。Based on the methods shown in Figures 1 and 2 and the virtual device embodiment shown in Figure 3, in order to achieve the above objectives, the embodiments of the present application also provide a computer device, which can be a personal computer, a server, or a network. The physical device includes a non-volatile readable storage medium and a processor; the non-volatile readable storage medium is used to store computer readable instructions; and the processor is used to execute computer readable instructions to achieve the above Figure 1 and Figure 2 show the semantic recognition method based on convolutional neural network.
可选的,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。本领域技术人员可以理解,本实施例提供的一种计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a Wi-Fi module, and so on. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like. The network interface can optionally include a standard wired interface, a wireless interface (such as a Bluetooth interface, a WI-FI interface), etc. Those skilled in the art can understand that the structure of a computer device provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
非易失性可读存储介质中还可以包括操作系统、网络通信模块。操作系统是管理计算机设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或程序的运行。网络通信模块用于实现非易失性可读存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以借助软件加必要的通用硬件平台的方式来 实现,也可以通过硬件实现。通过应用本申请的技术方案,与现有基于循环神经网络将用于命名实体识别和实体关系识别的两种独立的识别模型进行部分联合的技术方案相比,本实施例能够有效避免现有两种独立的识别模型在联合使用过程中造成的信息冗余问题,从而有效提升语义识别效率。The non-volatile readable storage medium may also include an operating system and a network communication module. The operating system is a program that manages the hardware and software resources of computer equipment, and supports the operation of information processing programs and other software and/or programs. The network communication module is used to implement communication between various components in the non-volatile readable storage medium and communication with other hardware and software in the physical device. Through the description of the foregoing implementation manners, those skilled in the art can clearly understand that this application can be implemented by means of software plus a necessary general hardware platform, or by hardware. By applying the technical solution of the present application, compared with the existing technical solution based on the recurrent neural network that partially combines two independent recognition models for named entity recognition and entity relationship recognition, this embodiment can effectively avoid the existing two This independent recognition model causes information redundancy in the process of joint use, thereby effectively improving the efficiency of semantic recognition.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawings are only schematic diagrams of preferred implementation scenarios, and the modules or processes in the accompanying drawings are not necessarily necessary for implementing this application. Those skilled in the art can understand that the modules in the device in the implementation scenario can be distributed in the device in the implementation scenario according to the description of the implementation scenario, or can be changed to be located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further divided into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。The above serial number of this application is only for description and does not represent the pros and cons of implementation scenarios. The above disclosures are only a few specific implementation scenarios of the application, but the application is not limited to these, and any changes that can be thought of by those skilled in the art should fall into the protection scope of the application.

Claims (20)

  1. 一种基于卷积神经网络的语义识别方法,其特征在于,包括:A semantic recognition method based on convolutional neural network, which is characterized in that it includes:
    根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数;Determine the first loss function, the second loss function, and the third loss function according to the initialized first, second, and third convolutional neural networks;
    根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络;According to the determined first loss function, second loss function, and third loss function, train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first One convolutional neural network, second convolutional neural network and third convolutional neural network;
    利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;Use the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized;
    利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;Use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
    利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系;Use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized based on the obtained text vector and the determined named entity;
    其中,利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量,具体包括:Among them, using the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized includes:
    利用字词向量词典获取待识别文本的字向量和词向量;Use the word vector dictionary to obtain the word vector and word vector of the text to be recognized;
    对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。Perform a convolution operation on the obtained word vector and word vector to obtain the text vector of the text to be recognized.
  2. 根据权利要求1所述的方法,其特征在于,根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,具体包括:The method according to claim 1, wherein according to the determined first loss function, second loss function and third loss function, the initialized first convolutional neural network, second convolutional neural network and first Three convolutional neural networks are trained to obtain the preset first, second, and third convolutional neural networks, which specifically include:
    根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;Determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function;
    利用所述语义识别模型的损失函数训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network And the third convolutional neural network.
  3. 根据权利要求1所述的方法,其特征在于,利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体,具体包括:The method according to claim 1, wherein the second convolutional neural network preset in the semantic recognition model is used to determine the named entity in the text to be recognized according to the obtained text vector, which specifically includes:
    对所获取的文本向量进行卷积运算,得到待确定的命名实体;Perform a convolution operation on the obtained text vector to obtain the named entity to be determined;
    对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。Perform boundary character recognition on the named entity to be determined, and determine the final named entity according to the recognition result.
  4. 根据权利要求3所述的方法,其特征在于,对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体,具体包括:The method according to claim 3, wherein, performing boundary character recognition on the named entity to be determined, and determining the final named entity according to the recognition result, specifically includes:
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;If the boundary character recognition result of the to-be-determined named entity is consistent with the preset boundary character recognition result, determining that the to-be-determined named entity is the final named entity;
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。If the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  5. 根据权利要求4所述的方法,其特征在于,具体还包括:The method according to claim 4, further comprising:
    利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。The semantic recognition model is trained by using the newly added training samples to obtain an optimized semantic recognition model.
  6. 一种基于卷积神经网络的语义识别装置,其特征在于,包括:A semantic recognition device based on convolutional neural network, which is characterized in that it comprises:
    训练模块,用于根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数;以及,根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络;The training module is used to determine the first loss function, the second loss function and the third loss function according to the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network; Determine the first loss function, second loss function, and third loss function, train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to get the preset first volume Convolutional neural network, second convolutional neural network and third convolutional neural network;
    第一卷积神经网络模块,用于利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;The first convolutional neural network module is used to obtain the text vector of the text to be recognized by using the first convolutional neural network preset in the semantic recognition model;
    第二卷积神经网络模块,用于利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;The second convolutional neural network module is configured to use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
    第三卷积神经网络模块,用于利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系;The third convolutional neural network module is used to use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized according to the obtained text vector and the determined named entity;
    其中,所述第一卷积神经网络模块,具体包括:Wherein, the first convolutional neural network module specifically includes:
    利用字词向量词典获取待识别文本的字向量和词向量;Use the word vector dictionary to obtain the word vector and word vector of the text to be recognized;
    对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。Perform a convolution operation on the obtained word vector and word vector to obtain the text vector of the text to be recognized.
  7. 根据权利要求6所述的装置,其特征在于,根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,具体包括:7. The device according to claim 6, wherein the first convolutional neural network, the second convolutional neural network, and the first convolutional neural network are initialized according to the determined first loss function, second loss function, and third loss function. Three convolutional neural networks are trained to obtain the preset first, second, and third convolutional neural networks, which specifically include:
    根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;Determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function;
    利用所述语义识别模型的损失函数训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network And the third convolutional neural network.
  8. 根据权利要求6所述的装置,其特征在于,所述第二卷积神经网络模块,具体包括:The device according to claim 6, wherein the second convolutional neural network module specifically comprises:
    对所获取的文本向量进行卷积运算,得到待确定的命名实体;Perform a convolution operation on the obtained text vector to obtain the named entity to be determined;
    对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。Perform boundary character recognition on the named entity to be determined, and determine the final named entity according to the recognition result.
  9. 根据权利要求8所述的装置,其特征在于,对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体,具体包括:The device according to claim 8, characterized in that, performing boundary character recognition on the named entity to be determined, and determining the final named entity according to the recognition result, specifically includes:
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;If the boundary character recognition result of the to-be-determined named entity is consistent with the preset boundary character recognition result, determining that the to-be-determined named entity is the final named entity;
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。If the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  10. 根据权利要求9所述的装置,其特征在于,具体还包括:The device according to claim 9, characterized in that it specifically further comprises:
    利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。The semantic recognition model is trained by using the newly added training samples to obtain an optimized semantic recognition model.
  11. 一种非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述程序被处理器执行时实现基于卷积神经网络的语义识别方法,包括:A non-volatile readable storage medium having computer readable instructions stored thereon, characterized in that, when the program is executed by a processor, a method for semantic recognition based on a convolutional neural network is realized, including:
    根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数;Determine the first loss function, the second loss function, and the third loss function according to the initialized first, second, and third convolutional neural networks;
    根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络;According to the determined first loss function, second loss function, and third loss function, train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first One convolutional neural network, second convolutional neural network and third convolutional neural network;
    利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;Use the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized;
    利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;Use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
    利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系;Use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized based on the obtained text vector and the determined named entity;
    其中,利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量,具体包括:Among them, using the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized includes:
    利用字词向量词典获取待识别文本的字向量和词向量;Use the word vector dictionary to obtain the word vector and word vector of the text to be recognized;
    对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。Perform a convolution operation on the obtained word vector and word vector to obtain the text vector of the text to be recognized.
  12. 根据权利要求11所述的非易失性可读存储介质,其特征在于,根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,具体包括:The non-volatile readable storage medium according to claim 11, wherein according to the determined first loss function, second loss function, and third loss function, the initialized first convolutional neural network, second The second convolutional neural network and the third convolutional neural network are trained to obtain the preset first, second, and third convolutional neural networks, which specifically include:
    根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;Determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function;
    利用所述语义识别模型的损失函数训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network And the third convolutional neural network.
  13. 根据权利要求11所述的非易失性可读存储介质,其特征在于,利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体,具体包括:The non-volatile readable storage medium according to claim 11, wherein the second convolutional neural network preset in the semantic recognition model is used to determine the named entity in the text to be recognized according to the obtained text vector, Specifically:
    对所获取的文本向量进行卷积运算,得到待确定的命名实体;Perform a convolution operation on the obtained text vector to obtain the named entity to be determined;
    对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。Perform boundary character recognition on the named entity to be determined, and determine the final named entity according to the recognition result.
  14. 根据权利要求13所述的非易失性可读存储介质,其特征在于,对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体,具体包括:The non-volatile readable storage medium according to claim 13, wherein performing boundary character recognition on the named entity to be determined, and determining the final named entity according to the recognition result, specifically includes:
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;If the boundary character recognition result of the to-be-determined named entity is consistent with the preset boundary character recognition result, determining that the to-be-determined named entity is the final named entity;
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。If the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  15. 根据权利要求14所述的非易失性可读存储介质,其特征在于,具体还包括:The non-volatile readable storage medium according to claim 14, characterized in that it specifically further comprises:
    利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。The semantic recognition model is trained by using the newly added training samples to obtain an optimized semantic recognition model.
  16. 一种计算机设备,包括非易失性可读存储介质、处理器及存储在非易失性可读存储介质上并可在处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述程序时实现基于卷积神经网络的语义识别方法,包括:A computer device, including a non-volatile readable storage medium, a processor, and computer readable instructions stored on the non-volatile readable storage medium and running on the processor, characterized in that the processor The method for implementing semantic recognition based on convolutional neural network when executing the program includes:
    根据初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,分别确定第一损失函数、第二损失函数和第三损失函数;Determine the first loss function, the second loss function, and the third loss function according to the initialized first, second, and third convolutional neural networks;
    根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络;According to the determined first loss function, second loss function, and third loss function, train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first One convolutional neural network, second convolutional neural network and third convolutional neural network;
    利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量;Use the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized;
    利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体;Use the second convolutional neural network preset in the semantic recognition model to determine the named entity in the text to be recognized according to the obtained text vector;
    利用语义识别模型中预设的第三卷积神经网络,根据所获取的文本向量和所确定的命名实体,确定待识别文本中的实体关系;Use the preset third convolutional neural network in the semantic recognition model to determine the entity relationship in the text to be recognized based on the obtained text vector and the determined named entity;
    其中,利用语义识别模型中预设的第一卷积神经网络获取待识别文本的文本向量,具体包括:Among them, using the first convolutional neural network preset in the semantic recognition model to obtain the text vector of the text to be recognized includes:
    利用字词向量词典获取待识别文本的字向量和词向量;Use the word vector dictionary to obtain the word vector and word vector of the text to be recognized;
    对得到的字向量和词向量进行卷积运算,得到待识别文本的文本向量。Perform a convolution operation on the obtained word vector and word vector to obtain the text vector of the text to be recognized.
  17. 根据权利要求16所述的计算机设备,其特征在于,根据所确定的第一损失函数、第二损失函数和第三损失函数,对初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络进行训练,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,具体包括:The computer device according to claim 16, characterized in that, according to the determined first loss function, second loss function and third loss function, the initialized first convolutional neural network, second convolutional neural network and The third convolutional neural network is trained to obtain the preset first, second, and third convolutional neural networks, which specifically include:
    根据所确定的第一损失函数、第二损失函数和第三损失函数,确定所述语义识别模型的损失函数;Determine the loss function of the semantic recognition model according to the determined first loss function, second loss function, and third loss function;
    利用所述语义识别模型的损失函数训练初始化的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络,得到预设的第一卷积神经网络、第二卷积神经网络和第三卷积神经网络。Use the loss function of the semantic recognition model to train the initialized first convolutional neural network, second convolutional neural network, and third convolutional neural network to obtain the preset first convolutional neural network and second convolutional neural network And the third convolutional neural network.
  18. 根据权利要求16所述的计算机设备,其特征在于,利用语义识别模型中预设的第二卷积神经网络,根据所获取的文本向量确定待识别文本中的命名实体,具体包括:The computer device according to claim 16, wherein the second convolutional neural network preset in the semantic recognition model is used to determine the named entity in the text to be recognized according to the obtained text vector, which specifically includes:
    对所获取的文本向量进行卷积运算,得到待确定的命名实体;Perform a convolution operation on the obtained text vector to obtain the named entity to be determined;
    对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体。Perform boundary character recognition on the named entity to be determined, and determine the final named entity according to the recognition result.
  19. 根据权利要求18所述的计算机设备,其特征在于,对待确定的命名实体进行边界字符识别,根据识别结果确定最终的命名实体,具体包括:The computer device according to claim 18, wherein, performing boundary character recognition on the named entity to be determined, and determining the final named entity according to the recognition result, specifically includes:
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果一致,则确定所述待确定的命名实体为最终的命名实体;If the boundary character recognition result of the to-be-determined named entity is consistent with the preset boundary character recognition result, determining that the to-be-determined named entity is the final named entity;
    若待确定的命名实体的边界字符识别结果与预设的边界字符识别结果不一致,则将所述待确定的命名实体作为所述语义识别模型的新增训练样本。If the boundary character recognition result of the to-be-determined named entity is inconsistent with the preset boundary character recognition result, the to-be-determined named entity is used as a new training sample of the semantic recognition model.
  20. 根据权利要求19所述的计算机设备,其特征在于,具体还包括:The computer device according to claim 19, further comprising:
    利用所述新增训练样本对所述语义识别模型进行训练,得到优化的语义识别模型。The semantic recognition model is trained by using the newly added training samples to obtain an optimized semantic recognition model.
PCT/CN2019/117723 2019-04-26 2019-11-12 Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device WO2020215683A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910345595.7 2019-04-26
CN201910345595.7A CN110222330B (en) 2019-04-26 2019-04-26 Semantic recognition method and device, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
WO2020215683A1 true WO2020215683A1 (en) 2020-10-29

Family

ID=67819991

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117723 WO2020215683A1 (en) 2019-04-26 2019-11-12 Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device

Country Status (2)

Country Link
CN (1) CN110222330B (en)
WO (1) WO2020215683A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222330B (en) * 2019-04-26 2024-01-30 平安科技(深圳)有限公司 Semantic recognition method and device, storage medium and computer equipment
CN111079418B (en) * 2019-11-06 2023-12-05 科大讯飞股份有限公司 Named entity recognition method, device, electronic equipment and storage medium
CN112232088A (en) * 2020-11-19 2021-01-15 京北方信息技术股份有限公司 Contract clause risk intelligent identification method and device, electronic equipment and storage medium
CN112765984A (en) * 2020-12-31 2021-05-07 平安资产管理有限责任公司 Named entity recognition method and device, computer equipment and storage medium
CN112906380A (en) * 2021-02-02 2021-06-04 北京有竹居网络技术有限公司 Method and device for identifying role in text, readable medium and electronic equipment
CN112949477B (en) * 2021-03-01 2024-03-15 苏州美能华智能科技有限公司 Information identification method, device and storage medium based on graph convolution neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018057945A1 (en) * 2016-09-22 2018-03-29 nference, inc. Systems, methods, and computer readable media for visualization of semantic information and inference of temporal signals indicating salient associations between life science entities
CN108763445A (en) * 2018-05-25 2018-11-06 厦门智融合科技有限公司 Construction method, device, computer equipment and the storage medium in patent knowledge library
CN109165385A (en) * 2018-08-29 2019-01-08 中国人民解放军国防科技大学 Multi-triple extraction method based on entity relationship joint extraction model
CN110222330A (en) * 2019-04-26 2019-09-10 平安科技(深圳)有限公司 Method for recognizing semantics and device, storage medium, computer equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761328A (en) * 1995-05-22 1998-06-02 Solberg Creations, Inc. Computer automated system and method for converting source-documents bearing alphanumeric text relating to survey measurements
CN107239446B (en) * 2017-05-27 2019-12-03 中国矿业大学 A kind of intelligence relationship extracting method based on neural network Yu attention mechanism
CN108304911B (en) * 2018-01-09 2020-03-13 中国科学院自动化研究所 Knowledge extraction method, system and equipment based on memory neural network
CN108536679B (en) * 2018-04-13 2022-05-20 腾讯科技(成都)有限公司 Named entity recognition method, device, equipment and computer readable storage medium
CN108804417B (en) * 2018-05-21 2022-03-15 山东科技大学 Document-level emotion analysis method based on specific field emotion words
CN109101492A (en) * 2018-07-25 2018-12-28 南京瓦尔基里网络科技有限公司 Usage history conversation activity carries out the method and system of entity extraction in a kind of natural language processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018057945A1 (en) * 2016-09-22 2018-03-29 nference, inc. Systems, methods, and computer readable media for visualization of semantic information and inference of temporal signals indicating salient associations between life science entities
CN108763445A (en) * 2018-05-25 2018-11-06 厦门智融合科技有限公司 Construction method, device, computer equipment and the storage medium in patent knowledge library
CN109165385A (en) * 2018-08-29 2019-01-08 中国人民解放军国防科技大学 Multi-triple extraction method based on entity relationship joint extraction model
CN110222330A (en) * 2019-04-26 2019-09-10 平安科技(深圳)有限公司 Method for recognizing semantics and device, storage medium, computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
E, HAIHONG ET AL.: "Survey of entity relationship extraction based on deep learning", HTTP://KNS.CNKI.NET/KXREADER/DETAIL?TIMESTAMP=637169548090085000&DBCODE =CJFQ&TABLENAME=CJFDLAST2019&FILENAME=RJXB201906016&RESULT=1&SIGN=IOS%2BYM%2BP1%2FYQO6D83OTVC0JF%2FGG%3D, 28 March 2019 (2019-03-28), XP055746606 *

Also Published As

Publication number Publication date
CN110222330A (en) 2019-09-10
CN110222330B (en) 2024-01-30

Similar Documents

Publication Publication Date Title
WO2020215683A1 (en) Semantic recognition method and apparatus based on convolutional neural network, and non-volatile readable storage medium and computer device
CN109241524B (en) Semantic analysis method and device, computer-readable storage medium and electronic equipment
WO2020114429A1 (en) Keyword extraction model training method, keyword extraction method, and computer device
WO2020232861A1 (en) Named entity recognition method, electronic device and storage medium
WO2019242297A1 (en) Method for intelligent dialogue based on machine reading comprehension, device, and terminal
US10114809B2 (en) Method and apparatus for phonetically annotating text
WO2020062770A1 (en) Method and apparatus for constructing domain dictionary, and device and storage medium
WO2022048173A1 (en) Artificial intelligence-based customer intent identification method and apparatus, device, and medium
US10929610B2 (en) Sentence-meaning recognition method, sentence-meaning recognition device, sentence-meaning recognition apparatus and storage medium
CN110619050B (en) Intention recognition method and device
US9811517B2 (en) Method and system of adding punctuation and establishing language model using a punctuation weighting applied to chinese speech recognized text
WO2023138188A1 (en) Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
WO2014117553A1 (en) Method and system of adding punctuation and establishing language model
JP7430820B2 (en) Sorting model training method and device, electronic equipment, computer readable storage medium, computer program
WO2021129123A1 (en) Corpus data processing method and apparatus, server, and storage medium
CN116127020A (en) Method for training generated large language model and searching method based on model
WO2022105121A1 (en) Distillation method and apparatus applied to bert model, device, and storage medium
US20230094730A1 (en) Model training method and method for human-machine interaction
CN113053367A (en) Speech recognition method, model training method and device for speech recognition
WO2022228127A1 (en) Element text processing method and apparatus, electronic device, and storage medium
CN112214595A (en) Category determination method, device, equipment and medium
CN115359383A (en) Cross-modal feature extraction, retrieval and model training method, device and medium
CN113836316B (en) Processing method, training method, device, equipment and medium for ternary group data
US20220027766A1 (en) Method for industry text increment and electronic device
CN114242113A (en) Voice detection method, training method and device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926379

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926379

Country of ref document: EP

Kind code of ref document: A1