CN110309514A - A kind of method for recognizing semantics and device - Google Patents

A kind of method for recognizing semantics and device Download PDF

Info

Publication number
CN110309514A
CN110309514A CN201910615990.2A CN201910615990A CN110309514A CN 110309514 A CN110309514 A CN 110309514A CN 201910615990 A CN201910615990 A CN 201910615990A CN 110309514 A CN110309514 A CN 110309514A
Authority
CN
China
Prior art keywords
sequence
slot position
target
vector
analysis sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910615990.2A
Other languages
Chinese (zh)
Other versions
CN110309514B (en
Inventor
樊骏锋
李长亮
汪美玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Kingsoft Digital Entertainment Co Ltd
Beijing Jinshan Digital Entertainment Technology Co Ltd
Original Assignee
Chengdu Kingsoft Digital Entertainment Co Ltd
Beijing Jinshan Digital Entertainment Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Kingsoft Digital Entertainment Co Ltd, Beijing Jinshan Digital Entertainment Technology Co Ltd filed Critical Chengdu Kingsoft Digital Entertainment Co Ltd
Priority to CN201910615990.2A priority Critical patent/CN110309514B/en
Publication of CN110309514A publication Critical patent/CN110309514A/en
Application granted granted Critical
Publication of CN110309514B publication Critical patent/CN110309514B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

This specification provides a kind of method for recognizing semantics and device, the method comprise the steps that obtaining the corresponding characteristic vector sequence of target analysis sentence of user;The corresponding characteristic vector sequence of the target analysis sentence is input to the inclusion layer of preset semantics recognition model, obtains the corresponding first sequential export sequence of the target analysis sentence;By the first sequential export sequence inputting to the information identification layer of the semantics recognition model, the first sequential export sequence is classified by the information identification layer, generates the corresponding slot position probability sequence of the target analysis sentence;The corresponding intention of the target analysis sentence and slot position information sequence are predicted based on the first sequential export sequence and the slot position probability sequence.

Description

A kind of method for recognizing semantics and device
Technical field
This specification is related to field of computer technology, in particular to a kind of method for recognizing semantics, calculates equipment and meter at device Calculation machine readable storage medium storing program for executing.
Background technique
Intention assessment and slot position filling are the important sons of language understanding system (Spoken Understanding System) Task, by obtaining the intention and slot position information of anolytic sentence, Neng Goushi to anolytic sentence progress text classification and sequence labelling Existing computer has important application value to human problem's intelligent Understanding, in fields such as intelligent answers.However, existing semanteme Identification neural network often expends a large amount of manual resources when carrying out intention assessment and slot position is filled, and is easy between rule negative mutually Face is rung, and cannot achieve migration between different field, meanwhile, anolytic sentence is not accounted for before carrying out semantic analysis In respectively participle be key message word probability, and previous task malfunction subsequent resume broadcast the problem of leading to prediction of failure, make It is lower to obtain accuracy of the existing semantics recognition neural network on intention assessment and slot position filling.
Summary of the invention
In view of this, this specification embodiment provides a kind of method for recognizing semantics, device, calculates equipment and computer can Storage medium is read, to solve technological deficiency existing in the prior art.
According to this specification embodiment in a first aspect, providing a kind of method for recognizing semantics, comprising:
Obtain the corresponding characteristic vector sequence of target analysis sentence of user;
The corresponding characteristic vector sequence of the target analysis sentence is input to the inclusion layer of preset semantics recognition model, Obtain the corresponding first sequential export sequence of the target analysis sentence;
By the first sequential export sequence inputting to the information identification layer of the semantics recognition model, pass through the information Identification layer classifies the first sequential export sequence, generates the corresponding slot position probability sequence of the target analysis sentence;
It is corresponding that the target analysis sentence is predicted based on the first sequential export sequence and the slot position probability sequence Intention and slot position information sequence.
According to the second aspect of this specification embodiment, a kind of semantics recognition model training method is provided, comprising:
Obtain training sample set, wherein the training sample set includes multiple samples pair, and each sample is to packet Sample analysis sentence and corresponding trained label are included, the trained label includes the corresponding intention mark of the sample analysis sentence Label and slot position information sequence label and slot position probability sequence label;
Semantics recognition model is trained by the training sample set, obtains the semantics recognition model, it is described Semantics recognition model makes the sample analysis sentence and the trained label associated.
According to the third aspect of this specification embodiment, a kind of semantic recognition device is provided, comprising:
Sentence is embedded in module, is configured as obtaining the corresponding characteristic vector sequence of target analysis sentence of user;
Sharing module is configured as the corresponding characteristic vector sequence of the target analysis sentence being input to preset semanteme The inclusion layer of identification model obtains the corresponding first sequential export sequence of the target analysis sentence;
Information identification module is configured as the letter by the first sequential export sequence inputting to the semantics recognition model Identification layer is ceased, the first sequential export sequence is classified by the information identification layer, generates the target analysis sentence pair The slot position probability sequence answered;
Semantic module is configured as predicting based on the first sequential export sequence and the slot position probability sequence The target analysis sentence is corresponding to be intended to and slot position information sequence.
According to the fourth aspect of this specification embodiment, a kind of semantics recognition model training apparatus is provided, comprising:
Sample acquisition module is configured as obtaining training sample set, wherein the training sample set includes multiple samples This is right, and for each sample to including sample analysis sentence and corresponding trained label, the trained label includes the sample The corresponding intention labels of this anolytic sentence and slot position information sequence label and slot position probability sequence label;
Model training module is configured as being trained semantics recognition model by the training sample set, obtain The semantics recognition model, the semantics recognition model make the sample analysis sentence and the trained label associated.
According to the 5th of this specification embodiment aspect, a kind of calculating equipment is provided, including memory, processor and deposit The computer instruction that can be run on a memory and on a processor is stored up, the processor realizes institute's predicate when executing described instruction The step of adopted recognition methods.
According to the 6th of this specification embodiment the aspect, a kind of computer readable storage medium is provided, meter is stored with The step of calculation machine instruction, which realizes the method for recognizing semantics when being executed by processor.
Complicated slot position filling problem is split as relatively simple slot position word identification problem to the application and slot position word is classified Problem, it is every in the target analysis sentence by preferentially judging before the target analysis sentence to user carries out semantic analysis A word is the probability of keyword slot position information, realizes the priority processing to the slot position word identification problem in slot position filling, together When, by the way that the corresponding slot position probability sequence of the target analysis sentence merge as subsequent with the first sequential export sequence The input information of intention assessment and slot position filling, to obtain the higher input letter of accuracy when handling slot position word classification problem Breath, and beneficial reference and positive orientation guide are provided when carrying out the calculating of intention assessment and slot position word classification problem for model, this Shen Semantics recognition model please reduces the consuming of manual resource without the artificial extraction for carrying out rule, realize different field it Between migration, improve the predictablity rate and reliability of semantic analysis.
Detailed description of the invention
Fig. 1 is the structural block diagram provided by the embodiments of the present application for calculating equipment;
Fig. 2 is the flow chart of method for recognizing semantics provided by the embodiments of the present application;
Fig. 3 is the topological structure schematic diagram of semantics recognition model provided by the embodiments of the present application;
Fig. 4 is the topological structure schematic diagram for the semantics recognition model that another embodiment of the application provides;
Fig. 5 is the topological structure schematic diagram of information identification layer provided by the embodiments of the present application;
Fig. 6 is another flow chart of method for recognizing semantics provided by the embodiments of the present application;
Fig. 7 is another flow chart of method for recognizing semantics provided by the embodiments of the present application;
Fig. 8 is the flow chart of semantics recognition model training method provided by the embodiments of the present application;
Fig. 9 is the structural schematic diagram of semantic recognition device provided by the embodiments of the present application;
Figure 10 is the structural schematic diagram of semantics recognition model training apparatus provided by the embodiments of the present application.
Specific embodiment
Many details are explained in the following description in order to fully understand the application.But the application can be with Much it is different from other way described herein to implement, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.
The term used in this specification one or more embodiment be only merely for for the purpose of describing particular embodiments, It is not intended to be limiting this specification one or more embodiment.In this specification one or more embodiment and appended claims The "an" of singular used in book, " described " and "the" are also intended to including most forms, unless context is clearly Indicate other meanings.It is also understood that term "and/or" used in this specification one or more embodiment refers to and includes One or more associated any or all of project listed may combine.
It will be appreciated that though may be retouched using term first, second etc. in this specification one or more embodiment Various information are stated, but these information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other It opens.For example, first can also be referred to as second, class in the case where not departing from this specification one or more scope of embodiments As, second can also be referred to as first.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".
Firstly, the vocabulary of terms being related to one or more embodiments of the invention explains.
Natural language processing: NLP (Natural Language Processing) is computer science, artificial intelligence and Philological crossing domain, target is to allow computer disposal or " understanding " natural language, to execute language translation and question answering Etc. tasks.
Semantics recognition: semantics recognition is one of important component of natural language processing technique, the core of semantics recognition Meaning in addition to understanding text vocabulary further includes the meaning for understanding that this word is representative in sentence and chapter, semantics recognition Technically to accomplish text, vocabulary, syntax, morphology and chapter level semantic analysis and ambiguity eliminate and it is corresponding Meaning recombination, to achieve the purpose that identify itself.
Intention assessment: identifying a kind of intention of behavior, judges the input of user is intended to that does, be question and answer robot most Important component part, it is intended that identification is often made of two important directions: the intention assessment based on retrieval: being drawn similar to search It holds up, robot retrieves the knowledge base of oneself and returns to the answer that can most answer customer problem;Intention assessment based on text classification Algorithm: to user the problem of classified using the knowledge point training text disaggregated model of knowledge base and using textual classification model It obtains knowledge point and returns to the corresponding answer in knowledge point.
Slot position filling: set structured field is extracted from the information that user inputs, thus to subsequent place Reason process gives more accurate feedback.
Convolutional neural networks: CNN (Convolutional Neural Network) is a kind of comprising convolutional calculation and tool There is the feedforward neural network of depth structure, for the abstraction sequence character representation from input picture.
Recognition with Recurrent Neural Network: RNN (Recurrent Neural Network) is a kind of artificial neural network, it passes through section A digraph is connected and composed between point, for showing the dynamic behaviour of sequence data.
Two-way shot and long term memory network: BiLSTM (Bi-directional Long Short-Term Memory) be by The characteristics of time recurrent neural network that forward direction LSTM and backward LSTM are composed, it is one kind of RNN, is designed due to it, It is highly suitable for the modeling to time series data, such as text data.
In this application, it provides a kind of method for recognizing semantics, device, calculate equipment and computer readable storage medium, It is described in detail one by one in the following embodiments.
Fig. 1 shows the structural block diagram of the calculating equipment 100 according to one embodiment of this specification.The calculating equipment 100 Component includes but is not limited to memory 110 and processor 120.Processor 120 is connected with memory 1 10 by bus 130, Database 150 is for saving data.
Calculating equipment 100 further includes access device 140, access device 140 enable calculate equipment 100 via one or Multiple networks 160 communicate.The example of these networks includes public switched telephone network (PSTN), local area network (LAN), wide area network (WAN), the combination of the communication network of personal area network (PAN) or such as internet.Access device 140 may include wired or wireless One or more of any kind of network interface (for example, network interface card (NIC)), such as IEEE802.11 wireless local area Net (WLAN) wireless interface, worldwide interoperability for microwave accesses (Wi-MAX) interface, Ethernet interface, universal serial bus (USB) Interface, cellular network interface, blue tooth interface, near-field communication (NFC) interface, etc..
In one embodiment of this specification, other unshowned portions in the above-mentioned component and Fig. 1 of equipment 100 are calculated Part can also be connected to each other, such as pass through bus.It should be appreciated that calculating device structure block diagram shown in FIG. 1 merely for the sake of Exemplary purpose, rather than the limitation to this specification range.Those skilled in the art can according to need, and increases or replaces it His component.
Calculating equipment 100 can be any kind of static or mobile computing device, including mobile computer or mobile meter Calculate equipment (for example, tablet computer, personal digital assistant, laptop computer, notebook computer, net book etc.), movement Phone (for example, smart phone), wearable calculating equipment (for example, smartwatch, intelligent glasses etc.) or other kinds of shifting Dynamic equipment, or the static calculating equipment of such as desktop computer or PC.Calculating equipment 100 can also be mobile or static The server of formula.
Wherein, processor 120 can execute the step in method shown in Fig. 2.Fig. 2 is to show to be implemented according to the application one The schematic flow chart of the method for recognizing semantics of example, including step 201 is to step 204:
Step 201: obtaining the corresponding characteristic vector sequence of target analysis sentence of user.
In embodiments herein, system obtains the target analysis sentence of user's input, and the target analysis sentence can To be the original language material of a Duan Zhongwen or English, such as " today, how is Pekinese's weather " and " The study Published on Monday in the Proceedings " etc., system are obtaining the laggard of the target analysis sentence Row word is embedded in (word embedding), encodes the target analysis sentence according to default rule, is converted into dimension Identical feature vector and composition characteristic sequence vector, that is, corresponding eigenmatrix of target analysis sentence.
Step 202: the corresponding characteristic vector sequence of the target analysis sentence is input to preset semantics recognition model Inclusion layer, obtain the corresponding first sequential export sequence of the target analysis sentence.
In embodiments herein, systemic presupposition has based on artificial neural network (Artificial Neural Network semantics recognition model), the semantics recognition model include the inclusion layer formed based on Recognition with Recurrent Neural Network (RNN) (Shared Layer), the corresponding characteristic vector sequence of target analysis sentence described in system are input to described in the inclusion layer excavates The sequence information that target analysis sentence includes carries out sequence information table to the corresponding characteristic vector sequence of the target analysis sentence Sign obtains the target analysis sentence and corresponds to the i.e. described first sequential export sequence of timing vector characteristics expression.
Step 203: by the first sequential export sequence inputting to the information identification layer of the semantics recognition model, passing through The information identification layer classifies the first sequential export sequence, generates the corresponding slot position probability sequence of the target analysis sentence Column.
In embodiments herein, the semantics recognition model includes information identification layer, and system is by first timing Output sequence is input to the information identification layer and the first sequential export sequence is mapped to lower dimensional space simultaneously by hyperspace Purification is normalized, obtains whether each character in the target analysis sentence is that the key message i.e. slot position of key message is believed The probability of breath, and generate the corresponding slot position probability sequence of the target analysis sentence.For example, in original language material " today Pekinese Weather is how " and " The study publ ished on Monday in the Proceedings " in, for character " Beijing ", " weather ", " stud y " and " Monday " etc. are that correspondence is original by the available each character of information identification layer The probability of key message in corpus, and the information identification layer is according to the crucial letter that each character is in corresponding original language material The probability of breath generates the corresponding slot position probability sequence of the target analysis sentence.
Step 204: the target analysis is predicted based on the first sequential export sequence and the slot position probability sequence The corresponding intention of sentence and slot position information sequence.
In embodiments herein, system carries out the first sequential export sequence and the slot position probability sequence soft Connection, i.e., the slot position probability sequence and the first sequential export sequence predicted according to the information identification layer carry out Fusion, to add a positive reference factor to the first sequential export sequence, and when by updated described first Input information of the sequence output sequence as slot position filling (Slot Filling) and intention assessment (Intent Attention), from And predict the corresponding intention of the target analysis sentence and slot position information sequence.For example, for original language material " today Pekinese Weather is how ", system by slot position filling and intention assessment may determine that user expresses be intended that " inquiry weather " and Inquiry be place (slot position) is the weather for " Beijing " time (slot position) being " today ";For original language material " T he study In published on Monday in the Proceedings ", system may determine that by slot position filling and intention assessment Out user's expression be intended that " search paper or research " and what is searched be time (slot position) is " Monday " keyword (slot position) Paper or research for " periodical ".
Complicated slot position filling problem is split as relatively simple slot position word identification problem to the application and slot position word is classified Problem, it is every in the target analysis sentence by preferentially judging before the target analysis sentence to user carries out semantic analysis A word is the probability of keyword slot position information, realizes the priority processing to the slot position word identification problem in slot position filling, together When, by the way that the corresponding slot position probability sequence of the target analysis sentence merge as subsequent with the first sequential export sequence The input information of intention assessment and slot position filling, to obtain the higher input letter of accuracy when handling slot position word classification problem Breath, and beneficial reference and positive orientation guide are provided when carrying out the calculating of intention assessment and slot position word classification problem for model, this Shen Semantics recognition model please reduces the consuming of manual resource without the artificial extraction for carrying out rule, realize different field it Between migration, improve the predictablity rate and reliability of semantic analysis.
The comparative situation of the semantics recognition model measurement data of the application and the test data of existing semantics recognition model It is as shown in table 1:
Table 1
In the specific embodiment of the application, the corresponding feature vector of target analysis sentence for obtaining user Sequence includes:
S2011, the target analysis sentence for obtaining user.
In embodiments herein, system obtains the target analysis sentence of user, and the target analysis sentence can be The original language material of one Duan Zhongwen or English, such as " today, how is Pekinese's weather " and " T he study published On Monday in the Proceedings " etc..
S2012, the target analysis sentence is segmented, obtains the corresponding target participle sequence of the target analysis sentence Column.
In embodiments herein, system is segmented (Tokenize) to the target analysis sentence, is obtained described The corresponding target segmentation sequence X=(x of target analysis sentence1..., xT).For example, for original language material " today Pekinese's weather How " system segmented to obtain five targets participle x to the original language material1" today ", target segment x2" Beijing ", target Segment x3" ", target segment x4" weather " and target segment x5" how ", and target segmentation sequence X=is generated (x1..., x5);For original language material " The study published on Monday in the Proceedings " system System segments the original language material to obtain eight target participle x1" China's ", target segment x2" One ", target segment x3 " Belt ", target segment x4" One ", target segment x5" Road ", target segment x6" triggers ", target segment x7 " policy " and target segment x8" change ", and generate target segmentation sequence X=(x1..., x8)。
S2013, vectorization insertion is carried out to each target participle in the target segmentation sequence, obtains each mesh Mark segments corresponding feature vector, generates the corresponding characteristic vector sequence of the target analysis sentence.
In embodiments herein, system segments (x to each target of X in the target segmentation sequence1..., xT) using random initializtion method progress vectorization insertion (Embedding), obtain each target participle (x1..., xT) corresponding feature vectorGenerate the corresponding characteristic vector sequence of the target analysis sentence
In another specific embodiment of the application, as shown in Figure 3 or Figure 4, the semantics recognition model includes altogether Layer is enjoyed, it is described that the corresponding characteristic vector sequence of the target analysis sentence is input to the shared of preset semantics recognition model Layer, obtaining the corresponding first sequential export sequence of the target analysis sentence includes:
It is S2021, each of the corresponding characteristic vector sequence of target analysis sentence target participle is corresponding Feature vector sequentially inputs the feature vector correspondence that each target participle is obtained to the first two-way shot and long term memory network layer The first forward saliency matrix and the first eigenmatrix backward.
In embodiments herein, two-way shot and long term memory network refers to BiLSTM (Bi-directional Long Short-Term), the described first two-way shot and long term memory network layer includes preceding to shot and long term memory network model and backward length Phase memory network model, system is by the corresponding characteristic vector sequence of the target analysis sentenceIn Each of the target segment (x1..., xT) corresponding feature vectorAccording to timestamp (time_ Step to shot and long term memory network model and backward shot and long term memory network model, the forward direction shot and long term before) being separately input into Memory network model segments (x to each target1..., xT) corresponding feature vectorSuccessively into Row processing obtains the first forward saliency matrix of reflection forward sequence informationThe backward shot and long term note Recall network model and (x is segmented to each target1..., xT) corresponding feature vectorSuccessively carry out Handle the second forward saliency matrix after being reflected to sequence informationSequence information characterizes the mesh Mark the connection in anolytic sentence between each target participle.
S2022, the corresponding first forward saliency matrix of feature vector that each target is segmented and first are special backward Sign matrix is spliced, and is obtained each target and is segmented corresponding first hidden layer state vector.
In embodiments herein, spy that the first two-way shot and long term memory network layer segments each target It is corresponding to levy vectorThe first forward saliency matrixWith the first eigenmatrix backwardSpliced, obtains that each target participle base can be characterized according to timestamp (time_step) In the first hidden layer state vector h of context semantic dependency relationship1i, wherein the 1≤i≤T.
S2023, the corresponding first hidden layer state vector generation target analysis sentence is segmented according to each target Corresponding first sequential export sequence.
In embodiments herein, system segments corresponding first hidden layer state vector vector according to each target h1i, wherein the 1≤i≤T, generates the corresponding first sequential export sequence H of the target analysis sentence1=(h11..., h1T)。
In another specific embodiment of the application, as shown in Figure 5 and Figure 6, the semantics recognition model includes letter Identification layer is ceased, the information identification layer includes the second two-way shot and long term memory network layer, full articulamentum and normalization layer, passes through institute It states information identification layer the first sequential export sequence is classified, generates the corresponding slot position probability of the target analysis sentence Sequence includes step 601 to step 604:
Step 601: each of the corresponding first sequential export sequence of the target analysis sentence target is segmented Corresponding first hidden layer state vector is sequentially input to the described second two-way shot and long term memory network layer, obtains each target The corresponding second forward saliency matrix of the first hidden layer state vector of participle and the second eigenmatrix backward.
In embodiments herein, by the corresponding first sequential export sequence H of the target analysis sentence1= (h11..., h1T) each of the target segment corresponding first hidden layer state vector h1iIt sequentially inputs to described second pair To shot and long term memory network layer, the first hidden layer state vector h of each target participle is obtained1iCorresponding second forward saliency MatrixWith the second eigenmatrix backward Wherein the described second two-way shot and long term memory Network layer is identical with the structure of the described previously first two-way shot and long term memory network layer and principle, therefore details are not described herein.
Step 602: by the corresponding second forward saliency matrix of the first hidden layer state vector of each target participle and Second backward eigenmatrix spliced, obtain each target and segment corresponding second hidden layer state vector.
In embodiments herein, the second two-way shot and long term memory network layer is by each first hidden layer state Vector h1iCorresponding second forward saliency matrixWith the second eigenmatrix backward Spliced, obtains that each first hidden layer state vector h can be characterized according to timestamp (time_step)1iBased on upper and lower Second hidden layer state vector h of literary semantic dependency relationship2i, wherein the 1≤i≤T.
Step 603: each target being segmented into corresponding second hidden layer state vector and is carried out by the full articulamentum Classification, and be normalized by the normalization layer, obtain the confidence level vector that each target participle is slot position information.
In embodiments herein, each target is segmented corresponding second hidden layer state vector h by system2iThrough The full articulamentum (Fully Connected Layers) is crossed to be mapped to low-dimensional vector by multi-C vector and then classified, Include whole features of slot position information in the full articulamentum, by each target segment corresponding second hidden layer state to Measure h2i2 × 1 output matrix vector is mapped as by the matrix-vector of n × 1, then by the normalization layer to the output square Battle array vector is normalized, and obtains the confidence level vector f that each target participle is slot position informationi, the confidence level vector fi =(f1, f2), wherein the 1≤i≤T, the f1It is the probability of slot position information, the f for target participle2For the mesh Mark participle is not the probability of slot position information, if f1Greater than f2, then judge the target participle for slot position information i.e. important information.Example Such as, for original language material " today, how is Pekinese's weather ", wherein target segments x4The confidence level vector f of " weather "4= (0.8,0.2), then the target segments x4" weather " is slot position information i.e. important information, and target segments x3" " confidence level to Measure f3=(0.1,0.9), then the target segments x3" " it is not slot position information i.e. important information.
Step 604: the target point is generated according to the confidence level vector that each target participle is slot position information Analyse the corresponding slot position probability sequence of sentence.
In embodiments herein, system is according to stating the confidence level vector f that each target participle is slot position informationi Generate the corresponding slot position probability sequence F=(f of the target analysis sentence1..., fT)。
The information identification layer of the application is by the second two-way shot and long term memory network layer, full articulamentum and normalization layer to the One timing output sequence carries out key message identification, judges the keyword i.e. slot position information in target analysis sentence, realizes To the priority processing of the slot position word identification problem in slot position filling, to the slot position word classification processing in subsequent slot position filling problem Positive influence is provided, the accuracy rate of prediction is improved.
In another specific embodiment of the application, as shown in fig. 7, described be based on the first sequential export sequence Predict the corresponding intention of the target analysis sentence with the slot position probability sequence and slot position information sequence include step 701 to Step 703:
Step 701: the first sequential export sequence and the slot position probability sequence being spliced, the target is obtained The corresponding second sequential export sequence of anolytic sentence.
It is described to spell the first sequential export sequence and the slot position probability sequence in embodiments herein It connects, obtaining the corresponding second sequential export sequence of the target analysis sentence includes:
S7011, each target is segmented into corresponding confidence level vector sum the first hidden layer state vector progress vector spelling It connects, obtains each target and segment corresponding sequential export vector.
In embodiments herein, each target is segmented corresponding confidence level vector f by systemiWith the first hidden layer State vector h1iVector splicing is carried out, each target is obtained and segments corresponding sequential export vector wi, that is, assume described One hidden layer state vector h1iFor the matrix-vector of n × 1, the confidence level vector fiFor 2 × 1 matrix-vector, then the timing Output vector wiFor the matrix-vector of (n+2) × 1.For example, for original language material " today, Pekinese's weather was how ", wherein institute State target participle x4The confidence level vector f of " weather "4=(0.8,0.2), target segment x4The first hidden layer state vector of " weather " h1iFor 120 × 1 matrix-vector, then the target segments x4" weather " corresponding sequential export vector w4For 122 × 1 square Battle array vector.
S7012, corresponding sequential export vector generation the second sequential export sequence is segmented according to each target.
In embodiments herein, system segments corresponding sequential export vector w according to each targetiIt generates The corresponding second sequential export sequence H of the target analysis sentence2=(w1..., wT)。
Step 702: the second sequential export sequence inputting to the intention assessment layer of the semantics recognition model obtains The corresponding intention of the target analysis sentence.
In embodiments herein, as shown in Figure 3 or Figure 4, the second sequential export sequence H described in system2= (w1..., wT) it is input to the intention assessment layer of the semantics recognition model, obtain the corresponding intention of the target analysis sentence yI.Specifically, the second sequential export sequence inputting is obtained the mesh to the intention assessment layer of the semantics recognition model Marking the corresponding intention of anolytic sentence includes:
S7021, determined according to the second sequential export sequence the corresponding intention context of the target analysis sentence to Amount.
In embodiments herein, the intention context vector cIAs the second sequential export sequence H2Weight sets Conjunction is calculated by following formula:
Wherein,To be intended to attention weighting function.
S7022, the end output state based on the second sequential export sequence and the intention context vector, pass through Weight matrix and flexible max function determine that the target analysis sentence corresponds to each meaning in preset intent classifier sequence The probability of figure classification.
In embodiments herein, the target analysis sentence corresponds to each intention in preset intent classifier sequence The probability of classification is calculated by following formula:
Wherein,For weight matrix, wTFor the second sequential export sequence H2End output state.
S7023, the intention classification of highest scoring in the intent classifier sequence is determined as the target analysis language The corresponding intention of sentence.
Step 703: by the second sequential export sequence inputting to the slot position filled layer of the semantics recognition model, obtaining The corresponding slot position information sequence of the target analysis sentence.
In embodiments herein, system is by the second sequential export sequence H2=(w1..., wT) it is input to institute The slot position filled layer of predicate justice identification model obtains the corresponding slot position information sequence of the target analysis sentence
It optionally, will be described as shown in figure 3, in the case where the semantics recognition model includes slot position attention mechanism It is corresponding to obtain the target analysis sentence to the slot position filled layer of the semantics recognition model for second sequential export sequence inputting Slot position information sequence includes:
S70311, determined based on the second sequential export sequence the corresponding intention context of the target analysis sentence to Amount and slot position context vector.
In embodiments herein, system is based on the second sequential export sequence H2Determine the target analysis sentence Corresponding intention context vector cIWith slot position context vectorThe slot position context vectorPass through following formula meters It obtains:
Wherein, it isIt is intended to attention weighting function.
S70312, the target analysis sentence is determined based on the intention context vector and the slot position context vector Corresponding weighted eigenvalue.
In embodiments herein, system is based on the intention context vector cIWith slot position context vectorIt determines The target analysis sentence corresponding weighted eigenvalue g, the weighted eigenvalue g are calculated by following formula:
Wherein, υ and W is respectively trainable vector sum matrix.
S70313, it is based on the slot position context vector, weighted eigenvalue and the second sequential export sequence, passed through Weight matrix and flexible max function determine the slot position label of each sequential export vector in the second sequential export sequence.
In embodiments herein, system is based on the slot position context vectorWeighted eigenvalue g and described Second sequential export sequence H2, second sequential export is determined by weight matrix and flexible max function (softmax) Sequence H2In each sequential export vector wiSlot position labelEach sequential export vector wiSlot position labelPass through Following formula are calculated:
Wherein,For weight matrix.
S70314, the mesh is generated according to the slot position label of sequential export vector each in the second sequential export sequence Mark the corresponding slot position information sequence of anolytic sentence.
Optionally, as shown in figure 4, in the case where the semantics recognition model does not include slot position attention mechanism, by institute It is corresponding to obtain the target analysis sentence for the slot position filled layer for stating the second sequential export sequence inputting to the semantics recognition model Slot position information sequence include:
S70321, determined based on the second sequential export sequence the corresponding intention context of the target analysis sentence to Amount and slot position context vector.
In embodiments herein, system is based on the second sequential export sequence H2Determine the target analysis sentence Corresponding intention context vector cIWith slot position context vectorIts calculation method is identical as content described previously, herein not It repeats again.
S70322, the target analysis sentence is determined based on the intention context vector and the slot position context vector Corresponding weighted eigenvalue.
In embodiments herein, system is based on the intention context vector cIWith slot position context vectorIt determines The target analysis sentence corresponding weighted eigenvalue g, the weighted eigenvalue g are calculated by following formula:
Wherein, υ and W is respectively trainable vector sum matrix.
S70323, it is based on the weighted eigenvalue and the second sequential export sequence, passes through weight matrix and flexibility Max function determines the slot position label of each sequential export vector in the second sequential export sequence.
In embodiments herein, system is based on the weighted eigenvalue g and the second sequential export sequence H2, The second sequential export sequence H is determined by weight matrix and flexible max function (softmax)2In each sequential export Vector wiSlot position labelEach sequential export vector wiSlot position labelIt is calculated by following formula:
Wherein,For weight matrix.
S70324, the mesh is generated according to the slot position label of sequential export vector each in the second sequential export sequence Mark the corresponding slot position information sequence of anolytic sentence.
The application is by carrying out the corresponding slot position probability sequence of the target analysis sentence and the first sequential export sequence Vector splices to obtain the second sequential export sequence, manual feature is similar to, so that the slot position probability sequence and the first timing are defeated The information for including in sequence out can be embodied in the second sequential export sequence, so that the second sequential export sequence is as subsequent The input information reliability with higher and accuracy of intention assessment and slot position filling.
The application by carry out slot position filling and intention assessment calculate when introduce weighted eigenvalue g, improve model into The result of intention assessment is realized forward direction to slot position filling by weighted eigenvalue g by the performance and reliability of row slot position filling Guidance.
Wherein, processor 120 can execute the step in method shown in Fig. 8.Fig. 8 is to show to be implemented according to the application one The schematic flow chart of the semantics recognition model training method of example, including step 801 is to step 802:
Step 801: obtaining training sample set, wherein the training sample set includes multiple samples pair, each described For sample to including sample analysis sentence and corresponding trained label, the trained label includes that the sample analysis sentence is corresponding Intention labels and slot position information sequence label and slot position probability sequence label.
Step 802: semantics recognition model being trained by the training sample set, obtains the semantics recognition mould Type, the semantics recognition model make the sample analysis sentence and the trained label associated.
The semantics recognition model training method of the application is trained semantics recognition model by training sample set, obtains To the semantics recognition model, the semantics recognition model will be answered when the target analysis sentence to user carries out semantic analysis Miscellaneous slot position filling problem is split as the slot position word identification problem and slot position word classification problem of relatively easyization, first carries out slot position word That is key message identifies and judges, and is then carrying out intention assessment and the classification of slot position word, enables the result of information identification layer Enough results to complicated slot position filling model generate wholesome effect, improve the accuracy and reliability of prediction.
In embodiments herein, by the training sample set to semantics recognition model be trained including.
S8021, the corresponding prediction result of the sample analysis sentence for obtaining the sample centering and training label.
In embodiments herein, the corresponding prediction result packet of sample analysis sentence described in the sample pair is obtained It includes.
S80211, the corresponding prediction intention of sample analysis sentence described in the sample pair is obtained.
S80212, the corresponding prediction slot position information sequence of sample analysis sentence described in the sample pair is obtained.
S80213, the confidence level vector that each sample in the sample analysis sentence segments is obtained.
S8022, the cross entropy for determining the sample analysis sentence corresponding prediction result and the trained label, according to institute It states cross entropy and obtains the loss function of the semantics recognition model.
In embodiments herein, the corresponding prediction result of the sample analysis sentence and the trained label are determined Cross entropy includes.
S80221, determine that the prediction of sample analysis sentence described in the sample pair is intended to and the friendship of corresponding intention labels Pitch entropy.
S80222, the prediction slot position information sequence for determining sample analysis sentence described in the sample pair and corresponding slot position The cross entropy of information sequence label.
S80223, confidence level vector and the slot position probability that each sample in the sample analysis sentence segments are determined The cross entropy of sequence label.
S8023, it is adjusted according to parameter of the loss function to the semantics recognition model until the semantic knowledge Other model meets training stop condition.
The application is by obtaining the loss function of the semantics recognition model using cross entropy, by loss function to model Parameter optimize, even if carry out slot position probability sequence and the first sequential export sequence progress vector splicing when spell into mistake Information accidentally, semantics recognition model automatically can also be fallen the information screen of mistake using the process of parameter optimization, that is, use soft company The method connect prevents the dislocation back-propagation of information identification layer, segments so that model is realized by the optimization of parameter to target The selection of confidence level, to improve the related assessment parameter of model, such as intention assessment accuracy rate, slot position information discre value and Slot position and intention accuracy.
Corresponding with above method embodiment, this specification additionally provides semantic recognition device embodiment, and Fig. 9 is shown The structural schematic diagram of the semantic recognition device of this specification one embodiment.As shown in figure 9, the device includes.
Sentence is embedded in module 901, is configured as obtaining the corresponding characteristic vector sequence of target analysis sentence of user;
Sharing module 902 is configured as the corresponding characteristic vector sequence of the target analysis sentence being input to preset The inclusion layer of semantics recognition model obtains the corresponding first sequential export sequence of the target analysis sentence.
Information identification module 903 is configured as the first sequential export sequence inputting to the semantics recognition model Information identification layer, the first sequential export sequence is classified by the information identification layer, generates the target analysis language The corresponding slot position probability sequence of sentence;
Semantic module 904 is configured as pre- based on the first sequential export sequence and the slot position probability sequence Measure the corresponding intention of the target analysis sentence and slot position information sequence.
Optionally, the sentence insertion module 901 includes:
Acquiring unit is configured as obtaining the target analysis sentence of user;
Participle unit is configured as segmenting the target analysis sentence, and it is corresponding to obtain the target analysis sentence Target segmentation sequence;
Embedded unit is configured as carrying out vectorization insertion to each target participle in the target segmentation sequence, obtain Corresponding feature vector is segmented to each target, generates the corresponding characteristic vector sequence of the target analysis sentence.
Optionally, the sharing module 902 includes:
First circulation unit, being configured as will be described in each of corresponding characteristic vector sequence of the target analysis sentence Target segments corresponding feature vector and sequentially inputs to the first two-way shot and long term memory network layer, obtains each target participle The corresponding first forward saliency matrix of feature vector and the first eigenmatrix backward;
First memory unit is configured as the corresponding first forward saliency square of feature vector of each target participle Battle array and first backward eigenmatrix spliced, obtain each target and segment corresponding first hidden layer state vector;
First timing unit is configured as segmenting corresponding first hidden layer state vector generation institute according to each target State the corresponding first sequential export sequence of target analysis sentence.
Optionally, the information identification module 903 includes:
Second circulation unit is configured as each of corresponding first sequential export sequence of the target analysis sentence The target segments corresponding first hidden layer state vector and sequentially inputs to the described second two-way shot and long term memory network layer, obtains The corresponding second forward saliency matrix of the first hidden layer state vector of each target participle and the second eigenmatrix backward;
Second memory unit, be configured as by the first hidden layer state vector corresponding second of each target participle to Preceding eigenmatrix and second backward eigenmatrix spliced, obtain each target segment corresponding second hidden layer state to Amount;
Taxon, is configured as each target segmenting corresponding second hidden layer state vector and connects entirely by described It connects layer to classify, and is normalized by the normalization layer, obtaining each target participle is setting for slot position information Reliability vector;
First ray generation unit, be configured as according to the confidence level that each target participle is slot position information to Amount generates the corresponding slot position probability sequence of the target analysis sentence.
Optionally, semantic module 904 includes:
Second timing unit is configured as spelling the first sequential export sequence and the slot position probability sequence It connects, obtains the corresponding second sequential export sequence of the target analysis sentence;
Intention assessment unit is configured as the meaning by the second sequential export sequence inputting to the semantics recognition model Figure identification layer obtains the corresponding intention of the target analysis sentence;
Slot position fills unit is configured as the slot by the second sequential export sequence inputting to the semantics recognition model Position filled layer, obtains the corresponding slot position information sequence of the target analysis sentence.
Optionally, second timing unit includes:
Vector concatenation unit is configured as each target segmenting corresponding the first hidden layer of confidence level vector sum state Vector carries out vector splicing, obtains each target and segments corresponding sequential export vector;
Second sequence generating unit is configured as segmenting corresponding sequential export vector generation institute according to each target State the second sequential export sequence.
Optionally, the intention assessment unit includes:
First intention determination unit is configured as determining the target analysis sentence according to the second sequential export sequence Corresponding intention context vector;
Second intention determination unit is configured as end output state based on the second sequential export sequence and described It is intended to context vector, determines that the target analysis sentence is intended to divide preset by weight matrix and flexible max function Each probability for being intended to classification is corresponded in class sequence;
Third intent determination unit is configured as the intention classification of highest scoring in the intent classifier sequence is true It is set to the corresponding intention of the target analysis sentence.
Optionally, the slot position fills unit includes:
First slot position determination unit is configured as determining the target analysis sentence based on the second sequential export sequence Corresponding intention context vector and slot position context vector;
Second slot position determination unit is configured as true based on the intention context vector and the slot position context vector Determine the corresponding weighted eigenvalue of the target analysis sentence;
Third slot position determination unit is configured as based on the slot position context vector, weighted eigenvalue and described the Two sequential export sequences determine each timing in the second sequential export sequence by weight matrix and flexible max function The slot position label of output vector;
4th slot position determination unit is configured as according to sequential export vector each in the second sequential export sequence Slot position label generates the corresponding slot position information sequence of the target analysis sentence.
Optionally, third slot position determination unit is additionally configured to defeated based on the weighted eigenvalue and second timing Sequence out determines each sequential export vector in the second sequential export sequence by weight matrix and flexible max function Slot position label.
Complicated slot position is filled problem by the application and is split as relatively simple slot by the semantic recognition device of the application Position word identification problem and slot position word classification problem, by before the target analysis sentence to user carries out semantic analysis, preferentially Judge that each word is the probability of keyword slot position information in the target analysis sentence, realizes to the slot position in slot position filling The priority processing of word identification problem, meanwhile, by by the corresponding slot position probability sequence of the target analysis sentence and the first timing Output sequence carries out the input information that fusion is filled as subsequent intention assessment and slot position, thus in processing slot position word classification problem When obtain the higher input information of accuracy, and provided for model when carrying out the calculating of intention assessment and slot position word classification problem Beneficial reference and positive orientation guide, the semantics recognition model of the application reduce manual resource without the artificial extraction for carrying out rule Consuming, realize the migration between different field, improve the predictablity rate and reliability of semantic analysis.
Corresponding with above method embodiment, this specification additionally provides semantics recognition model training apparatus embodiment, figure 10 show the structural schematic diagram of the semantics recognition model training apparatus of this specification one embodiment.As shown in Figure 10, the dress It sets and includes:
Sample acquisition module 1001 is configured as obtaining training sample set, wherein the training sample set includes more A sample pair, for each sample to including sample analysis sentence and corresponding trained label, the trained label includes institute State the corresponding intention labels of sample analysis sentence and slot position information sequence label and slot position probability sequence label;
Model training module 1002 is configured as being trained semantics recognition model by the training sample set, The semantics recognition model is obtained, the semantics recognition model makes the sample analysis sentence related to the trained label Connection.
Optionally, the model training module 1002 includes:
As a result acquiring unit is configured as obtaining the corresponding prediction result of the sample analysis sentence of the sample centering With training label;
Cross entropy unit is configured to determine that the corresponding prediction result of the sample analysis sentence and the trained label Cross entropy obtains the loss function of the semantics recognition model according to the cross entropy;
Optimize unit, be configured as being adjusted according to parameter of the loss function to the semantics recognition model until The semantics recognition model meets training stop condition.
Optionally, the result acquiring unit includes:
First result obtains subelement, is configured as obtaining the corresponding prediction of sample analysis sentence described in the sample pair It is intended to;
Second result obtains subelement, is configured as obtaining the corresponding prediction of sample analysis sentence described in the sample pair Slot position information sequence;
Third result obtains subelement, is configured as obtaining the confidence of each sample participle in the sample analysis sentence Spend vector;
Optionally, the cross entropy list includes:
First cross entropy computing unit is configured to determine that the prediction of sample analysis sentence described in the sample pair is intended to With the cross entropy of corresponding intention labels;
Second cross entropy computing unit is configured to determine that the prediction slot position of sample analysis sentence described in the sample pair The cross entropy of information sequence and corresponding slot position information sequence label;
Third cross entropy computing unit is configured to determine that the confidence of each sample participle in the sample analysis sentence Spend the cross entropy of vector and the slot position probability sequence label.
The semantics recognition model training apparatus of the application is trained semantics recognition model by training sample set, obtains To the semantics recognition model, the semantics recognition model will be answered when the target analysis sentence to user carries out semantic analysis Miscellaneous slot position filling problem is split as the slot position word identification problem and slot position word classification problem of relatively easyization, first carries out slot position word That is key message identifies and judges, and is then carrying out intention assessment and the classification of slot position word, enables the result of information identification layer Enough results to complicated slot position filling model generate wholesome effect, improve the accuracy and reliability of prediction.
One embodiment of the application also provides a kind of calculating equipment, including memory, processor and storage are on a memory simultaneously The computer instruction that can be run on a processor, the processor perform the steps of when executing described instruction
Obtain the corresponding characteristic vector sequence of target analysis sentence of user;
The corresponding characteristic vector sequence of the target analysis sentence is input to the inclusion layer of preset semantics recognition model, Obtain the corresponding first sequential export sequence of the target analysis sentence;
By the first sequential export sequence inputting to the information identification layer of the semantics recognition model, pass through the information Identification layer classifies the first sequential export sequence, generates the corresponding slot position probability sequence of the target analysis sentence;
It is corresponding that the target analysis sentence is predicted based on the first sequential export sequence and the slot position probability sequence Intention and slot position information sequence.
One embodiment of the application also provides a kind of calculating equipment, including memory, processor and storage are on a memory simultaneously The computer instruction that can be run on a processor, the processor perform the steps of when executing described instruction
Obtain training sample set, wherein the training sample set includes multiple samples pair, and each sample is to packet Sample analysis sentence and corresponding trained label are included, the trained label includes the corresponding intention mark of the sample analysis sentence Label and slot position information sequence label and slot position probability sequence label;
Semantics recognition model is trained by the training sample set, obtains the semantics recognition model, it is described Semantics recognition model makes the sample analysis sentence and the trained label associated.
One embodiment of the application also provides a kind of computer readable storage medium, is stored with computer instruction, the instruction The step of method for recognizing semantics and semantics recognition model training method as previously described is realized when being executed by processor.
A kind of exemplary scheme of above-mentioned computer readable storage medium for the present embodiment.It should be noted that the meter The technical solution of calculation machine readable storage medium storing program for executing and the technical solution of above-mentioned method for recognizing semantics belong to same design, and computer can The detail content that the technical solution of storage medium is not described in detail is read, may refer to the technical solution of above-mentioned method for recognizing semantics Description.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.
The computer instruction includes computer program code, the computer program code can for source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include: that can carry institute State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Me mory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium The content for including can carry out increase and decrease appropriate according to the requirement made laws in jurisdiction with patent practice, such as in certain departments Method administrative area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.
It should be noted that for the various method embodiments described above, describing for simplicity, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can use other sequences or carry out simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules might not all be this Shen It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.
The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, so that skilled artisan be enable to better understand and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims (16)

1. a kind of method for recognizing semantics characterized by comprising
Obtain the corresponding characteristic vector sequence of target analysis sentence of user;
The corresponding characteristic vector sequence of the target analysis sentence is input to the inclusion layer of preset semantics recognition model, is obtained The corresponding first sequential export sequence of the target analysis sentence;
By the first sequential export sequence inputting to the information identification layer of the semantics recognition model, identified by the information Layer classifies the first sequential export sequence, generates the corresponding slot position probability sequence of the target analysis sentence;
The corresponding meaning of the target analysis sentence is predicted based on the first sequential export sequence and the slot position probability sequence Figure and slot position information sequence.
2. the method according to claim 1, wherein obtaining the corresponding feature vector of target analysis sentence of user Sequence includes:
Obtain the target analysis sentence of user;
The target analysis sentence is segmented, the corresponding target segmentation sequence of the target analysis sentence is obtained;
Vectorization insertion is carried out to each target participle in the target segmentation sequence, each target participle is obtained and corresponds to Feature vector, generate the corresponding characteristic vector sequence of the target analysis sentence.
3. according to the method described in claim 2, it is characterized in that, by the corresponding characteristic vector sequence of the target analysis sentence It is input to the inclusion layer of preset semantics recognition model, obtains the corresponding first sequential export sequence packet of the target analysis sentence It includes:
By each of the corresponding characteristic vector sequence of the target analysis sentence target segment corresponding feature vector according to It is secondary to be input to the first two-way shot and long term memory network layer, obtain the feature vector corresponding first of each target participle forward Eigenmatrix and the first eigenmatrix backward;
By the corresponding first forward saliency matrix of feature vector of each target participle and the first eigenmatrix progress backward Splicing obtains each target and segments corresponding first hidden layer state vector;
Corresponding first hidden layer state vector, which is segmented, according to each target generates the target analysis sentence corresponding first Sequential export sequence.
4. according to the method described in claim 3, it is characterized in that, the information identification layer includes the second two-way shot and long term memory Network layer, full articulamentum and normalization layer, are classified the first sequential export sequence by the information identification layer, raw Include: at the corresponding slot position probability sequence of the target analysis sentence
It is hidden that each of the corresponding first sequential export sequence of the target analysis sentence target is segmented corresponding first Layer state vector is sequentially input to the described second two-way shot and long term memory network layer, and obtain each target participle first is hidden The corresponding second forward saliency matrix of layer state vector and the second eigenmatrix backward;
By the corresponding second forward saliency matrix of the first hidden layer state vector of each target participle and the second feature backward Matrix is spliced, and is obtained each target and is segmented corresponding second hidden layer state vector;
Each target is segmented corresponding second hidden layer state vector to classify by the full articulamentum, and passes through institute It states normalization layer to be normalized, obtains the confidence level vector that each target participle is slot position information;
It is corresponding that the target analysis sentence is generated according to the confidence level vector that each target participle is slot position information Slot position probability sequence.
5. according to the method described in claim 4, it is characterized in that, general based on the first sequential export sequence and the slot position Rate sequence prediction goes out the corresponding intention of the target analysis sentence and slot position information sequence includes:
The first sequential export sequence and the slot position probability sequence are spliced, it is corresponding to obtain the target analysis sentence The second sequential export sequence;
By the second sequential export sequence inputting to the intention assessment layer of the semantics recognition model, the target analysis is obtained The corresponding intention of sentence;
By the second sequential export sequence inputting to the slot position filled layer of the semantics recognition model, the target analysis is obtained The corresponding slot position information sequence of sentence.
6. according to the method described in claim 5, it is characterized in that, by the first sequential export sequence and the slot position probability Sequence is spliced, and is obtained the corresponding second sequential export sequence of the target analysis sentence and is included:
Each target is segmented into corresponding confidence level vector sum the first hidden layer state vector and carries out vector splicing, is obtained each The target segments corresponding sequential export vector;
Corresponding sequential export vector, which is segmented, according to each target generates the second sequential export sequence.
7. according to the method described in claim 5, it is characterized in that, by the second sequential export sequence inputting to the semanteme The intention assessment layer of identification model, obtaining the corresponding intention of the target analysis sentence includes:
The corresponding intention context vector of the target analysis sentence is determined according to the second sequential export sequence;
End output state and the intention context vector based on the second sequential export sequence, by weight matrix and Flexible max function determines that the target analysis sentence corresponds to the general of each intention classification in preset intent classifier sequence Rate;
The intention classification of highest scoring in the intent classifier sequence is determined as the corresponding meaning of the target analysis sentence Figure.
8. according to the method described in claim 5, it is characterized in that, by the second sequential export sequence inputting to the semanteme The slot position filled layer of identification model, obtaining the corresponding slot position information sequence of the target analysis sentence includes:
It is determined in the corresponding intention context vector and slot position of the target analysis sentence based on the second sequential export sequence Below vector;
The corresponding weighting of the target analysis sentence is determined based on the intention context vector and the slot position context vector Characteristic value;
Based on the slot position context vector, weighted eigenvalue and the second sequential export sequence, by weight matrix and Flexible max function determines the slot position label of each sequential export vector in the second sequential export sequence;
The target analysis sentence is generated according to the slot position label of sequential export vector each in the second sequential export sequence Corresponding slot position information sequence.
9. according to the method described in claim 5, it is characterized in that, by the second sequential export sequence inputting to the semanteme The slot position filled layer of identification model, obtaining the corresponding slot position information sequence of the target analysis sentence includes:
It is determined in the corresponding intention context vector and slot position of the target analysis sentence based on the second sequential export sequence Below vector;
The corresponding weighting of the target analysis sentence is determined based on the intention context vector and the slot position context vector Characteristic value;
It is true by weight matrix and flexible max function based on the weighted eigenvalue and the second sequential export sequence The slot position label of each sequential export vector in the fixed second sequential export sequence;
The target analysis sentence is generated according to the slot position label of sequential export vector each in the second sequential export sequence Corresponding slot position information sequence.
10. a kind of semantics recognition model training method characterized by comprising
Obtain training sample set, wherein the training sample set includes multiple samples pair, and each sample is to including sample This anolytic sentence and corresponding trained label, the trained label include the corresponding intention labels of the sample analysis sentence and Slot position information sequence label and slot position probability sequence label;
Semantics recognition model is trained by the training sample set, obtains the semantics recognition model, the semanteme Identification model makes the sample analysis sentence and the trained label associated.
11. according to the method described in claim 10, it is characterized in that, by the training sample set to semantics recognition model It is trained and includes:
Obtain the corresponding prediction result of the sample analysis sentence and training label of the sample centering;
The cross entropy for determining the sample analysis sentence corresponding prediction result and the trained label, obtains according to the cross entropy To the loss function of the semantics recognition model;
It is adjusted according to parameter of the loss function to the semantics recognition model until the semantics recognition model meets Training stop condition.
12. according to the method for claim 11, which is characterized in that obtain sample analysis sentence pair described in the sample pair The prediction result answered includes:
The corresponding prediction of sample analysis sentence described in the sample pair is obtained to be intended to;
Obtain the corresponding prediction slot position information sequence of sample analysis sentence described in the sample pair;
Obtain the confidence level vector of each sample participle in the sample analysis sentence;
It determines the corresponding prediction result of the sample analysis sentence and the cross entropy of the trained label includes:
Determine that the prediction of sample analysis sentence described in the sample pair is intended to and the cross entropy of corresponding intention labels;
Determine the prediction slot position information sequence of sample analysis sentence described in the sample pair and corresponding slot position information sequence mark The cross entropy of label;
Determine the confidence level vector and the slot position probability sequence label of each sample participle in the sample analysis sentence Cross entropy.
13. a kind of semantic recognition device characterized by comprising
Sentence is embedded in module, is configured as obtaining the corresponding characteristic vector sequence of target analysis sentence of user;
Sharing module is configured as the corresponding characteristic vector sequence of the target analysis sentence being input to preset semantics recognition The inclusion layer of model obtains the corresponding first sequential export sequence of the target analysis sentence;
Information identification module is configured as knowing the information of the first sequential export sequence inputting to the semantics recognition model First sequential export sequence is classified by the information identification layer, it is corresponding to generate the target analysis sentence by other layer Slot position probability sequence;
Semantic module is configured as predicting based on the first sequential export sequence and the slot position probability sequence described The corresponding intention of target analysis sentence and slot position information sequence.
14. a kind of semantics recognition model training apparatus characterized by comprising
Sample acquisition module is configured as obtaining training sample set, wherein the training sample set includes multiple samples Right, for each sample to including sample analysis sentence and corresponding trained label, the trained label includes the sample The corresponding intention labels of anolytic sentence and slot position information sequence label and slot position probability sequence label;
Model training module is configured as being trained semantics recognition model by the training sample set, obtains described Semantics recognition model, the semantics recognition model make the sample analysis sentence and the trained label associated.
15. a kind of calculating equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine instruction, which is characterized in that the processor realizes claim 1-9 or 10-12 any one institute when executing described instruction The step of stating method.
16. a kind of computer readable storage medium, is stored with computer instruction, which is characterized in that the instruction is held by processor The step of claim 1-9 or 10-12 any one the method are realized when row.
CN201910615990.2A 2019-07-09 2019-07-09 Semantic recognition method and device Active CN110309514B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910615990.2A CN110309514B (en) 2019-07-09 2019-07-09 Semantic recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910615990.2A CN110309514B (en) 2019-07-09 2019-07-09 Semantic recognition method and device

Publications (2)

Publication Number Publication Date
CN110309514A true CN110309514A (en) 2019-10-08
CN110309514B CN110309514B (en) 2023-07-11

Family

ID=68079922

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910615990.2A Active CN110309514B (en) 2019-07-09 2019-07-09 Semantic recognition method and device

Country Status (1)

Country Link
CN (1) CN110309514B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909541A (en) * 2019-11-08 2020-03-24 杭州依图医疗技术有限公司 Instruction generation method, system, device and medium
CN111079945A (en) * 2019-12-18 2020-04-28 北京百度网讯科技有限公司 End-to-end model training method and device
CN111144127A (en) * 2019-12-25 2020-05-12 科大讯飞股份有限公司 Text semantic recognition method and model acquisition method thereof and related device
CN111159546A (en) * 2019-12-24 2020-05-15 腾讯科技(深圳)有限公司 Event pushing method and device, computer readable storage medium and computer equipment
CN111177381A (en) * 2019-12-21 2020-05-19 深圳市傲立科技有限公司 Slot filling and intention detection joint modeling method based on context vector feedback
CN111177358A (en) * 2019-12-31 2020-05-19 华为技术有限公司 Intention recognition method, server, and storage medium
CN111310707A (en) * 2020-02-28 2020-06-19 山东大学 Skeleton-based method and system for recognizing attention network actions
CN111339770A (en) * 2020-02-18 2020-06-26 百度在线网络技术(北京)有限公司 Method and apparatus for outputting information
CN111651988A (en) * 2020-06-03 2020-09-11 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training a model
CN111985249A (en) * 2020-09-03 2020-11-24 贝壳技术有限公司 Semantic analysis method and device, computer-readable storage medium and electronic equipment
CN111988294A (en) * 2020-08-10 2020-11-24 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence
CN112149736A (en) * 2020-09-22 2020-12-29 腾讯科技(深圳)有限公司 Data processing method, device, server and medium
CN112364659A (en) * 2020-07-08 2021-02-12 西湖大学 Unsupervised semantic representation automatic identification method and unsupervised semantic representation automatic identification device
CN112380327A (en) * 2020-11-09 2021-02-19 天翼爱音乐文化科技有限公司 Cold-start slot filling method, system, device and storage medium
CN112632962A (en) * 2020-05-20 2021-04-09 华为技术有限公司 Method and device for realizing natural language understanding in human-computer interaction system
WO2021190259A1 (en) * 2020-03-23 2021-09-30 华为技术有限公司 Slot identification method and electronic device
CN113642654A (en) * 2021-08-16 2021-11-12 北京百度网讯科技有限公司 Image feature fusion method and device, electronic equipment and storage medium
CN113705230A (en) * 2021-08-31 2021-11-26 中国平安财产保险股份有限公司 Artificial intelligence-based policy agreement assessment method, device, equipment and medium
CN113779975A (en) * 2020-06-10 2021-12-10 北京猎户星空科技有限公司 Semantic recognition method, device, equipment and medium
CN113971399A (en) * 2020-07-23 2022-01-25 北京金山数字娱乐科技有限公司 Training method and device for recognition model and text recognition method and device
CN115687625A (en) * 2022-11-14 2023-02-03 五邑大学 Text classification method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110093459A1 (en) * 2009-10-15 2011-04-21 Yahoo! Inc. Incorporating Recency in Network Search Using Machine Learning
CN107766559A (en) * 2017-11-06 2018-03-06 第四范式(北京)技术有限公司 Training method, trainer, dialogue method and the conversational system of dialog model
CN107783960A (en) * 2017-10-23 2018-03-09 百度在线网络技术(北京)有限公司 Method, apparatus and equipment for Extracting Information
CN107832476A (en) * 2017-12-01 2018-03-23 北京百度网讯科技有限公司 A kind of understanding method of search sequence, device, equipment and storage medium
US20180157638A1 (en) * 2016-12-02 2018-06-07 Microsoft Technology Licensing, Llc Joint language understanding and dialogue management
CN108932342A (en) * 2018-07-18 2018-12-04 腾讯科技(深圳)有限公司 A kind of method of semantic matches, the learning method of model and server
CN109086814A (en) * 2018-07-23 2018-12-25 腾讯科技(深圳)有限公司 A kind of data processing method, device and the network equipment
CN109697679A (en) * 2018-12-27 2019-04-30 厦门智融合科技有限公司 Intellectual property services guidance method and system
CN109933799A (en) * 2019-03-22 2019-06-25 北京金山数字娱乐科技有限公司 Sentence joining method and device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110093459A1 (en) * 2009-10-15 2011-04-21 Yahoo! Inc. Incorporating Recency in Network Search Using Machine Learning
US20180157638A1 (en) * 2016-12-02 2018-06-07 Microsoft Technology Licensing, Llc Joint language understanding and dialogue management
CN107783960A (en) * 2017-10-23 2018-03-09 百度在线网络技术(北京)有限公司 Method, apparatus and equipment for Extracting Information
CN107766559A (en) * 2017-11-06 2018-03-06 第四范式(北京)技术有限公司 Training method, trainer, dialogue method and the conversational system of dialog model
CN107832476A (en) * 2017-12-01 2018-03-23 北京百度网讯科技有限公司 A kind of understanding method of search sequence, device, equipment and storage medium
CN108932342A (en) * 2018-07-18 2018-12-04 腾讯科技(深圳)有限公司 A kind of method of semantic matches, the learning method of model and server
CN109086814A (en) * 2018-07-23 2018-12-25 腾讯科技(深圳)有限公司 A kind of data processing method, device and the network equipment
CN109697679A (en) * 2018-12-27 2019-04-30 厦门智融合科技有限公司 Intellectual property services guidance method and system
CN109933799A (en) * 2019-03-22 2019-06-25 北京金山数字娱乐科技有限公司 Sentence joining method and device

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHIH-WEN GOO等: "Slot-Gated Modeling for Joint Slot Filling and Intent Prediction", 《PROCEEDINGS OF NAACL-HLT2018》 *
JIAWEI SHAN等: "A Neural Framework for Joint Prediction on Intent Identification and Slot Filling", 《ICCC2019》 *
华冰涛;袁志祥;肖维民;郑啸;: "基于BLSTM-CNN-CRF模型的槽填充与意图识别", 计算机工程与应用 *
徐梓翔等: "基于Bi-LSTM-CRF网络的语义槽识别", 《智能计算机与应用》 *
王恒升等: "基于领域知识的增强约束词向量", 《中文信息学报》 *
高燕等: "最大熵模型在最长地点实体识别中的应用", 《广东石油化工学院学报》 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909541A (en) * 2019-11-08 2020-03-24 杭州依图医疗技术有限公司 Instruction generation method, system, device and medium
US11182648B2 (en) 2019-12-18 2021-11-23 Beijing Baidu Netcom Science And Technology Co., Ltd. End-to-end model training method and apparatus, and non-transitory computer-readable medium
CN111079945A (en) * 2019-12-18 2020-04-28 北京百度网讯科技有限公司 End-to-end model training method and device
CN111177381A (en) * 2019-12-21 2020-05-19 深圳市傲立科技有限公司 Slot filling and intention detection joint modeling method based on context vector feedback
CN111159546A (en) * 2019-12-24 2020-05-15 腾讯科技(深圳)有限公司 Event pushing method and device, computer readable storage medium and computer equipment
CN111159546B (en) * 2019-12-24 2023-10-24 深圳市雅阅科技有限公司 Event pushing method, event pushing device, computer readable storage medium and computer equipment
CN111144127A (en) * 2019-12-25 2020-05-12 科大讯飞股份有限公司 Text semantic recognition method and model acquisition method thereof and related device
CN111177358A (en) * 2019-12-31 2020-05-19 华为技术有限公司 Intention recognition method, server, and storage medium
CN111177358B (en) * 2019-12-31 2023-05-12 华为技术有限公司 Intention recognition method, server and storage medium
CN111339770A (en) * 2020-02-18 2020-06-26 百度在线网络技术(北京)有限公司 Method and apparatus for outputting information
CN111339770B (en) * 2020-02-18 2023-07-21 百度在线网络技术(北京)有限公司 Method and device for outputting information
CN111310707A (en) * 2020-02-28 2020-06-19 山东大学 Skeleton-based method and system for recognizing attention network actions
CN111310707B (en) * 2020-02-28 2023-06-20 山东大学 Bone-based graph annotation meaning network action recognition method and system
WO2021190259A1 (en) * 2020-03-23 2021-09-30 华为技术有限公司 Slot identification method and electronic device
CN112632962B (en) * 2020-05-20 2023-11-17 华为技术有限公司 Method and device for realizing natural language understanding in man-machine interaction system
CN112632962A (en) * 2020-05-20 2021-04-09 华为技术有限公司 Method and device for realizing natural language understanding in human-computer interaction system
CN111651988B (en) * 2020-06-03 2023-05-19 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training model
CN111651988A (en) * 2020-06-03 2020-09-11 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for training a model
CN113779975A (en) * 2020-06-10 2021-12-10 北京猎户星空科技有限公司 Semantic recognition method, device, equipment and medium
CN113779975B (en) * 2020-06-10 2024-03-01 北京猎户星空科技有限公司 Semantic recognition method, device, equipment and medium
CN112364659A (en) * 2020-07-08 2021-02-12 西湖大学 Unsupervised semantic representation automatic identification method and unsupervised semantic representation automatic identification device
CN112364659B (en) * 2020-07-08 2024-05-03 西湖大学 Automatic identification method and device for unsupervised semantic representation
CN113971399A (en) * 2020-07-23 2022-01-25 北京金山数字娱乐科技有限公司 Training method and device for recognition model and text recognition method and device
CN111988294B (en) * 2020-08-10 2022-04-12 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence
CN111988294A (en) * 2020-08-10 2020-11-24 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence
CN111985249A (en) * 2020-09-03 2020-11-24 贝壳技术有限公司 Semantic analysis method and device, computer-readable storage medium and electronic equipment
CN112149736A (en) * 2020-09-22 2020-12-29 腾讯科技(深圳)有限公司 Data processing method, device, server and medium
CN112149736B (en) * 2020-09-22 2024-02-09 腾讯科技(深圳)有限公司 Data processing method, device, server and medium
CN112380327A (en) * 2020-11-09 2021-02-19 天翼爱音乐文化科技有限公司 Cold-start slot filling method, system, device and storage medium
CN113642654A (en) * 2021-08-16 2021-11-12 北京百度网讯科技有限公司 Image feature fusion method and device, electronic equipment and storage medium
CN113705230B (en) * 2021-08-31 2023-08-25 中国平安财产保险股份有限公司 Method, device, equipment and medium for evaluating policy specifications based on artificial intelligence
CN113705230A (en) * 2021-08-31 2021-11-26 中国平安财产保险股份有限公司 Artificial intelligence-based policy agreement assessment method, device, equipment and medium
CN115687625A (en) * 2022-11-14 2023-02-03 五邑大学 Text classification method, device, equipment and medium
CN115687625B (en) * 2022-11-14 2024-01-09 五邑大学 Text classification method, device, equipment and medium

Also Published As

Publication number Publication date
CN110309514B (en) 2023-07-11

Similar Documents

Publication Publication Date Title
CN110309514A (en) A kind of method for recognizing semantics and device
CN110427463B (en) Search statement response method and device, server and storage medium
CN110442718B (en) Statement processing method and device, server and storage medium
CN111950269A (en) Text statement processing method and device, computer equipment and storage medium
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN110781663B (en) Training method and device of text analysis model, text analysis method and device
CN110532554A (en) A kind of Chinese abstraction generating method, system and storage medium
CN111738007B (en) Chinese named entity identification data enhancement algorithm based on sequence generation countermeasure network
CN110222178A (en) Text sentiment classification method, device, electronic equipment and readable storage medium storing program for executing
CN108038492A (en) A kind of perceptual term vector and sensibility classification method based on deep learning
CN112883714B (en) ABSC task syntactic constraint method based on dependency graph convolution and transfer learning
CN111143569A (en) Data processing method and device and computer readable storage medium
CN111104509B (en) Entity relationship classification method based on probability distribution self-adaption
CN111831826B (en) Training method, classification method and device of cross-domain text classification model
CN110232123A (en) The sentiment analysis method and device thereof of text calculate equipment and readable medium
CN111666376B (en) Answer generation method and device based on paragraph boundary scan prediction and word shift distance cluster matching
CN110334186A (en) Data query method, apparatus, computer equipment and computer readable storage medium
CN110163376A (en) Sample testing method, the recognition methods of media object, device, terminal and medium
CN115952292B (en) Multi-label classification method, apparatus and computer readable medium
CN114510570A (en) Intention classification method and device based on small sample corpus and computer equipment
CN113947086A (en) Sample data generation method, training method, corpus generation method and apparatus
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN113934835B (en) Retrieval type reply dialogue method and system combining keywords and semantic understanding representation
CN114444515A (en) Relation extraction method based on entity semantic fusion
CN110222737A (en) A kind of search engine user satisfaction assessment method based on long memory network in short-term

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant