WO2019079922A1 - 会话信息处理方法及其装置、存储介质 - Google Patents

会话信息处理方法及其装置、存储介质

Info

Publication number
WO2019079922A1
WO2019079922A1 PCT/CN2017/107269 CN2017107269W WO2019079922A1 WO 2019079922 A1 WO2019079922 A1 WO 2019079922A1 CN 2017107269 W CN2017107269 W CN 2017107269W WO 2019079922 A1 WO2019079922 A1 WO 2019079922A1
Authority
WO
WIPO (PCT)
Prior art keywords
statement
analyzed
sentence
category
word
Prior art date
Application number
PCT/CN2017/107269
Other languages
English (en)
French (fr)
Inventor
舒悦
林芬
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to PCT/CN2017/107269 priority Critical patent/WO2019079922A1/zh
Priority to CN201780054093.8A priority patent/CN109964223B/zh
Publication of WO2019079922A1 publication Critical patent/WO2019079922A1/zh
Priority to US16/670,822 priority patent/US10971141B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • G10L15/05Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • G10L19/135Vector sum excited linear prediction [VSELP]

Definitions

  • the present application relates to the field of artificial intelligence, and in particular, to a session information processing method and apparatus, and a storage medium.
  • AI Artificial intelligence
  • the AI dialogue robot performs semantic analysis on the statements in the session and responds accordingly.
  • the application example provides a session information processing method and device, and a storage medium.
  • the session information processing method provided by the example of the present application includes:
  • a second feature set including one or more second features
  • the second feature includes a phrase or a sentence consisting of a first word and a second word, the first word being one or more words in the first word set, and the second word being in the second word set One or more words;
  • the statement category includes a first category indicating that the statement is complete and the semantics are not ambiguous or indicating that the statement is incomplete or semantic There is a second category of ambiguity.
  • the session information processing method provided by the example of the present application includes:
  • the to-be-analyzed statement belongs to the first category, analyzing the semantics of the to-be-analyzed statement according to the to-be-analyzed statement to obtain the semantics of the to-be-analyzed statement;
  • the to-be-analyzed statement belongs to the second category, analyzing the semantics of the to-be-analyzed sentence according to the to-be-analyzed statement and the preset number of the above statements to obtain the semantics of the to-be-analyzed statement;
  • analyzing the semantics of the sentence to be analyzed according to the statement to be analyzed and the preset number of the above statements includes:
  • the completing the statement to be analyzed according to the above statement includes:
  • mapping model determining a completion statement corresponding to the statement to be analyzed
  • mapping model is pre-built by using a self-encoder, and the input information of the mapping model is the to-be-analyzed statement and the above statement, and the output information of the mapping model is the completion statement.
  • the performing the corresponding operations according to the semantics of the statement to be analyzed includes:
  • the slot system is used to set a corresponding slot; the information acquisition statement is returned to the user for the statement to be analyzed, to obtain information required by the slot; Information, perform the appropriate action.
  • the session information processing method provided by the example of the present application is executed by an electronic device, and the method includes:
  • a second feature set including one or more second features
  • the second feature includes a phrase or a sentence consisting of a first word and a second word, the first word being one or more words in the first word set, and the second word being in the second word set One or more words;
  • the statement category includes a first category indicating that the statement is complete and the semantics are not ambiguous or indicating that the statement is incomplete or semantic There is a second category of ambiguity.
  • the session information processing method provided by the example of the present application is executed by an electronic device, and the method includes:
  • the to-be-analyzed statement belongs to the first category, analyzing the semantics of the to-be-analyzed statement according to the to-be-analyzed statement to obtain the semantics of the to-be-analyzed statement;
  • the to-be-analyzed statement belongs to the second category, analyzing the semantics of the to-be-analyzed sentence according to the to-be-analyzed statement and the preset number of the above statements to obtain the semantics of the to-be-analyzed statement;
  • the session information processing apparatus includes:
  • One or more memories are One or more memories
  • the one or more memories are stored with one or more instruction modules configured to be executed by the one or more processors;
  • the one or more command units include:
  • a first extracting unit extracting, from a session, a statement to be analyzed and a preset number of statements of the statement to be analyzed;
  • a word segmentation unit performing word segmentation processing on the sentence to be analyzed and the preset number of the above statements, to obtain a first feature set including a plurality of first features
  • the second extracting unit extracts a second feature set including one or more second features from the first word set corresponding to the sentence to be analyzed and the second word set corresponding to the preset number of the above statements
  • a second feature comprises a phrase or a sentence consisting of a first word and a second word, the first word being one or more words in the first word set, the second word being the One or more words in the set of two words;
  • the statement category includes a first category or a representation statement indicating that the statement is complete and the semantics are not ambiguous
  • the second category of complete or semantic ambiguity
  • the session information processing apparatus includes:
  • One or more memories are One or more memories
  • the one or more memories are stored with one or more instruction modules configured to be executed by the one or more processors;
  • the one or more instruction modules include:
  • the receiving module receives a statement in a session and uses it as a statement to be analyzed;
  • Determining a module including the one or more instruction units in the above apparatus, to determine a category of the sentence to be analyzed;
  • a first analysis module when the statement to be analyzed belongs to the first category, analyzing the semantics of the sentence to be analyzed according to the statement to be analyzed to obtain the semantics of the sentence to be analyzed;
  • a second analysis module when the statement to be analyzed belongs to the second category, analyzing the semantics of the sentence to be analyzed according to the statement to be analyzed and the preset number of the above statement Analyze the semantics of the statement;
  • the execution module performs a corresponding operation according to the semantics of the statement to be analyzed.
  • the non-transitory computer readable storage medium provided by the example of the present application has stored thereon a computer program which, when executed by the processor, implements the steps as described above.
  • FIG. 1 is a system architecture diagram related to an example of the present application
  • FIG. 2 is a schematic flowchart of a method for processing session information in an example of the present application
  • FIG. 3 is a structural block diagram of a session information processing apparatus in an example of the present application.
  • FIG. 4 is a schematic diagram of an interface of a session in an example of the present application.
  • FIG. 5 is a schematic diagram of an interface of a session in an example of the present application.
  • FIG. 6 is a usage scene diagram of a smart speaker in an example of the present application.
  • FIG. 7 is a structural block diagram of a session information processing apparatus in an example of the present application.
  • FIG. 8 is a structural block diagram of a computer device in an example of the present application.
  • the present application proposes a session information processing method, and the system architecture applicable to the method is as shown in FIG. 1 .
  • the system architecture includes a client device (eg, 101a-101d in FIG. 1) and a server 102, and the client device and server 102 are connected by a communication network 103, wherein:
  • the client device may be a user's computer 101a or a smart phone 101b, on which client software of various application software is installed, and the user can log in and use a client of various application software through the client device, and the application software
  • the client may include a shopping client with artificial intelligence function, for example, Tmall (Artificial Intelligence Tool - Ali Huawei), Jingdong (Intelligent Customer Service - JIMI Intelligent Robot), and the client of the application software may also include a social client.
  • WeChat the public number of the second machine
  • the client of the application software can also be other clients that can provide artificial intelligence services.
  • the above-mentioned client device can also be a smart speaker 101c (for example, an inquiry speaker capable of inquiring a ticket/takeaway, a Tmall elf capable of voice shopping, and a millet AI capable of querying the weather.
  • a smart speaker 101c for example, an inquiry speaker capable of inquiring a ticket/takeaway, a Tmall elf capable of voice shopping, and a millet AI capable of querying the weather.
  • Hardware devices such as speakers, etc., and intelligent robot 101d (for example, Kenji Machine).
  • the server 102 may be a server or a server cluster, and corresponding to the client installed on the client device, may provide corresponding services for the client device.
  • the above communication network 103 can be a local area network (LAN), a metropolitan area Metropolitan Area Network (MAN), Wide Area Network (WAN), mobile network, wired or wireless network, private network, etc.
  • LAN local area network
  • MAN metropolitan area Metropolitan Area Network
  • WAN Wide Area Network
  • mobile network wired or wireless network, private network, etc.
  • the client device can exchange data with the server 102 through the communication network 103.
  • the server 102 executes the session information processing method provided by the example of the present application, thereby enabling the user to enjoy the online artificial intelligence service. That is to say, when the above client device is in a networked state, the user can be provided with an online artificial intelligence service.
  • the session information processing method provided by the example of the present application may be performed not only by the server in the above scenario, but also by a hardware device that directly faces the user and can provide the artificial intelligence service offline.
  • a background server that can execute the session information processing method provided by the example of the present application or a hardware device that directly faces the user and can provide the artificial intelligence service offline is collectively referred to as an electronic device.
  • the example of the present application provides a method for processing a session information, which may be performed by the foregoing electronic device, as shown in FIG. 2, and specifically includes:
  • the above statement refers to the statement in the above sentence of the sentence to be analyzed in the session.
  • the preset number can be selected as needed, for example, 2.
  • the N-gram algorithm can be used for word segmentation, and the obtained word can be called N-gram feature.
  • the specific word segmentation process includes: splicing the statement to be analyzed and the preset number of the above statements according to the sequence of sentences in the session to obtain a splicing statement; the adjacent characters are separated by a first character, Separating the analysis statement from the preceding statement adjacent thereto by a second character; extracting an N-gram feature from the spliced statement, and forming the N-gram feature as the first feature a set of features; where N is a preset integer.
  • the above N takes 3, the first character is _EOS_, and the second character is _END_, thereby distinguishing which statements are the above statements and which statements are the sentences to be analyzed.
  • the splicing statement obtained by splicing the dialog provided in the above step S201 is "White Horse Temple _ EOS_ very long temple _END_ have you ever been?". Then the wording of the spliced statement is obtained:
  • 1_gram ⁇ white, horse, temple, _EOS_, very, long, long, the temple, the courtyard, _END_, go, pass, no ⁇ ;
  • 3_gram ⁇ White Horse Temple, Ma Temple _EOS_, Temple _EOS_ very, _EOS_ very long, very long, long, long temple, temple, temple _END_, home _END_ go, _END_ went, Been to it;
  • Each of the above 1_gram, 2_gram, and 3_gram is referred to as an N-gram feature, and 1_gram, 2_gram, and 3_gram form a first feature set.
  • the first feature may be numbered by using a natural number, for example, “white 1”, “horse 2”, etc., thereby achieving different first Distinction of features.
  • S203 Extract a second feature set including one or more second features from a first set of words corresponding to the to-be-analyzed statement and a second set of words corresponding to the preset number of the foregoing statements;
  • a second feature includes a phrase or a sentence consisting of a first word and a second word, the first word being one or more words in the first word set, and the second word being the second word set One or more words in ;
  • each second feature includes one or more words in the first word set, and also includes one or more words in the second word set. That is to say, the second feature includes the words in the sentence to be analyzed, and also includes the words in the above sentence, so the second feature can also be referred to as a cross feature.
  • the significance of this step is that meaningful semantic fragments can be extracted.
  • the formation process of the first word set and the second word set may refer to the formation process of the first feature set, specifically: using the N-gram algorithm to perform segmentation of the sentence to be analyzed, and obtaining N-gram features (these N-gram features may be called For the first word), these N-gram features form the first set of words; similarly, each of the above statements in the preset number of statements can also be segmented using the N-gram algorithm to obtain N-gram features. (These N-gram features may be referred to as second words), and a predetermined number of N-gram features obtained from the above sentence segmentation form a second set of words.
  • the extracted second features may include: going to _ White Horse Temple, going to _ long time, going to the temple, these second features forming a second feature set.
  • S204 Determine, according to the first feature set and the second feature set, a sentence category to which the to-be-analyzed statement belongs; the statement category includes a first category indicating that the statement is complete and the semantics are not ambiguous or the representation statement is incomplete. Or the second category of semantic ambiguity.
  • the second category above refers to incomplete sentences or semantic ambiguities.
  • statement category does not only include the above two types, for example, it also includes the third category, and the statement of the category indicates the end of the session.
  • A3 I am going to sleep.
  • the sentences are first classified, and then the semantic analysis is performed in different ways for different types of statements.
  • the intent inheritance is not required, and only the analysis according to the statement to be analyzed is required, and the statement for the second category needs to be inherited, that is, the analysis of the statement to be analyzed is combined with the above statement.
  • the third category of statements no semantic analysis is possible.
  • the step S204 determines that there are multiple ways of the statement category to which the sentence to be analyzed belongs, and one of the modes includes the following process:
  • S2041 Encoding each of the first feature set and the second feature set to obtain a first vector corresponding to the feature, where the number of elements in each first vector is the same;
  • step S2041 is actually a process of feature vectorization, and the feature is converted into a vector, thereby facilitating the execution of subsequent calculation steps.
  • a feature code obtains a first vector, the length of each first vector (ie, the number of elements) is the same, and can be preset according to needs, for example, each feature is encoded into a first vector having a vector length of 100 dimensions, so-called The 100-dimensional means that the first vector includes 100 elements.
  • each of the first features or the second features may be converted into a fixed length first vector by using an embedding matrix.
  • S2042 Determine a second vector according to each of the first vectors; the second vector is a vector indicating the to-be-analyzed statement and the preset number of the above statements;
  • the second vector can represent The statement and the preset number of the above statements are analyzed, so the second vector can also be referred to as the statement to be analyzed and the representation vector of the above statement.
  • determining the second vector one of which is: determining an average vector of each of the first vectors, and using the average vector as the second vector; wherein the average vector
  • Each element is a ratio between a sum of elements of corresponding positions in each of the first vectors and a quantity of the first vector. For example, if there are three first vectors and two elements in each first vector, and the three first vectors are: (0.3, 0.7), (0.1, 0.9), (0.4, 0.6), then The second vector is (0.27, 0.73). If the second vector is calculated in this way, the number of elements in the second vector is the same as the number of elements in the first vector.
  • the vector averaging method is only one way to determine the second vector.
  • Long-term short-term memory (LSTM) and Convolutional Neural Network (CNN) can also be used.
  • a vector is processed to obtain a second vector representing the above statement and the statement to be analyzed. If the second vector is calculated by using LSTM or CNN, the number of elements in the second vector obtained may be the same as the number of elements in the first vector, or may be different from the number of elements in the first vector.
  • S2043 Input the second vector into a preset classifier to obtain a matching degree between the sentence to be analyzed and each sentence category;
  • the above preset classifier can be selected as needed, for example, a softmax classifier, a support vector machine (SVM), or the like.
  • SVM support vector machine
  • the degree of matching between the above-mentioned sentence to be analyzed and each sentence category can be understood as the probability that the sentence to be analyzed belongs to each sentence category.
  • the degree of matching between the sentence to be analyzed and each sentence category may be output by the softmax classifier in step S2043.
  • the second vector is input into a softmax classifier, and the classifier outputs the matching degree between the sentence to be analyzed and each sentence category by using the following formula (1):
  • y 0 , . . . , y n are the respective elements in the second vector.
  • the number of elements in the vector output by the Softmax classifier is the same as the number of elements in the vector of the input Softmax, and therefore the softmax classifier is used to determine between the sentence to be analyzed and each sentence category.
  • the number of elements in the input vector needs to be the same as the number of statement categories. If other classifiers are used to determine the degree of matching between the sentence to be analyzed and each sentence category, it is not necessary to make the number of elements in the input vector the same as the number of statement categories.
  • the second vector may be nonlinearly transformed by the transformation function before the second vector is input to the second vector.
  • Transforming specifically: inputting the second vector into a transformation model, the transformation model comprising a second predetermined number of transformation functions, wherein the transformation function can perform nonlinear transformation on the input data; wherein, if the second preset If the number is greater than or equal to 2, the output data of the previous transform function is the input data of the latter transform function.
  • the output data of the transformation model is input to the preset classifier to obtain the degree of matching between the sentence to be analyzed and each sentence category.
  • the preset number of the above statement may be referred to as a first preset number.
  • the second predetermined number of transformation functions are sequentially connected.
  • the second predetermined number of transform functions is actually a second predetermined number of hidden layers, and each hidden layer has a transform function.
  • conversion functions that can implement nonlinear transformation, such as sigmoid functions.
  • the transformation function can also perform a certain linear transformation on the input data before nonlinear transformation of the input data.
  • the ith variation function can take the following formula (2):
  • f 2 (x 2 ) 1/(1+exp(-x 2 ))
  • x 2 W i *h i-1 +b i
  • W i is a weight coefficient
  • b i is a bias coefficient
  • h i-1 is the output data of the i-1th transform function
  • h 0 is the second vector.
  • the input data h i-1 is linearly transformed by W i *h i-1 +b i , and then nonlinearly transformed by 1/(1+exp(-x 2 )).
  • the generalization ability when the second preset number is 2 is better than when the second preset number is 1, and if the second preset number is greater than 2, the calculation amount is large, and The generalization ability is not greatly improved when the second preset number is 2, so the second preset number can be selected to be 2, and of course, the second preset number can be set to other values.
  • the vector to be input into the preset classifier may be input into the fully connected layer before inputting the preset classifier, and the role of the fully connected layer is
  • the vector of any length is converted into a preset length, so that the vector to be input into the preset classifier can be converted into a vector having the same length as the number of the statement categories by using the fully connected layer. For example, if the number of the statement categories is 3,
  • the vector to be input into the preset classifier is converted into a vector having three elements by using the fully connected layer, and then the vector having three elements is input into the preset classifier to implement classification.
  • the dimensional transformation of the vector is realized by the conversion function in the fully connected layer, so that the vector whose conversion length is the same as the number of the statement categories can be converted.
  • the statement category corresponding to the highest matching degree among the matching sentences of the sentence to be analyzed and each sentence category may be used as the sentence category to which the sentence to be analyzed belongs.
  • the preset classifier output vector is: (0.3, 0.7), the first element in the vector is the matching degree of the sentence to be analyzed and the first category, and the second element is the sentence to be analyzed and The matching degree of the second category, since 0.7 is greater than 0.3, the sentence to be analyzed belongs to the second category.
  • the vector outputted in step S2043 is not limited to only two elements, and the number of elements is consistent with the number of categories.
  • the session information processing method provided by the example of the present application is actually a neural network model, and is a supervised multi-layer neural network model, which can implement a class to be analyzed. None else.
  • the context information is richer than the method of classifying only the sentence to be analyzed, which can improve the accuracy of the classification. Sex.
  • the method does not require the construction of classification rules, thereby reducing or avoiding the problem of low recall rate due to incomplete coverage of classification rules.
  • the method provided by the examples of the present application can improve the accuracy and recall rate of the category to be analyzed.
  • the number of the first features in the first feature set formed in step S202 is very large, and the value of N is larger, the number of first features is also increased, so that certain measures can be taken to reduce the feature.
  • the quantity is then encoded into the first vector.
  • One way to reduce the number of features is to use a hash function to discretize the first feature of the massive amount into a finite hash bucket by a hash function, each hash bucket corresponding to a hash value, the same hash value.
  • the first feature can be considered as a feature, thereby achieving the effect of reducing the number of features.
  • the specific process may include: inputting each first feature into a preset hash function to obtain a hash value corresponding to the first feature; and the hash function is capable of mapping the input feature to an integer of the preset interval.
  • the first feature having the same hash value may be encoded as one feature to obtain a corresponding first vector. For example, for the first feature "white” and “white horse” through the hash function will be assigned to the number 1 system, and then "white” and “white horse” can be regarded as a feature.
  • the hash function can use the following formula (3):
  • x 1 is the input characteristic of the hash function
  • f 1 (x 1 ) is a hash value and is an integer in [0, n-1].
  • n in the above formula (3) can be selected as needed, and the larger the value of n, the larger the number of hash buckets.
  • n 10
  • each first feature can be mapped to an integer between 0 and 9, That is to say, each of the first features can be divided into 10 hash buckets numbered 0-9.
  • the present application further provides a session information processing device, which may be any electronic device that executes the above-described session information processing method.
  • the device 300 includes:
  • One or more memories are One or more memories
  • the one or more memories are stored with one or more instruction modules configured to be executed by the one or more processors;
  • the one or more command units include:
  • the first extracting unit 301 extracts, from a session, a statement to be analyzed and a preset number of statements of the statement to be analyzed;
  • the word segmentation unit 302 performs word segmentation processing on the to-be-analyzed sentence and the preset number of the above-mentioned sentences to obtain a first feature set including a plurality of first features;
  • the second extracting unit 303 extracts a second feature including one or more second features from the first word set corresponding to the sentence to be analyzed and the second word set corresponding to the preset number of the above statements a set; wherein a second feature comprises a phrase or a sentence consisting of a first word and a second word, the first word being one or more words in the first word set, the second word being the One or more words in the second set of words; and
  • the determining unit 304 is configured to determine, according to the first feature set and the second feature set, a sentence category to which the sentence to be analyzed belongs; the statement category includes a first category or a representation statement indicating that the statement is complete and the semantics are not ambiguous Incomplete or semantically ambiguous second category.
  • the word segmentation unit can specifically include:
  • a splicing subunit splicing the statement to be analyzed and the preset number of the above statements according to a sequence of sentences in a session to obtain a splicing statement; the adjacent preceding statements are separated by a first character, The second character is separated between the statement to be analyzed and the preceding statement adjacent thereto;
  • Extracting a subunit extracting an N-gram feature from the spliced statement, and forming the first feature set as the first feature by the N-gram feature; wherein N is a preset integer.
  • the determining unit can include:
  • Encoding a sub-unit Encoding each of the first feature set and the second feature set to obtain a first vector corresponding to the feature, where the number of elements in each first vector is the same;
  • the second vector is a vector indicating the statement to be analyzed and the preset number of the above statements
  • the above apparatus may further include:
  • a hash unit configured to input each first feature into a preset hash function before the encoding subunit encodes each of the first feature set and the second feature set, to obtain the first a hash value corresponding to the feature;
  • the hash function is capable of mapping the input feature And an integer of the preset interval; wherein the coding sub-unit specifically encodes the first feature with the same hash value as a feature to obtain a corresponding first vector.
  • the hash function can include:
  • x 1 is the input characteristic of the hash function
  • f 1 (x 1 ) is a hash value and is an integer in [0, n-1].
  • the subunit specifically inputs the output data of the transformation model to the preset classifier.
  • the ith transformation function can be:
  • f 2 (x 2 ) 1/(1+exp(-x 2 ))
  • x 2 W i *h i-1 +b i
  • W i is a weight coefficient
  • b i is a bias coefficient
  • h i-1 is the output data of the i-1th transform function
  • h 0 is the second vector.
  • the first determining subunit specifically: determining an average vector of each of the first vectors, and using the average vector as the second vector; wherein each element in the average vector is each of the The ratio between the sum of the elements of the corresponding position in the first vector and the number of the first vector.
  • the preset classifier may calculate a degree of matching between the sentence to be analyzed and each sentence category by using the following formula:
  • y 0 , . . . , y n are the respective elements in the vector of the input preset classifier.
  • the second determining sub-unit specifically takes the statement category corresponding to the highest matching degree among the matching sentences of the sentence to be analyzed and the respective sentence categories as the statement category to which the sentence to be analyzed belongs.
  • the foregoing session information processing method or device may classify the analysis sentences, know the category to which the sentence to be analyzed belongs, and further process the analysis statement based on the classification result, for example, perform semantic analysis, or perform some execution on the basis of semantic analysis.
  • An operation, therefore, the application example of the present application provides another method for processing session information, which may include the following steps:
  • S301 Receive a statement in a session and use it as a statement to be analyzed.
  • the statement may be input by the user for text input, or may be obtained by converting the voice data input by the user, and of course, may be obtained by other forms.
  • the user when the user queries the weather through a smart tool on a client on the mobile phone, the user can manually input “Beijing Tomorrow Weather”, or can input “Beijing Tomorrow Weather” by voice, and then the smart tool.
  • the current weather is queried and the weather information that is queried is sent out, "The weather is clear, there is a north wind”...
  • the user and a certain robot the user says “Bodhi has no tree", after receiving the voice, the robot converts the voice into a text sentence, and then analyzes and returns "the mirror is not a Taiwan”.
  • the statement to be analyzed is in the first category, the statement of the statement to be analyzed is complete and the semantics are not saved. In ambiguity, therefore, based on the analysis of the statement to be analyzed, the semantics can be known.
  • the completion technique may be used to complete the analysis sentence.
  • the so-called completion is to extract some words or phrases from the above, and add them to the statement to be analyzed to form a flow field unambiguous statement, thereby compressing the multi-round dialogue into a single round statement. That is, the sentence to be analyzed is complemented according to the above statement, and a completion statement is obtained, and the completion statement is in a first category; then the completion statement can be semantically analyzed to obtain a Describe the semantics of the analysis statement.
  • the C in the input is the above statement, that is, the first few rounds of dialogue
  • q is the statement to be analyzed
  • q' is the completed statement
  • F is a mapping model constructed, according to the above input statement and
  • the statement to be parsed can output the completed statement.
  • mapping model can be abstracted into a translation problem.
  • multiple rounds of dialogue can be regarded as a complex language A, and our goal is to translate it into a language B that is simple and compact.
  • the self-encoder can be used, that is, the encoder-decoder architecture is used for end-to-end modeling, and the semantic complete statement can be automatically generated according to the above statement and the sentence to be analyzed with information missing.
  • the encoding part can use LSTM to encode the input
  • the decoding part can use RNN to generate the output statement, and in order to strengthen the influence of related words in the above statement on generating output statements, attention mechanism can be used to strengthen the key. Part of the impact, weakening the impact of irrelevant information.
  • the specific modeling process may include: splicing the input statement and the input sentence to be analyzed to obtain a splicing statement.
  • the bidirectional LSTM is used to center each word, and the embedding of the entire concatenated sentence (ie, embedding) is extracted.
  • the attention weight value is calculated, and the weighted value is used to calculate the weighted sum of the extracted embedding, and the global vector expression under the current important word is obtained, and the vector and the decoder are hidden.
  • the layer and the last generated word are input into the decoder together to get the next word to be generated in the word.
  • the probability distribution in the code and take the word with the highest probability as the output. Repeat the above process until the end character is generated. Thus, the complete statement is obtained and output.
  • the user inputs a "booking ticket" on the intelligent customer service of a certain client, and the dialogue of the user to place an order through the intelligent customer service is actually a task-type dialogue that requires precise control.
  • the slot system can be used for the session management. That is to say, the slot system is configured to configure the dialog and the process, and the information to be filled in is used as the slot. For the slot where the information is missing, the user needs to be asked. To get the corresponding information.
  • the intelligent customer service uses the slot system to set a corresponding slot according to the semantics of the to-be-analyzed statement; and returns an information acquisition statement to the user for the sentence to be analyzed to obtain information required by the slot;
  • the information required by the slot performs corresponding operations.
  • the user enters the "booking ticket" on a smart customer service. Because the ticket is required to set the flight ticket, the destination, the departure time, the ticket information, etc., the relevant slot is configured through the slot system. Every slot in which the information is missing will be asked to the user.
  • the intelligent customer service responds “Where do you want to go from?”, then the user enters “Beijing”, then the intelligent customer service responds “Which city do you want to go from Beijing?”, and so on, through continuous questioning, from the user In the answer, get the required information, fill in the corresponding slot, and you can make the booking after you get the complete information.
  • the information slot can also be changed. For example, after the smart customer service responds “Which city do you want to travel from Beijing?”, the user enters the “change place of departure”, and the server of the intelligent customer service receives this. After a statement, determine its category as the second category. It needs to be combined with the above understanding.
  • the semantics are: change the departure place of the airline ticket, and then perform the following operations: reply to the user "Which city do you want to change?" And several possible replacement cities are available below the input box: Shanghai, Guangzhou, Shenzhen, Qingdao.
  • the intelligent customer service replaces the information in the corresponding slot to implement information modification. Visible,
  • the corresponding action performed was to reply to a statement.
  • the corresponding operation performed is not limited to replying to a statement, but also may be playing a song, inquiring news information, placing an order, and the like.
  • the smart speaker receives the voice and plays the song.
  • the specific implementation process is roughly as follows: the smart speaker passes through the communication network and the background. The server is connected. When the smart speaker receives the voice, it sends the voice to the server in the background, and then the background server converts it into a text statement. After analyzing the semantics, the server performs the following operations: in the song library. Query "New Love" and send the audio stream to the smart speaker so that the smart speaker can play the song.
  • the application example further provides a session information processing apparatus.
  • the apparatus 700 includes:
  • One or more memories are One or more memories
  • the one or more memories are stored with one or more instruction modules configured to be executed by the one or more processors;
  • the one or more instruction modules include:
  • the receiving module 701 receives a statement in a session and uses it as a statement to be analyzed;
  • a determining module 702 including the one or more instruction units in the apparatus 300 above, to determine a category of the sentence to be analyzed;
  • the first analysis module 703 when the statement to be analyzed belongs to the first category, analyzes the semantics of the sentence to be analyzed according to the statement to be analyzed to obtain the semantics of the sentence to be analyzed;
  • a second analysis module 704 when the statement to be analyzed belongs to the second category, according to the to-be-analyzed statement and the preset number of the above statements, the semantics of the sentence to be analyzed Performing analysis to obtain the semantics of the sentence to be analyzed;
  • the executing module 705 performs a corresponding operation according to the semantics of the statement to be analyzed.
  • the present application also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements the steps of any of the above methods (eg, S201-S204 or S301-S305) .
  • the application example further provides a computer device, which may be the above electronic device.
  • the computer device includes one or more processors (CPUs) 802 , a communication module 804 , a memory 806 , and a user interface 810 . And a communication bus 808 for interconnecting these components, wherein:
  • the processor 802 can receive and transmit data through the communication module 804 to effect network communication and/or local communication.
  • User interface 810 includes one or more output devices 812 that include one or more speakers and/or one or more visual displays.
  • User interface 810 also includes one or more input devices 814 including, for example, a keyboard, a mouse, a voice command input unit or loudspeaker, a touch screen display, a touch sensitive tablet, a gesture capture camera or other input button or control, and the like.
  • the memory 806 can be a high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state storage device; or a non-volatile memory such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, Or other non-volatile solid-state storage devices.
  • a high speed random access memory such as DRAM, SRAM, DDR RAM, or other random access solid state storage device
  • non-volatile memory such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, Or other non-volatile solid-state storage devices.
  • the memory 806 stores a set of instructions executable by the processor 802, including:
  • Operating system 816 including programs for processing various basic system services and for performing hardware related tasks
  • the application 818 includes various applications for session information processing, such an application being capable of implementing the processing flow in each of the above examples, such as may include some or all of the instruction modules or units in the session information processing apparatus.
  • the processor 802 can implement the functions of at least one of the above-described units or modules by executing machine-executable instructions in at least one of the units in the memory 806.
  • the hardware modules in each example may be implemented in a hardware manner or a hardware platform plus software.
  • the above software includes machine readable instructions stored in a non-volatile storage medium. Therefore, each instance can also be embodied as a software product.
  • the hardware may be implemented by specialized hardware or hardware that executes machine readable instructions.
  • the hardware can be a specially designed permanent circuit or logic device (such as a dedicated processor such as an FPGA or ASIC) for performing a particular operation.
  • the hardware may also include programmable logic devices or circuits (such as including general purpose processors or other programmable processors) that are temporarily configured by software for performing particular operations.
  • each instance of the present application can be implemented by a data processing program executed by a data processing device such as a computer.
  • the data processing program constitutes the present application.
  • a data processing program usually stored in a storage medium can be read out of the storage medium directly or by installing or copying the program to a storage device of the data processing device (such as a hard disk and/or Or in memory). Therefore, such a storage medium also constitutes the present application, and the present application also provides a non-volatile storage medium in which a data processing program is stored, which can be used to execute any of the above-mentioned method examples of the present application. An example.
  • the machine readable instructions corresponding to the modules of FIG. 8 may cause an operating system or the like operating on a computer to perform some or all of the operations described herein.
  • the non-transitory computer readable storage medium may be inserted into a memory provided in an expansion board within the computer or written to a memory provided in an expansion unit connected to the computer.
  • the CPU or the like installed on the expansion board or the expansion unit can perform part and all of the actual operations according to the instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

本申请公开了一种会话信息处理方法及装置、存储介质,该方法包括:从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别。

Description

会话信息处理方法及其装置、存储介质 技术领域
本申请涉及人工智能术领域,尤其是涉及一种会话信息处理方法及装置、存储介质。
背景
人工智能(AI,Artificial Intelliuigence)技术是指利用计算机等现代化工具来模拟人类的思维和行动的技术。随着人工智能技术的日渐进步,人工智能技术已被应用于生产生活的各个方面,例如,对话系统。在对话系统中,AI对话机器人会对会话中的语句进行语义分析再做出相应的回复。
技术内容
本申请实例提供了一种会话信息处理方法及装置、存储介质。
本申请实例提供的会话信息处理方法包括:
从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
本申请实例提供的会话信息处理方法,包括:
接收一个会话中的一个语句并将其作为待分析语句;
采用上述方法确定所述待分析语句的类别;
若所述待分析语句属于所述第一类别,则根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
根据所述待分析语句的语义执行相应的操作。
在一些实施例中,所述若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义,包括:
根据所述上文语句对所述待分析语句进行补全,得到补全语句,所述补全语句为第一类别;并对所述补全语句进行语义分析,得到所述待分析语句的语义。
在一些实施例中,所述根据所述上文语句对所述待分析语句进行补全,包括:
采用映射模型,确定所述待分析语句对应的补全语句;
其中,所述映射模型为采用自编码器预先构建,所述映射模型的输入信息为所述待分析语句和所述上文语句,所述映射模型的输出信息为所述补全语句。
在一些实施例中,所述根据所述待分析语句的语义执行相应的操作,包括:
根据所述待分析语句的语义,采用槽位系统设置相应的槽位;针对所述待分析语句向用户回复信息获取语句,以获取所述槽位所需的信息;根据所述槽位所需的信息,执行相应的操作。
本申请实例提供的会话信息处理方法,由电子设备执行,方法包括:
从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
本申请实例提供的会话信息处理方法,由电子设备执行,方法包括:
接收一个会话中的一个语句并将其作为待分析语句;
采用上述方法确定所述待分析语句的类别;
若所述待分析语句属于所述第一类别,则根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
根据所述待分析语句的语义执行相应的操作。
本申请实例提供的会话信息处理装置包括:
一个或多个存储器;
一个或多个处理器;其中,
所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
所述一个或多个指令单元包括:
第一提取单元,从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
分词单元,对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
第二提取单元,从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
确定单元,根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
本申请实例提供的会话信息处理装置,包括:
一个或多个存储器;
一个或多个处理器;其中,
所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
所述一个或多个指令模块包括:
接收模块,接收一个会话中的一个语句并将其作为待分析语句;
确定模块,包括上述装置中的所述一个或多个指令单元,以确定所述待分析语句的类别;
第一分析模块,在所述待分析语句属于所述第一类别时,根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
第二分析模块,在所述待分析语句属于所述第二类别时,根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
执行模块,根据所述待分析语句的语义执行相应的操作。
本申请实例提供的非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述的步骤。
附图简要说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是本申请实例涉及的一种系统架构图;
图2是本申请实例中一种会话信息处理方法的流程示意图;
图3是本申请实例中会话信息处理装置的结构框图;
图4是本申请实例中一会话的界面示意图;
图5是本申请实例中一会话的界面示意图;
图6是本申请实例中一智能音箱的使用场景图;
图7是本申请实例中会话信息处理装置的结构框图;
图8是本申请实例中计算机设备的结构框图。
实施方式
本申请提出了一种会话信息处理方法,该方法适用的系统架构如图1所示。该系统架构包括:客户端设备(例如,图1中的101a~101d)和服务器102,客户端设备和服务器102之间通过通信网络103连接,其中:
上述客户端设备可以是用户的电脑101a或智能手机101b,其上安装有各种应用软件的客户端软件,用户可以通过上述客户端设备登录并使用各种应用软件的客户端,该应用软件的客户端可以包括具有人工智能功能的购物客户端,例如,天猫(人工智能工具-阿里小蜜)、京东(智能客服-JIMI智能机器人),该应用软件的客户端也可以包括社交客户端,例如,微信(贤二机器僧的公众号)等,当然该应用软件的客户端还可以是其他能够提供人工智能服务的客户端。上述客户端设备除了可以为智能手机、电脑等设备外,还可以是智能音箱101c(例如,能查询机票/外卖的问问音箱、能语音购物下单的天猫精灵、能查询天气的小米AI音箱等)、智能机器人101d(例如,贤二机器僧)等硬件设备。
上述服务器102可以是一台服务器,也可以是服务器集群,与客户端设备上安装的客户端相对应,可以为客户端设备提供相应的服务。
上述通信网络103可以局域网(Local Area Network,LAN)、城域 网(Metropolitan Area Network,MAN)、广域网(Wide Area Network,WAN)、移动网络、有线网络或者无线网络、专用网络等。
当用户使用上述客户端设备时,客户端设备能通过上述通信网络103与服务器102交互数据,服务器102执行本申请实例提供的会话信息处理方法,进而使用户享受在线的人工智能服务。也就是说,上述客户端设备处于联网状态时,可以为用户提供在线的人工智能服务。当然,本申请实例提供的会话信息处理方法不仅仅可以由上述场景中的服务器执行,也可以由直接面对用户且可以离线提供人工智能服务的硬件设备执行。这里,将可执行本申请实例提供的会话信息处理方法的后台服务器或者直接面对用户且可以离线提供人工智能服务的硬件设备统称为电子设备。
本申请实例提供一种会话信息处理方法,该方法可以由上述电子设备执行,如图2所示,具体包括:
S201、从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
可理解的是,上文语句指的是在会话中,待分析语句的上文中的语句。为了更好的理解待分析语句,可以选择与待分析语句相邻的几个上文语句。预设数量,可以根据需要进行选择,例如,2。
举例来说,在一个对话中包括以下几个语句:
A:白马寺
B:很悠久的寺院
A:去过吗?
其中,“去过吗?”为待分析语句,可以选择前两句“白马寺”、“很悠久的寺院”作为上文语句。
S202、对所述待分析语句和所述预设数量的所述上文语句进行分词 处理,得到包括多个第一特征的第一特征集合;
可理解的是,这里对待分析语句和上文语句进行分词的目的是得到待分析语句和上文语句的各个第一特征。
其中,对待分析语句和上文语句进行分词的方式有多种,例如,可以利用N-gram算法进行分词,得到的词可以称为N-gram特征。具体分词过程包括:将所述待分析语句和所述预设数量的上文语句按照会话中的语句先后顺序进行拼接,得到拼接语句;相邻的上文语句之间采用第一字符分隔,所述待分析语句和与其相邻的上文语句之间采用第二字符分隔;从所述拼接语句中提取N-gram特征,并将所述N-gram特征作为所述第一特征形成所述第一特征集合;其中N为预设整数。
举例来说,上述N取3,第一字符为_EOS_,第二字符为_END_,以此来区分哪些语句是上文语句,哪些语句是待分析语句。对上述步骤S201中提供的对话进行拼接后得到的拼接语句为“白马寺_EOS_很悠久的寺院_END_去过吗”。然后对该拼接语句进行分词后得到:
1_gram:{白,马,寺,_EOS_,很,悠,久,的,寺,院,_END_,去,过,吗};
2_gram:{白马,马寺,寺_EOS_,_EOS很_,很悠,悠久,久的,的寺,寺院,院_END_,_END_去,去过,过吗};
3_gram:{白马寺,马寺_EOS_,寺_EOS_很,_EOS_很悠,很悠久,悠久的,久的寺,的寺院,寺院_END_,院_END_去,_END_去过,去过吗};
上述1_gram、2_gram和3_gram中的每一个特征称为N-gram特征,1_gram、2_gram和3_gram形成第一特征集合。
由于第一特征集合中的第一特征数量较多,可以采用自然数对其中的第一特征进行编号,例如,“白1”、“马2”等,进而实现对不同第一 特征的区分。
S203、从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;
可理解的是,该步骤S203实际上是一个提取第二特征的过程,每一个第二特征包括第一词集合中的一个或多个词,也包括第二词集合中的一个或多个词,也就是说,第二特征即包括待分析语句中的词,也包括上文语句中的词,因此也可以将第二特征称之为交叉特征。该步骤的意义在于可以提取出有意义的语义片段。
第一词集合和第二词集合的形成过程可以参考第一特征集合的形成过程,具体为:对待分析语句进行利用N-gram算法进行分词,得到N-gram特征(这些N-gram特征可以称为第一词),这些N-gram特征形成第一词集合;同样的,对预设数量的上文语句中的各个上文语句也分别可以利用N-gram算法进行分词,得到N-gram特征(这些N-gram特征可以称为第二词),预设数量的上文语句分词得到的N-gram特征形成第二词集合。
举例来说,针对步骤S201中提供的对话,提取的第二特征可以包括:去过_白马寺、去过_悠久、去过_寺院,这些第二特征形成第二特征集合。
S204、根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
可理解的是,上述第一类别是指语句完整且语义不存在歧义。
举例来说:
A1:白马寺
B1:很悠久的寺院
A1:去过白马寺吗?
对于待分析语句“去过白马寺吗?”,其语句完整而且语义不存在歧义,因此该待分析语句属于第一类别。
上述第二类别是指语句不完整或者语义存在歧义。
举例来说:
A2:白马寺
B2:很悠久的寺院
A2:去过吗?
对于待分析语句“去过吗?”,其语句不完整,在理解其含义时需要结合上文进行分析,因此该待分析语句属于第一类别。
当然,语句类别不只是包括以上两种,例如,还包括第三类别,该类别的语句表示会话结束。
举例来说:
A3:我要睡觉了
B3:好好休息
A3:晚安
针对待分析语句“晚安”,其属于第三类别。
在会话系统中,进行语义分析时,首先要对语句进行分类,然后对不同类别的语句采用不同的方式进行语义分析。例如,对于第一类别的语句不需要进行意图继承,只需要依据待分析语句进行分析即可,而针对第二类别的语句则需要进行意图继承,即需要结合上文语句对待分析语句进行分析。对于第三类别的语句,可以不进行语义分析。
其中,该步骤S204确定所述待分析语句所属的语句类别的方式有多种,其中一种方式包括以下过程:
S2041、将所述第一特征集合和所述第二特征集合中每一个特征进行编码,得到该特征对应的第一向量,各个第一向量中元素的个数相同;
可理解的是,该步骤S2041实际上是一个特征向量化的过程,将特征转换为向量,进而便于执行后续的计算步骤。一个特征编码得到一个第一向量,各个第一向量的长度(即元素的个数)相同,而且可以根据需要预设,例如,将每一个特征编码成向量长度为100维的第一向量,所谓的100维是指第一向量中包括100个元素。在实际应用时,具体可以采用词嵌入(embedding)矩阵将每一个第一特征或第二特征转换为一个固定长度的第一向量。
S2042、根据各个所述第一向量,确定一个第二向量;所述第二向量为表示所述待分析语句和所述预设数量的上文语句的向量;
由于有一些第一向量是由待分析语句编码得到的,有一些第一向量是由上文语句编码得到的,而第二向量是依据各个第一特征而确定的,因此第二向量可以表示待分析语句和预设数量的上文语句,因此第二向量也可以称为待分析语句和上文语句的表示向量。
在实际应用时,第二向量的确定方法有多种,其中一种为:确定各个所述第一向量的平均向量,并将所述平均向量作为所述第二向量;其中,所述平均向量中每一元素为各个所述第一向量中对应位置的元素之和与所述第一向量的数量之间的比值。举例来说,第一向量有三个,而每一个第一向量中有2个元素,而且这三个第一向量为:(0.3,0.7)、(0.1,0.9)、(0.4,0.6),则第二向量为(0.27,0.73)。若采用这种方式计算第二向量,则第二向量中元素的个数与第一向量中元素的个数相同。
当然,采用向量平均法仅仅是确定第二向量的一种方式,还可以采用长短期记忆网络(Long Short-Term Memory,简称LSTM)、卷积神经网络(Convolutional Neural Network,简称CNN)对各个第一向量进行处理,得到一个表示上文语句和待分析语句的第二向量。若采用LSTM或者CNN计算第二向量,得到的第二向量中元素的个数可以与第一向量中元素的个数相同,也可以与第一向量中元素的个数不同。
S2043、将所述第二向量输入预设分类器,得到所述待分析语句与各个语句类别之间的匹配度;
上述预设分类器,可以根据需要选择,例如,softmax分类器、支持向量机(SVM)等。上述待分析语句与各个语句类别之间的匹配度可以理解为待分析语句属于各个语句类别的概率。
例如,若通过LSTM或者CNN计算得到的第二向量与语句类别的数量相同,则在该步骤S2043中可以通过softmax分类器输出待分析语句与各个语句类别之间的匹配度。具体为:将第二向量输入一softmax分类器中,该分类器采用以下公式(1)输出待分析语句与各个语句类别之间的匹配度:
Figure PCTCN2017107269-appb-000001
式(1)中,y0,…,yn为第二向量中的各个元素。
从上述式(1)可知,Softmax分类器输出的向量中元素的个数与输入Softmax的向量中的元素的个数相同,因此在利用softmax分类器确定所述待分析语句与各个语句类别之间的匹配度时,输入向量中的元素个数需要与语句类别的个数相同。若采用其他分类器确定待分析语句与各个语句类别之间的匹配度时,则不需要使输入向量中的元素个数需要与语句类别的个数相同。
在实际应用时,为了增加本申请实例的会话信息处理方法的泛化能力,提高其非线性表达能力,还可以在将第二向量输入第二向量之前,利用变换函数对第二向量进行非线性变换,具体为:将所述第二向量输入变换模型,该变换模型包括第二预设数量的变换函数,所述变换函数能对输入数据进行非线性变换;其中,若所述第二预设数量大于或等于2,则相邻的两个变换函数中,前一个变换函数的输出数据为后一个变换函数的输入数据。这样,将变换模型的输出数据输入上述预设分类器,以获得待分析语句与各个语句类别之间的匹配度。此时,可以将上文语句的预设数量称之为第一预设数量。
可理解的是,由于前一个变化函数的输出数据为后一个变换函数的输入数据,因此上述第二预设数量的变换函数是依次连接的。第二预设数量的变换函数实际上是第二预设数量的隐层,每一隐层有一个变换函数。其中,能实现非线性变换的变换函数有多种,例如sigmoid函数。
为进一步提高泛化能力,变换函数在对输入数据进行非线性变换之前,还可以对输入数据进行一定的线性变换。举例来说,第i个变化函数可以采用以下公式(2):
hi=f2(Wi*hi-1+bi)   (2)
式中,f2(x2)=1/(1+exp(-x2)),x2=Wi*hi-1+bi,Wi为权重系数,bi为偏置系数,hi-1为第i-1个变换函数的输出数据,当i为1时,h0为所述第二向量。
在上述式(2)中,利用Wi*hi-1+bi对输入数据hi-1进行线性变换,然后利用1/(1+exp(-x2))进行非线性变换。
可理解的是,如果将第二向量输入变换模型,并将变换模型的输出向量输入上述式(1)代表的softmax函数中,则y0,…,yn为变换模型的输出数据。
在实际应用时,经多次试验,第二预设数量为2时的泛化能力要比第二预设数量为1时好,如果第二预设数量大于2的话,计算量较大,而且泛化能力并没有相对于第二预设数量为2时有很大的提升,因此可以选择第二预设数量为2,当然也可以将第二预设数量设置为其他值。
在实际应用时,还可以将待输入预设分类器的向量(例如,第二向量或变换模型的输出向量)在输入预设分类器之前,先输入全连接层,全连接层的作用是把任意长度的向量转变成预设的长度,这样可以利用全连接层将待输入预设分类器的向量转换成长度与语句类别个数相同的向量,例如,语句类别的个数为3,则可以利用全连接层将待输入预设分类器的向量转换为一个有三个元素的向量,这样再将这具有三个元素的向量输入到预设分类器中,实现分类。这里,通过加入一个全连接层,利用全连接层中的转换函数实现向量的维度转换,从而可以将其转换长度与语句类别个数相同的向量。
S2044、根据所述待分析语句与各个语句类别之间的匹配度,确定所述待分析语句所属的语句类别。
在实际应用时,可以将所述待分析语句与各个语句类别之间的匹配度中最高的匹配度所对应的语句类别作为所述待分析语句所属的语句类别。
举例来说,步骤S2043中预设分类器输出向量为:(0.3,0.7),该向量中的第一个元素为待分析语句与第一类别的匹配度,第二个元素为待分析语句与第二类别的匹配度,由于0.7大于0.3,因此待分析语句属于第二类别。当然,步骤S2043输出的向量中不限于只有两个元素,元素的个数与类别的个数一致。
本申请实例提供的会话信息处理方法,实际上一个神经网络模型,而且是一个受监督的多层神经网络模型,可以实现对待分析语句所属类 别的确定。在对待分析语句进行分类时,不仅依据待分析语句进行分类,还依据待分析语句的上文语句,相对于仅依据待分析语句进行分类的方式,语境信息量比较丰富,可以提高分类的准确性。而且,该方法不需要构造分类规则,从而减少或避免因分类规则覆盖不全面而造成的召回率较低的问题。总之,本申请实例提供的方法可以提高对待分析语句所属类别进行确定的准确性和召回率。
在一些实例中,由于在步骤S202中形成的第一特征集合中的第一特征的数量非常多,而且N值越大,第一特征的数量也越多,因此可以采取一定的措施减少特征的数量,然后再将第一特征编码成第一向量。其中一种减少特征数量的方式为:采用一个哈希函数,将海量的第一特征通过哈希函数离散到有限的哈希桶中,每一个哈希桶对应一个哈希值,相同哈希值的第一特征可以视为一个特征,进而达到减少特征数量的效果。具体过程可以包括:将每一个第一特征输入预设的哈希函数,得到该第一特征对应的哈希值;所述哈希函数能够将输入特征映射到预设区间的一个整数上。这样,在将第一特征编码成预设维度的第一向量时,可以将将哈希值相同的第一特征作为一个特征进行编码,得到对应的一个第一向量。例如,对于第一特征“白”和“白马”通过哈希函数会被分到编号为1的哈系统中,进而可以将“白”和“白马”当做一个特征。
其中,哈希函数可以采用下式(3):
f1(x1)=x1mod n   (3)
式中,x1为所述哈希函数的输入特征;f1(x1)为哈希值,且为[0,n-1]中的整数。
上述式(3)中的n可以根据需要选择,n值越大,哈希桶的个数越多。例如,当n=10时,可以将各个第一特征映射到0~9之间的整数上, 也就是说,可以将各个第一特征分到编号为0~9的10个哈希桶里。
可见,上述利用哈希函数将第一特征映射到预设区间内的整数的过程实际上是一个聚类的过程,将比较相关或相似的特征合并为一个特征进行后续的处理。
对上述会话信息处理方法对应的,本申请实例还提供一种会话信息处理装置,该装置可以是任何执行上述会话信息处理方法的电子设备,如图3所示,该装置300包括:
一个或多个存储器;
一个或多个处理器;其中,
所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
所述一个或多个指令单元包括:
第一提取单元301,从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
分词单元302,对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
第二提取单元303,从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
确定单元304,根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二 类别。
可理解的是,上述各单元为上述会话信息处理方法的功能模块,其有关内容的解释、举例、有益效果等部分可以参考上述会话信息处理方法中的相应部分,这里不再赘述。
在一些实例中,分词单元可以具体包括:
拼接子单元,将所述待分析语句和所述预设数量的上文语句按照会话中的语句先后顺序进行拼接,得到拼接语句;相邻的上文语句之间采用第一字符分隔,所述待分析语句和与其相邻的上文语句之间采用第二字符分隔;
提取子单元,从所述拼接语句中提取N-gram特征,并将所述N-gram特征作为所述第一特征形成所述第一特征集合;其中N为预设整数。
在一些实例中,确定单元可以包括:
编码子单元,将所述第一特征集合和所述第二特征集合中每一个特征进行编码,得到该特征对应的第一向量,各个第一向量中元素的个数相同;
第一确定子单元,根据各个所述第一向量,确定一个第二向量;所述第二向量为表示所述待分析语句和所述预设数量的上文语句的向量;
输入子单元,将所述第二向量输入预设分类器,得到所述待分析语句与各个语句类别之间的匹配度;
第二确定子单元,根据所述待分析语句与各个语句类别之间的匹配度,确定所述待分析语句所属的语句类别。
在一些实例中,上述装置还可以包括:
哈希单元,用于在编码子单元将所述第一特征集合和所述第二特征集合中每一个特征进行编码之前,将每一个第一特征输入预设的哈希函数,得到该第一特征对应的哈希值;所述哈希函数能够将输入特征映射 到预设区间的一个整数上;其中,编码子单元具体将哈希值相同的第一特征作为一个特征进行编码,得到对应的一个第一向量。
在一些实例中,所述哈希函数可以包括:
f1(x1)=x1mod n
式中,x1为所述哈希函数的输入特征;f1(x1)为哈希值,且为[0,n-1]中的整数。
在一些实例中,所述预设数量为第一预设数量;确定单元还可以包括:
变换子单元,在输入子单元在将所述第二向量输入预设分类器之前,将第二向量输入变换模型,该变换模型包括第二预设数量的变换函数,所述变换函数能对输入数据进行非线性变换;其中,若所述第二预设数量大于或等于2,则相邻的两个变换函数中,前一个变换函数的输出数据为后一个变换函数的输入数据;其中,输入子单元具体将所述变换模型的输出数据输入所述预设分类器。
在一些实例中,第i个变换函数可以为:
hi=f2(Wi*hi-1+bi)
式中,f2(x2)=1/(1+exp(-x2)),x2=Wi*hi-1+bi,Wi为权重系数,bi为偏置系数,hi-1为第i-1个变换函数的输出数据,当i为1时,h0为所述第二向量。
在一些实例中,第一确定子单元具体:确定各个所述第一向量的平均向量,并将所述平均向量作为所述第二向量;其中,所述平均向量中每一元素为各个所述第一向量中对应位置的元素之和与所述第一向量的数量之间的比值。
在一些实例中,所述预设分类器可以采用以下公式计算所述待分析语句与各个语句类别之间的匹配度:
Figure PCTCN2017107269-appb-000002
式中,y0,…,yn为输入所述预设分类器的向量中的各个元素。
在一些实例中,第二确定子单元具体将所述待分析语句与各个语句类别之间的匹配度中最高的匹配度所对应的语句类别作为所述待分析语句所属的语句类别。
上述会话信息处理方法或装置可以对待分析语句进行分类,得知待分析语句所属的类别,基于分类结果,还可以对待分析语句进行进一步处理,例如进行语义分析,或者在语义分析的基础上执行某种操作,因此本申请实例提供另一种会话信息处理方法,该方法可以包括以下步骤:
S301、接收一个会话中的一个语句并将其作为待分析语句;
可理解的是,语句可以是用户进行文本输入的,也可以是将用户输入的语音数据转换得到的,当然,还可以是采用其他形式得到的。例如,如图4所示,当用户在手机上的某客户端上通过智能工具查询天气时,用户可以手动输入“北京明天的天气”,也可以语音输入“北京明天的天气”,然后智能工具对这一语句进行分析后查询当前的天气并发出所查询到的天气信息“天气晴朗,有北风”……。再例如,用户与某一机器人时,用户说“菩提本无树”,该机器人接收到这一语音后,将其转换为文本语句,进行分析后回复“明镜亦非台”。
S302、采用上述S201~S204确定所述待分析语句的类别;
S303、若所述待分析语句属于所述第一类别,则根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
若待分析语句为第一类别,说明待分析语句的语句完整且语义不存 在歧义,因此依据待分析语句本身进行分析,即可以知道语义。
举例来说,针对以下对话:
A1:白马寺
B1:很悠久的寺院
A1:去过白马寺吗?
待分析语句“去过白马寺吗?”为第一类别,则可以直接根据“去过白马寺吗?”这句话本身分析语义即可。
S304、若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
若待分析语句为第二类别,说明待分析语句不完整或者语义存在歧义,因此需要结合上文语句对待分析语句进行语义分析。
举例来说,针对以下对话:
A2:白马寺
B2:很悠久的寺院
A2:去过吗?
待分析语句“去过吗?”为第二类别,则需要结合上文语句“白马寺”、“很悠久的寺院”对待分析语句的含义进行分析,得知其含义是“去过白马寺吗?”。
当在获知待分析语句属于所述第二类别,则可以采用补全技术,对待分析语句进行补全。所谓的补全是抽取上文中的一些词或者短语,加入待分析语句中,形成一个流场无歧义的语句,从而将多轮对话压缩成一个单轮语句。也就是说,根据所述上文语句对所述待分析语句进行补全,得到补全语句,所述补全语句为第一类别;然后就可以对所述补全语句进行语义分析,得到所述待分析语句的语义。
举例来说:
用户A:讲个故事给我听;
机器人:等我学会了给你讲哦。
用户A:我等着。
上述补全的目的将是将“我等着”结合上文改写成“我等着听故事”。这样对于语义的分析更加有利。
实现语句补全可以采用以下方式:
q’=F(q|C)
输入中的C是上文语句,即前几轮对话,q是待分析语句,q’是补全后的语句,具有完整的语义,F是构造的一个映射模型,根据输入的上文语句和待分析语句即可以输出补全后的语句。
映射模型的构建可以抽象为一个翻译问题,对于上面的举例来说,可以将多轮对话视为一种表述繁杂的语言A,而我们的目标是将其翻译为表达简洁紧凑的语言B。具体可以采用自编码器实现,即encoder-decoder架构进行端对端的建模,可以自动根据上文语句与有信息缺失的待分析语句生成语义完整的语句。在该框架中,编码部分可以采用LSTM对输入进行编码,解码部分可以采用RNN进行输出语句的生成,同时为了加强上文语句中相关词对生成输出语句的影响,可以采用注意力机制来加强关键部分的影响,减弱无关信息的影响。
具体建模过程可以包括:将输入的上文语句与输入的待分析语句拼接在一起,得到一个拼接语句。在编码过程中,采用双向LSTM以每个字为中心,抽取整个拼接语句的嵌入(即embedding)。然后根据解码器的隐层和编码器中的嵌入计算注意力权重值,并以此权重值对抽取的嵌入计算加权和,得到当前重要词下全局的向量表达,将该向量与解码器的隐层、上一个生成的字一起输入解码器,得到下一个要生成的字在字 典中的概率分布,并取概率最大的一个字作为输出。重复以上过程,直至生成结束符。由此,得到完整的语句并输出。
S305、根据所述待分析语句的语义执行相应的操作。
举例来说,如图5所示,用户在某一客户端的智能客服上输入“订飞机票”,用户通过智能客服等进行下单的对话实际上是需要精确控制的任务型对话,对于这种类型的对话,可以采用槽位系统来进行对话管理,也就是说,通过槽位系统配置对话和流程,预设需要填入的信息作为槽位,对于信息缺失的槽位,需要通过向用户发问来获取对应的信息。也就是说,智能客服根据所述待分析语句的语义,采用槽位系统设置相应的槽位;针对所述待分析语句向用户回复信息获取语句,以获取所述槽位所需的信息;根据所述槽位所需的信息,执行相应的操作。就这个例子而言,用户在某智能客服上输入“订飞机票”,由于订飞机票需要出发地、目的地、出发时间、订票人信息等,通过槽位系统配置相关的槽位,针对每一个信息缺失的槽位,都会向用户发问。因此,智能客服回复“您想要从哪里出发呀?”,然后用户输入“北京”,然后智能客服回复“您想从北京到哪个城市呢?”,以此类推,通过不断的发问,从用户的回答中获取所需的信息,填入相应的槽位,在获取完整信息后便可以进行订票操作。当然,已经填好信息的槽位也可以进行信息更改,例如,在智能客服回复“您想从北京到哪个城市呢?”之后,用户输入了“换出发地”,智能客服的服务器接收到这一语句后,确定其类别为第二类别,需要结合上文理解,经分析得知其语义是:更换飞机票的出发地,进而执行以下操作:回复用户“您想更换为哪个城市呢?”,并在输入框的下方提供了几个可能的更换城市:上海、广州、深圳、青岛。在用户输入更换城市后,智能客服便将对应的槽位进行信息替换,实现信息修改。可见,
在上一举例中,执行的相应操作是回复一条语句。当然执行的相应操作不限于回复一条语句,还可以是对播放歌曲、查询新闻信息、下单等。例如,如图6所示,若用户发出语音“播放歌曲《新不了情》”,智能音箱接收到这一条语音后,播放该歌曲,具体的实现过程大致为:智能音箱通过通信网络与后台的服务器处于连接状态,当智能音箱在接收到语音后,会将语音发送至后台的服务器,然后后台的服务器转换为文本语句,对其进行分析获知其语义后,服务器执行以下操作:在歌曲库中查询《新不了情》,并将音频流发送给智能音箱,以使智能音箱进行播放该歌曲。
与该会话信息处理方法对应的,本申请实例还提供一种会话信息处理装置,如图7所示,该装置700包括:
一个或多个存储器;
一个或多个处理器;其中,
所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
所述一个或多个指令模块包括:
接收模块701,接收一个会话中的一个语句并将其作为待分析语句;
确定模块702,包括上述装置300中的所述一个或多个指令单元,以确定所述待分析语句的类别;
第一分析模块703,在所述待分析语句属于所述第一类别时,根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
第二分析模块704,在所述待分析语句属于所述第二类别时,根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义 进行分析得到所述待分析语句的语义;及
执行模块705,根据所述待分析语句的语义执行相应的操作。
可理解的是,上述各单元为上述会话信息处理方法S301~S305的功能模块,其有关内容的解释、举例、有益效果等部分可以参考上述会话信息处理方法中的相应部分,这里不再赘述。
本申请实例还提供一种非易失性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述任一方法(例如,S201~S204或者S301~S305)的步骤。
本申请实例还提供一种计算机设备,该计算机设备可以为上述电子设备,如图8所示,该计算机设备包括一个或者多个处理器(CPU)802、通信模块804、存储器806、用户接口810,以及用于互联这些组件的通信总线808,其中:
处理器802可通过通信模块804接收和发送数据以实现网络通信和/或本地通信。
用户接口810包括一个或多个输出设备812,其包括一个或多个扬声器和/或一个或多个可视化显示器。用户接口810也包括一个或多个输入设备814,其包括诸如,键盘,鼠标,声音命令输入单元或扩音器,触屏显示器,触敏输入板,姿势捕获摄像机或其他输入按钮或控件等。
存储器806可以是高速随机存取存储器,诸如DRAM、SRAM、DDR RAM、或其他随机存取固态存储设备;或者非易失性存储器,诸如一个或多个磁盘存储设备、光盘存储设备、闪存设备,或其他非易失性固态存储设备。
存储器806存储处理器802可执行的指令集,包括:
操作系统816,包括用于处理各种基本系统服务和用于执行硬件相关任务的程序;
应用818,包括用于会话信息处理的各种应用程序,这种应用程序能够实现上述各实例中的处理流程,比如可以包括会话信息处理装置中的部分或者全部指令模块或单元。处理器802通过执行存储器806中各单元中至少一个单元中的机器可执行指令,进而能够实现上述各单元或模块中的至少一个模块的功能。
需要说明的是,上述各流程和各结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。各模块的划分仅仅是为了便于描述采用的功能上的划分,实际实现时,一个模块可以分由多个模块实现,多个模块的功能也可以由同一个模块实现,这些模块可以位于同一个设备中,也可以位于不同的设备中。
各实例中的硬件模块可以以硬件方式或硬件平台加软件的方式实现。上述软件包括机器可读指令,存储在非易失性存储介质中。因此,各实例也可以体现为软件产品。
各例中,硬件可以由专门的硬件或执行机器可读指令的硬件实现。例如,硬件可以为专门设计的永久性电路或逻辑器件(如专用处理器,如FPGA或ASIC)用于完成特定的操作。硬件也可以包括由软件临时配置的可编程逻辑器件或电路(如包括通用处理器或其它可编程处理器)用于执行特定操作。
另外,本申请的每个实例可以通过由数据处理设备如计算机执行的数据处理程序来实现。显然,数据处理程序构成了本申请。此外,通常存储在一个存储介质中的数据处理程序通过直接将程序读取出存储介质或者通过将程序安装或复制到数据处理设备的存储设备(如硬盘和/ 或内存)中执行。因此,这样的存储介质也构成了本申请,本申请还提供了一种非易失性存储介质,其中存储有数据处理程序,这种数据处理程序可用于执行本申请上述方法实例中的任何一种实例。
图8模块对应的机器可读指令可以使计算机上操作的操作系统等来完成这里描述的部分或者全部操作。非易失性计算机可读存储介质可以是插入计算机内的扩展板中所设置的存储器中或者写到与计算机相连接的扩展单元中设置的存储器。安装在扩展板或者扩展单元上的CPU等可以根据指令执行部分和全部实际操作。
以上所述仅为本申请的较佳实例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (16)

  1. 一种会话信息处理方法,包括:
    从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
    对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
    从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
    根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
  2. 根据权利要求1所述的方法,其中,所述对所述待分析语句和所述预设数量的所述上文语句进行分词处理,包括:
    将所述待分析语句和所述预设数量的上文语句按照会话中的语句先后顺序进行拼接,得到拼接语句;相邻的上文语句之间采用第一字符分隔,所述待分析语句和与其相邻的上文语句之间采用第二字符分隔;
    从所述拼接语句中提取N-gram特征,并将所述N-gram特征作为所述第一特征形成所述第一特征集合;其中N为预设整数。
  3. 根据权利要求1所述的方法,其中,所述根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别,包括:
    将所述第一特征集合和所述第二特征集合中每一个特征进行编码,得到该特征对应的第一向量,各个第一向量中元素的个数相同;
    根据各个所述第一向量,确定一个第二向量;所述第二向量为表示所述待分析语句和所述预设数量的上文语句的向量;
    将所述第二向量输入预设分类器,得到所述待分析语句与各个语句类别之间的匹配度;
    根据所述待分析语句与各个语句类别之间的匹配度,确定所述待分析语句所属的语句类别。
  4. 根据权利要求3所述的方法,其中,所述将所述第一特征集合和所述第二特征集合中每一个特征进行编码之前,所述方法还包括:
    将每一个第一特征输入预设的哈希函数,得到该第一特征对应的哈希值;所述哈希函数能够将输入特征映射到预设区间的一个整数上;
    其中,所述将所述第一特征集合和所述第二特征集合中每一个特征进行编码,得到该特征对应的预设维度的第一向量,包括:
    将哈希值相同的第一特征作为一个特征进行编码,得到对应的一个第一向量。
  5. 根据权利要求4所述的方法,其中,所述哈希函数包括:
    f1(x1)=x1mod n
    式中,x1为所述哈希函数的输入特征;f1(x1)为哈希值,且为[0,n-1]中的整数。
  6. 根据权利要求3所述的方法,其中,所述预设数量为第一预设数量;所述将所述第二向量输入预设分类器之前,所述方法还包括:
    将所述第二向量输入变换模型,该变换模型包括第二预设数量的变换函数,所述变换函数能对输入数据进行非线性变换;其中,若所述第二预设数量大于或等于2,则相邻的两个变换函数中,前一个变换函数的输出数据为后一个变换函数的输入数据;
    其中,所述将所述第二向量输入预设分类器,包括:
    将所述变换模型的输出数据输入所述预设分类器。
  7. 根据权利要求6所述的方法,其中,第i个变换函数为:
    hi=f2(Wi*hi-1+bi)
    式中,f2(x2)=1/(1+exp(-x2)),x2=Wi*hi-1+bi,Wi为权重系数,bi为偏置系数,hi-1为第i-1个变换函数的输出数据,当i为1时,h0为所述第二向量。
  8. 根据权利要求3~7任一项所述的方法,其中,所述根据各个所述第一向量,确定一个第二向量,包括:
    确定各个所述第一向量的平均向量,并将所述平均向量作为所述第二向量;其中,所述平均向量中每一元素为各个所述第一向量中对应位置的元素之和与所述第一向量的数量之间的比值。
  9. 根据权利要求3~7任一项所述的方法,其中,所述预设分类器采用以下公式计算所述待分析语句与各个语句类别之间的匹配度:
    Figure PCTCN2017107269-appb-100001
    式中,y0,…,yn为输入所述预设分类器的向量中的各个元素。
  10. 根据权利要求3~7任一项所述的方法,其中,所述根据所述待 分析语句与各个语句类别之间的匹配度,确定所述待分析语句所属的语句类别,包括:
    将所述待分析语句与各个语句类别之间的匹配度中最高的匹配度所对应的语句类别作为所述待分析语句所属的语句类别。
  11. 一种会话信息处理方法,包括:
    接收一个会话中的一个语句并将其作为待分析语句;
    采用权利要求1~10任一项所述的方法确定所述待分析语句的类别;
    若所述待分析语句属于所述第一类别,则根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
    若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
    根据所述待分析语句的语义执行相应的操作。
  12. 一种会话信息处理方法,由电子设备执行,该方法包括:
    从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
    对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
    从所述待分析语句对应的第一词集合和所述预设数量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
    根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
  13. 一种会话信息处理方法,由电子设备执行,该方法包括:
    接收一个会话中的一个语句并将其作为待分析语句;
    采用权利要求12所述的方法确定所述待分析语句的类别;
    若所述待分析语句属于所述第一类别,则根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
    若所述待分析语句属于所述第二类别,则根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
    根据所述待分析语句的语义执行相应的操作。
  14. 一种会话信息处理装置,包括:
    一个或多个存储器;
    一个或多个处理器;其中,
    所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
    所述一个或多个指令单元包括:
    第一提取单元,从一个会话中提取待分析语句和预设数量的所述待分析语句的上文语句;
    分词单元,对所述待分析语句和所述预设数量的所述上文语句进行分词处理,得到包括多个第一特征的第一特征集合;
    第二提取单元,从所述待分析语句对应的第一词集合和所述预设数 量的所述上文语句对应的第二词集合中提取包括一个或多个第二特征的第二特征集合;其中,一个第二特征包括第一词和第二词组成的词组或语句,所述第一词为所述第一词集合中的一个或者多个词,所述第二词为所述第二词集合中的一个或多个词;及
    确定单元,根据所述第一特征集合和所述第二特征集合,确定所述待分析语句所属的语句类别;所述语句类别包括表示语句完整且语义不存在歧义的第一类别或表示语句不完整或者语义存在歧义的第二类别。
  15. 一种会话信息处理装置,包括:
    一个或多个处理器;其中,
    所述一个或多个存储器存储有一个或多个指令模块,经配置由所述一个或多个处理器执行;其中,
    所述一个或多个指令模块包括:
    接收模块,接收一个会话中的一个语句并将其作为待分析语句;
    确定模块,包括权利要求14所述的装置中的所述一个或多个指令单元,以确定所述待分析语句的类别;
    第一分析模块,在所述待分析语句属于所述第一类别时,根据所述待分析语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;
    第二分析模块,在所述待分析语句属于所述第二类别时,根据所述待分析语句和所述预设数量的上文语句,对所述待分析语句的语义进行分析得到所述待分析语句的语义;及
    执行模块,根据所述待分析语句的语义执行相应的操作。
  16. 一种非易失性计算机可读存储介质,其上存储有计算机程序, 该程序被处理器执行时实现如权利要求1~10任一所述方法的步骤。
PCT/CN2017/107269 2017-10-23 2017-10-23 会话信息处理方法及其装置、存储介质 WO2019079922A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/CN2017/107269 WO2019079922A1 (zh) 2017-10-23 2017-10-23 会话信息处理方法及其装置、存储介质
CN201780054093.8A CN109964223B (zh) 2017-10-23 2017-10-23 会话信息处理方法及其装置、存储介质
US16/670,822 US10971141B2 (en) 2017-10-23 2019-10-31 Session information processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/107269 WO2019079922A1 (zh) 2017-10-23 2017-10-23 会话信息处理方法及其装置、存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/670,822 Continuation US10971141B2 (en) 2017-10-23 2019-10-31 Session information processing method and device and storage medium

Publications (1)

Publication Number Publication Date
WO2019079922A1 true WO2019079922A1 (zh) 2019-05-02

Family

ID=66247106

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/107269 WO2019079922A1 (zh) 2017-10-23 2017-10-23 会话信息处理方法及其装置、存储介质

Country Status (3)

Country Link
US (1) US10971141B2 (zh)
CN (1) CN109964223B (zh)
WO (1) WO2019079922A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879938A (zh) * 2019-11-14 2020-03-13 中国联合网络通信集团有限公司 文本情感分类方法、装置、设备和存储介质
CN111538823A (zh) * 2020-04-26 2020-08-14 支付宝(杭州)信息技术有限公司 信息处理方法、模型训练方法、装置、设备及介质
CN111783429A (zh) * 2020-07-31 2020-10-16 腾讯科技(深圳)有限公司 信息处理方法、装置、电子设备以及存储介质
CN113095087A (zh) * 2021-04-30 2021-07-09 哈尔滨理工大学 一种基于图卷积神经网络的中文词义消歧方法
WO2021217935A1 (zh) * 2020-04-29 2021-11-04 深圳壹账通智能科技有限公司 问题生成模型的训练方法、问题生成方法及其相关设备

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102608469B1 (ko) * 2017-12-22 2023-12-01 삼성전자주식회사 자연어 생성 방법 및 장치
CN110489761B (zh) * 2018-05-15 2021-02-02 科大讯飞股份有限公司 一种篇章级文本翻译方法及装置
US11600194B2 (en) * 2018-05-18 2023-03-07 Salesforce.Com, Inc. Multitask learning as question answering
US11768874B2 (en) * 2018-12-19 2023-09-26 Microsoft Technology Licensing, Llc Compact entity identifier embeddings
CN110427625B (zh) * 2019-07-31 2022-12-27 腾讯科技(深圳)有限公司 语句补全方法、装置、介质及对话处理系统
CN110808031A (zh) * 2019-11-22 2020-02-18 大众问问(北京)信息科技有限公司 一种语音识别方法、装置和计算机设备
CN111400475A (zh) * 2020-03-24 2020-07-10 联想(北京)有限公司 信息处理方法、装置和电子设备
CN111507088B (zh) * 2020-04-15 2022-12-16 深圳前海微众银行股份有限公司 语句补全方法、设备及可读存储介质
CN111583919B (zh) * 2020-04-15 2023-10-13 北京小米松果电子有限公司 信息处理方法、装置及存储介质
CN113806469B (zh) * 2020-06-12 2024-06-11 华为技术有限公司 语句意图识别方法及终端设备
CN111737990B (zh) * 2020-06-24 2023-05-23 深圳前海微众银行股份有限公司 一种词槽填充方法、装置、设备及存储介质
CN113691686A (zh) * 2021-09-17 2021-11-23 杭州一知智能科技有限公司 一种基于微信的智能语音外呼系统及方法
US11941345B2 (en) 2021-10-26 2024-03-26 Grammarly, Inc. Voice instructed machine authoring of electronic documents

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8892488B2 (en) * 2011-06-01 2014-11-18 Nec Laboratories America, Inc. Document classification with weighted supervised n-gram embedding
CN105701088A (zh) * 2016-02-26 2016-06-22 北京京东尚科信息技术有限公司 从机器对话切换到人工对话的方法和装置
CN106547735A (zh) * 2016-10-25 2017-03-29 复旦大学 基于深度学习的上下文感知的动态词或字向量的构建及使用方法
CN106997342A (zh) * 2017-03-27 2017-08-01 上海奔影网络科技有限公司 基于多轮交互的意图识别方法和装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050043940A1 (en) * 2003-08-20 2005-02-24 Marvin Elder Preparing a data source for a natural language query
US20180096203A1 (en) * 2004-04-12 2018-04-05 Google Inc. Adding value to a rendered document
US8713418B2 (en) * 2004-04-12 2014-04-29 Google Inc. Adding value to a rendered document
CN104598445B (zh) * 2013-11-01 2019-05-10 腾讯科技(深圳)有限公司 自动问答系统和方法
DK179745B1 (en) * 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
CN107133345B (zh) * 2017-05-22 2020-11-06 北京百度网讯科技有限公司 基于人工智能的交互方法和装置
US10348658B2 (en) * 2017-06-15 2019-07-09 Google Llc Suggested items for use with embedded applications in chat conversations
US10404636B2 (en) * 2017-06-15 2019-09-03 Google Llc Embedded programs and interfaces for chat conversations
US11382543B2 (en) * 2018-06-11 2022-07-12 Edwards Lifesciences Corporation Tubing system for use in a blood sampling-blood pressure monitoring system
US10388272B1 (en) * 2018-12-04 2019-08-20 Sorenson Ip Holdings, Llc Training speech recognition systems using word sequences

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8892488B2 (en) * 2011-06-01 2014-11-18 Nec Laboratories America, Inc. Document classification with weighted supervised n-gram embedding
CN105701088A (zh) * 2016-02-26 2016-06-22 北京京东尚科信息技术有限公司 从机器对话切换到人工对话的方法和装置
CN106547735A (zh) * 2016-10-25 2017-03-29 复旦大学 基于深度学习的上下文感知的动态词或字向量的构建及使用方法
CN106997342A (zh) * 2017-03-27 2017-08-01 上海奔影网络科技有限公司 基于多轮交互的意图识别方法和装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879938A (zh) * 2019-11-14 2020-03-13 中国联合网络通信集团有限公司 文本情感分类方法、装置、设备和存储介质
CN111538823A (zh) * 2020-04-26 2020-08-14 支付宝(杭州)信息技术有限公司 信息处理方法、模型训练方法、装置、设备及介质
WO2021217935A1 (zh) * 2020-04-29 2021-11-04 深圳壹账通智能科技有限公司 问题生成模型的训练方法、问题生成方法及其相关设备
CN111783429A (zh) * 2020-07-31 2020-10-16 腾讯科技(深圳)有限公司 信息处理方法、装置、电子设备以及存储介质
CN111783429B (zh) * 2020-07-31 2024-06-07 腾讯科技(深圳)有限公司 信息处理方法、装置、电子设备以及存储介质
CN113095087A (zh) * 2021-04-30 2021-07-09 哈尔滨理工大学 一种基于图卷积神经网络的中文词义消歧方法
CN113095087B (zh) * 2021-04-30 2022-11-25 哈尔滨理工大学 一种基于图卷积神经网络的中文词义消歧方法

Also Published As

Publication number Publication date
US10971141B2 (en) 2021-04-06
CN109964223B (zh) 2020-11-13
CN109964223A (zh) 2019-07-02
US20200066262A1 (en) 2020-02-27

Similar Documents

Publication Publication Date Title
WO2019079922A1 (zh) 会话信息处理方法及其装置、存储介质
JP7283009B2 (ja) 対話理解モデルの訓練方法、装置、デバイス及び記憶媒体
US20210365635A1 (en) Joint intent and entity recognition using transformer models
CN109376234B (zh) 一种训练摘要生成模型的方法和装置
WO2020199904A1 (zh) 视频描述信息的生成方法、视频处理方法、相应的装置
KR20210038467A (ko) 이벤트 테마 생성 방법, 장치, 기기 및 저장 매체
WO2022095354A1 (zh) 基于bert的文本分类方法、装置、计算机设备及存储介质
CN110851546B (zh) 一种验证、模型的训练、模型的共享方法、系统及介质
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN111783429B (zh) 信息处理方法、装置、电子设备以及存储介质
CN115994177A (zh) 基于数据湖的知识产权管理方法及其系统
JP2022091122A (ja) 汎化処理方法、装置、デバイス、コンピュータ記憶媒体及びプログラム
US20220027766A1 (en) Method for industry text increment and electronic device
CN112861474B (zh) 一种信息标注方法、装置、设备及计算机可读存储介质
CN117874234A (zh) 基于语义的文本分类方法、装置、计算机设备及存储介质
CN113987162A (zh) 文本摘要的生成方法、装置及计算机设备
JP2022006166A (ja) 地図上の目的地の決定方法、機器、及び記憶媒体
WO2023168997A1 (zh) 一种跨模态搜索方法及相关设备
WO2023137903A1 (zh) 基于粗糙语义的回复语句确定方法、装置及电子设备
US20230020574A1 (en) Disfluency removal using machine learning
WO2021082570A1 (zh) 基于人工智能的语义识别方法、装置和语义识别设备
TW202232468A (zh) 使用基於文本的說話者變更檢測的說話者劃分糾正方法及系統
CN114791950A (zh) 基于词性位置与图卷积网络的方面级情感分类方法及装置
US11386056B2 (en) Duplicate multimedia entity identification and processing
CN112560466A (zh) 链接实体关联方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17929851

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17929851

Country of ref document: EP

Kind code of ref document: A1