WO2017177901A1 - 一种语义匹配方法及智能设备 - Google Patents

一种语义匹配方法及智能设备 Download PDF

Info

Publication number
WO2017177901A1
WO2017177901A1 PCT/CN2017/080107 CN2017080107W WO2017177901A1 WO 2017177901 A1 WO2017177901 A1 WO 2017177901A1 CN 2017080107 W CN2017080107 W CN 2017080107W WO 2017177901 A1 WO2017177901 A1 WO 2017177901A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
semantic
vector
rule
matching
Prior art date
Application number
PCT/CN2017/080107
Other languages
English (en)
French (fr)
Inventor
陈见耸
高鹏
Original Assignee
芋头科技(杭州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 芋头科技(杭州)有限公司 filed Critical 芋头科技(杭州)有限公司
Publication of WO2017177901A1 publication Critical patent/WO2017177901A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • G06F16/90332Natural language query formulation or dialogue systems

Definitions

  • the present invention relates to the field of semantic analysis technologies, and in particular, to a semantic matching method and a smart device.
  • the manner of information interaction between the human and the smart device may generally include: directly interacting with the user's gestures by means of an input device (such as a keyboard or a mouse), and by recognizing the user's voice.
  • Information, methods of information interaction, etc. In practice, since natural language (ie, spoken language) has natural convenience and friendliness to users, the way of information interaction based on natural language-based semantic matching and recognition naturally needs to be focused on development, in anticipation of bringing users more Good experience.
  • the premise that the smart device performs semantic analysis on the natural language to support the realization of human-computer interaction is that a large number of statement rules need to be manually input to support the semantic matching process, which will bring great advantages to the user or the developer. Trouble, thus reducing the efficiency of semantic analysis; and, the usual semantic analysis method is to match the statement rules one-to-one with the statement to be judged, and the matching returns the semantics of the statement rule, and the matching fails to return the matching failure.
  • the accuracy of semantic analysis therefore depends on the number of semantic rules manually entered by the user or developer, ie the size of the semantic rules database, which often results in semantic analysis due to the very limited semantic rules entered manually by the user or developer. The results are not accurate, which affects the experience of voice interaction.
  • a semantic matching method and a technical solution of a smart device are provided, and the traditional semantic matching needs to manually write a large number of semantic sentence rules in advance, thereby reducing the complexity of the semantic matching operation, and Greatly improve the accuracy of semantic matching.
  • a semantic matching method is applicable to a smart device; wherein, the smart device is preset in a plurality of Rule semantic sentences, including:
  • Step S1 obtaining a statement to be parsed according to the input of the user
  • Step S2 Obtain at least one of the rule semantic sentence patterns according to the statement to be parsed;
  • Step S3 the first sentence vector of the to-be-analyzed sentence is obtained, and the second sentence vector of the at least one rule semantic sentence obtained by the matching is respectively processed;
  • Step S4 respectively, according to the first sentence vector and each of the second sentence vectors, processing a vector similarity between the to-be-analyzed sentence and each of the matched rule semantic sentences;
  • Step S5 comparing each of the vector similarities with a preset similarity threshold, and returning semantic information of the rule semantic sentence corresponding to the vector similarity of the similarity threshold. Take the semantics of the statement to be parsed.
  • the semantic matching method wherein the method for presetting the rule semantic sentence formula and establishing an index associated with the rule semantic sentence pattern comprises:
  • Step A1 replacing the key information of the corresponding type in the rule semantic sentence with different types of tags preset by the user;
  • each of the tags is treated as a word, and each word is an index unit, and an index for the rule semantic sentence is established.
  • the semantic matching method wherein the step A2 specifically includes:
  • step A21 the index unit appearing in all the rule semantic sentences is listed by using a hash inverted index manner
  • Step A22 respectively, after each of the index units, respectively link the sequence numbers of each of the rule semantic sentences associated with the index unit.
  • the semantic matching method before performing the step S2, first replacing the key information of the corresponding type in the to-be-resolved sentence with the label of a different type;
  • the step S2 specifically includes:
  • Step S21 treating each of the tags as a word, and using each of the to-be-resolved sentences as a retrieval unit, respectively, according to the index of the rule semantic sentence, respectively searching and matching At least one of the rule semantic sentences of the statement to be parsed;
  • Step S22 respectively processing the matching degree between the rule semantic sentence obtained by each retrieval and the statement to be parsed
  • Step S23 respectively, matching the matching of the rule semantic sentence obtained by each retrieval The degree is compared with a preset matching degree threshold, and at least one of the rule semantic sentences corresponding to the matching degree greater than the matching degree threshold is retained;
  • Step S24 outputting at least one of the rule semantic sentences retained as the rule semantic sentence obtained by the matching.
  • the semantic matching method is: wherein, in the step S22, the matching degree is calculated according to the following formula:
  • S 1 represents a ratio of a matching portion between the to-be-resolved sentence and the rule semantic sentence to the statement to be parsed;
  • S 2 represents a ratio between a matching portion between the to-be-parsed statement and the rule semantic sentence in the rule semantic sentence.
  • the semantic matching method wherein the pre-training forms a vector processing model
  • the method for calculating the first sentence vector includes:
  • Step S31a performing a word segmentation process on a statement to be parsed
  • Step S32a inputting each word in the parsing sentence processed by the word segmentation into the vector processing model to respectively obtain a word vector associated with each word;
  • Step S33a the first sentence vector of the to-be-analyzed statement is obtained according to all the word vectors.
  • the semantic matching method wherein the pre-training forms a vector processing model
  • the method for calculating the second sentence vector includes:
  • Step S31b performing a word segmentation process on a piece of the rule semantic sentence
  • Step S32b input each word in the rule semantic sentence processed by the word segmentation into the vector processing model to respectively obtain a word vector associated with each word;
  • Step S33b the second statement vector of the rule semantic sentence is obtained according to all the word vectors.
  • the semantic matching method wherein the method for training to form the vector processing model comprises:
  • Step B1 obtaining a preset plurality of corpus information
  • Step B2 performing word segmentation processing on each of the corpus information respectively;
  • Step B3 respectively, using the corpus information processed by the word segmentation as an input of the vector processing model, and outputting the word vector corresponding to different words according to the vector processing model;
  • Step B4 after training of a plurality of the corpus information, finally training to form the vector processing model
  • Each of the corpus information includes:
  • the semantic matching method in the step S4, includes:
  • the vector similarity is obtained by directly measuring the cosine similarity calculation method
  • a vector distance between the first sentence vector and the corresponding second sentence vector is calculated, and then the vector distance is converted into a corresponding vector similarity.
  • a smart device in which the above semantic matching method is employed.
  • the beneficial effects of the above technical solution are: providing a semantic matching method, which can solve the traditional semantic matching requires manually writing a large number of semantic sentence rules, reducing the complexity of the semantic matching operation, and greatly improving the accuracy of the semantic matching.
  • FIG. 1 is a schematic overall flow chart of a semantic matching method in a preferred embodiment of the present invention.
  • FIG. 2 is a flow chart showing establishing an index associated with a rule semantic sentence in a preferred embodiment of the present invention
  • FIG. 3 is a schematic flow chart of indexing each word as an index unit in a preferred embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of obtaining at least one rule semantic sentence according to a statement to be parsed in a preferred embodiment of the present invention
  • FIG. 5 is a schematic flow chart of calculating a first sentence vector in a preferred embodiment of the present invention.
  • FIG. 6 is a schematic flow chart of calculating a second sentence vector in a preferred embodiment of the present invention.
  • Figure 7 is a flow diagram showing the process of forming a vector processing model in a preferred embodiment of the present invention.
  • a semantic allocation method is provided, which is applicable to a smart device, for example, to a mobile terminal, or to other smart devices such as an intelligent robot.
  • a plurality of rule semantic sentences are preset in the smart device.
  • Each rule semantic sentence has the same preset format. This preset format will be described in detail below.
  • the foregoing semantic allocation method specifically includes:
  • Step S1 obtaining a statement to be parsed according to the input of the user
  • Step S2 obtaining at least one rule semantic sentence according to the statement to be parsed
  • Step S3 the first sentence vector of the statement to be parsed is processed, and the second sentence vector of the at least one rule semantic sentence obtained by the matching is respectively processed;
  • Step S4 respectively, according to the first sentence vector and each second sentence vector, processing a vector similarity between the sentence to be parsed and each of the matched rule semantic sentences;
  • step S5 each vector similarity is compared with a preset similarity threshold, and the semantic information of the rule semantic sentence corresponding to the vector similarity greater than the similarity threshold is returned as the semantics of the sentence to be parsed.
  • the statement to be parsed is first obtained based on the user's input.
  • the input mode of the user may be that a natural language is input through a pickup (for example, a microphone) set on the smart device, and the smart device uses the natural language input by the user as a statement to be parsed.
  • At least one rule semantic sentence is obtained according to the statement to be parsed.
  • the matching rule is roughly: converting the statement to be parsed into the above pre Formatting, and then matching the converted to-be-parsed statements with the above-described rule semantic sentences to obtain at least one rule semantic sentence pattern. The above process will be described in detail below.
  • the matching obtains the at least one rule semantic sentence pattern, respectively processing the first sentence vector of the sentence to be parsed, and processing the second sentence vector of each matching rule semantic sentence, and respectively calculating The similarity between the first sentence vector and the vector between each second sentence vector is obtained, and the similarity is used as the similarity of the rule semantic sentence with respect to the sentence to be parsed.
  • the rule semantic sentence pattern that finally matches the to-be-analyzed sentence is determined, and the semantic information of the rule semantic sentence is used as the semantic information of the sentence to be parsed, so that subsequent speech interaction operations can be performed.
  • the prompt for the interaction failure is directly returned. information.
  • the method for setting the rule semantic sentence in advance, and establishing an index associated with the rule semantic sentence, as shown in FIG. 2, specifically includes:
  • Step A1 replacing the key information of the corresponding type in the rule semantic sentence with different types of tags preset by the user;
  • each label is regarded as a word, and each word is an index unit, and an index for the rule semantic sentence is established.
  • the preset format is a statement format formed by using each word in the regular semantic sentence as an index unit. Specifically, the user first presets a plurality of different types of tags, then replaces the corresponding content in the rule semantic sentence with different types of tags, and finally treats each tag as a word, and establishes one word as an index unit. An index to the rule semantic sentence.
  • rule semantic sentence when a certain moment (departure time) from the starting point to the destination, it can be converted into a regular format of the rule semantic sentence, that is, departure time + from + start + to + purpose Ground.
  • step A2 specifically includes:
  • Step A21 using a hash inverted index method to list index units that appear in all rule semantic sentences;
  • step A22 the sequence numbers of each rule semantic sentence associated with the index unit are respectively linked after each index unit.
  • all index units appearing in all rule semantic sentences are listed by using a hash inverted index method, and each index unit is followed by a link including each index unit.
  • the sequence number of the rule semantic sentence form, thus forming an index directory of the complete rule semantic sentence.
  • all the rule semantic sentences matching the index may be directly found according to the index directory according to the index unit included in the statement to be parsed.
  • step S2 before performing step S2, firstly replace the key information of the corresponding type in the sentence to be parsed with different types of tags;
  • step S2 is specifically as shown in FIG. 4, and includes:
  • Step S21 treating each label as a word, and using each word in the sentence to be parsed as a retrieval unit, respectively, according to the index of the rule semantic sentence, respectively searching for at least one rule semantic sentence pattern matching the statement to be parsed ;
  • Step S22 respectively processing the matching degree between the rule semantic sentence and the sentence to be parsed obtained by each retrieval;
  • Step S23 comparing the matching degree of the rule semantic sentence associated with each search with a preset matching degree threshold, and retaining at least one rule semantic sentence corresponding to the matching degree greater than the matching degree threshold;
  • Step S24 outputting at least one rule semantic sentence pattern to be retained as the rule semantic sentence sentence obtained by the matching.
  • the statement to be parsed needs to be converted into the preset format before the matching, that is,
  • each word in the sentence to be parsed that has been converted into a preset format is used as a corresponding index unit to perform retrieval in the index list that has been formed above, thereby obtaining all matching rule semantic sentences.
  • each index unit in a sentence to be parsed may be searched one by one, and all rule semantic sentences associated with each index unit included in the sentence to be parsed are retrieved and output.
  • the above process is only a preliminary search matching process, and the rule semantic sentence retrieved in the process may be very large.
  • the following rules are required for the retrieved rule semantic sentence:
  • the matching degree can be calculated according to the following formula:
  • S 1 represents the proportion of the matching part between the sentence to be parsed and the rule semantic sentence in the proportion of the statement to be parsed
  • S 2 represents the ratio between the matching part between the sentence to be parsed and the rule semantic sentence in the rule semantic sentence.
  • the matching portion between the sentence to be parsed and the rule semantic sentence occupies the proportion of the sentence to be parsed, for example, the index to be parsed includes the index unit 1+2+3+4+5, and accordingly, the matching rule
  • the semantic sentence includes the index unit 1+3+4+6+7+8+9, and the ratio of the matching part (1, 3, 4) to the statement to be parsed is 3/5.
  • the matching part between the sentence to be parsed and the rule semantic sentence occupies the proportion between the rule semantic sentence, and also according to the above example, the matching part (1, 3, 4) occupies the rule semantic sentence The ratio is 3/7.
  • the matching degree is matched with a preset.
  • the matching degree threshold is compared: if the matching degree is higher than the matching degree threshold, the corresponding rule semantic sentence is retained; otherwise, the corresponding rule semantic sentence is ignored.
  • the matching range can be narrowed down, and at least one rule semantic sentence is retained.
  • a vector processing model is pre-trained prior to performing the semantic matching method described above. This vector processing model is used to process word vectors that get different words.
  • the above step S3 may be divided into a portion for calculating the first sentence vector and a portion for calculating the second sentence vector.
  • the method for calculating the first sentence vector includes:
  • Step S31a performing a word segmentation process on a sentence to be parsed
  • Step S32a input each word in the sentence to be parsed processed by the word segmentation into a vector processing model to respectively obtain a word vector associated with each word;
  • Step S33a the first sentence vector of the sentence to be parsed is obtained according to all word vectors.
  • the word segmentation process refers to dividing a sentence to be parsed into different words, that is, converting a sentence to be parsed into a combined structure composed of different words. For example, an airplane from Beijing to Shanghai can be divided into + aircraft from +Beijing+ to +Shanghai+.
  • the rules for the above-mentioned word segmentation have been implemented in the prior art, and are not described herein again.
  • the sentence to be parsed through the word segmentation may be a statement that adds a special mark between the word and the word, such as "airplane" from 'Beijing' to 'Shanghai'.
  • the word between two special marks is a word.
  • each word in the sentence to be parsed is put into a vector processing model formed by training, as the input quantity of the model, to process the word of each word. vector.
  • the word vectors of each word are combined to form the first sentence vector of the sentence to be parsed.
  • the method for processing the second sentence vector in the above step S3 is as shown in FIG.
  • Step S31b performing a word segmentation process on a rule semantic sentence
  • Step S32b input each word in the rule semantic sentence processed by the word segmentation into a vector processing model to respectively obtain a word vector associated with each word;
  • Step S33b processing a second sentence vector of the rule semantic sentence according to all word vectors.
  • the method of training to form a vector processing model includes:
  • Step B1 obtaining a preset plurality of corpus information
  • Step B2 performing word segmentation processing on each corpus information separately;
  • Step B3 respectively, each corpus information processed by word segmentation is used as an input of a vector processing model, and a word vector corresponding to different words is output according to the vector processing model;
  • step B4 after training of multiple corpus information, the final training forms a vector processing model.
  • each of the corpus information includes: a statement content; or a statement content; or a plurality of statement contents.
  • a random search from the Internet for a sentence, or a paragraph, or an entire article. Since the number of training samples determines the accuracy of the vector processing model, that is, the more training samples, the more accurate the vector processing model. Therefore, a large amount of corpus information can be randomly searched on the network and used as an input to the training vector processing model.
  • each corpus information is subjected to word segmentation processing, and different words included in the corpus information are input into the neural network, and the corresponding output is obtained through the processing of the neural network.
  • a vector processing model formed by training is obtained.
  • the method for obtaining the vector similarity in the process includes:
  • the vector similarity is directly measured according to the first sentence vector and the corresponding second sentence vector;
  • the vector distance between the first sentence vector and the corresponding second sentence vector is calculated, and then the vector distance is converted into a corresponding vector similarity.
  • the vector similarity is determined. Whether it is greater than a preset similarity threshold, and the semantic information of the corresponding rule semantic sentence whose vector similarity is greater than the similarity threshold is used as the semantic information of the to-be-parsed statement as a basis for subsequent information interaction processing.
  • the optimal semantic recognition result should be a certain result in the process of information interaction.
  • the best option automatically recognized by the smart device may not be the result desired by the user, so multiple semantic recognition results may be allowed for the user. select. For example, in the above process, a number of options, for example, four options, is set in advance. Then, the number of rule semantic sentences retained by the judgment of the vector similarity is judged: if it is greater than 4, the four rule semantic sentences with the highest vector similarity are retained; if not more than 4, all are retained. These retained rule semantic sentences are then displayed in the form of options for the user to select. Finally, the semantic information of the rule semantic sentence selected by the user is used as the semantic information of the sentence to be parsed for subsequent interaction processing.

Abstract

一种语义匹配方法及智能设备;方法包括:根据使用者的输入获取待解析语句(S1);根据待解析语句匹配得到至少一个规则语义句式(S2);处理得到待解析语句的第一语句向量,以及分别处理得到匹配得到的至少一个规则语义句式的第二语句向量(S3);分别根据第一语句向量和每个第二语句向量,处理得到待解析语句与每个匹配得到的规则语义句式之间的向量相似度(S4);分别将每个向量相似度与一预设的相似度阈值进行比较,并返回大于相似度阈值的向量相似度所对应的规则语义句式的语义信息,以作为待解析语句的语义(S5)。上述技术方案的有益效果是:解决语义匹配需要人工撰写大量的语义句式规则,降低语义匹配操作的复杂度,并且大幅提升语义匹配的准确度。

Description

一种语义匹配方法及智能设备 技术领域
本发明涉及语义分析技术领域,尤其涉及一种语义匹配方法及智能设备。
背景技术
随着智能设备的普及,人与智能设备之间如何进行更直接友好的信息交互称为一个比较重要的问题。现阶段对于人与智能设备之间信息交互的方式大体可以包括:直接通过输入设备(例如键盘或者鼠标)的方式,通过识别使用者的手势动作进行信息交互的方式,以及通过识别使用者的语音信息进行信息交互的方法等。在实践中,由于自然语言(即口头语言)对于使用者具有天然的便捷性和友好性,因此基于自然语言的语义匹配和识别的信息交互方式自然需要重点发展,以期待给使用者带来较佳的使用体验。
但是现有技术中,智能设备对自然语言进行语义分析从而支持实现人机交互的前提是需要手动输入大量的语句规则来支持语义匹配的过程,这会给使用者或者开发者带来极大的麻烦,因此降低了语义分析的效率;并且,通常的语义分析方法是将语句规则与待判断的语句进行一一对应的匹配,匹配到则返回该语句规则的语义,匹配不到返回匹配失败的结果,因此语义分析的准确性依赖于使用者或开发者手动输入的语义规则的数量,即语义规则数据库的规模大小,由于使用者或开发者手动输入的语义规则非常有限,因此通常导致语义分析的结果并不准确,从而影响语音交互方式的使用体验。
发明内容
根据现有技术中存在的上述问题,现提供一种语义匹配方法及智能设备的技术方案,旨在解决传统语义匹配需要事先人工撰写大量的语义句式规则,降低语义匹配操作的复杂度,并且大幅提升语义匹配的准确度。
上述技术方案具体包括:
一种语义匹配方法,适用于智能设备;其中,于所述智能设备中预设多 个规则语义句式,还包括:
步骤S1,根据使用者的输入获取待解析语句;
步骤S2,根据所述待解析语句匹配得到至少一个所述规则语义句式;
步骤S3,处理得到所述待解析语句的第一语句向量,以及分别处理得到匹配得到的至少一个所述规则语义句式的第二语句向量;
步骤S4,分别根据所述第一语句向量和每个所述第二语句向量,处理得到所述待解析语句与每个匹配得到的所述规则语义句式之间的向量相似度;
步骤S5,分别将每个所述向量相似度与一预设的相似度阈值进行比较,并返回大于所述相似度阈值的所述向量相似度所对应的所述规则语义句式的语义信息,以作为所述待解析语句的语义。
优选的,该语义匹配方法,其中,预设所述规则语义句式,并建立关联于所述规则语义句式的索引的方法包括:
步骤A1,分别以使用者预先设置的不同类型的标签替代所述规则语义句式中相应类型的关键信息;
步骤A2,将每个所述标签视为一个字,并以每个字为一个索引单元,建立对于所述规则语义句式的索引。
优选的,该语义匹配方法,其中,所述步骤A2具体包括:
步骤A21,采用哈希倒排索引方式罗列在所有所述规则语义句式中出现的所述索引单元;
步骤A22,在每个所述索引单元后分别链接关联于所述索引单元的每个所述规则语义句式的序号。
优选的,该语义匹配方法,其中,在执行所述步骤S2之前,首先以不同类型的所述标签替代所述待解析语句中相应类型的关键信息;
所述步骤S2具体包括:
步骤S21,将每个所述标签视为一个字,并以所述待解析语句中的每个字作为一个检索单元,依据所述规则语义句式的所述索引,分别检索得到匹配于所述待解析语句的至少一个所述规则语义句式;
步骤S22,分别处理得到每个检索得到的所述规则语义句式与所述待解析语句之间的匹配度;
步骤S23,分别将关联于每个检索得到的所述规则语义句式的所述匹配 度与一预设的匹配度阈值进行比较,保留大于所述匹配度阈值的所述匹配度所对应的至少一个所述规则语义句式;
步骤S24,输出被保留的至少一个所述规则语义句式,以作为匹配得到的所述规则语义句式。
优选的,该语义匹配方法,其中,所述步骤S22中,依照下述公式计算得到所述匹配度:
S=(S1+S2)/2;
其中,S表示所述匹配度;
S1表示所述待解析语句与所述规则语义句式之间的匹配部分占所述待解析语句的比例;
S2表示所述待解析语句与所述规则语义句式之间的匹配部分占所述规则语义句式之间的比例。
优选的,该语义匹配方法,其中,预先训练形成一向量处理模型;
所述步骤S3中,计算得到所述第一语句向量的方法包括:
步骤S31a,将一条所述待解析语句进行分词处理;
步骤S32a,将经过所述分词处理的所述待解析语句中的每个词输入至所述向量处理模型中,以分别得到关联于每个词的词向量;
步骤S33a,根据所有所述词向量处理得到所述待解析语句的所述第一语句向量。
优选的,该语义匹配方法,其中,预先训练形成一向量处理模型;
所述步骤S3中,计算得到所述第二语句向量的方法包括:
步骤S31b,将一条所述规则语义句式进行分词处理;
步骤S32b,将经过所述分词处理的所述规则语义句式中的每个词输入至所述向量处理模型中,以分别得到关联于每个词的词向量;
步骤S33b,根据所有所述词向量处理得到所述规则语义句式的所述第二语句向量。
优选的,该语义匹配方法,其中,训练形成所述向量处理模型的方法包括:
步骤B1,获取预设的多个语料信息;
步骤B2,分别对每个所述语料信息进行分词处理;
步骤B3,分别将每个经过所述分词处理的所述语料信息作为所述向量处理模型的输入,根据所述向量处理模型输出对应不同词的所述词向量;
步骤B4,经过多个所述语料信息的训练,最终训练形成所述向量处理模型;
每个所述语料信息中包括:
一条语句内容;或者
一段语句内容;或者
多段语句内容。
优选的,该语义匹配方法,其中,所述步骤S4中,处理得到所述向量相似度的方法包括:
采用余弦相似度计算方法直接度量得到所述向量相似度;
或者
首先计算得到所述第一语句向量与对应的所述第二语句向量之间的向量距离,随后将所述向量距离转换为对应的所述向量相似度。
一种智能设备,其中,采用上述的语义匹配方法。
上述技术方案的有益效果是:提供一种语义匹配方法,能够解决传统语义匹配需要事先人工撰写大量的语义句式规则,降低语义匹配操作的复杂度,并且大幅提升语义匹配的准确度。
附图说明
图1是本发明的较佳的实施例中,一种语义匹配方法的总体流程示意图;
图2是本发明的较佳的实施例中,建立关联于规则语义句式的索引的流程示意图;
图3是本发明的较佳的实施例中,以每个字为索引单元建立索引的流程示意图;
图4是本发明的较佳的实施例中,根据待解析语句匹配得到至少一个规则语义句式的流程示意图;
图5是本发明的较佳的实施例中,计算得到第一语句向量的流程示意图;
图6是本发明的较佳的实施例中,计算得到第二语句向量的流程示意图;
图7是本发明的较佳的实施例中,训练形成向量处理模型的流程示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。
下面结合附图和具体实施例对本发明作进一步说明,但不作为本发明的限定。
本发明的较佳的实施例中,基于现有技术中存在的上述问题,现提供一种语义分配方法,适用于智能设备,例如适用于移动终端,或者适用于智能机器人等其他智能设备。该方法中,于上述智能设备中预设多个规则语义句式。每个规则语义句式均具有相同的预设格式。该预设格式在下文中会详述。
则本发明的较佳的实施例中,如图1所示,上述语义分配方法具体包括:
步骤S1,根据使用者的输入获取待解析语句;
步骤S2,根据待解析语句匹配得到至少一个规则语义句式;
步骤S3,处理得到待解析语句的第一语句向量,以及分别处理得到匹配得到的至少一个规则语义句式的第二语句向量;
步骤S4,分别根据第一语句向量和每个第二语句向量,处理得到待解析语句与每个匹配得到的规则语义句式之间的向量相似度;
步骤S5,分别将每个向量相似度与一预设的相似度阈值进行比较,并返回大于相似度阈值的向量相似度所对应的规则语义句式的语义信息,以作为待解析语句的语义。
在一个具体实施例中,首先根据使用者的输入获取待解析语句。使用者的输入方式可以为通过设置在智能设备上的拾音器(例如麦克风)输入一句自然语言,则智能设备会将使用者输入的自然语言作为待解析语句。
在该实施例中,在得到上述待解析语句之后,根据该待解析语句匹配得到至少一个规则语义句式。该匹配规则大致为:将待解析语句转换为上述预 设格式,然后将经过转换的待解析语句分别与上述规则语义句式进行匹配,得到相匹配的至少一个规则语义句式。上述过程在下文中会详述。
在该实施例中,匹配得到上述至少一个规则语义句式之后,分别处理得到待解析语句的第一语句向量,以及处理得到每个匹配得到的规则语义句式的第二语句向量,并且分别计算得到第一语句向量分别与每个第二语句向量之间的向量的相似度,将这个相似度作为该规则语义句式相对于待解析语句的相似度。
最后,根据该相似度,确定最终匹配该待解析语句的规则语义句式,并将该规则语义句式的语义信息作为该待解析语句的语义信息,从而可以进行后续的语音交互操作。
本发明的较佳的实施例中,根据向量相似度没有找到相匹配的规则语义句式(所有规则语义句式的向量相似度均不高于上述相似度阈值),则直接返回交互失败的提示信息。
本发明的较佳的实施例中,在上述步骤之前预先设置上述规则语义句式,并且建立关联于规则语义句式的索引的方法如图2所示,具体包括:
步骤A1,分别以使用者预先设置的不同类型的标签替代规则语义句式中相应类型的关键信息;
步骤A2,将每个标签视为一个字,并以每个字为一个索引单元,建立对于规则语义句式的索引。
换言之,本发明的较佳的实施例中,上述预设格式即为以规则语义句式中的每个字作为一个索引单元形成的语句格式。具体地,使用者首先预设多个不同类型的标签,然后以不同类型的标签替代规则语义句式中相应的内容,最后将每个标签视为一个字,并且以一个字作为一个索引单元建立对于规则语义句式的索引。
例如:对于飞机票、火车票、汽车票等类似的领域,可以设定一个标签名为“起始地”,设定另一个标签名为“目的地”,以及设定一个标签名为“出发时间”。
则对于一个规则语义句式:某时某刻(出发时间)从起始地到目的地,则可以转换成预设格式的规则语义句式,即出发时间+从+起始地+到+目的地。
进一步地,本发明的较佳的实施例中,如图3所示,上述步骤A2具体包括:
步骤A21,采用哈希倒排索引方式罗列在所有规则语义句式中出现的索引单元;
步骤A22,在每个索引单元后分别链接关联于索引单元的每个规则语义句式的序号。
具体地,本发明的较佳的实施例中,采用哈希倒排索引方式将出现在所有规则语义句式中的所有索引单元罗列出来,并在每个索引单元后链接包括该索引单元的每个规则语义句式的序号,从而构成一个完整的规则语义句式的索引目录。
则在实际检索匹配的过程中,可以根据待解析语句中包括的索引单元,直接根据索引目录找到相匹配的所有规则语义句式。
本发明的较佳的实施例中,在执行步骤S2之前,首先以不同类型的标签替代待解析语句中相应类型的关键信息;
则上述步骤S2具体如图4所示,包括:
步骤S21,将每个标签视为一个字,并以待解析语句中的每个字作为一个检索单元,依据规则语义句式的索引,分别检索得到匹配于待解析语句的至少一个规则语义句式;
步骤S22,分别处理得到每个检索得到的规则语义句式与待解析语句之间的匹配度;
步骤S23,分别将关联于每个检索得到的规则语义句式的匹配度与一预设的匹配度阈值进行比较,保留大于匹配度阈值的匹配度所对应的至少一个规则语义句式;
步骤S24,输出被保留的至少一个规则语义句式,以作为匹配得到的规则语义句式。
具体地,本发明的较佳的实施例中,为了便于待解析语句和规则语义句式进行匹配,在匹配之前首先同样需要将待解析语句转换成上述预设格式,即:
首先,以不同类型的标签替换待解析语句中相应的关键信息。例如,对于一句待解析语句:15时30分从北京到上海的飞机,则这句可以被转换成: 出发时间(15时30分)+从+出发地(北京)+到+目的地(上海)+的+交通工具(飞机)。其中出发时间、出发地、目的地和交通工具均为预先设置的标签。
随后,根据已经转换成预设格式的待解析语句中的每个字作为相应的索引单元,以在上述已经形成的索引目录中进行检索,从而得到所有相匹配的规则语义句式。具体地,可以根据一个待解析语句中的每个索引单元进行逐个检索,检索得到每个包括在待解析语句中的索引单元所关联的所有规则语义句式并输出。
本发明的较佳的实施例中,上述过程只是一个初步检索匹配的过程,该过程中检索得到的规则语义句式可能会非常多。为了进一步缩小匹配的范围,对检索得到的规则语义句式需要执行下述的处理:
计算得到每个规则语义句式和待解析语句的匹配度,并根据匹配度缩小匹配范围。例如,确定一个匹配度阈值,并保留匹配度高于该匹配度阈值的相应的规则语义句式。
本发明的较佳的实施例中,可以依照下述公式计算得到匹配度:
S=(S1+S2)/2;              (1)
其中,S表示匹配度;
S1表示待解析语句与规则语义句式之间的匹配部分占待解析语句的比例;
S2表示待解析语句与规则语义句式之间的匹配部分占规则语义句式之间的比例。
具体地,所谓待解析语句与规则语义句式之间的匹配部分占待解析语句的比例,例如:待解析语句中包括索引单元1+2+3+4+5,相应地,相匹配的规则语义句式中包括索引单元1+3+4+6+7+8+9,则上述匹配部分(1,3,4)占待解析语句的比例即为3/5。
类似上文中所述,所谓待解析语句与规则语义句式之间的匹配部分占规则语义句式之间的比例,同样依据上述示例,匹配部分(1,3,4)占规则语义句式的比例即为3/7。
则依照上述公式(1),最终的匹配度S就为(3/5+3/7)/2=18/35。
本发明的较佳的实施例中,计算得到匹配度之后,将该匹配度与一预设 的匹配度阈值进行比较:若该匹配度高于匹配度阈值,则保留相应的规则语义句式;反之,忽略相应的规则语义句式。
则经过上述处理,最终可以缩小匹配范围,保留至少一个规则语义句式。
本发明的较佳的实施例中,在执行上述语义匹配方法之前,预先训练形成一向量处理模型。该向量处理模型用于处理得到不同词的词向量。
则本发明的较佳的实施例中,上述步骤S3可以被划分为计算得到第一语句向量的部分,以及计算得到第二语句向量的部分。
本发明的较佳的实施例中,如图5所示,上述计算得到第一语句向量的方法具体包括:
步骤S31a,将一条待解析语句进行分词处理;
步骤S32a,将经过分词处理的待解析语句中的每个词输入至向量处理模型中,以分别得到关联于每个词的词向量;
步骤S33a,根据所有词向量处理得到待解析语句的第一语句向量。
具体地,本发明的较佳的实施例中,所谓分词处理,是指将一条待解析语句划分成不同的词语,即将一条待解析语句转换成由不同的词语构成的组合结构。例如:从北京到上海的飞机,可以被划分为从+北京+到+上海+的+飞机。上述分词的规则在现有技术中已有较多实现方式,在此不再赘述。
本发明的较佳的实施例中,经过分词的待解析语句可以为在词与词之间添加特殊标记的语句,例如“从’北京’到’上海’的’飞机”。两个特殊标记之间的即为一个词。
本发明的较佳的实施例中,经过分词处理后,将待解析语句中的每个词都放入训练形成的向量处理模型中,作为该模型的输入量,以处理得到每个词的词向量。
最后,本发明的较佳的实施例中,将每个词的词向量组合形成上述待解析语句的第一语句向量。
本发明的较佳的实施例中,类似上文中处理得到第一语句向量的方法,上述步骤S3中,处理得到第二语句向量的方法如图6所示,具体包括:
步骤S31b,将一条规则语义句式进行分词处理;
步骤S32b,将经过分词处理的规则语义句式中的每个词输入至向量处理模型中,以分别得到关联于每个词的词向量;
步骤S33b,根据所有词向量处理得到规则语义句式的第二语句向量。
上述过程与上述步骤S31a-S33a类似,在此不再赘述。
本发明的较佳的实施例中,训练形成向量处理模型的方法包括:
步骤B1,获取预设的多个语料信息;
步骤B2,分别对每个语料信息进行分词处理;
步骤B3,分别将每个经过分词处理的语料信息作为向量处理模型的输入,根据向量处理模型输出对应不同词的词向量;
步骤B4,经过多个语料信息的训练,最终训练形成向量处理模型。
具体地,本发明的较佳的实施例中,上述每个语料信息中包括:一条语句内容;或者一段语句内容;或者多段语句内容。例如,从网络上随机搜索得到的一句话,或者一段话,或者一整篇文章。由于训练样本的数量决定了向量处理模型的准确程度,即训练样本越多,向量处理模型越精确。因此,可以在网络上随机搜索大量的语料信息,并作为训练向量处理模型的输入量。
本发明的较佳的实施例中,同样地,对每个语料信息进行分词处理,包括在语料信息中的不同的词语输入到神经网络中,经过神经网络的处理得到相应的输出量。最终经过大量的语料信息中包括的词的训练,得到训练形成的向量处理模型。上述训练过程在现有技术中存在较多的实现方式,在此不再展开。
本发明的较佳的实施例中,上述步骤S4中,处理得到向量相似度的方法包括:
采用余弦相似度计算方法,根据上述第一语句向量和相应的第二语句向量,直接度量得到向量相似度;
或者
首先计算得到第一语句向量与对应的第二语句向量之间的向量距离,随后将向量距离转换为对应的向量相似度。
本发明的较佳的实施例中,如上文中所述,在计算得到上述待解析语句的第一语句向量和对应的一个规则语义句式的第二语句向量的相似度之后,判断该向量相似度是否大于一预设的相似度阈值,并将向量相似度大于该相似度阈值的对应的规则语义句式的语义信息作为该待解析语句的语义信息,以作为依据进行后续的信息交互处理。
本发明的一个较佳的实施例中,在信息交互的过程中,最佳的语义识别结果应该为一个确定的结果。而在上述过程中,可能存在多个规则语义句式的向量相似度大于相似度阈值而被保留。此时需要根据向量相似度进行排列,并获取向量相似度最高的一个规则语义句式,并将其语义信息作为待解析语句的语义信息。
本发明的另一个较佳的实施例中,在信息交互的过程中,智能设备自动识别出的最佳选项可能并不是使用者所需的结果,因此可以允许存在多个语义识别结果供使用者选择。例如,在上述过程中,预先设定一个选项数目,例如4个选项。随后判断通过向量相似度的判断被保留的规则语义句式的数目:若大于4个,则保留向量相似度最高的四个规则语义句式;若不大于4个,则全部保留。随后将这些被保留的规则语义句式通过选项的形式显示,以供使用者选择。最后将被使用者选中的规则语义句式的语义信息作为待解析语句的语义信息,以进行后续的交互处理。
本发明的较佳的实施例中,还提供一种智能设备,其中采用上文中所述的语义匹配方法。
以上所述仅为本发明较佳的实施例,并非因此限制本发明的实施方式及保护范围,对于本领域技术人员而言,应当能够意识到凡运用本发明说明书及图示内容所作出的等同替换和显而易见的变化所得到的方案,均应当包含在本发明的保护范围内。

Claims (10)

  1. 一种语义匹配方法,适用于智能设备;其特征在于,于所述智能设备中预设多个规则语义句式,还包括:
    步骤S1,根据使用者的输入获取待解析语句;
    步骤S2,根据所述待解析语句匹配得到至少一个所述规则语义句式;
    步骤S3,处理得到所述待解析语句的第一语句向量,以及分别处理得到匹配得到的至少一个所述规则语义句式的第二语句向量;
    步骤S4,分别根据所述第一语句向量和每个所述第二语句向量,处理得到所述待解析语句与每个匹配得到的所述规则语义句式之间的向量相似度;
    步骤S5,分别将每个所述向量相似度与一预设的相似度阈值进行比较,并返回大于所述相似度阈值的所述向量相似度所对应的所述规则语义句式的语义信息,以作为所述待解析语句的语义。
  2. 如权利要求1所述的语义匹配方法,其特征在于,预设所述规则语义句式,并建立关联于所述规则语义句式的索引的方法包括:
    步骤A1,分别以使用者预先设置的不同类型的标签替代所述规则语义句式中相应类型的关键信息;
    步骤A2,将每个所述标签视为一个字,并以每个字为一个索引单元,建立对于所述规则语义句式的索引。
  3. 如权利要求2所述的语义匹配方法,其特征在于,所述步骤A2具体包括:
    步骤A21,采用哈希倒排索引方式罗列在所有所述规则语义句式中出现的所述索引单元;
    步骤A22,在每个所述索引单元后分别链接关联于所述索引单元的每个所述规则语义句式的序号。
  4. 如权利要求2所述的语义匹配方法,其特征在于,在执行所述步骤S2之前,首先以不同类型的所述标签替代所述待解析语句中相应类型的关键信息;
    所述步骤S2具体包括:
    步骤S21,将每个所述标签视为一个字,并以所述待解析语句中的每个 字作为一个检索单元,依据所述规则语义句式的所述索引,分别检索得到匹配于所述待解析语句的至少一个所述规则语义句式;
    步骤S22,分别处理得到每个检索得到的所述规则语义句式与所述待解析语句之间的匹配度;
    步骤S23,分别将关联于每个检索得到的所述规则语义句式的所述匹配度与一预设的匹配度阈值进行比较,保留大于所述匹配度阈值的所述匹配度所对应的至少一个所述规则语义句式;
    步骤S24,输出被保留的至少一个所述规则语义句式,以作为匹配得到的所述规则语义句式。
  5. 如权利要求4所述的语义匹配方法,其特征在于,所述步骤S22中,依照下述公式计算得到所述匹配度:
    S=(S1+S2)/2;
    其中,S表示所述匹配度;
    S1表示所述待解析语句与所述规则语义句式之间的匹配部分占所述待解析语句的比例;
    S2表示所述待解析语句与所述规则语义句式之间的匹配部分占所述规则语义句式之间的比例。
  6. 如权利要求1所述的语义匹配方法,其特征在于,预先训练形成一向量处理模型;
    所述步骤S3中,计算得到所述第一语句向量的方法包括:
    步骤S31a,将一条所述待解析语句进行分词处理;
    步骤S32a,将经过所述分词处理的所述待解析语句中的每个词输入至所述向量处理模型中,以分别得到关联于每个词的词向量;
    步骤S33a,根据所有所述词向量处理得到所述待解析语句的所述第一语句向量。
  7. 如权利要求1所述的语义匹配方法,其特征在于,预先训练形成一向量处理模型;
    所述步骤S3中,计算得到所述第二语句向量的方法包括:
    步骤S31b,将一条所述规则语义句式进行分词处理;
    步骤S32b,将经过所述分词处理的所述规则语义句式中的每个词输入至 所述向量处理模型中,以分别得到关联于每个词的词向量;
    步骤S33b,根据所有所述词向量处理得到所述规则语义句式的所述第二语句向量。
  8. 如权利要求6或7所述的语义匹配方法,其特征在于,训练形成所述向量处理模型的方法包括:
    步骤B1,获取预设的多个语料信息;
    步骤B2,分别对每个所述语料信息进行分词处理;
    步骤B3,分别将每个经过所述分词处理的所述语料信息作为所述向量处理模型的输入,根据所述向量处理模型输出对应不同词的所述词向量;
    步骤B4,经过多个所述语料信息的训练,最终训练形成所述向量处理模型;
    每个所述语料信息中包括:
    一条语句内容;或者
    一段语句内容;或者
    多段语句内容。
  9. 如权利要求1所述的语义匹配方法,其特征在于,所述步骤S4中,处理得到所述向量相似度的方法包括:
    采用余弦相似度计算方法直接度量得到所述向量相似度;
    或者
    首先计算得到所述第一语句向量与对应的所述第二语句向量之间的向量距离,随后将所述向量距离转换为对应的所述向量相似度。
  10. 一种智能设备,其特征在于,采用如权利要求1-9所述的语义匹配方法。
PCT/CN2017/080107 2016-04-12 2017-04-11 一种语义匹配方法及智能设备 WO2017177901A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610227718.3A CN107291783B (zh) 2016-04-12 2016-04-12 一种语义匹配方法及智能设备
CN201610227718.3 2016-04-12

Publications (1)

Publication Number Publication Date
WO2017177901A1 true WO2017177901A1 (zh) 2017-10-19

Family

ID=60041419

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/080107 WO2017177901A1 (zh) 2016-04-12 2017-04-11 一种语义匹配方法及智能设备

Country Status (3)

Country Link
CN (1) CN107291783B (zh)
TW (1) TWI638274B (zh)
WO (1) WO2017177901A1 (zh)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763462A (zh) * 2018-05-28 2018-11-06 深圳前海微众银行股份有限公司 平行语句库的更新方法、设备及计算机可读存储介质
CN109117474A (zh) * 2018-06-25 2019-01-01 广州多益网络股份有限公司 语句相似度的计算方法、装置及存储介质
CN109614618A (zh) * 2018-06-01 2019-04-12 安徽省泰岳祥升软件有限公司 基于多语义的集外词处理方法及装置
CN109684458A (zh) * 2018-12-26 2019-04-26 北京壹捌零数字技术有限公司 一种语句向量的计算方法及装置
CN109857846A (zh) * 2019-01-07 2019-06-07 阿里巴巴集团控股有限公司 用户问句与知识点的匹配方法和装置
CN109977382A (zh) * 2019-03-05 2019-07-05 安徽省泰岳祥升软件有限公司 诗句生成模型的训练方法、自动写诗方法及装置
CN109992788A (zh) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 基于未登录词处理的深度文本匹配方法及装置
CN110348003A (zh) * 2019-05-22 2019-10-18 安徽省泰岳祥升软件有限公司 文本有效信息的抽取方法及装置
CN110413992A (zh) * 2019-06-26 2019-11-05 重庆兆光科技股份有限公司 一种语义分析识别方法、系统、介质和设备
CN110909870A (zh) * 2018-09-14 2020-03-24 中科寒武纪科技股份有限公司 训练装置及方法
CN111221939A (zh) * 2019-11-22 2020-06-02 华中师范大学 评分方法、装置和电子设备
CN111368527A (zh) * 2020-02-28 2020-07-03 上海汇航捷讯网络科技有限公司 一种键值匹配方法
CN111427995A (zh) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 基于内部对抗机制的语义匹配方法、装置及存储介质
CN111538810A (zh) * 2020-04-22 2020-08-14 斑马网络技术有限公司 数据生成方法、装置、电子设备及存储介质
CN111626059A (zh) * 2020-04-30 2020-09-04 联想(北京)有限公司 一种信息处理方法及装置
CN113255351A (zh) * 2021-06-22 2021-08-13 中国平安财产保险股份有限公司 语句意图识别方法、装置、计算机设备及存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710915B (zh) 2017-10-26 2021-02-23 华为技术有限公司 复述语句生成方法及装置
CN108304439B (zh) * 2017-10-30 2021-07-27 腾讯科技(深圳)有限公司 一种语义模型优化方法、装置及智能设备、存储介质
CN109841210B (zh) * 2017-11-27 2024-02-20 西安中兴新软件有限责任公司 一种智能操控实现方法及装置、计算机可读存储介质
TWI740086B (zh) 2019-01-08 2021-09-21 安碁資訊股份有限公司 網域名稱辨識方法及網域名稱辨識裝置
CN111478877B (zh) * 2019-01-24 2022-08-02 安碁资讯股份有限公司 网域名称识别方法及网域名称识别装置
CN112101037A (zh) * 2019-05-28 2020-12-18 云义科技股份有限公司 语意相似度计算方法
CN110489740B (zh) * 2019-07-12 2023-10-24 深圳追一科技有限公司 语义解析方法及相关产品
CN111160041B (zh) * 2019-12-30 2024-02-13 科大讯飞股份有限公司 语义理解方法、装置、电子设备和存储介质
CN111104803B (zh) * 2019-12-31 2024-02-13 科大讯飞股份有限公司 语义理解处理方法、装置、设备及可读存储介质
CN115883765A (zh) * 2021-09-26 2023-03-31 天翼爱音乐文化科技有限公司 一种进行图像共享的虚拟客服应答方法、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102880645A (zh) * 2012-08-24 2013-01-16 上海云叟网络科技有限公司 语义化的智能搜索方法
CN103425640A (zh) * 2012-05-14 2013-12-04 华为技术有限公司 一种多媒体问答系统及方法
CN104166682A (zh) * 2014-07-21 2014-11-26 安徽华贞信息科技有限公司 一种基于组合理论的类自然语言的语义信息抽取方法及系统
CN105354300A (zh) * 2015-11-05 2016-02-24 上海智臻智能网络科技股份有限公司 一种信息推荐方法及装置

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI290684B (en) * 2003-05-09 2007-12-01 Webgenie Information Ltd Incremental thesaurus construction method
CN101833555B (zh) * 2009-03-12 2016-05-04 富士通株式会社 信息提取方法和装置
RU2487403C1 (ru) * 2011-11-30 2013-07-10 Федеральное государственное бюджетное учреждение науки Институт системного программирования Российской академии наук Способ построения семантической модели документа
US20140006012A1 (en) * 2012-07-02 2014-01-02 Microsoft Corporation Learning-Based Processing of Natural Language Questions
US20140101162A1 (en) * 2012-10-09 2014-04-10 Industrial Technology Research Institute Method and system for recommending semantic annotations
US10229190B2 (en) * 2013-12-31 2019-03-12 Samsung Electronics Co., Ltd. Latent semantic indexing in application classification
CN103886034B (zh) * 2014-03-05 2019-03-19 北京百度网讯科技有限公司 一种建立索引及匹配用户的查询输入信息的方法和设备
CN104850539B (zh) * 2015-05-28 2017-08-25 宁波薄言信息技术有限公司 一种自然语言理解方法及基于该方法的旅游问答系统
CN104933183B (zh) * 2015-07-03 2018-02-06 重庆邮电大学 一种融合词向量模型和朴素贝叶斯的查询词改写方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425640A (zh) * 2012-05-14 2013-12-04 华为技术有限公司 一种多媒体问答系统及方法
CN102880645A (zh) * 2012-08-24 2013-01-16 上海云叟网络科技有限公司 语义化的智能搜索方法
CN104166682A (zh) * 2014-07-21 2014-11-26 安徽华贞信息科技有限公司 一种基于组合理论的类自然语言的语义信息抽取方法及系统
CN105354300A (zh) * 2015-11-05 2016-02-24 上海智臻智能网络科技股份有限公司 一种信息推荐方法及装置

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763462A (zh) * 2018-05-28 2018-11-06 深圳前海微众银行股份有限公司 平行语句库的更新方法、设备及计算机可读存储介质
CN109614618A (zh) * 2018-06-01 2019-04-12 安徽省泰岳祥升软件有限公司 基于多语义的集外词处理方法及装置
CN109614618B (zh) * 2018-06-01 2023-07-14 安徽省泰岳祥升软件有限公司 基于多语义的集外词处理方法及装置
CN109117474A (zh) * 2018-06-25 2019-01-01 广州多益网络股份有限公司 语句相似度的计算方法、装置及存储介质
CN109117474B (zh) * 2018-06-25 2022-05-03 广州多益网络股份有限公司 语句相似度的计算方法、装置及存储介质
CN110909870A (zh) * 2018-09-14 2020-03-24 中科寒武纪科技股份有限公司 训练装置及方法
CN110909870B (zh) * 2018-09-14 2022-12-09 中科寒武纪科技股份有限公司 训练装置及方法
CN109684458A (zh) * 2018-12-26 2019-04-26 北京壹捌零数字技术有限公司 一种语句向量的计算方法及装置
CN109857846A (zh) * 2019-01-07 2019-06-07 阿里巴巴集团控股有限公司 用户问句与知识点的匹配方法和装置
CN109857846B (zh) * 2019-01-07 2023-06-20 创新先进技术有限公司 用户问句与知识点的匹配方法和装置
CN109977382A (zh) * 2019-03-05 2019-07-05 安徽省泰岳祥升软件有限公司 诗句生成模型的训练方法、自动写诗方法及装置
CN109992788B (zh) * 2019-04-10 2023-08-29 鼎富智能科技有限公司 基于未登录词处理的深度文本匹配方法及装置
CN109992788A (zh) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 基于未登录词处理的深度文本匹配方法及装置
CN110348003B (zh) * 2019-05-22 2023-10-17 安徽省泰岳祥升软件有限公司 文本有效信息的抽取方法及装置
CN110348003A (zh) * 2019-05-22 2019-10-18 安徽省泰岳祥升软件有限公司 文本有效信息的抽取方法及装置
CN110413992A (zh) * 2019-06-26 2019-11-05 重庆兆光科技股份有限公司 一种语义分析识别方法、系统、介质和设备
CN111221939A (zh) * 2019-11-22 2020-06-02 华中师范大学 评分方法、装置和电子设备
CN111221939B (zh) * 2019-11-22 2023-09-08 华中师范大学 评分方法、装置和电子设备
CN111427995A (zh) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 基于内部对抗机制的语义匹配方法、装置及存储介质
CN111427995B (zh) * 2020-02-26 2023-05-26 平安科技(深圳)有限公司 基于内部对抗机制的语义匹配方法、装置及存储介质
CN111368527B (zh) * 2020-02-28 2023-06-20 上海汇航捷讯网络科技有限公司 一种键值匹配方法
CN111368527A (zh) * 2020-02-28 2020-07-03 上海汇航捷讯网络科技有限公司 一种键值匹配方法
CN111538810A (zh) * 2020-04-22 2020-08-14 斑马网络技术有限公司 数据生成方法、装置、电子设备及存储介质
CN111538810B (zh) * 2020-04-22 2024-04-09 斑马网络技术有限公司 数据生成方法、装置、电子设备及存储介质
CN111626059A (zh) * 2020-04-30 2020-09-04 联想(北京)有限公司 一种信息处理方法及装置
CN113255351A (zh) * 2021-06-22 2021-08-13 中国平安财产保险股份有限公司 语句意图识别方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN107291783A (zh) 2017-10-24
TW201737120A (zh) 2017-10-16
CN107291783B (zh) 2021-04-30
TWI638274B (zh) 2018-10-11

Similar Documents

Publication Publication Date Title
WO2017177901A1 (zh) 一种语义匹配方法及智能设备
CN109918680B (zh) 实体识别方法、装置及计算机设备
CN112100349B (zh) 一种多轮对话方法、装置、电子设备及存储介质
US10997370B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
CN105869634B (zh) 一种基于领域的带反馈语音识别后文本纠错方法及系统
WO2019085779A1 (zh) 机器处理及文本纠错方法和装置、计算设备以及存储介质
CN105957518B (zh) 一种蒙古语大词汇量连续语音识别的方法
CN108959242B (zh) 一种基于中文字符词性特征的目标实体识别方法及装置
KR101309042B1 (ko) 다중 도메인 음성 대화 장치 및 이를 이용한 다중 도메인 음성 대화 방법
KR102316063B1 (ko) 오디오 중의 키 프레이즈를 인식하기 위한 방법과 장치, 기기 및 매체
CN110147451B (zh) 一种基于知识图谱的对话命令理解方法
WO2021190259A1 (zh) 一种槽位识别方法及电子设备
WO2020233386A1 (zh) 基于aiml的智能问答方法、装置、计算机设备及存储介质
CN103678684A (zh) 一种基于导航信息检索的中文分词方法
WO2021212801A1 (zh) 面向电商产品的评价对象识别方法、装置及存储介质
CN108538294B (zh) 一种语音交互方法及装置
CN112417102A (zh) 一种语音查询方法、装置、服务器和可读存储介质
KR102267561B1 (ko) 음성 언어 이해 장치 및 방법
WO2017000809A1 (zh) 一种语言交互方法
WO2015139497A1 (zh) 一种在搜索引擎中确定形近字的方法和装置
CN112699686A (zh) 基于任务型对话系统的语义理解方法、装置、设备及介质
WO2017166626A1 (zh) 归一化方法、装置和电子设备
CN113505209A (zh) 一种面向汽车领域的智能问答系统
US20230094730A1 (en) Model training method and method for human-machine interaction
CN114764566B (zh) 用于航空领域的知识元抽取方法

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17781877

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17781877

Country of ref document: EP

Kind code of ref document: A1