WO2019148797A1 - Natural language processing method, device, computer apparatus, and storage medium - Google Patents

Natural language processing method, device, computer apparatus, and storage medium Download PDF

Info

Publication number
WO2019148797A1
WO2019148797A1 PCT/CN2018/100169 CN2018100169W WO2019148797A1 WO 2019148797 A1 WO2019148797 A1 WO 2019148797A1 CN 2018100169 W CN2018100169 W CN 2018100169W WO 2019148797 A1 WO2019148797 A1 WO 2019148797A1
Authority
WO
WIPO (PCT)
Prior art keywords
backbone structure
natural language
word
matching
converted
Prior art date
Application number
PCT/CN2018/100169
Other languages
French (fr)
Chinese (zh)
Inventor
吴贞海
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2019148797A1 publication Critical patent/WO2019148797A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools

Abstract

A natural language processing method, comprising: receiving an input natural language segment, and parsing the input natural language segment by means of a pre-determined natural language parsing database, so as to obtain a natural language dependency tree; extracting a backbone structure from the natural language dependency tree; determining whether a particular interrogative word is present in the extracted backbone structure, and if so, acquiring a type of the particular interrogative word; matching the extracted backbone structure against a first standard sentence, the first standard sentence being stored in a knowledge database and corresponding to the type of the particular interrogative word; and if matching succeeds, extracting a portion corresponding to the particular interrogative word from the first standard sentence, replacing the particular interrogative word in the natural language segment with the extracted portion, and outputting the replaced natural language segment.

Description

自然语言处理方法、装置、计算机设备和存储介质Natural language processing method, device, computer device and storage medium
相关申请的交叉引用Cross-reference to related applications
本申请要求于2018年1月30日提交中国专利局,申请号为2018100908467,申请名称为“自然语言处理方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese Patent Application entitled "Natural Language Processing Method, Apparatus, Computer Equipment, and Storage Media" by the Chinese Patent Office on January 30, 2018, the entire disclosure of which is incorporated by reference. Combined in this application.
技术领域Technical field
本申请涉及一种自然语言处理方法、装置、计算机设备和存储介质。The present application relates to a natural language processing method, apparatus, computer device, and storage medium.
背景技术Background technique
随着计算机技术的发展,出现了计算机自然语言生成,计算机自然语言生成属于人工智能模式识别领域的工作,目前的很多工作是基于关键词匹配的模式,基于庞大的现实世界语料环境库,从中匹配已有的语言句子。With the development of computer technology, computer natural language generation has emerged, and computer natural language generation belongs to the field of artificial intelligence pattern recognition. Many of the current work is based on keyword matching mode, based on the huge real-world corpus environment library. Existing language sentences.
然而,发明人意识到,目前的匹配方式是基于关键词的匹配,由于基于关键词匹配依赖于关键词的提取精确程度,因此当提取精确程度低时,则会造成匹配错误率较高的情况。However, the inventors realized that the current matching method is based on keyword matching. Since keyword matching depends on the accuracy of keyword extraction, when the accuracy of extraction is low, the matching error rate is high. .
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种自然语言处理方法、装置、计算机设备和存储介质。In accordance with various embodiments disclosed herein, a natural language processing method, apparatus, computer device, and storage medium are provided.
一种自然语言处理方法,包括:A natural language processing method that includes:
接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;Receiving the natural language of the input, and parsing the natural language of the input through a preset natural language analysis library to obtain a natural language dependency tree;
提取所述自然语言依存树中的骨干结构;Extracting a backbone structure in the natural language dependent tree;
判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;Determining whether there is a special question word in the extracted backbone structure, and when present, identifying the type of the special question word;
将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及Matching the extracted backbone structure with a first standard sentence, the first standard sentence being stored in the knowledge base and corresponding to the type of the special question word;
当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。When the matching is successful, the part corresponding to the special question word in the first standard sentence is extracted, and the extracted part is replaced with the special question word in the natural language, and the replaced natural language is output.
一种自然语言处理装置,包括:A natural language processing device comprising:
接收模块,用于接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;a receiving module, configured to receive the input natural language, and parse the input natural language through a preset natural language parsing library to obtain a natural language dependency tree;
提取模块,用于提取所述自然语言依存树中的骨干结构;An extraction module, configured to extract a backbone structure in the natural language dependent tree;
第一判断模块,用于判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;a first determining module, configured to determine whether a special question word exists in the extracted backbone structure, and when present, identify a type of the special question word;
第一匹配模块,用于将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及a first matching module, configured to match the extracted backbone structure with a first standard sentence, where the first standard sentence is stored in the knowledge base and corresponds to a type of the special question word;
输出模块,用于当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。An output module, configured to: when the matching is successful, extract a portion of the first standard sentence corresponding to the special question word, and replace the extracted part with a special question word in the natural language, and output the replacement Natural language.
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行时实现本申请任意一个实施例中提供的自然语言处理方法的步骤。A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executable by the processor to cause the one or more processors to execute The steps of the natural language processing method provided in any of the embodiments of the present application are implemented.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行时实现本申请任意一个实施例中提供的自然语言处理方法的步骤。One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform any one of the implementations of the present application The steps of the natural language processing method provided in the example.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below. Other features and advantages of the present invention will be apparent from the description, drawings and claims.
附图说明DRAWINGS
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings to be used in the embodiments will be briefly described below. Obviously, the drawings in the following description are only some embodiments of the present application, Those skilled in the art can also obtain other drawings based on these drawings without any creative work.
图1为根据一个或多个实施例中自然语言处理方法的应用场景图。1 is an application scenario diagram of a natural language processing method in accordance with one or more embodiments.
图2为根据一个或多个实施例中自然语言处理方法的流程示意图。2 is a flow diagram of a natural language processing method in accordance with one or more embodiments.
图3为根据一个或多个实施例中自然语言解析库的加载示意图。3 is a schematic diagram of loading of a natural language parsing library in accordance with one or more embodiments.
图4为根据一个或多个实施例中的自然语言依存树的结构示意图。4 is a block diagram showing the structure of a natural language dependent tree in accordance with one or more embodiments.
图5为根据另一个或多个实施例中自然语言处理方法的流程示意图。FIG. 5 is a flow diagram of a natural language processing method in accordance with another or more embodiments.
图6为根据一个或多个实施例中自然语言处理装置的框图。FIG. 6 is a block diagram of a natural language processing device in accordance with one or more embodiments.
图7为根据一个或多个实施例中计算机设备的框图。FIG. 7 is a block diagram of a computer device in accordance with one or more embodiments.
具体实施方式Detailed ways
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.
本申请提供的自然语言处理方法,可以应用于如图1所示的应用环境中。用户可以通过语音、触摸输入、键盘输入、遥控输入等任一方式与终端进行交互。具体地,用户可以说出一自然语言,终端接收该自然语言,并通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树,提取自然语言依存树中的骨干结构,从而可以减少处理量, 且保留了关键内容,判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则获取特殊疑问词的类型;将所提取的骨干结构与知识库中特殊疑问词的类型对应的第一标准句子进行匹配;当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,并将所提取的部分替换自然语言中的特殊疑问词后,输出替换后的自然语言,从而终端可以针对用户输入的自然语言作出回应,通过知识库中第一标准句子的相应部分替换该特殊疑问词,并未改变句子的结构,使得智能问答中的答案与问题逻辑相关,提高了回答的准确性。终端可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。The natural language processing method provided by the present application can be applied to the application environment as shown in FIG. 1. The user can interact with the terminal by any means such as voice, touch input, keyboard input, remote control input, and the like. Specifically, the user can speak a natural language, the terminal receives the natural language, and parses the input natural language through a preset natural language analysis library to obtain a natural language dependency tree, and extracts a backbone structure in the natural language dependency tree, thereby The processing amount can be reduced, and the key content is retained, and whether the special question word exists in the extracted backbone structure is judged, and when present, the type of the special question word is obtained; the extracted backbone structure and the special question word in the knowledge base are The first standard sentence corresponding to the type is matched; when the matching is successful, the part corresponding to the special question word in the first standard sentence is extracted, and the extracted part is replaced with the special question word in the natural language, and the output is replaced. Natural language, so that the terminal can respond to the natural language input by the user, and replace the special question word with the corresponding part of the first standard sentence in the knowledge base, without changing the structure of the sentence, so that the answer in the intelligent question and answer is related to the problem logic. Improve the accuracy of the answer. The terminal can be, but is not limited to, various personal computers, notebook computers, smart phones, tablets, and portable wearable devices.
在其中一个实施例中,如图2所示,提供了一种自然语言处理方法,以该方法应用于图1中的终端为例进行说明,包括以下步骤:In one embodiment, as shown in FIG. 2, a natural language processing method is provided, which is applied to the terminal in FIG. 1 as an example, and includes the following steps:
S202:接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树。S202: Receive the natural language of the input, and parse the input natural language through a preset natural language analysis library to obtain a natural language dependency tree.
具体地,接收输入的自然语言可以是通过语音、触摸输入、键盘输入、遥控输入等任一方式进行的;例如当终端安装有语音识别设备时,则可以接收到用户说出的自然语言,并识别该自然语言作为语音输入;也可以是用户通过终端提供的触摸屏、键盘、遥控器等进行输入。Specifically, the natural language for receiving the input may be performed by any means such as voice, touch input, keyboard input, remote control input, etc.; for example, when the terminal is installed with the voice recognition device, the natural language spoken by the user may be received, and The natural language is recognized as a voice input; or the user can input through a touch screen, a keyboard, a remote controller, or the like provided by the terminal.
具体地,预设的自然语言解析库可是stanford大学的自然语言解析库。其中参见图3,可以预先将stanford大学的自然语言解析库加载在终端或者是主控设备中,包括首先创建自然语言解析器parser,然后加载中文自然语言训练模型xinhuaFactoredSegmenting.ser.gz。可选地,还可以是加载其他语言的自然语言训练模型,例如英文、法文等。在终端或主控设备接收到自然语言后,则将自然语言输入至自然语言解析库从而可以得到自然语言依存树。Specifically, the preset natural language parsing library is a natural language parsing library of the University of Stanford. Referring to FIG. 3, the natural language parsing library of the University of Stanford can be preloaded in the terminal or the main control device, including firstly creating a natural language parser parser, and then loading the Chinese natural language training model xinhuaFactoredSegmenting.ser.gz. Optionally, it may also be a natural language training model loaded in other languages, such as English, French, and the like. After the terminal or the master device receives the natural language, the natural language is input into the natural language parsing library to obtain the natural language dependency tree.
其中自然语言依存树是将自然语言进行分割后,对分割后的每一个部分的成分进行标注得到的;例如当用户所说出或所输入的自然语言是“番茄来自哪里”,则通过stanford大学的自然语言解析库解析输入的自然语言得到自然语言依存树结构,具体可以参见图4。The natural language dependent tree is obtained by dividing the natural language into parts and labeling the components of each part after the division; for example, when the natural language spoken or input by the user is "where the tomato comes from", the University of Stanford is passed. The natural language parsing library parses the input natural language to obtain the natural language dependent tree structure. See Figure 4 for details.
S204:提取自然语言依存树中的骨干结构。S204: Extract a backbone structure in the natural language dependent tree.
具体地,骨干结构为包含动作词和对象词的结构,可选地,骨干结构可以包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。在得到自然语言依存树后,则通过遍历自然语言依存树按照预设规则获取到输入的自然语言的骨干结构,例如可以设置先提取主谓宾结构,在未提取到主谓宾结构时,则提取谓宾结构,在未提取到谓宾结构时,则提取介宾结构,直至提取成功。仍以上述例子进行说明,其提取到的主谓宾结构为{"ips":[{"s":"[番茄]","v":"[来自]","o":"[哪里]"}],其中“番茄”为主语,“来自”为谓语,“哪里”为宾语,则不需要再继续提取谓宾结构和介宾结构。如果在提取骨干结构的过程中提取失败,即未提取到主谓宾结构、谓宾结构或介宾结构,则可以输出报错,例如可以输出预设的提示,例如“未听清,请重复”等。Specifically, the backbone structure is a structure including an action word and an object word. Optionally, the backbone structure may include at least one of a subject-predicate structure, a predicate structure, and a mediation structure. After obtaining the natural language dependency tree, the backbone structure of the input natural language is obtained by traversing the natural language dependency tree according to a preset rule. For example, the main predicate structure can be set first, and when the main predicate structure is not extracted, The predicate structure is extracted. When the predicate structure is not extracted, the mediation structure is extracted until the extraction is successful. Still using the above example, the extracted subject-predicate structure is {"ips":[{"s":"[tomato]","v":"[from]","o":"[where ]"}], where "tomato" is the main language, "from" is the predicate, and "where" is the object, then there is no need to continue to extract the object structure and the mediation structure. If the extraction fails during the process of extracting the backbone structure, that is, the subject-predicate structure, the predicate structure, or the mediation structure is not extracted, an error may be output, for example, a preset prompt may be output, for example, “not heard, please repeat” Wait.
S206:判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别特殊疑问词 的类型。S206: Determine whether there is a special question word in the extracted backbone structure, and when present, identify the type of the special question word.
具体地,特殊疑问词包括多种类型,例如What-事物名称相关的主语和宾语可以替换为“什么”。Who-人称相关的主语和宾语可以替换成“谁”。Where-地点相关的主语和宾语可以替换成“哪里”,“什么地方”,“什么位置”When-时间相关的主语和宾语可以替换成“何时”,“什么时间”,“什么时候”。从而可以通过遍历所提取的骨干结构中是否存在上述特殊疑问词,当所遍历的骨干结构中存在上述特征疑问词时,则识别该特殊疑问词的类型,例如上述“番茄来自哪里”的实施例中的特殊疑问词“哪里”是属于“where”类的。Specifically, the special question words include a plurality of types, for example, the subject and the object related to the What-thing name can be replaced with "what". Who-person related subject and object can be replaced by "who". Where-place related subject and object can be replaced with "where", "what place", "what position" When-time related subject and object can be replaced with "when", "when", "when". Therefore, by traversing whether the special question word exists in the extracted backbone structure, when the feature question word exists in the traversed backbone structure, the type of the special question word is recognized, for example, in the embodiment of “where the tomato comes from” The special question word "where" belongs to the "where" category.
S208:将所提取的骨干结构与第一标准句子进行匹配,第一标准句子存储在所述知识库中,并与特殊疑问词的类型对应。S208: Match the extracted backbone structure with a first standard sentence, where the first standard sentence is stored in the knowledge base and corresponds to the type of the special question word.
具体地,当骨干结构中存在特殊疑问词时,则可以将所提取的骨干结构与知识库中的特殊疑问词类型对应的第一标准句子进行匹配,其中知识库中可以根据特殊疑问词的类型进行分类存储,从而可以直接获取所提取的骨干结构中的特殊疑问词类型对应的第一标准句子,并进行相应的匹配即可。例如可以通过所提取的骨干结构中的除特殊疑问词以外的词语与第一标准句子进行匹配,如果剩余的词语均与第一标准句子匹配成功,则认为匹配成功,否则,则认为匹配失败。Specifically, when there is a special question word in the backbone structure, the extracted backbone structure may be matched with the first standard sentence corresponding to the special question word type in the knowledge base, wherein the knowledge base may be based on the type of the special question word. The classification is stored, so that the first standard sentence corresponding to the special question word type in the extracted backbone structure can be directly obtained, and corresponding matching can be performed. For example, the words other than the special question words in the extracted backbone structure may be matched with the first standard sentence. If the remaining words are successfully matched with the first standard sentence, the matching is considered successful, otherwise, the matching is considered to be failed.
以上述“番茄来自哪里”为例进行说明,其中特殊疑问词为“哪里”,则通过“番茄”和“来自”与知识库中的“where”类型所对应的第一标准句子进行匹配,如果第一标准句子中存在“番茄来自北美洲”,即“番茄”与“番茄”匹配成功,“来自”与“来自”匹配成功,则认为匹配成功。Take the above "where the tomato comes from" as an example. The special question word is "where", and the "standard" sentence corresponding to the "where" type in the knowledge base is matched by "tomato" and "from". In the first standard sentence, there is "tomato from North America", that is, "tomato" and "tomato" match successfully, and "from" and "from" match successfully, and the match is considered successful.
S210:当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,并将所提取的部分替换自然语言中的特殊疑问词后,输出替换后的自然语言。S210: When the matching is successful, the part corresponding to the special question word in the first standard sentence is extracted, and the extracted part is replaced with the special question word in the natural language, and the replaced natural language is output.
具体地,当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,如上述例子中的“北美洲”,并通过该部分替换原自然语言中的特殊疑问词,并输出替换后的自然语言,即将原自然语言中的“哪里”替换为“北美洲”,并输出“番茄来自北美洲”,从而完成整个问答过程。Specifically, when the matching is successful, the part corresponding to the special question word in the first standard sentence is extracted, such as “North America” in the above example, and the special question word in the original natural language is replaced by the part, and the replacement is output. After the natural language, replace "where" in the original natural language with "North America" and output "tomato from North America" to complete the whole question and answer process.
上述自然语言处理方法,对所接收的的自然语言通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树,并根据该自然语言依存树得到骨干结构,通过提取句子骨干结构,去除掉无用信息;并根据特殊疑问词来判断句型,然后根据句型来查询相应的知识库,并通过知识库中第一标准句子的相应部分替换该特殊疑问词,并未改变句子的结构,使得智能问答中的答案与问题逻辑相关,提高了回答的准确性。The natural language processing method described above obtains a natural language dependency tree by parsing the input natural language through a preset natural language analysis library, and obtains a backbone structure according to the natural language dependency tree, by extracting a sentence backbone structure , removing the useless information; and judging the sentence pattern according to the special question words, and then querying the corresponding knowledge base according to the sentence pattern, and replacing the special question word through the corresponding part of the first standard sentence in the knowledge base, without changing the sentence The structure makes the answers in the intelligent question and answer related to the problem logic, which improves the accuracy of the answer.
在其中一个实施例中,该自然语言处理方法还可以包括:当所提取的骨干结构中不存在特殊疑问词时,则判断骨干结构中是否存在一般疑问词;当骨干结构中存在一般疑问词时,则将一般疑问词转换为肯定词,并将转换后的骨干结构与知识库中的第二标准句子进行匹配;当匹配成功时,则将自然语言中的一般疑问词转换为肯定词后,输出转换后的自 然语言;当匹配失败时,则将自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。在其中一个实施例中,该自然语言存储方法还可以包括:当骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将骨干结构存储至知识库中。In one embodiment, the natural language processing method may further include: when there is no special interrogative word in the extracted backbone structure, determining whether there is a general interrogative word in the backbone structure; when there is a general interrogative word in the backbone structure, Translating the general question words into affirmative words, and matching the transformed backbone structure with the second standard sentence in the knowledge base; when the matching is successful, converting the general question words in the natural language into affirmative words, and outputting The natural language after conversion; when the match fails, the general question words in the natural language are converted into negative words, and the converted natural language is output. In one embodiment, the natural language storage method may further include: storing the backbone structure into the knowledge base when there is neither a special question word nor a general question word in the backbone structure.
具体地,本实施例中的自然语言骨干结构包括主谓宾结构、介宾结构和谓宾结构,句子类型包括:陈述句或者祈使句:不带特殊疑问词和一般疑问词的句子,该类型的句子认定为知识陈述类,可以将这类句子的骨干提取到知识库中存储。带疑问词的特殊疑问句:这类疑问句有显著的主谓宾结构和疑问词,这个类型的句子认定为查询类。通过提取疑问词查询知识库得到相关的知识点,使用相应自然语言生成方法(替换和否定)生成自然语言句子形式的查询结果返回给用户。一般疑问句:这类疑问句也是有显著的主谓宾结构和一般疑问结构,主要是语法分析中“动词-否定-动词”结构,这类型的句子是一类判定问题,通过自然语言生成系统中的否定运算方法运算以后得到查询语句,搜索知识库,如果知识库中有这样的判定,那么就可以通过自然语言的生成方法中否定运算操作运算以后返回结果给用户。Specifically, the natural language backbone structure in this embodiment includes a subject-predicate structure, a mediation structure, and a predicate structure, and the sentence types include: a declarative sentence or an imperative sentence: a sentence without a special question word and a general question word, the sentence of the type Recognized as a knowledge statement class, the backbone of such sentences can be extracted into the knowledge base for storage. Special interrogative sentences with interrogative words: These interrogative sentences have significant subject-predicate structures and interrogative words. This type of sentence is identified as a query class. By extracting the query word query knowledge base to obtain relevant knowledge points, the corresponding natural language generation methods (replacement and negation) are used to generate the query results in the form of natural language sentences and returned to the user. General interrogative sentences: such interrogative sentences also have significant subject-predicate structure and general interrogative structure, mainly in the grammatical analysis of "verb-negative-verb" structure. This type of sentence is a type of decision problem, which is generated by the natural language generation system. After the operation method of the negative operation method, the query statement is obtained, and the knowledge base is searched. If there is such a judgment in the knowledge base, the result can be returned to the user after the operation operation is negated by the natural language generation method.
其中的自然语言生成方法包括替换运算、否定运算和复合运算,其中,替换运算其针对主语和宾语的不同属性,替换成不用疑问词,得到不同语义的疑问句,针对相应的特殊疑问词也可以用相应的结果词进行替换。否定运算主要是针对宾语部分,添加适当的否定词前缀或去除特定的否定词前缀可以得到否定语义的骨干结构,或得到肯定语义的骨干结构,其中否定词前缀可以包括“不”,“不是”,“没”,“没有”等。对于复合运算则是将替换运算和否定预算进行叠加使用,即可以先使用一种得到某个结构,在得到的结构上继续施加第二种运算得到另外一个结构,以此类推。The natural language generation methods include substitution operation, negative operation and compound operation. Among them, the replacement operation is different for the different attributes of the subject and the object, and is replaced by the question without the interrogative words, and the interrogative sentences with different semantics are obtained, and the corresponding special question words can also be used. Replace the corresponding result words. The negative operation is mainly for the object part. Adding the appropriate negative word prefix or removing the specific negative word prefix can get the backbone structure of the negative semantics, or get the backbone structure of the affirmative semantics. The negative word prefix can include “no”, “no”. , "No", "No", etc. For the compound operation, the replacement operation and the negative budget are superimposed, that is, one structure can be obtained first, and the second operation can be continued on the obtained structure to obtain another structure, and so on.
具体地,参阅图5,图5为另一实施例中的自然语言处理方法的流程图。其中终端首先接收用户输入的自然语言,然后,通过预设的自然语言解析器进行解析得到语法依存树,并从该语法依存树中提取到相应的句子骨干结构,对所提取的句子骨干结构进行判断可以得到三种句子类型,针对不同的句子类型进行相应的处理。当句子骨干中存在特殊疑问词时,则根据特殊疑问词的类型获取对应的知识库,将句子骨干成分中的其他词语与知识库中的第二标准句子进行匹配,并将所匹配成功的第二标准句子对应于特殊疑问词的部分提取出来,将所提取出来的词替换自然语言句子中的特殊疑问词并输出以实现智能问答。在句子骨干结构中没有特殊疑问词的时候,则判断句子中是否存在“动词-否定词-动词”的结构,如果存在,则认定为一般疑问句,首先对句子进行否定运算,形成陈述句,然后根据句子中其他的骨干成分查询知识库,如果与知识库中的内容匹配成功,则输出陈述句,如果未匹配成功,则对该句子再次进行否定运算,输出否定陈述句。在骨干结构中没有特殊疑问词且没有“动词-否定词-动词”的结构时,则认为句子为陈述句,则将所提取的骨干结构存储至知识库,以为其他智能问答奠定基础。Specifically, referring to FIG. 5, FIG. 5 is a flowchart of a natural language processing method in another embodiment. The terminal first receives the natural language input by the user, and then parses the grammatic dependency tree through the preset natural language parser, and extracts the corresponding sentence backbone structure from the grammatical dependency tree, and performs the extracted sentence backbone structure. Judgment can get three kinds of sentence types, and deal with different sentence types accordingly. When there are special interrogative words in the backbone of the sentence, the corresponding knowledge base is obtained according to the type of the special interrogative word, and the other words in the backbone component of the sentence are matched with the second standard sentence in the knowledge base, and the matching is successful. The second standard sentence is extracted corresponding to the part of the special question word, and the extracted word is replaced with the special question word in the natural language sentence and output to realize the intelligent question and answer. When there is no special interrogative word in the backbone structure of the sentence, it is judged whether there is a structure of "verb-negative-verb" in the sentence. If it exists, it is regarded as a general interrogative sentence. First, the sentence is negated to form a declarative sentence, and then according to The other backbone components in the sentence query the knowledge base. If the content in the knowledge base matches successfully, the statement is output. If the match is not successful, the sentence is again negated and the negative statement is output. When there is no special interrogative word in the backbone structure and there is no "verb-negative-verb" structure, the sentence is considered as a declarative sentence, and the extracted backbone structure is stored in the knowledge base to lay the foundation for other intelligent questions and answers.
具体地,假设用户输入的自然语言为“番茄来自哪里”,则提取到骨干结构“番茄”、“来自”“哪里”,其中“哪里”是特殊疑问词,且为“where”类型,则在知识库中查询 地点相关的第二标准句子,并且限定条件是骨干中的“番茄”,如果得到知识库的响应是“北美洲”,那么利用上述的自然语言生成方法中的替换(replace)运算操作,将原输入句子中的疑问词“哪里”替换成“北美洲”,最后经过序列化处理,合成骨干结构为自然语言句子,得到“番茄来自北美洲”的知识点表述,返回给用户。Specifically, assuming that the natural language input by the user is "where the tomato comes from", the backbone structure "tomato", "from" and "where" are extracted, wherein "where" is a special question word, and is "where" type, then The second standard sentence related to the query location in the knowledge base, and the qualification condition is "tomato" in the backbone. If the response from the knowledge base is "North America", then the replacement operation in the natural language generation method described above is utilized. The operation replaces the question word "where" in the original input sentence with "North America", and finally serializes the synthetic backbone structure into a natural language sentence, and obtains the expression point of "tomato from North America" and returns it to the user.
假设用户输入的自然语言为“番茄是不是来自北美洲”,则提取到骨干结构“番茄”、“是不是”“来自”“北美洲”,则判断句子骨干中是否存在“动词-否定词-动词”的结构,如果存在,则先对所提取的骨干结构进行否定操作得到肯定句,然后将肯定句中的句子结构依次与知识库中的内容进行匹配,例如首先对主语进行匹配,然后对谓语,最后对宾语,如果匹配成功,则直接输出该肯定句,如果匹配失败,则对该肯定句再次进行否定操作后再输出。如上述例子中,首先对骨干结构进行否定操作,得到“番茄”“是”“来自”“北美洲”的肯定句,然后在知识库中查询是否有这样的句子陈述,如果查询为真,那么,可以直接按照自然语言生成方法序列化该结构为“番茄是来自北美洲”返回给用户;如果为假,那么使用本案例的自然语言生成方法中的否定操作,将查询结构变换成“番茄不是来自北美洲”返回给用户。具体地,当查询为真,则将原自然语言中的“动词-否定词-动词”修改为“动词”,否则修改为“否定词-动词”,并输出进行替换后的词语。Assuming that the natural language input by the user is "Tomato is from North America", then extract the backbone structure "tomato", "is not" and "from" "North America", then determine whether there is a "verb-negative word" in the backbone of the sentence. The structure of the verb "if present, the negative operation of the extracted backbone structure is first obtained to obtain an affirmative sentence, and then the sentence structure in the affirmative sentence is matched with the content in the knowledge base in turn, for example, the subject is first matched, and then Predicate, and finally, for the object, if the match is successful, the affirmative sentence is directly output. If the match fails, the affirmative sentence is again subjected to a negative operation and then output. In the above example, the negative structure of the backbone structure is first obtained, and the affirmative sentence of “tomato”, “yes”, “from” and “North America” is obtained, and then the knowledge base is queried whether there is such a sentence statement. If the query is true, then The structure can be serialized directly to the user according to the natural language generation method for "tomato is from North America"; if it is false, then the negative operation in the natural language generation method of this case is used to transform the query structure into "tomato is not From North America" returned to the user. Specifically, when the query is true, the "verb-negative-verb" in the original natural language is modified to "verb", otherwise it is modified to "negative-verb", and the replaced word is output.
假设用户输入的是“番茄来自北美洲”,则提取到骨干结构“番茄”“来自”“北美洲”,其中未包含特殊疑问词,也未包含一般疑问词,即未包含“动词-否定词-动词”的结构时,则认为该句子未陈述句或祈使句,当该句子为陈述句或祈使句的时候,则直接将该句子保存到知识库中,以对知识库进行扩充,并为了提高趣味性,可以随机对该陈述句中的句子骨干的某一部分用特殊疑问词进行代替输出,或进行复合运算后输出,以实现趣味性。例如可以输出“番茄来自哪里”,从而可以提高趣味性。其中为了避免重复,还可以首先查询知识库中是否存在该骨干结构,如果存在,则不作任何操作,只有在知识库中不存在该骨干结构时,才会将该骨干结构存储进知识库中。Suppose the user inputs "tomato from North America", then extracts the backbone structure "tomato" "from" "North America", which does not contain special question words, nor does it contain general question words, ie does not contain "verb-negative words" - the structure of the verb, it is considered that the sentence does not state a sentence or imperative sentence, when the sentence is a declarative sentence or an imperative sentence, the sentence is directly saved to the knowledge base to expand the knowledge base, and in order to enhance the interest, It is possible to randomly output a certain part of the backbone of the sentence in the statement with a special question word, or perform a composite operation and output to achieve fun. For example, you can output "where the tomato comes from", which can improve the fun. To avoid duplication, you can first query whether the backbone structure exists in the knowledge base. If it exists, no operation is performed. Only when the backbone structure does not exist in the knowledge base, the backbone structure is stored in the knowledge base.
上述实施例中,将句子类型分为特殊疑问句、一般疑问句以及陈述句,在智能问答的时候,首先提取句子骨干成分,去除掉无用信息,其次根据特殊疑问词来判断句型,然后根据句型来查询相应的知识库,并通过替换操作替换特殊疑问词,通过否定操作或替换操作替换骨干结构中的“动词-否定词-动词”结构等,并将陈述句直接存储至知识库中,并未改变句子的结构,使得智能问答中的答案与问题逻辑相关,提高了回答的准确性。In the above embodiment, the sentence type is divided into a special question sentence, a general question sentence, and a statement sentence. In the intelligent question and answer, the sentence backbone component is first extracted, the useless information is removed, and the sentence pattern is determined according to the special question word, and then according to the sentence pattern. Query the corresponding knowledge base, replace the special question words by the replacement operation, replace the "verb-negative-verb" structure in the backbone structure by negation or replacement operation, and store the statement directly in the knowledge base, without changing The structure of the sentence makes the answer in the intelligent question and answer related to the problem logic, which improves the accuracy of the answer.
在其中一个实施例中,为了提高匹配的效率,引入了模糊匹配的方式,其中在模糊匹配之前还可以对骨干结构中的成分进行标准化处理,或者在匹配失败后,引入人工干预的步骤,通过该人工干预的步骤,建立匹配失败的骨干结构与标准句子的映射关系,从而后续再接收到该骨干结构时,可以直接从知识库中获取到相应的标准句子,从而不仅通过陈述句实现知识库的扩充,还可以通过人工干预实现知识库的扩充。对于上述实施例中可能存在模糊匹配的步骤包括将转换后的骨干结构与知识库中的标准句子进行匹配的步骤和/或将所提取的骨干结构与知识库中特殊疑问词的类型对应的标准句子进行匹配的步骤。In one of the embodiments, in order to improve the efficiency of the matching, a fuzzy matching manner is introduced, wherein the components in the backbone structure may be standardized before the fuzzy matching, or after the matching fails, the manual intervention step is introduced. The manual intervention step establishes a mapping relationship between the failed backbone structure and the standard sentence, so that when the backbone structure is received subsequently, the corresponding standard sentence can be directly obtained from the knowledge base, thereby implementing the knowledge base not only through the declarative sentence. Expansion, the expansion of the knowledge base can also be achieved through manual intervention. The steps that may exist for the fuzzy matching in the above embodiment include the step of matching the converted backbone structure with the standard sentence in the knowledge base and/or the criterion for matching the extracted backbone structure with the type of the special question word in the knowledge base. The step in which the sentence is matched.
其中,在一个实施例中,将转换后的骨干结构与知识库中的第二标准句子进行匹配的步骤,可以包括:将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;根据第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将第一目标句子存储至知识库中。In an embodiment, the step of matching the converted backbone structure with the second standard sentence in the knowledge base may include: performing fuzzy matching on the converted backbone structure and the second standard sentence in the knowledge base; Receiving a first mapping instruction for the converted backbone structure when the converted backbone structure fails to match the second standard sentence in the knowledge base; and establishing the converted backbone structure and the first target sentence according to the first mapping instruction Match the relationship and store the first target sentence in the knowledge base.
具体地,当骨干结构中存在一般疑问词时,首先对该骨干结构进行否定操作,然后将转换后的骨干结构与知识库中的第二标准句子模糊匹配,包括骨干结构中每一部分的模糊匹配,例如当骨干结构为主谓宾结构时,则主语、宾语和谓语均要进行模糊匹配,例如当所提取到的骨干结构为“番茄是不是来自北美洲”,则首先用“番茄”匹配到的知识库中的内容为“小番茄”和“番茄”,由于“番茄”的匹配率为100%,大于预设值,且大于小番茄的匹配率66.6%,因此选取“番茄”为最终的匹配结果,同样地,北美洲也进行同样的匹配。可选地,在模糊匹配开始前还可以对提取到的骨干结构进行预处理,即标准化处理,例如当提取到的骨干成分为“番茄”时,则首先将“番茄”转换为“西红柿”,然后再按照上述实施例进行匹配。Specifically, when there is a general interrogative word in the backbone structure, the negative structure of the backbone structure is first performed, and then the converted backbone structure is fuzzyly matched with the second standard sentence in the knowledge base, including the fuzzy matching of each part in the backbone structure. For example, when the backbone structure is the main-predicate structure, the subject, the object, and the predicate are all fuzzy-matched. For example, when the extracted backbone structure is “the tomato is from North America,” the first one is matched with “tomato”. The contents of the knowledge base are “small tomato” and “tomato”. Since the matching rate of “tomato” is 100%, which is greater than the preset value and greater than the matching rate of small tomato by 66.6%, select “tomato” as the final match. As a result, similarly, North America also performs the same match. Optionally, before the start of the fuzzy matching, the extracted backbone structure may be pre-processed, that is, standardized processing. For example, when the extracted backbone component is “tomato”, the “tomato” is first converted into “tomato”. Then, matching is performed in accordance with the above embodiment.
其中,当匹配失败时,则可以接收针对转换后的骨干结构的第一映射指令;根据第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将第一目标句子存储至知识库中,例如在匹配失败时,则可以输出提示“不知道”等内容,此时用户可以进行人工干预,输入第一目标句子“番茄来自北美洲”,从而终端在接收到该指示后,可以将该第一目标句子存储至知识库中,从而实现知识库的扩充。If the matching fails, the first mapping instruction for the converted backbone structure may be received; the matching relationship between the converted backbone structure and the first target sentence is established according to the first mapping instruction, and the first target sentence is stored to In the knowledge base, for example, when the matching fails, the content such as “Don't know” may be output, and the user may perform manual intervention to input the first target sentence “Tomato from North America”, so that after receiving the instruction, the terminal receives the instruction. The first target sentence can be stored in the knowledge base to implement the expansion of the knowledge base.
在一个实施例中,将所提取的骨干结构与第一标准句子进行匹配的步骤,可以包括:将所提取的骨干结构与第一标准句子进行模糊匹配;当所提取的骨干结构与第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;根据第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将第二目标句子存储至知识库中。In an embodiment, the step of matching the extracted backbone structure with the first standard sentence may include: performing fuzzy matching on the extracted backbone structure with the first standard sentence; when extracting the backbone structure and the first standard sentence When the fuzzy matching fails, the second mapping instruction for the extracted backbone structure is received; the matching relationship between the extracted backbone structure and the second target sentence is established according to the second mapping instruction, and the second target sentence is stored in the knowledge base.
具体地,当骨干结构中存在特殊疑问词时,将所提取的骨干结构与知识库中的第一标准句子模糊匹配,包括骨干结构中每一部分的模糊匹配,例如当骨干结构为主谓宾结构时,则除了特殊疑问词部分的主语、宾语和谓语均要进行模糊匹配,例如当所提取到的骨干结构为“番茄来自哪里”,则首先用“番茄”匹配到的知识库中的内容为“小番茄”和“番茄”,由于“番茄”的匹配率为100%,大于预设值,且大于小番茄的匹配率66.6%,因此选取“番茄”为最终的匹配结果,同样地,“来自”也进行同样的匹配。可选地,在模糊匹配开始前还可以对提取到的骨干结构进行预处理,即标准化处理,例如当提取到的骨干成分为“番茄”时,则首先将“番茄”转换为“西红柿”,然后再按照上述实施例进行匹配。Specifically, when there is a special interrogative word in the backbone structure, the extracted backbone structure is fuzzyly matched with the first standard sentence in the knowledge base, including fuzzy matching of each part in the backbone structure, for example, when the backbone structure is the main-predicate structure In addition, the subject, object and predicate of the special question part should be fuzzy matched. For example, when the extracted backbone structure is “where the tomato comes from”, the content in the knowledge base matched with “tomato” is first “ "Small tomato" and "tomato", because the matching rate of "tomato" is 100%, which is greater than the preset value, and is greater than the matching rate of small tomato by 66.6%, so "tomato" is selected as the final matching result, and similarly, "from "The same match is also made. Optionally, before the start of the fuzzy matching, the extracted backbone structure may be pre-processed, that is, standardized processing. For example, when the extracted backbone component is “tomato”, the “tomato” is first converted into “tomato”. Then, matching is performed in accordance with the above embodiment.
其中,当匹配失败时,则可以接收针对转换后的骨干结构的第二映射指令;根据第二映射指令建立转换后的骨干结构与第二目标句子的匹配关系,并将第二目标句子存储至知识库中,例如在匹配失败时,则可以输出提示“不知道”等内容,此时用户可以进行人工 干预,输入第二目标句子“番茄来自北美洲”,从而终端在接收到该指示后,可以将该第二目标句子存储至知识库中,从而实现知识库的扩充。Wherein, when the matching fails, the second mapping instruction for the converted backbone structure may be received; the matching relationship between the converted backbone structure and the second target sentence is established according to the second mapping instruction, and the second target sentence is stored to In the knowledge base, for example, when the matching fails, the content such as “not knowing” may be output. At this time, the user may manually intervene and input the second target sentence “Tomato from North America”, so that after receiving the instruction, the terminal receives the instruction. The second target sentence can be stored in the knowledge base to implement the expansion of the knowledge base.
上述实施例中,采用模糊匹配的方式可以提高匹配的效率,且在匹配失败的情况下,引入人工干预,从而实现对知识库的扩充,提高了下一次的匹配效率。In the above embodiment, the fuzzy matching method can improve the matching efficiency, and in the case of matching failure, manual intervention is introduced, thereby realizing the expansion of the knowledge base and improving the matching efficiency of the next time.
应该理解的是,虽然图2和图5的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2和图5中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowcharts of FIGS. 2 and 5 are sequentially displayed as indicated by the arrows, these steps are not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and the steps may be performed in other orders. Moreover, at least some of the steps in FIGS. 2 and 5 may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be performed at different times, or The order of execution of the stages is also not necessarily sequential, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
在一个实施例中,如图6所示,提供了一种自然语言处理装置,包括:接收模块100、提取模块200、第一判断模块300、第一匹配模块400和输出模块500,其中:In an embodiment, as shown in FIG. 6, a natural language processing apparatus is provided, including: a receiving module 100, an extracting module 200, a first determining module 300, a first matching module 400, and an output module 500, wherein:
接收模块100,用于接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树。The receiving module 100 is configured to receive the input natural language, and parse the input natural language through a preset natural language analysis library to obtain a natural language dependency tree.
提取模块200,用于提取自然语言依存树中的骨干结构。The extraction module 200 is configured to extract a backbone structure in the natural language dependency tree.
第一判断模块300,用于判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别特殊疑问词的类型。The first determining module 300 is configured to determine whether there is a special question word in the extracted backbone structure, and when present, identify the type of the special question word.
第一匹配模块400,用于将所提取的骨干结构与第一标准句子进行匹配,第一标准句子存储在知识库中,并与特殊疑问词的类型对应。The first matching module 400 is configured to match the extracted backbone structure with the first standard sentence, where the first standard sentence is stored in the knowledge base and corresponds to the type of the special question word.
输出模块500,用于当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,并将所提取的部分替换自然语言中的特殊疑问词后,输出替换后的自然语言。The output module 500 is configured to: when the matching succeeds, extract a part corresponding to the special question word in the first standard sentence, and replace the extracted part with the special question word in the natural language, and output the replaced natural language.
在其中一个实施例中,装置还可以包括:In one embodiment, the apparatus may further include:
第二判断模块,用于当所提取的骨干结构中不存在特殊疑问词时,则判断骨干结构中是否存在一般疑问词。The second judging module is configured to determine whether there is a general interrogative word in the backbone structure when there is no special interrogative word in the extracted backbone structure.
第二匹配模块,用于当骨干结构中存在一般疑问词时,则将一般疑问词转换为肯定词,并将转换后的骨干结构与知识库中的第二标准句子进行匹配。The second matching module is configured to convert the general question word into an affirmative word when there is a general question word in the backbone structure, and match the converted backbone structure with the second standard sentence in the knowledge base.
输出模块500还用于当匹配成功时,则将自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;当匹配失败时,则将自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。The output module 500 is further configured to: when the matching is successful, convert the general interrogative word in the natural language into an affirmative word, and output the converted natural language; when the matching fails, convert the general interrogative word in the natural language to a negative After the word, the converted natural language is output.
在其中一个实施例中,装置还可以包括:In one embodiment, the apparatus may further include:
存储模块,用于当骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将骨干结构存储至知识库中。The storage module is configured to store the backbone structure into the knowledge base when there is no special question word in the backbone structure and there is no general question word.
在其中一个实施例中,第二匹配模块可以包括:In one embodiment, the second matching module can include:
第一模糊匹配单元,用于将转换后的骨干结构与知识库中的第二标准句子进行模糊匹 配。The first fuzzy matching unit is configured to perform fuzzy matching on the converted backbone structure with the second standard sentence in the knowledge base.
第一映射指令接收单元,用于当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令。The first mapping instruction receiving unit is configured to receive a first mapping instruction for the converted backbone structure when the converted backbone structure fails to match the second standard sentence in the knowledge base.
第一映射关系存储单元,用于根据第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将第一目标句子存储至知识库中。The first mapping relationship storage unit is configured to establish a matching relationship between the converted backbone structure and the first target sentence according to the first mapping instruction, and store the first target sentence into the knowledge base.
在其中一个实施例中,第一匹配模块可以包括:In one embodiment, the first matching module may include:
第二模糊匹配单元,用于将所提取的骨干结构与第一标准句子进行模糊匹配。The second fuzzy matching unit is configured to perform fuzzy matching on the extracted backbone structure with the first standard sentence.
第二映射指令接收单元,用于当所提取的骨干结构与第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令。The second mapping instruction receiving unit is configured to receive a second mapping instruction for the extracted backbone structure when the extracted backbone structure fails to match the first standard sentence.
第二映射关系存储单元,用于根据第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将第二目标句子存储至知识库中。The second mapping relationship storage unit is configured to establish a matching relationship between the extracted backbone structure and the second target sentence according to the second mapping instruction, and store the second target sentence in the knowledge base.
在其中一个实施例中,骨干结构可以包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。In one embodiment, the backbone structure may include at least one of a subject-predicate structure, a predicate structure, and a mediation structure.
关于自然语言处理装置的具体限定可以参见上文中对于自然语言处理方法的限定,在此不再赘述。上述自然语言处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For specific definitions of the natural language processing device, reference may be made to the above definition of the natural language processing method, and details are not described herein again. The various modules in the above-described natural language processing device may be implemented in whole or in part by software, hardware, and combinations thereof. Each of the above modules may be embedded in or independent of the processor in the computer device, or may be stored in a memory in the computer device in a software form, so that the processor invokes the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机可读指令。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种自然语言处理方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in FIG. The computer device includes a processor, memory, network interface, display screen, and input device connected by a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium, an internal memory. The non-volatile storage medium stores operating systems and computer readable instructions. The internal memory provides an environment for operation of an operating system and computer readable instructions in a non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal via a network connection. The computer readable instructions are executed by a processor to implement a natural language processing method. The display screen of the computer device may be a liquid crystal display or an electronic ink display screen, and the input device of the computer device may be a touch layer covered on the display screen, or may be a button, a trackball or a touchpad provided on the computer device casing. It can also be an external keyboard, trackpad or mouse.
本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。It will be understood by those skilled in the art that the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied. The specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;提取自然语言依存树中的骨干结构;判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别特殊疑问词的类型;将所提取的骨干结构与第一标准句子进行匹配,第一标准句子 存储在知识库中,并与特殊疑问词的类型对应;及当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,并将所提取的部分替换自然语言中的特殊疑问词后,输出替换后的自然语言。A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by the processor such that the one or more processors perform the step of: receiving the input natural language The natural language dependency tree is obtained by parsing the input natural language through a preset natural language analysis library; extracting the backbone structure in the natural language dependency tree; determining whether there is a special question word in the extracted backbone structure, when present, Identifying the type of the special question word; matching the extracted backbone structure with the first standard sentence, the first standard sentence is stored in the knowledge base and corresponding to the type of the special question word; and when the match is successful, the first is extracted The part of the standard sentence that corresponds to the special question word, and replaces the extracted part with the special question word in the natural language, and outputs the replaced natural language.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:当所提取的骨干结构中不存在特殊疑问词时,则判断骨干结构中是否存在一般疑问词;当骨干结构中存在一般疑问词时,则将一般疑问词转换为肯定词,并将转换后的骨干结构与知识库中的第二标准句子进行匹配;当匹配成功时,则将自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;及当匹配失败时,则将自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。In one embodiment, when the processor executes the computer readable instructions, the following steps are further implemented: when there is no special interrogative word in the extracted backbone structure, it is determined whether there is a general question word in the backbone structure; when there is a general question in the backbone structure When the word is used, the general question word is converted into a positive word, and the transformed backbone structure is matched with the second standard sentence in the knowledge base; when the match is successful, the general question word in the natural language is converted into a positive word. After that, the converted natural language is output; and when the matching fails, the general question words in the natural language are converted into negative words, and the converted natural language is output.
在一个实施例中,处理器执行计算机可读指令时还实现以下步骤:当骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将骨干结构存储至知识库中。In one embodiment, the processor, when executing the computer readable instructions, further implements the step of storing the backbone structure in the knowledge base when there are no special interrogative words in the backbone structure and no general interrogative words exist.
在一个实施例中,处理器执行计算机可读指令时所实现的将转换后的骨干结构与知识库中的第二标准句子进行匹配的步骤,可以包括:将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;及根据第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将第一目标句子存储至知识库中。In an embodiment, the step of matching the converted backbone structure with the second standard sentence in the knowledge base implemented by the processor when executing the computer readable instructions may include: converting the converted backbone structure into a knowledge base The second standard sentence performs fuzzy matching; when the converted backbone structure fails to match the second standard sentence in the knowledge base, the first mapping instruction for the converted backbone structure is received; and the conversion is established according to the first mapping instruction The matching relationship between the back backbone structure and the first target sentence, and storing the first target sentence into the knowledge base.
在一个实施例中,处理器执行计算机可读指令时所实现的将所提取的骨干结构与第一标准句子进行匹配的步骤,可以包括:将所提取的骨干结构与第一标准句子进行模糊匹配;当所提取的骨干结构与第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;及根据第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将第二目标句子存储至知识库中。In one embodiment, the step of matching the extracted backbone structure with the first standard sentence implemented by the processor when executing the computer readable instructions may include: performing fuzzy matching on the extracted backbone structure with the first standard sentence. Receiving a second mapping instruction for the extracted backbone structure when the extracted backbone structure fails to match the first standard sentence; and establishing a matching relationship between the extracted backbone structure and the second target sentence according to the second mapping instruction, And store the second target sentence in the knowledge base.
在一个实施例中,处理器执行计算机可读指令时实现的步骤中的骨干结构包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。In one embodiment, the backbone structure in the steps implemented by the processor when executing the computer readable instructions comprises at least one of a subject-predicate structure, a predicate structure, and a mediation structure.
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;提取自然语言依存树中的骨干结构;判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别特殊疑问词的类型;将所提取的骨干结构与第一标准句子进行匹配,第一标准句子存储在知识库中,并与特殊疑问词的类型对应;及当匹配成功时,则提取第一标准句子中与特殊疑问词对应的部分,并将所提取的部分替换自然语言中的特殊疑问词后,输出替换后的自然语言。One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause one or more processors to perform the steps of: receiving input in nature The language, through the preset natural language analysis library, parses the input natural language to obtain a natural language dependency tree; extracts the backbone structure in the natural language dependency tree; determines whether there is a special interrogative word in the extracted backbone structure, when present, Identifying the type of the special question word; matching the extracted backbone structure with the first standard sentence, the first standard sentence is stored in the knowledge base and corresponding to the type of the special question word; and when the match is successful, the first A part of a standard sentence that corresponds to a special question word, and replaces the extracted part with a special question word in the natural language, and outputs the replaced natural language.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:当所提取的骨干结构中不存在特殊疑问词时,则判断骨干结构中是否存在一般疑问词;当骨干结构中存在一般疑问词时,则将一般疑问词转换为肯定词,并将转换后的骨干结构与知识库中的第二标准句子进行匹配;当匹配成功时,则将自然语言中的一般疑问词转换为肯定词后,输出 转换后的自然语言;及当匹配失败时,则将自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。In one embodiment, when the computer readable instructions are executed by the processor, the following steps are further implemented: when there is no special interrogative word in the extracted backbone structure, it is determined whether there is a general interrogative word in the backbone structure; when there is a general structure in the backbone structure In the case of interrogative words, the general interrogative words are converted into affirmative words, and the transformed backbone structure is matched with the second standard sentence in the knowledge base; when the matching is successful, the general interrogative words in the natural language are converted into affirmative After the word, the converted natural language is output; and when the matching fails, the general question word in the natural language is converted into a negative word, and the converted natural language is output.
在一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:当骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将骨干结构存储至知识库中。In one embodiment, when the computer readable instructions are executed by the processor, the following steps are further implemented: when there are no special interrogative words in the backbone structure and no general interrogative words exist, the backbone structure is stored in the knowledge base.
在一个实施例中,计算机可读指令被处理器执行时所实现的将转换后的骨干结构与知识库中的第二标准句子进行匹配的步骤,可以包括:将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;及根据第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将第一目标句子存储至知识库中。In one embodiment, the step of matching the converted backbone structure with the second standard sentence in the knowledge base implemented by the processor when the computer readable instructions are executed may include: transforming the backbone structure and the knowledge base The second standard sentence in the fuzzy matching is performed; when the converted backbone structure fails to match the second standard sentence in the knowledge base, the first mapping instruction for the converted backbone structure is received; and the first mapping instruction is established according to the first mapping instruction The matching relationship between the converted backbone structure and the first target sentence, and storing the first target sentence into the knowledge base.
在一个实施例中,计算机可读指令被处理器执行时所实现的将所提取的骨干结构与第一标准句子进行匹配的步骤,可以包括:将所提取的骨干结构与第一二标准句子进行模糊匹配;当所提取的骨干结构与第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;及根据第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将第二目标句子存储至知识库中。In one embodiment, the step of matching the extracted backbone structure with the first standard sentence implemented by the processor when the computer readable instructions are executed may include: performing the extracted backbone structure with the first two standard sentences Fuzzy matching; receiving a second mapping instruction for the extracted backbone structure when the extracted backbone structure fails to match the first standard sentence; and establishing a matching of the extracted backbone structure with the second target sentence according to the second mapping instruction Relationship and store the second target sentence in the knowledge base.
在一个实施例中,计算机可读指令被处理器执行时实现的步骤中的骨干结构包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。In one embodiment, the backbone structure in the steps implemented when the computer readable instructions are executed by the processor comprises at least one of a subject-predicate structure, a predicate structure, and a mediation structure.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the above embodiments can be completed by computer readable instructions, which can be stored in a non-volatile computer. The readable storage medium, which when executed, may include the flow of an embodiment of the methods as described above. Any reference to a memory, storage, database or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of formats, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronization chain. Synchlink DRAM (SLDRAM), Memory Bus (Rambus) Direct RAM (RDRAM), Direct Memory Bus Dynamic RAM (DRDRAM), and Memory Bus Dynamic RAM (RDRAM).
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。The technical features of the above embodiments may be arbitrarily combined. For the sake of brevity of description, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, It is considered to be the range described in this specification.
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments are merely illustrative of several embodiments of the present application, and the description thereof is more specific and detailed, but is not to be construed as limiting the scope of the invention. It should be noted that a number of variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the present application. Therefore, the scope of the invention should be determined by the appended claims.

Claims (20)

  1. 一种自然语言处理方法,包括:A natural language processing method that includes:
    接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;Receiving the natural language of the input, and parsing the natural language of the input through a preset natural language analysis library to obtain a natural language dependency tree;
    提取所述自然语言依存树中的骨干结构;Extracting a backbone structure in the natural language dependent tree;
    判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;Determining whether there is a special question word in the extracted backbone structure, and when present, identifying the type of the special question word;
    将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及Matching the extracted backbone structure with a first standard sentence, the first standard sentence being stored in the knowledge base and corresponding to the type of the special question word;
    当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。When the matching is successful, the part corresponding to the special question word in the first standard sentence is extracted, and the extracted part is replaced with the special question word in the natural language, and the replaced natural language is output.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 further comprising:
    当所提取的骨干结构中不存在特殊疑问词时,则判断所述骨干结构中是否存在一般疑问词;When there is no special interrogative word in the extracted backbone structure, it is determined whether there is a general question word in the backbone structure;
    当所述骨干结构中存在一般疑问词时,则将所述一般疑问词转换为肯定词,并将转换后的骨干结构与所述知识库中的第二标准句子进行匹配;When there is a general question word in the backbone structure, the general question word is converted into an affirmative word, and the converted backbone structure is matched with the second standard sentence in the knowledge base;
    当匹配成功时,则将所述自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;及When the matching is successful, the general question words in the natural language are converted into positive words, and the converted natural language is output;
    当匹配失败时,则将所述自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。When the matching fails, the general question words in the natural language are converted into negative words, and the converted natural language is output.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method of claim 2, wherein the method further comprises:
    当所述骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将所述骨干结构存储至所述知识库中。When there is neither a special question word nor a general question word in the backbone structure, the backbone structure is stored in the knowledge base.
  4. 根据权利要求2所述的方法,其特征在于,所述将转换后的骨干结构与知识库中的第二标准句子进行匹配,包括:The method according to claim 2, wherein the matching the converted backbone structure with the second standard sentence in the knowledge base comprises:
    将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;Fuzzy matching the converted backbone structure with the second standard sentence in the knowledge base;
    当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;Receiving a first mapping instruction for the converted backbone structure when the converted backbone structure fails to match the second standard sentence in the knowledge base;
    根据所述第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将所述第一目标句子存储至所述知识库中。And establishing, according to the first mapping instruction, a matching relationship between the converted backbone structure and the first target sentence, and storing the first target sentence into the knowledge base.
  5. 根据权利要求1至3任意一项所述的方法,其特征在于,所述将所提取的骨干结构与第一标准句子进行匹配,包括:The method according to any one of claims 1 to 3, wherein the matching the extracted backbone structure with the first standard sentence comprises:
    将所提取的骨干结构与第一标准句子进行模糊匹配;Fuzzyly matching the extracted backbone structure with the first standard sentence;
    当所提取的骨干结构与所述第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;及Receiving, when the extracted backbone structure fails to match the first standard sentence, a second mapping instruction for the extracted backbone structure; and
    根据所述第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将所述第二目标句子存储至所述知识库中。And establishing, according to the second mapping instruction, a matching relationship between the extracted backbone structure and the second target sentence, and storing the second target sentence into the knowledge base.
  6. 根据权利要求1至3任意一项所述的方法,其特征在于,所述骨干结构包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。The method according to any one of claims 1 to 3, wherein the backbone structure comprises at least one of a subject-predicate structure, a predicate structure, and a mediation structure.
  7. 一种自然语言处理装置,包括:A natural language processing device comprising:
    接收模块,用于接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;a receiving module, configured to receive the input natural language, and parse the input natural language through a preset natural language parsing library to obtain a natural language dependency tree;
    提取模块,用于提取所述自然语言依存树中的骨干结构;An extraction module, configured to extract a backbone structure in the natural language dependent tree;
    第一判断模块,用于判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;a first determining module, configured to determine whether a special question word exists in the extracted backbone structure, and when present, identify a type of the special question word;
    第一匹配模块,用于将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及a first matching module, configured to match the extracted backbone structure with a first standard sentence, where the first standard sentence is stored in the knowledge base and corresponds to a type of the special question word;
    输出模块,用于当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。An output module, configured to: when the matching is successful, extract a portion of the first standard sentence corresponding to the special question word, and replace the extracted part with a special question word in the natural language, and output the replacement Natural language.
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:The device according to claim 7, wherein the device further comprises:
    第二判断模块,用于当所提取的骨干结构中不存在特殊疑问词时,则判断所述骨干结构中是否存在一般疑问词;a second determining module, configured to determine whether a general question word exists in the backbone structure when there is no special question word in the extracted backbone structure;
    第二匹配模块,用于当所述骨干结构中存在一般疑问词时,则将所述一般疑问词转换为肯定词,并将转换后的骨干结构与所述知识库中的第二标准句子进行匹配;及a second matching module, configured to convert the general interrogative word into an affirmative word when the general interrogative word exists in the backbone structure, and perform the converted backbone structure and the second standard sentence in the knowledge base Match; and
    所述输出模块还用于当匹配成功时,则将所述自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;当匹配失败时,则将所述自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。The output module is further configured to: when the matching is successful, convert the general question words in the natural language into a positive words, and output the converted natural language; when the matching fails, the general language is After the interrogative word is converted into a negative word, the converted natural language is output.
  9. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;提取所述自然语言依存树中的骨干结构;判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。A computer device comprising a memory and one or more processors having stored therein computer readable instructions, the computer readable instructions being executed by the one or more processors to cause the one or more The processor performs the following steps: receiving the natural language of the input, parsing the input natural language through a preset natural language parsing library to obtain a natural language dependency tree; extracting the backbone structure in the natural language dependent tree; and determining the extracted Whether there is a special interrogative word in the backbone structure, when present, identifying the type of the special interrogative word; matching the extracted backbone structure with a first standard sentence, the first standard sentence being stored in the knowledge base And corresponding to the type of the special question word; and when the matching is successful, extracting a portion of the first standard sentence corresponding to the special question word, and replacing the extracted part in the natural language After the special question word, the natural language after the replacement is output.
  10. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:当所提取的骨干结构中不存在特殊疑问词时,则判断所述骨干结构中是否存在一般疑问词;当所述骨干结构中存在一般疑问词时,则将所述一般疑问词转换为肯定词,并将转换后的骨干结构与所述知识库中的第二标准句子进行匹配;当匹配成功时,则将所述自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;及 当匹配失败时,则将所述自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。The computer device according to claim 9, wherein the processor, when executing the computer readable instructions, further performs the step of determining the backbone structure when there is no special interrogative word in the extracted backbone structure Whether there is a general question word in the middle; when there is a general question word in the backbone structure, the general question word is converted into an affirmative word, and the converted backbone structure and the second standard sentence in the knowledge base are performed Matching; when the matching is successful, the general question word in the natural language is converted into a positive word, and the converted natural language is output; and when the matching fails, the general question word in the natural language is converted into After the negative word, the converted natural language is output.
  11. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:当所述骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将所述骨干结构存储至所述知识库中。The computer apparatus according to claim 10, wherein said processor, when said computer readable instructions are executed, further performs the step of: when there is neither a special question word nor a general question word in said backbone structure The backbone structure is then stored in the knowledge base.
  12. 根据权利要求10所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述将转换后的骨干结构与知识库中的第二标准句子进行匹配,包括:将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;根据所述第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将所述第一目标句子存储至所述知识库中。The computer device according to claim 10, wherein the processor implements the computer readable instructions to match the converted backbone structure with a second standard sentence in the knowledge base, including The fuzzy matching between the converted backbone structure and the second standard sentence in the knowledge base; when the transformed backbone structure fails to match the second standard sentence in the knowledge base, receiving the first for the converted backbone structure Mapping the instruction; establishing a matching relationship between the converted backbone structure and the first target sentence according to the first mapping instruction, and storing the first target sentence into the knowledge base.
  13. 根据权利要求9至11任意一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所实现的所述将所提取的骨干结构与第一标准句子进行匹配,包括:将所提取的骨干结构与第一标准句子进行模糊匹配;当所提取的骨干结构与所述第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;及根据所述第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将所述第二目标句子存储至所述知识库中。The computer device according to any one of claims 9 to 11, wherein the processor performs the computer readable instruction to match the extracted backbone structure with a first standard sentence, The method includes: performing fuzzy matching on the extracted backbone structure with a first standard sentence; receiving a second mapping instruction for the extracted backbone structure when the extracted backbone structure fails to match the first standard sentence; and The second mapping instruction establishes a matching relationship between the extracted backbone structure and the second target sentence, and stores the second target sentence into the knowledge base.
  14. 根据权利要求9至11任意一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时所涉及的所述骨干结构包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。The computer device according to any one of claims 9 to 11, wherein the backbone structure involved in the execution of the computer readable instructions by the processor comprises a subject-predicate structure, a predicate structure, and a mediator. At least one of the structures.
  15. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:接收输入的自然语言,通过预设的自然语言解析库对输入的自然语言进行解析得到自然语言依存树;提取所述自然语言依存树中的骨干结构;判断所提取的骨干结构中是否存在特殊疑问词,当存在时,则识别所述特殊疑问词的类型;将所提取的骨干结构与第一标准句子进行匹配,所述第一标准句子存储在所述知识库中,并与所述特殊疑问词的类型对应;及当匹配成功时,则提取所述第一标准句子中与所述特殊疑问词对应的部分,并将所提取的部分替换所述自然语言中的特殊疑问词后,输出替换后的自然语言。One or more non-transitory computer readable storage mediums storing computer readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps: Receiving the input natural language, parsing the input natural language through a preset natural language analysis library to obtain a natural language dependency tree; extracting the backbone structure in the natural language dependent tree; determining whether there is a special question in the extracted backbone structure a word, when present, identifying a type of the special question word; matching the extracted backbone structure with a first standard sentence, the first standard sentence being stored in the knowledge base, and the special question Corresponding to the type of the word; and when the matching is successful, extracting the part of the first standard sentence corresponding to the special question word, and replacing the extracted part with the special question word in the natural language, and outputting the replacement After the natural language.
  16. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:当所提取的骨干结构中不存在特殊疑问词时,则判断所述骨干结构中是否存在一般疑问词;当所述骨干结构中存在一般疑问词时,则将所述一般疑问词转换为肯定词,并将转换后的骨干结构与所述知识库中的第二标准句子进行匹配;当匹配成功时,则将所述自然语言中的一般疑问词转换为肯定词后,输出转换后的自然语言;及当匹配失败时,则将所述自然语言中的一般疑问词转换为否定词后,输出转换后的自然语言。A storage medium according to claim 15, wherein said computer readable instructions, when executed by said processor, further perform the step of: determining said backbone when there is no special interrogative word in said extracted backbone structure Whether there is a general question word in the structure; when there is a general question word in the backbone structure, the general question word is converted into an affirmative word, and the converted backbone structure and the second standard sentence in the knowledge base are Matching; when the matching is successful, converting the general interrogative words in the natural language into affirmative words, outputting the converted natural language; and when the matching fails, converting the general interrogative words in the natural language After the negative word, the converted natural language is output.
  17. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:当所述骨干结构中既不存在特殊疑问词,又不存在一般疑问词时,则将所述骨干结构存储至所述知识库中。A storage medium according to claim 16, wherein said computer readable instructions, when executed by said processor, further perform the step of: when there is neither a special interrogative word in said backbone structure nor a general question When the word is used, the backbone structure is stored in the knowledge base.
  18. 根据权利要求16所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述将转换后的骨干结构与知识库中的第二标准句子进行匹配,包括:将转换后的骨干结构与知识库中的第二标准句子进行模糊匹配;当转换后的骨干结构与知识库中的第二标准句子模糊匹配失败时,接收针对转换后的骨干结构的第一映射指令;根据所述第一映射指令建立转换后的骨干结构与第一目标句子的匹配关系,并将所述第一目标句子存储至所述知识库中。The storage medium according to claim 16, wherein said computer readable instructions are matched by said processor to perform said converted backbone structure with a second standard sentence in a knowledge base, The method comprises: performing fuzzy matching on the converted backbone structure with a second standard sentence in the knowledge base; and receiving, when the converted backbone structure fails to match the second standard sentence in the knowledge base, receiving the converted backbone structure a mapping instruction; establishing a matching relationship between the converted backbone structure and the first target sentence according to the first mapping instruction, and storing the first target sentence into the knowledge base.
  19. 根据权利要求15至17任意一项所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所实现的所述将所提取的骨干结构与第一标准句子进行匹配,包括:将所提取的骨干结构与第一标准句子进行模糊匹配;当所提取的骨干结构与所述第一标准句子模糊匹配失败时,接收针对所提取的骨干结构的第二映射指令;及根据所述第二映射指令建立所提取的骨干结构与第二目标句子的匹配关系,并将所述第二目标句子存储至所述知识库中。A storage medium according to any one of claims 15 to 17, wherein said computer readable instructions are matched by said processor to perform said matching of said extracted backbone structure with a first standard sentence The method includes: performing fuzzy matching on the extracted backbone structure with the first standard sentence; receiving a second mapping instruction for the extracted backbone structure when the extracted backbone structure fails to match the first standard sentence; and The second mapping instruction establishes a matching relationship between the extracted backbone structure and the second target sentence, and stores the second target sentence into the knowledge base.
  20. 根据权利要求15至17任意一项所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时所涉及的所述骨干结构包括主谓宾结构、谓宾结构以及介宾结构中的至少一种。The storage medium according to any one of claims 15 to 17, wherein said backbone structure involved in execution of said computer readable instructions by said processor comprises a subject-predicate structure, a predicate structure, and a mediation At least one of the guest structures.
PCT/CN2018/100169 2018-01-30 2018-08-13 Natural language processing method, device, computer apparatus, and storage medium WO2019148797A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810090846.7A CN109344385B (en) 2018-01-30 2018-01-30 Natural language processing method, device, computer equipment and storage medium
CN201810090846.7 2018-01-30

Publications (1)

Publication Number Publication Date
WO2019148797A1 true WO2019148797A1 (en) 2019-08-08

Family

ID=65291468

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/100169 WO2019148797A1 (en) 2018-01-30 2018-08-13 Natural language processing method, device, computer apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN109344385B (en)
WO (1) WO2019148797A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737973A (en) * 2020-06-29 2020-10-02 北京明略软件系统有限公司 Natural language retrieval statement parsing method, device, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115083624A (en) * 2021-03-11 2022-09-20 海信集团控股股份有限公司 Online inquiry method, user server and user terminal equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073533A1 (en) * 2005-09-23 2007-03-29 Fuji Xerox Co., Ltd. Systems and methods for structural indexing of natural language text
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN106997376A (en) * 2017-02-28 2017-08-01 浙江大学 The problem of one kind is based on multi-stage characteristics and answer sentence similarity calculating method

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101373532A (en) * 2008-07-10 2009-02-25 昆明理工大学 FAQ Chinese request-answering system implementing method in tourism field
CN101320374A (en) * 2008-07-10 2008-12-10 昆明理工大学 Field question classification method combining syntax structural relationship and field characteristic
US9015031B2 (en) * 2011-08-04 2015-04-21 International Business Machines Corporation Predicting lexical answer types in open domain question and answering (QA) systems
US9262406B1 (en) * 2014-05-07 2016-02-16 Google Inc. Semantic frame identification with distributed word representations
CN104503998B (en) * 2014-12-05 2018-11-20 百度在线网络技术(北京)有限公司 For the kind identification method and device of user query sentence
CN104657463B (en) * 2015-02-10 2018-04-27 乐娟 Question Classification method and device applied to automatically request-answering system
KR20170096659A (en) * 2016-02-16 2017-08-25 한국전자통신연구원 Question and answer apparatus, and question and answer method thereof
CN105930452A (en) * 2016-04-21 2016-09-07 北京紫平方信息技术股份有限公司 Smart answering method capable of identifying natural language
CN106528531B (en) * 2016-10-31 2019-09-03 北京百度网讯科技有限公司 Intention analysis method and device based on artificial intelligence
CN106919655B (en) * 2017-01-24 2020-05-19 网易(杭州)网络有限公司 Answer providing method and device
CN107480133B (en) * 2017-07-25 2020-07-28 广西师范大学 Subjective question self-adaptive scoring method based on answer implication and dependency relationship

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070073533A1 (en) * 2005-09-23 2007-03-29 Fuji Xerox Co., Ltd. Systems and methods for structural indexing of natural language text
CN105701253A (en) * 2016-03-04 2016-06-22 南京大学 Chinese natural language interrogative sentence semantization knowledge base automatic question-answering method
CN106997376A (en) * 2017-02-28 2017-08-01 浙江大学 The problem of one kind is based on multi-stage characteristics and answer sentence similarity calculating method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737973A (en) * 2020-06-29 2020-10-02 北京明略软件系统有限公司 Natural language retrieval statement parsing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109344385A (en) 2019-02-15
CN109344385B (en) 2020-12-22

Similar Documents

Publication Publication Date Title
US10713441B2 (en) Hybrid learning system for natural language intent extraction from a dialog utterance
US11948058B2 (en) Utilizing recurrent neural networks to recognize and extract open intent from text inputs
WO2021042503A1 (en) Information classification extraction method, apparatus, computer device and storage medium
JP6909832B2 (en) Methods, devices, equipment and media for recognizing important words in audio
US20230142217A1 (en) Model Training Method, Electronic Device, And Storage Medium
WO2022142613A1 (en) Training corpus expansion method and apparatus, and intent recognition model training method and apparatus
US11520992B2 (en) Hybrid learning system for natural language understanding
KR20210092148A (en) Time series knowledge graph generation method, device, equipment and medium
US11031009B2 (en) Method for creating a knowledge base of components and their problems from short text utterances
TW202020691A (en) Feature word determination method and device and server
US10997223B1 (en) Subject-specific data set for named entity resolution
CN111176996A (en) Test case generation method and device, computer equipment and storage medium
KR101509727B1 (en) Apparatus for creating alignment corpus based on unsupervised alignment and method thereof, and apparatus for performing morphological analysis of non-canonical text using the alignment corpus and method thereof
CN110276080B (en) Semantic processing method and system
WO2017016286A1 (en) Multi-language semantic parsing method and apparatus
US10235350B2 (en) Detect annotation error locations through unannotated document segment partitioning
CN113821616B (en) Domain-adaptive slot filling method, device, equipment and storage medium
CN110825840B (en) Word bank expansion method, device, equipment and storage medium
WO2019148797A1 (en) Natural language processing method, device, computer apparatus, and storage medium
KR102608867B1 (en) Method for industry text increment, apparatus thereof, and computer program stored in medium
WO2020052060A1 (en) Method and apparatus for generating correction statement
US20180314683A1 (en) Method and device for processing natural language
CN111240971B (en) Method and device for generating wind control rule test case, server and storage medium
CN111400340A (en) Natural language processing method and device, computer equipment and storage medium
CN115858776A (en) Variant text classification recognition method, system, storage medium and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18904004

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.12.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18904004

Country of ref document: EP

Kind code of ref document: A1