CN110309252B

CN110309252B - Natural language processing method and device

Info

Publication number: CN110309252B
Application number: CN201810164982.6A
Authority: CN
Inventors: 李生; 王剑; 曹元斌; 温建华; 郎君; 司罗
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-02-28
Filing date: 2018-02-28
Publication date: 2023-11-24
Anticipated expiration: 2038-02-28
Also published as: CN110309252A

Abstract

The embodiment of the application discloses a natural language processing method and device. The method comprises the following steps: acquiring natural language content input by a user; carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention. By utilizing the embodiment of the application, not only the intension recognition mode of strong matching in the prior art can be weakened, but also the accuracy of the intension recognition of the user can be improved.

Description

Natural language processing method and device

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing natural language.

Background

In recent years, chat robots (chatbots) have been widely used in many technical fields, such as virtual customer service on various application platforms. And as a user personal virtual manager, the chat robot can help the user to inquire weather and news in the personal virtual manager, and help the user to arrange meeting reminding, online shopping of goods and the like. Chat robots are able to understand at a first time that the intent of a user to enter a sentence is one of the important indicators that measure chat robot performance.

In the prior art, after a user inputs a search sentence on a chat robot platform, the chat robot can acquire the search intention of the user according to the information in the search sentence, and provide corresponding service for the user according to the search intention of the user so as to meet the requirements of the user. In the process of acquiring the search intention of the user, the chat robot platform usually utilizes a static rule mode to match the search intention. Specifically, the chat robot platform may preset a plurality of static rules to express different search intentions of the user, for example, one of the static rules is "i want to see + [ movie wild content ]". For the above-mentioned static rule for the user to watch the movie, when the search sentence input by the user matches with the static rule, the chat robot platform may determine the search intention of the user to watch the movie. However, in the matching process, the chat robot platform can be matched to obtain the search requirement of the user only when the search statement of the user is matched with the static rule strongly, namely, the search statement of the user must be 'i want to see … …'. Even though the search sentences like 'i want to see … …' have very similar meanings, the chat robot platform cannot be matched to obtain the search requirement of the user.

Accordingly, there is a need in the art for a way to determine a user's search intent that can mitigate the strong matches of the prior art.

Disclosure of Invention

The embodiment of the application aims to provide a natural language processing method and device, which not only can weaken the intension recognition mode of strong matching in the prior art, but also can improve the accuracy of user intension recognition.

The natural language processing method and the device provided by the embodiment of the application are realized in the following steps:

a method of natural language processing, the method comprising:

acquiring natural language content input by a user;

carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;

acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.

A method of natural language processing, the method comprising:

acquiring natural language content input by a user;

Extracting dynamic intention descriptors in the natural language content;

performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;

and carrying out user intention matching on the natural language content by utilizing static wild rule.

A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:

acquiring natural language content input by a user;

Acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

A computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:

acquiring natural language content input by a user;

Acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

the dynamic intention descriptor is expanded in a description mode by using the descriptor with the same meaning as the dynamic intention descriptor;

A method of natural language processing, the method comprising:

acquiring natural language content input by a user;

determining user intention corresponding to the natural language content by using a machine learning model component; the deep learning model component is trained according to a plurality of historical user intentions;

based on the user intention, corresponding processing is performed.

The natural language processing method and the natural language processing device provided by the application can be used for carrying out syntactic structure processing on the natural language input by a user and obtaining the dependency relationship characteristic data in the natural language. Then, a machine learning model can be utilized to obtain the user intent corresponding to the dependency characteristic data. Compared with the mode of matching intention information in natural language content by using static rules in the prior art, the technical scheme of the application can flexibly use the dependency relationship characteristic data in the natural language content, and the dependency relationship characteristic data can more accurately express the intention information of a user, so that the natural language processing mode provided by each embodiment of the application not only can weaken the intention recognition mode of strong matching in the prior art, but also can improve the accuracy of the intention recognition of the user.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a user interface diagram of an application scenario provided by the present application;

FIG. 2 is a method flow diagram of one embodiment of a natural language processing method provided by the present application;

FIG. 3 is a schematic diagram of a syntactic structure analysis provided by the present application;

fig. 4 is a schematic block diagram of a natural language processing device according to an embodiment of the present application.

Detailed Description

In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, shall fall within the scope of the application.

In order to facilitate understanding of the technical solution provided by the embodiments of the present application by those skilled in the art, a technical environment in which the technical solution is implemented is first described below.

In recent years, user intention recognition technology has been widely applied to chat robots, and can accurately recognize that user intention is an important index for measuring performance of the chat robot. Users often use natural language to express when they are talking to chat robots (e.g., intelligent customer service, etc.). Natural language is associated with a user's personal expression habits, and different users often have different expression habits, for example, the same expression of trying means, and there are various expression modes such as "want", "try", "craving", and the like. It can be seen that natural language is random and does not have uniform expression rules, so that it is a great challenge for chat robots to be able to recognize the intent of a user from natural language. As described above, in the prior art, when the chat robot recognizes the intention expressed by the natural language input by the user, the chat robot often performs matching recognition by a static rule matching method. Only when the chat robot background has data completely consistent with the natural language input by the user, the chat robot can recognize the intention expressed in the natural language. Even if the meaning expressed by the natural language is close to the standard rule, the intention thereof cannot be recognized. Therefore, when a user performs a conversation in a chat robot in the related art, there often occurs a case where the chat robot cannot recognize an intention expressed by the user.

Based on the technical requirements similar to those described above, the natural language processing method provided by the application can perform feature extraction on the natural language input by the user, acquire feature data in the natural language of the user, and determine the user intention of the natural language according to the feature data.

The following describes a specific implementation of the method according to the present embodiment through a specific application scenario.

As shown in fig. 1, when a user chat with an intelligent customer service R on a certain e-commerce platform, the user makes a demand "please help me recommend several articles about scientific skin care" to the intelligent customer service R. After receiving natural language content 'please help me recommend several articles related to scientific skin care' proposed by a user, a background server of the electronic commerce platform carries out syntactic structure analysis on the natural language content so as to acquire dependency relationship characteristic data of the natural language content. For example, in one example, the modification relation between the descriptors in the natural language content may be represented by using a dependency tree, and then the descriptor corresponding to the root node of the dependency tree may be obtained as "recommendation", that is, the syntax core word of the natural language content is "recommendation", and the dependency word of the "recommendation" is "article", and in addition, the dynamic entity descriptor in the natural language content is "science", "skin care". Based on this, the dependency relationship feature data of the natural language can be extracted as { syntactic core word=recommendation, dependency word=article, dynamic entity descriptor=science, skin care }. Then, the dependency characteristic data is input into a pre-trained machine learning model, and the expression intention of the user is 'recommended @ sys.any article', wherein @ sys.any is a wildcard, and for the natural language content @ sys.any=scientific skin care can be obtained. Thus, after learning the user's expressed intent, the user's needs may be satisfied based on the expressed intent, as in the present scenario, the user may be presented with a plurality of articles on scientific skin care.

The natural language processing method according to the present application will be described in detail with reference to fig. 2. FIG. 2 is a method flow diagram of one embodiment of a natural language processing method provided by the present application. Although the application provides the method steps shown in the examples or figures described below, more or fewer steps may be included in the method, either on a routine or non-inventive basis. In the steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided by the embodiment of the present application. The methods may be performed sequentially or in parallel (e.g., in a parallel processor or multithreaded environment) in accordance with the methods shown in the embodiments or figures when the methods are performed in the actual natural language processing or device.

As shown in fig. 2, the method may include:

s201: natural language content input by a user is acquired.

S203: and carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content.

S205: acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.

In this embodiment, first, natural language content input by a user is acquired. Opposite to the natural language, the natural language is a human brain-to-human brain interaction tool, and the logic language is a human brain-to-computer interaction tool, such as a programming language (C language, VB language, etc.). In this embodiment, the application scenarios of inputting natural language content by the user are various, which may include the natural language that the user inputs to express the search requirement on any platform when the user performs a dialogue with the smart client and when the user performs a dialogue with the personal virtual manager. The natural language content may include phrases, sentences, or any combination of the two. The natural language content can include text content input by a user, or can include text content obtained by converting the voice content of the user, for example, the text content is converted into text content by performing natural language recognition on the voice content of the user. The following may express the user's intention to watch the XX movie in various forms, which may include, for example: "I want to watch XX movie", "help I find XX movie", "XX movie high definition", "want to watch XX movie high definition", etc.

In this embodiment, the servers on each intelligent interaction platform may process the natural language content input by the user, where the servers may include a single server, and may also include a server cluster formed by multiple servers, which is not limited herein. After receiving the natural language content input by the user, the server can perform syntactic structure analysis on the natural language content to acquire dependency relationship characteristic data of the natural language content. The dependency characteristic data can be used for expressing the dependency relationship among the descriptors in the natural language content, and further expressing the core descriptors in the natural language content.

In one embodiment of the application, the modification relation among the descriptive words in the natural language content can be obtained in a syntax structure analysis mode, and the dependency characteristic data of the natural language content can be obtained according to the modification relation. In particular, at least one descriptor may be extracted from the natural language content. In one example, the natural language content "I want to query for weather in open sky in Suzhou" is word-segmented to obtain multiple descriptors of "I", "want", "query", "Suzhou", "open sky", "weather", and so on. At least one descriptor may then be extracted therefrom, e.g., redundant ones of the plurality of descriptors, such as "having been" and "punctuation marks, may be removed. Thus, a plurality of descriptors such as "I", "want", "query", "Suzhou", "tomorrow", "weather" and the like can be extracted from the natural language "I want to query weather in tomorrow in Suzhou". Then, a modifier relationship between the at least one descriptor may be determined, in the natural language content "I want to query for weather in open sky in Suzhou," query "is the predicate of" I, "weather" is the object of "query," and so on. In one embodiment, the modified relationships between descriptors, such as Eisner's algorithm, may be obtained using a graph model-based approach. In another embodiment, the modification relation between descriptors can also be obtained based on a transfer system method, such as an arc-earer algorithm, an arc-standard algorithm, an arc-hybrid algorithm, an easy-first algorithm, and the like. Of course, in other embodiments, the modifier relationships between descriptors, such as convolutional neural network models, may also be obtained using machine learning. The application does not limit the way of obtaining the modification relation between the descriptive words.

In this embodiment, after the modification relation between the descriptors in the natural language content is obtained, a dependency relation tree of the descriptors in the natural description language may be constructed according to the modification relation. Based on the dependency tree, a syntactic core word in the natural language content may be determined and used as dependency feature data of the natural language content. Specifically, the descriptor corresponding to the root node of the dependency tree may be used as a syntactic core word of the natural language content. For example, in one example, the seminar is hosted by the asian development banking president zodiac for natural language content. "Asia", "development", "banking", "president", "Zuoguangfu", "hosting", "having been" this "," inferior "," seminar ", can be extracted therefrom. And (3) analyzing the modified relation among the descriptive words to obtain the subjects of the zodiac guangfu as the host, the seminar as the host, the Asia, the development, the bank and the president as the compound noun relation, and the like.

After determining the modifier relation among the plurality of descriptors, a dependency relation tree corresponding to the plurality of descriptors can be determined based on the modifier relation. As shown in fig. 3, the modified relationship between the above-mentioned plural descriptors can be expressed by means of directed arcs. The lower part of each descriptor is marked with a descriptor part of speech, NR is a proper noun, NN is a common noun, VV is a verb, AS is a content mark (usually only, etc.), DT is a modifier, M is a modifier, and PU is a sentence-breaking character. In fig. 3, a line indicates that two descriptors have a modifier relationship, wherein a directional arc points to the modified descriptor, the modifier relationship is marked on the line, a ROOT is a ROOT node, an NMOD is a composite noun modifier relationship, an SBJ is a subject modifier relationship, a VMOD is a verb modifier relationship, an OBJ is an object modifier relationship, and an M is a modifier relationship. In one embodiment, the dependency tree may be set according to the following rules: each descriptor can be regarded as a node, a virtual node (ROOT node) with an auxiliary function is inserted into the sentence head, all nodes are connected through directed arcs to form a tree, and the following conditions are satisfied:

any node has and has only one incoming edge except the ROOT node ROOT;

Any node has at least one outgoing edge except for leaf nodes;

the root node has only one outgoing edge, and the corresponding directed arc points to a syntactic core word which governs the whole sentence;

all directional arcs cannot intersect, and if a directional arc exists between the two nodes a and b, the projection of the directional arc between any two nodes a and b in the horizontal direction must fall on the projection of the directional arc between a and b.

Through the descriptive words corresponding to the root nodes of the dependency relationship tree, the natural language content of 'Asian development Bank president zodiac fuv' can be determined to host the seminar. The syntactic core of "is" host ". In the same way, it is also possible to determine the syntactic core word "I want to query for weather in Suzhou tomorrow" as "query".

In this embodiment, the syntax core word in the natural language content is used as the dependency relationship feature data, and machine learning is performed subsequently and by using the dependency relationship feature data, that is, the key information in the natural language content is learned, so that the data redundancy can be reduced, and the truly effective data can be learned.

In this embodiment, the dependency characteristic data may be used to characterize the intent characteristics of natural language content. In one embodiment of the application, the dependency characteristic data may further include at least one of:

The method comprises the steps of selecting a part of speech of a syntactic core word, a dependency word of the syntactic core word, the part of speech of the dependency word, a dynamic entity description word, the part of speech of the dynamic entity description word, a distance between the dynamic entity description word and the syntactic core word and a synonym set of the dynamic entity description word.

The syntactic core word and the part of speech thereof, the dependency word and the part of speech of the syntactic core word, the dynamic entity description word and the part of speech thereof have important roles in expressing the intention characteristics in natural language content. For example, the intention of the user can be expressed more when the syntactic core is a verb than when the syntactic core is a noun. The dependency words of the syntactic core word may include descriptive words having a modified relationship with the syntactic core word, for example, the syllabary fujiv hosting the seminar in asian development banking of the natural language content described above. Among the "description words having a modified relationship with the syntactic core word" host "(i.e., dependency words) include" zoveguang "," seminar ",". The parts of speech of the dependent words are nouns, nouns and punctuations respectively. In this embodiment, the dynamic entity descriptor may include entity words in a plurality of descriptors of the natural language content, for example, may include nouns in various fields, etc. For example, dynamic entity descriptors in natural language content "I want to query for weather in open sky in Suzhou" may include "Suzhou", "open sky", "weather". In this embodiment, the dependency characteristic data may further include a part of speech of the dynamic entity descriptor, a distance between the dynamic entity descriptor and the syntactic core word, and the like. Typically, the closer the distance to the syntactic core word, the more the user's intent can be expressed. Based on this, the user intent of the natural language content may be determined from the feature data. As another example, dynamic entity descriptors in the natural language content "i want the latest quote for a apple phone" may include "apple", "phone", "quote". In addition, in the present embodiment, the dependency feature data may include a set of synonyms for the dynamic entity descriptor. In practical applications, many things have various expressions, such as "shirt" and "shirt", "sun umbrella" and "beach umbrella", "waistcoat" and "waistcoat", "scarf" and "scarf", etc., which belong to two expressions of the same thing. Thus, the set of synonyms for the dynamic entity descriptor may also be used as the dependency feature data.

In this embodiment, after the natural language content input by the user is acquired, static wild rule matching may also be performed on the natural language content first. The static wildcard rule may include a plurality of preset wildcard patterns, such as "i want articles", "i want to see movies", etc., where the symbol "×" is a wildcard. Dynamic intent descriptors in the natural language content can be extracted when static rule matching is performed. The dynamic intention descriptor may include a descriptor whose part of speech that can be intended by a user is a verb in the natural language content. For example, "I want", "try", etc. verbs with obvious intent features. In this embodiment, the description mode having the same meaning as the dynamic intent descriptor may be acquired, and the dynamic intent descriptor may be replaced by the description mode. For example, the dynamic intention descriptor "want" has various description modes with the same meaning, such as "want", "try", "desire", and the like. In this embodiment, in order to generalize the dynamic intent descriptor, the dynamic intent descriptor may be replaced by the multiple description methods with the same meaning, so that when static wild rule matching is performed, matching of the description methods with the same meaning as the dynamic intent descriptor may be performed. For example, in one example, the user proposes "I want to see XX movie" and if the dynamic intent descriptor "want" in the natural language content is not word sense augmented, then static wildcard rules cannot be utilized to match the user intent of the appropriate user. In this embodiment, the dynamic intention descriptor "want" may include "want", "try", "want", and so on, and thus, may be matched to the preset general intention "i want to watch" movie ".

In this embodiment, after the dependency characteristic data of the natural language content is obtained, the dependency characteristic data may be processed by using a machine learning model component, so as to obtain the user intention corresponding to the dependency characteristic data. The machine learning model component is trained from a plurality of historical dependency feature data and correspondence between historical user intent.

In this embodiment, in the process of constructing the machine learning model component by using a machine learning manner, a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents may be obtained. After the historical natural language content is obtained, dependency relationship characteristic data in the historical natural language content can be extracted in the same manner as in the above embodiments, and the present application is not described herein. After extracting the dependency characteristic data in the historical natural language content, a machine learning model component can be constructed, wherein training parameters are arranged in the machine learning model component. And training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement. In this embodiment, the machine learning manner may further include a K-nearest neighbor algorithm, a perceptron algorithm, a decision tree, a support vector machine, a logistic bottom regression, a maximum entropy, and the like, and the generated model is, for example, naive bayes, hidden markov, and the like. Of course, in other embodiments, the machine learning model component may include a deep learning model component that may include a convolutional neural network learning model component, a recurrent neural network model component, and so on. The application is not limited in this regard.

In training the machine learning model component with the dependency characteristic data, since the number of the historical natural language contents is large, the dependency characteristic data extracted from the historical natural language contents is also large. From the above, it can be seen that dynamic entity descriptors can be included in the dependency feature data, which have an important meaning for identifying the user's user intent. In a typical entity extraction manner, after extracting entity information, it is often required to label the type of an entity, for example, to extract an entity "dress", and set a category label of the entity as "clothing". In the embodiment of the present application, after extracting the dynamic entity descriptor in the natural language content, the category label of the dynamic entity descriptor may be set to be a unified preset label, such as "KEYWORD" and "TAB", so as to avoid setting a specific type. This is done because the same entity has different types in different fields (the entity descriptor "apple" has different entity categories such as "company name" and "fruit name" in different fields), and setting an entity-specific category label brings redundant information to subsequent intention recognition, so that an intention recognition error is caused.

In one embodiment of the application, the historical user intent may include at least one preset type. For example, formulating a personal assistant, the historical user intent may include the following categories: get up to an alarm clock, view mail, view weather, etc. In the process of training the machine learning model component, the dependency relationship characteristic data can be used as input of the machine learning model component, and intentions corresponding to certain specific types of expression intention characteristic information can be used as output of the machine learning model, and the machine learning model component is continuously trained until the machine learning model reaches preset requirements. In addition, the history user intention is provided with a wild card, such as "i want articles", "i want to see movies", and the like in the above example. In this embodiment, setting the wild card in the historical user intention may enable the user intention to be based on a unified expression, and may replace the wild card with a plurality of entity information to form a plurality of information corresponding to the same expression intention. For example, for the expression intent "i want an article," wild cards ". Times.can be replaced with various entity descriptors such as" sports "," emotion "," health "," finance "to construct various user intentions that all belong to the search article needs.

The natural language processing method provided by the application can be used for carrying out syntactic structure processing on the natural language input by the user and obtaining the dependency relationship characteristic data in the natural language. Then, a machine learning model can be utilized to obtain the user intent corresponding to the dependency characteristic data. Compared with the mode of matching intention information in natural language content by using static rules in the prior art, the technical scheme of the application can flexibly use the dependency relationship characteristic data in the natural language content, and the dependency relationship characteristic data can more accurately express the intention information of a user, so that the natural language processing mode provided by each embodiment of the application not only can weaken the intention recognition mode of strong matching in the prior art, but also can improve the accuracy of the intention recognition of the user.

In another aspect, the present application further provides a natural language processing device, and fig. 4 is a schematic block diagram of an embodiment of the natural language processing device provided by the present application, as shown in fig. 4, where the natural language processing device may include a processor and a memory for storing instructions executable by the processor, where the processor implements when executing the instructions:

Acquiring natural language content input by a user;

Optionally, in an embodiment of the present application, when the implementing step performs a syntactic structure analysis on the natural language content, the obtaining dependency relationship feature data of the natural language content may include:

extracting at least one descriptor from the natural language content;

determining a modification relation between the at least one descriptor;

and determining a syntactic core word in the natural language content according to the modification relation, and taking the syntactic core word as dependency relation characteristic data of the natural language content.

Optionally, in an embodiment of the present application, the processor when implementing the determining, according to the modification relation, a syntactic core word in the natural language content may include:

Constructing a dependency relationship tree of the at least one descriptor according to the modification relationship;

and taking the descriptive word corresponding to the root node of the dependency relationship tree as a syntactic core word of the natural language content.

Optionally, in an embodiment of the present application, the dependency characteristic data may further include at least one of:

Optionally, in an embodiment of the present application, after the step of obtaining the natural language content input by the user, the processor may further include:

extracting dynamic intention descriptors in the natural language content;

acquiring a description mode which has the same meaning as the dynamic intention descriptor;

and matching the natural language content by utilizing a static wild rule, and matching the description modes which have the same meaning as the dynamic intention descriptor.

Optionally, in an embodiment of the present application, the machine learning model component is configured to train in the following manner to obtain the model component may include:

Acquiring a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents;

extracting dependency relationship characteristic data of the plurality of historical natural language contents respectively;

constructing a machine learning model component, wherein training parameters are arranged in the machine learning model component;

and training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement.

Optionally, in an embodiment of the present application, the historical user intentions corresponding to the plurality of historical natural language contents respectively may include at least one preset type, and wild cards are set in the historical user intentions.

Optionally, in an embodiment of the present application, after the step of obtaining the plurality of historical natural language contents and the historical user intentions corresponding to the plurality of historical natural language contents respectively, the processor may further include:

Extracting dynamic entity description words in the plurality of historical natural language contents;

setting the category labels of the dynamic entity descriptors as unified preset labels.

Optionally, in an embodiment of the present application, the natural language content includes text content input by a user, and/or text content converted according to voice content input by the user.

Another aspect of the present application provides a schematic block diagram of another embodiment of a natural language processing apparatus, where the apparatus includes a processor and a memory for storing instructions executable by the processor, where the processor executes the instructions to implement:

acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

Another aspect of the application also provides a computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:

acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

The computer readable storage medium may include physical means for storing information, typically by digitizing the information and then storing the information in a medium using electrical, magnetic, or optical means. The computer readable storage medium according to the present embodiment may include: means for storing information using electrical energy such as various memories, e.g., RAM, ROM, etc.; devices for storing information using magnetic energy such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, and USB flash disk; devices for optically storing information, such as CDs or DVDs. Of course, there are other ways of readable storage medium, such as quantum memory, graphene memory, etc.

Although the application provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented by an actual device or client product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment) as shown in the embodiments or figures.

Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.

Various embodiments in this specification are described in a progressive manner, and identical or similar parts are all provided for each embodiment, each embodiment focusing on differences from other embodiments. The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

Although the present application has been described by way of examples, one of ordinary skill in the art appreciates that there are many variations and modifications that do not depart from the spirit of the application, and it is intended that the appended claims encompass such variations and modifications as fall within the spirit of the application.

Claims

1. A method of natural language processing, the method comprising:

acquiring natural language content input by a user;

carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency relationship feature data comprises distances among the syntactic core words, the dynamic entity descriptor and the syntactic core words; the dependency words of the syntactic core words comprise descriptive words with a modified relation with the syntactic core words;

2. The method of claim 1, wherein the parsing the natural language content to obtain dependency characteristic data of the natural language content comprises:

extracting at least one descriptor from the natural language content;

determining a modification relation between the at least one descriptor;

3. The method of claim 2, wherein said determining syntactic core words in the natural language content from the modifier relation comprises:

4. The method of claim 2, wherein the dependency characteristic data further comprises at least one of:

the part of speech of the syntactic core word, the dependency of the syntactic core word, the part of speech of the dependency, the dynamic entity descriptor, the part of speech of the dynamic entity descriptor, and the synonym set of the dynamic entity descriptor.

5. The method of claim 1, wherein after the obtaining the user-entered natural language content, the method further comprises:

extracting dynamic intention descriptors in the natural language content;

6. The method of claim 1, wherein the machine learning model component is configured to train to include:

7. The method of claim 6, wherein the historical user intents respectively corresponding to the plurality of historical natural language content include at least one preset type, and wherein wild cards are provided in the historical user intents.

8. The method of claim 6, wherein after the obtaining a plurality of historical natural language content and the historical user intent to which the plurality of historical natural language content respectively correspond, the method further comprises:

9. The method of claim 1, wherein the natural language content comprises user-entered text content and/or text content converted from user-entered speech content.

10. A method of natural language processing, the method comprising:

acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

user intention matching is carried out on the natural language content by utilizing a static wild rule; wherein, include: carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.

11. A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:

acquiring natural language content input by a user;

carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words;

12. The apparatus of claim 11, wherein the processor, when performing the step of parsing the natural language content to obtain dependency characteristic data for the natural language content, comprises:

extracting at least one descriptor from the natural language content;

determining a modification relation between the at least one descriptor;

13. The apparatus of claim 12, wherein the processor, when implementing the step of determining syntactic core words in the natural language content according to the modifier relation, comprises:

14. The apparatus of claim 12, wherein the dependency characteristic data further comprises at least one of:

15. The apparatus of claim 11, wherein the processor, after the step of implementing, obtains the natural language content entered by the user, further comprises:

extracting dynamic intention descriptors in the natural language content;

16. The apparatus of claim 11, wherein the machine learning model component is configured to train to include:

17. The apparatus of claim 16, wherein the historical user intents for each of the plurality of historical natural language content includes at least one preset type and wild cards are provided in the historical user intents.

18. The apparatus of claim 16, wherein the processor, after the step of implementing, obtains a plurality of historical natural language contents and historical user intents respectively corresponding to the plurality of historical natural language contents, further comprises:

19. The apparatus of claim 11, wherein the natural language content comprises user-entered text content and/or text content converted from user-entered speech content.

20. A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:

acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;

user intention matching is carried out on the natural language content by utilizing a static wild rule; wherein, include: carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency relationship feature data comprises distances among the syntactic core words, the dynamic entity descriptor and the syntactic core words; the dependency words of the syntactic core words comprise descriptive words with a modified relation with the syntactic core words; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.

21. A computer readable storage medium having stored thereon computer instructions, the instructions when executed performing the steps of:

acquiring natural language content input by a user;

22. A computer readable storage medium having stored thereon computer instructions, the instructions when executed performing the steps of:

Acquiring natural language content input by a user;

extracting dynamic intention descriptors in the natural language content;