CN110309252B - Natural language processing method and device - Google Patents

Natural language processing method and device Download PDF

Info

Publication number
CN110309252B
CN110309252B CN201810164982.6A CN201810164982A CN110309252B CN 110309252 B CN110309252 B CN 110309252B CN 201810164982 A CN201810164982 A CN 201810164982A CN 110309252 B CN110309252 B CN 110309252B
Authority
CN
China
Prior art keywords
natural language
dependency
descriptor
language content
characteristic data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810164982.6A
Other languages
Chinese (zh)
Other versions
CN110309252A (en
Inventor
李生
王剑
曹元斌
温建华
郎君
司罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201810164982.6A priority Critical patent/CN110309252B/en
Publication of CN110309252A publication Critical patent/CN110309252A/en
Application granted granted Critical
Publication of CN110309252B publication Critical patent/CN110309252B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application discloses a natural language processing method and device. The method comprises the following steps: acquiring natural language content input by a user; carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention. By utilizing the embodiment of the application, not only the intension recognition mode of strong matching in the prior art can be weakened, but also the accuracy of the intension recognition of the user can be improved.

Description

Natural language processing method and device
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing natural language.
Background
In recent years, chat robots (chatbots) have been widely used in many technical fields, such as virtual customer service on various application platforms. And as a user personal virtual manager, the chat robot can help the user to inquire weather and news in the personal virtual manager, and help the user to arrange meeting reminding, online shopping of goods and the like. Chat robots are able to understand at a first time that the intent of a user to enter a sentence is one of the important indicators that measure chat robot performance.
In the prior art, after a user inputs a search sentence on a chat robot platform, the chat robot can acquire the search intention of the user according to the information in the search sentence, and provide corresponding service for the user according to the search intention of the user so as to meet the requirements of the user. In the process of acquiring the search intention of the user, the chat robot platform usually utilizes a static rule mode to match the search intention. Specifically, the chat robot platform may preset a plurality of static rules to express different search intentions of the user, for example, one of the static rules is "i want to see + [ movie wild content ]". For the above-mentioned static rule for the user to watch the movie, when the search sentence input by the user matches with the static rule, the chat robot platform may determine the search intention of the user to watch the movie. However, in the matching process, the chat robot platform can be matched to obtain the search requirement of the user only when the search statement of the user is matched with the static rule strongly, namely, the search statement of the user must be 'i want to see … …'. Even though the search sentences like 'i want to see … …' have very similar meanings, the chat robot platform cannot be matched to obtain the search requirement of the user.
Accordingly, there is a need in the art for a way to determine a user's search intent that can mitigate the strong matches of the prior art.
Disclosure of Invention
The embodiment of the application aims to provide a natural language processing method and device, which not only can weaken the intension recognition mode of strong matching in the prior art, but also can improve the accuracy of user intension recognition.
The natural language processing method and the device provided by the embodiment of the application are realized in the following steps:
a method of natural language processing, the method comprising:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
A method of natural language processing, the method comprising:
acquiring natural language content input by a user;
Extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:
Acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
A computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
A computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:
Acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
the dynamic intention descriptor is expanded in a description mode by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
A method of natural language processing, the method comprising:
acquiring natural language content input by a user;
determining user intention corresponding to the natural language content by using a machine learning model component; the deep learning model component is trained according to a plurality of historical user intentions;
based on the user intention, corresponding processing is performed.
The natural language processing method and the natural language processing device provided by the application can be used for carrying out syntactic structure processing on the natural language input by a user and obtaining the dependency relationship characteristic data in the natural language. Then, a machine learning model can be utilized to obtain the user intent corresponding to the dependency characteristic data. Compared with the mode of matching intention information in natural language content by using static rules in the prior art, the technical scheme of the application can flexibly use the dependency relationship characteristic data in the natural language content, and the dependency relationship characteristic data can more accurately express the intention information of a user, so that the natural language processing mode provided by each embodiment of the application not only can weaken the intention recognition mode of strong matching in the prior art, but also can improve the accuracy of the intention recognition of the user.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a user interface diagram of an application scenario provided by the present application;
FIG. 2 is a method flow diagram of one embodiment of a natural language processing method provided by the present application;
FIG. 3 is a schematic diagram of a syntactic structure analysis provided by the present application;
fig. 4 is a schematic block diagram of a natural language processing device according to an embodiment of the present application.
Detailed Description
In order to make the technical solution of the present application better understood by those skilled in the art, the technical solution of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, shall fall within the scope of the application.
In order to facilitate understanding of the technical solution provided by the embodiments of the present application by those skilled in the art, a technical environment in which the technical solution is implemented is first described below.
In recent years, user intention recognition technology has been widely applied to chat robots, and can accurately recognize that user intention is an important index for measuring performance of the chat robot. Users often use natural language to express when they are talking to chat robots (e.g., intelligent customer service, etc.). Natural language is associated with a user's personal expression habits, and different users often have different expression habits, for example, the same expression of trying means, and there are various expression modes such as "want", "try", "craving", and the like. It can be seen that natural language is random and does not have uniform expression rules, so that it is a great challenge for chat robots to be able to recognize the intent of a user from natural language. As described above, in the prior art, when the chat robot recognizes the intention expressed by the natural language input by the user, the chat robot often performs matching recognition by a static rule matching method. Only when the chat robot background has data completely consistent with the natural language input by the user, the chat robot can recognize the intention expressed in the natural language. Even if the meaning expressed by the natural language is close to the standard rule, the intention thereof cannot be recognized. Therefore, when a user performs a conversation in a chat robot in the related art, there often occurs a case where the chat robot cannot recognize an intention expressed by the user.
Based on the technical requirements similar to those described above, the natural language processing method provided by the application can perform feature extraction on the natural language input by the user, acquire feature data in the natural language of the user, and determine the user intention of the natural language according to the feature data.
The following describes a specific implementation of the method according to the present embodiment through a specific application scenario.
As shown in fig. 1, when a user chat with an intelligent customer service R on a certain e-commerce platform, the user makes a demand "please help me recommend several articles about scientific skin care" to the intelligent customer service R. After receiving natural language content 'please help me recommend several articles related to scientific skin care' proposed by a user, a background server of the electronic commerce platform carries out syntactic structure analysis on the natural language content so as to acquire dependency relationship characteristic data of the natural language content. For example, in one example, the modification relation between the descriptors in the natural language content may be represented by using a dependency tree, and then the descriptor corresponding to the root node of the dependency tree may be obtained as "recommendation", that is, the syntax core word of the natural language content is "recommendation", and the dependency word of the "recommendation" is "article", and in addition, the dynamic entity descriptor in the natural language content is "science", "skin care". Based on this, the dependency relationship feature data of the natural language can be extracted as { syntactic core word=recommendation, dependency word=article, dynamic entity descriptor=science, skin care }. Then, the dependency characteristic data is input into a pre-trained machine learning model, and the expression intention of the user is 'recommended @ sys.any article', wherein @ sys.any is a wildcard, and for the natural language content @ sys.any=scientific skin care can be obtained. Thus, after learning the user's expressed intent, the user's needs may be satisfied based on the expressed intent, as in the present scenario, the user may be presented with a plurality of articles on scientific skin care.
The natural language processing method according to the present application will be described in detail with reference to fig. 2. FIG. 2 is a method flow diagram of one embodiment of a natural language processing method provided by the present application. Although the application provides the method steps shown in the examples or figures described below, more or fewer steps may be included in the method, either on a routine or non-inventive basis. In the steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided by the embodiment of the present application. The methods may be performed sequentially or in parallel (e.g., in a parallel processor or multithreaded environment) in accordance with the methods shown in the embodiments or figures when the methods are performed in the actual natural language processing or device.
As shown in fig. 2, the method may include:
s201: natural language content input by a user is acquired.
S203: and carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content.
S205: acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
In this embodiment, first, natural language content input by a user is acquired. Opposite to the natural language, the natural language is a human brain-to-human brain interaction tool, and the logic language is a human brain-to-computer interaction tool, such as a programming language (C language, VB language, etc.). In this embodiment, the application scenarios of inputting natural language content by the user are various, which may include the natural language that the user inputs to express the search requirement on any platform when the user performs a dialogue with the smart client and when the user performs a dialogue with the personal virtual manager. The natural language content may include phrases, sentences, or any combination of the two. The natural language content can include text content input by a user, or can include text content obtained by converting the voice content of the user, for example, the text content is converted into text content by performing natural language recognition on the voice content of the user. The following may express the user's intention to watch the XX movie in various forms, which may include, for example: "I want to watch XX movie", "help I find XX movie", "XX movie high definition", "want to watch XX movie high definition", etc.
In this embodiment, the servers on each intelligent interaction platform may process the natural language content input by the user, where the servers may include a single server, and may also include a server cluster formed by multiple servers, which is not limited herein. After receiving the natural language content input by the user, the server can perform syntactic structure analysis on the natural language content to acquire dependency relationship characteristic data of the natural language content. The dependency characteristic data can be used for expressing the dependency relationship among the descriptors in the natural language content, and further expressing the core descriptors in the natural language content.
In one embodiment of the application, the modification relation among the descriptive words in the natural language content can be obtained in a syntax structure analysis mode, and the dependency characteristic data of the natural language content can be obtained according to the modification relation. In particular, at least one descriptor may be extracted from the natural language content. In one example, the natural language content "I want to query for weather in open sky in Suzhou" is word-segmented to obtain multiple descriptors of "I", "want", "query", "Suzhou", "open sky", "weather", and so on. At least one descriptor may then be extracted therefrom, e.g., redundant ones of the plurality of descriptors, such as "having been" and "punctuation marks, may be removed. Thus, a plurality of descriptors such as "I", "want", "query", "Suzhou", "tomorrow", "weather" and the like can be extracted from the natural language "I want to query weather in tomorrow in Suzhou". Then, a modifier relationship between the at least one descriptor may be determined, in the natural language content "I want to query for weather in open sky in Suzhou," query "is the predicate of" I, "weather" is the object of "query," and so on. In one embodiment, the modified relationships between descriptors, such as Eisner's algorithm, may be obtained using a graph model-based approach. In another embodiment, the modification relation between descriptors can also be obtained based on a transfer system method, such as an arc-earer algorithm, an arc-standard algorithm, an arc-hybrid algorithm, an easy-first algorithm, and the like. Of course, in other embodiments, the modifier relationships between descriptors, such as convolutional neural network models, may also be obtained using machine learning. The application does not limit the way of obtaining the modification relation between the descriptive words.
In this embodiment, after the modification relation between the descriptors in the natural language content is obtained, a dependency relation tree of the descriptors in the natural description language may be constructed according to the modification relation. Based on the dependency tree, a syntactic core word in the natural language content may be determined and used as dependency feature data of the natural language content. Specifically, the descriptor corresponding to the root node of the dependency tree may be used as a syntactic core word of the natural language content. For example, in one example, the seminar is hosted by the asian development banking president zodiac for natural language content. "Asia", "development", "banking", "president", "Zuoguangfu", "hosting", "having been" this "," inferior "," seminar ", can be extracted therefrom. And (3) analyzing the modified relation among the descriptive words to obtain the subjects of the zodiac guangfu as the host, the seminar as the host, the Asia, the development, the bank and the president as the compound noun relation, and the like.
After determining the modifier relation among the plurality of descriptors, a dependency relation tree corresponding to the plurality of descriptors can be determined based on the modifier relation. As shown in fig. 3, the modified relationship between the above-mentioned plural descriptors can be expressed by means of directed arcs. The lower part of each descriptor is marked with a descriptor part of speech, NR is a proper noun, NN is a common noun, VV is a verb, AS is a content mark (usually only, etc.), DT is a modifier, M is a modifier, and PU is a sentence-breaking character. In fig. 3, a line indicates that two descriptors have a modifier relationship, wherein a directional arc points to the modified descriptor, the modifier relationship is marked on the line, a ROOT is a ROOT node, an NMOD is a composite noun modifier relationship, an SBJ is a subject modifier relationship, a VMOD is a verb modifier relationship, an OBJ is an object modifier relationship, and an M is a modifier relationship. In one embodiment, the dependency tree may be set according to the following rules: each descriptor can be regarded as a node, a virtual node (ROOT node) with an auxiliary function is inserted into the sentence head, all nodes are connected through directed arcs to form a tree, and the following conditions are satisfied:
any node has and has only one incoming edge except the ROOT node ROOT;
Any node has at least one outgoing edge except for leaf nodes;
the root node has only one outgoing edge, and the corresponding directed arc points to a syntactic core word which governs the whole sentence;
all directional arcs cannot intersect, and if a directional arc exists between the two nodes a and b, the projection of the directional arc between any two nodes a and b in the horizontal direction must fall on the projection of the directional arc between a and b.
Through the descriptive words corresponding to the root nodes of the dependency relationship tree, the natural language content of 'Asian development Bank president zodiac fuv' can be determined to host the seminar. The syntactic core of "is" host ". In the same way, it is also possible to determine the syntactic core word "I want to query for weather in Suzhou tomorrow" as "query".
In this embodiment, the syntax core word in the natural language content is used as the dependency relationship feature data, and machine learning is performed subsequently and by using the dependency relationship feature data, that is, the key information in the natural language content is learned, so that the data redundancy can be reduced, and the truly effective data can be learned.
In this embodiment, the dependency characteristic data may be used to characterize the intent characteristics of natural language content. In one embodiment of the application, the dependency characteristic data may further include at least one of:
The method comprises the steps of selecting a part of speech of a syntactic core word, a dependency word of the syntactic core word, the part of speech of the dependency word, a dynamic entity description word, the part of speech of the dynamic entity description word, a distance between the dynamic entity description word and the syntactic core word and a synonym set of the dynamic entity description word.
The syntactic core word and the part of speech thereof, the dependency word and the part of speech of the syntactic core word, the dynamic entity description word and the part of speech thereof have important roles in expressing the intention characteristics in natural language content. For example, the intention of the user can be expressed more when the syntactic core is a verb than when the syntactic core is a noun. The dependency words of the syntactic core word may include descriptive words having a modified relationship with the syntactic core word, for example, the syllabary fujiv hosting the seminar in asian development banking of the natural language content described above. Among the "description words having a modified relationship with the syntactic core word" host "(i.e., dependency words) include" zoveguang "," seminar ",". The parts of speech of the dependent words are nouns, nouns and punctuations respectively. In this embodiment, the dynamic entity descriptor may include entity words in a plurality of descriptors of the natural language content, for example, may include nouns in various fields, etc. For example, dynamic entity descriptors in natural language content "I want to query for weather in open sky in Suzhou" may include "Suzhou", "open sky", "weather". In this embodiment, the dependency characteristic data may further include a part of speech of the dynamic entity descriptor, a distance between the dynamic entity descriptor and the syntactic core word, and the like. Typically, the closer the distance to the syntactic core word, the more the user's intent can be expressed. Based on this, the user intent of the natural language content may be determined from the feature data. As another example, dynamic entity descriptors in the natural language content "i want the latest quote for a apple phone" may include "apple", "phone", "quote". In addition, in the present embodiment, the dependency feature data may include a set of synonyms for the dynamic entity descriptor. In practical applications, many things have various expressions, such as "shirt" and "shirt", "sun umbrella" and "beach umbrella", "waistcoat" and "waistcoat", "scarf" and "scarf", etc., which belong to two expressions of the same thing. Thus, the set of synonyms for the dynamic entity descriptor may also be used as the dependency feature data.
In this embodiment, after the natural language content input by the user is acquired, static wild rule matching may also be performed on the natural language content first. The static wildcard rule may include a plurality of preset wildcard patterns, such as "i want articles", "i want to see movies", etc., where the symbol "×" is a wildcard. Dynamic intent descriptors in the natural language content can be extracted when static rule matching is performed. The dynamic intention descriptor may include a descriptor whose part of speech that can be intended by a user is a verb in the natural language content. For example, "I want", "try", etc. verbs with obvious intent features. In this embodiment, the description mode having the same meaning as the dynamic intent descriptor may be acquired, and the dynamic intent descriptor may be replaced by the description mode. For example, the dynamic intention descriptor "want" has various description modes with the same meaning, such as "want", "try", "desire", and the like. In this embodiment, in order to generalize the dynamic intent descriptor, the dynamic intent descriptor may be replaced by the multiple description methods with the same meaning, so that when static wild rule matching is performed, matching of the description methods with the same meaning as the dynamic intent descriptor may be performed. For example, in one example, the user proposes "I want to see XX movie" and if the dynamic intent descriptor "want" in the natural language content is not word sense augmented, then static wildcard rules cannot be utilized to match the user intent of the appropriate user. In this embodiment, the dynamic intention descriptor "want" may include "want", "try", "want", and so on, and thus, may be matched to the preset general intention "i want to watch" movie ".
In this embodiment, after the dependency characteristic data of the natural language content is obtained, the dependency characteristic data may be processed by using a machine learning model component, so as to obtain the user intention corresponding to the dependency characteristic data. The machine learning model component is trained from a plurality of historical dependency feature data and correspondence between historical user intent.
In this embodiment, in the process of constructing the machine learning model component by using a machine learning manner, a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents may be obtained. After the historical natural language content is obtained, dependency relationship characteristic data in the historical natural language content can be extracted in the same manner as in the above embodiments, and the present application is not described herein. After extracting the dependency characteristic data in the historical natural language content, a machine learning model component can be constructed, wherein training parameters are arranged in the machine learning model component. And training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement. In this embodiment, the machine learning manner may further include a K-nearest neighbor algorithm, a perceptron algorithm, a decision tree, a support vector machine, a logistic bottom regression, a maximum entropy, and the like, and the generated model is, for example, naive bayes, hidden markov, and the like. Of course, in other embodiments, the machine learning model component may include a deep learning model component that may include a convolutional neural network learning model component, a recurrent neural network model component, and so on. The application is not limited in this regard.
In training the machine learning model component with the dependency characteristic data, since the number of the historical natural language contents is large, the dependency characteristic data extracted from the historical natural language contents is also large. From the above, it can be seen that dynamic entity descriptors can be included in the dependency feature data, which have an important meaning for identifying the user's user intent. In a typical entity extraction manner, after extracting entity information, it is often required to label the type of an entity, for example, to extract an entity "dress", and set a category label of the entity as "clothing". In the embodiment of the present application, after extracting the dynamic entity descriptor in the natural language content, the category label of the dynamic entity descriptor may be set to be a unified preset label, such as "KEYWORD" and "TAB", so as to avoid setting a specific type. This is done because the same entity has different types in different fields (the entity descriptor "apple" has different entity categories such as "company name" and "fruit name" in different fields), and setting an entity-specific category label brings redundant information to subsequent intention recognition, so that an intention recognition error is caused.
In one embodiment of the application, the historical user intent may include at least one preset type. For example, formulating a personal assistant, the historical user intent may include the following categories: get up to an alarm clock, view mail, view weather, etc. In the process of training the machine learning model component, the dependency relationship characteristic data can be used as input of the machine learning model component, and intentions corresponding to certain specific types of expression intention characteristic information can be used as output of the machine learning model, and the machine learning model component is continuously trained until the machine learning model reaches preset requirements. In addition, the history user intention is provided with a wild card, such as "i want articles", "i want to see movies", and the like in the above example. In this embodiment, setting the wild card in the historical user intention may enable the user intention to be based on a unified expression, and may replace the wild card with a plurality of entity information to form a plurality of information corresponding to the same expression intention. For example, for the expression intent "i want an article," wild cards ". Times.can be replaced with various entity descriptors such as" sports "," emotion "," health "," finance "to construct various user intentions that all belong to the search article needs.
The natural language processing method provided by the application can be used for carrying out syntactic structure processing on the natural language input by the user and obtaining the dependency relationship characteristic data in the natural language. Then, a machine learning model can be utilized to obtain the user intent corresponding to the dependency characteristic data. Compared with the mode of matching intention information in natural language content by using static rules in the prior art, the technical scheme of the application can flexibly use the dependency relationship characteristic data in the natural language content, and the dependency relationship characteristic data can more accurately express the intention information of a user, so that the natural language processing mode provided by each embodiment of the application not only can weaken the intention recognition mode of strong matching in the prior art, but also can improve the accuracy of the intention recognition of the user.
In another aspect, the present application further provides a natural language processing device, and fig. 4 is a schematic block diagram of an embodiment of the natural language processing device provided by the present application, as shown in fig. 4, where the natural language processing device may include a processor and a memory for storing instructions executable by the processor, where the processor implements when executing the instructions:
Acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
Optionally, in an embodiment of the present application, when the implementing step performs a syntactic structure analysis on the natural language content, the obtaining dependency relationship feature data of the natural language content may include:
extracting at least one descriptor from the natural language content;
determining a modification relation between the at least one descriptor;
and determining a syntactic core word in the natural language content according to the modification relation, and taking the syntactic core word as dependency relation characteristic data of the natural language content.
Optionally, in an embodiment of the present application, the processor when implementing the determining, according to the modification relation, a syntactic core word in the natural language content may include:
Constructing a dependency relationship tree of the at least one descriptor according to the modification relationship;
and taking the descriptive word corresponding to the root node of the dependency relationship tree as a syntactic core word of the natural language content.
Optionally, in an embodiment of the present application, the dependency characteristic data may further include at least one of:
the method comprises the steps of selecting a part of speech of a syntactic core word, a dependency word of the syntactic core word, the part of speech of the dependency word, a dynamic entity description word, the part of speech of the dynamic entity description word, a distance between the dynamic entity description word and the syntactic core word and a synonym set of the dynamic entity description word.
Optionally, in an embodiment of the present application, after the step of obtaining the natural language content input by the user, the processor may further include:
extracting dynamic intention descriptors in the natural language content;
acquiring a description mode which has the same meaning as the dynamic intention descriptor;
and matching the natural language content by utilizing a static wild rule, and matching the description modes which have the same meaning as the dynamic intention descriptor.
Optionally, in an embodiment of the present application, the machine learning model component is configured to train in the following manner to obtain the model component may include:
Acquiring a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents;
extracting dependency relationship characteristic data of the plurality of historical natural language contents respectively;
constructing a machine learning model component, wherein training parameters are arranged in the machine learning model component;
and training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement.
Optionally, in an embodiment of the present application, the historical user intentions corresponding to the plurality of historical natural language contents respectively may include at least one preset type, and wild cards are set in the historical user intentions.
Optionally, in an embodiment of the present application, after the step of obtaining the plurality of historical natural language contents and the historical user intentions corresponding to the plurality of historical natural language contents respectively, the processor may further include:
Extracting dynamic entity description words in the plurality of historical natural language contents;
setting the category labels of the dynamic entity descriptors as unified preset labels.
Optionally, in an embodiment of the present application, the natural language content includes text content input by a user, and/or text content converted according to voice content input by the user.
Another aspect of the present application provides a schematic block diagram of another embodiment of a natural language processing apparatus, where the apparatus includes a processor and a memory for storing instructions executable by the processor, where the processor executes the instructions to implement:
acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
Another aspect of the application also provides a computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:
acquiring natural language content input by a user;
Carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
Another aspect of the application also provides a computer readable storage medium having stored thereon computer instructions that when executed perform the steps of:
acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
The computer readable storage medium may include physical means for storing information, typically by digitizing the information and then storing the information in a medium using electrical, magnetic, or optical means. The computer readable storage medium according to the present embodiment may include: means for storing information using electrical energy such as various memories, e.g., RAM, ROM, etc.; devices for storing information using magnetic energy such as hard disk, floppy disk, magnetic tape, magnetic core memory, bubble memory, and USB flash disk; devices for optically storing information, such as CDs or DVDs. Of course, there are other ways of readable storage medium, such as quantum memory, graphene memory, etc.
Although the application provides method operational steps as described in the examples or flowcharts, more or fewer operational steps may be included based on conventional or non-inventive means. The order of steps recited in the embodiments is merely one way of performing the order of steps and does not represent a unique order of execution. When implemented by an actual device or client product, the instructions may be executed sequentially or in parallel (e.g., in a parallel processor or multi-threaded processing environment) as shown in the embodiments or figures.
Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
Various embodiments in this specification are described in a progressive manner, and identical or similar parts are all provided for each embodiment, each embodiment focusing on differences from other embodiments. The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Although the present application has been described by way of examples, one of ordinary skill in the art appreciates that there are many variations and modifications that do not depart from the spirit of the application, and it is intended that the appended claims encompass such variations and modifications as fall within the spirit of the application.

Claims (22)

1. A method of natural language processing, the method comprising:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency relationship feature data comprises distances among the syntactic core words, the dynamic entity descriptor and the syntactic core words; the dependency words of the syntactic core words comprise descriptive words with a modified relation with the syntactic core words;
Acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
2. The method of claim 1, wherein the parsing the natural language content to obtain dependency characteristic data of the natural language content comprises:
extracting at least one descriptor from the natural language content;
determining a modification relation between the at least one descriptor;
and determining a syntactic core word in the natural language content according to the modification relation, and taking the syntactic core word as dependency relation characteristic data of the natural language content.
3. The method of claim 2, wherein said determining syntactic core words in the natural language content from the modifier relation comprises:
constructing a dependency relationship tree of the at least one descriptor according to the modification relationship;
and taking the descriptive word corresponding to the root node of the dependency relationship tree as a syntactic core word of the natural language content.
4. The method of claim 2, wherein the dependency characteristic data further comprises at least one of:
the part of speech of the syntactic core word, the dependency of the syntactic core word, the part of speech of the dependency, the dynamic entity descriptor, the part of speech of the dynamic entity descriptor, and the synonym set of the dynamic entity descriptor.
5. The method of claim 1, wherein after the obtaining the user-entered natural language content, the method further comprises:
extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
and carrying out user intention matching on the natural language content by utilizing static wild rule.
6. The method of claim 1, wherein the machine learning model component is configured to train to include:
acquiring a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents;
extracting dependency relationship characteristic data of the plurality of historical natural language contents respectively;
Constructing a machine learning model component, wherein training parameters are arranged in the machine learning model component;
and training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement.
7. The method of claim 6, wherein the historical user intents respectively corresponding to the plurality of historical natural language content include at least one preset type, and wherein wild cards are provided in the historical user intents.
8. The method of claim 6, wherein after the obtaining a plurality of historical natural language content and the historical user intent to which the plurality of historical natural language content respectively correspond, the method further comprises:
extracting dynamic entity description words in the plurality of historical natural language contents;
setting the category labels of the dynamic entity descriptors as unified preset labels.
9. The method of claim 1, wherein the natural language content comprises user-entered text content and/or text content converted from user-entered speech content.
10. A method of natural language processing, the method comprising:
acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
user intention matching is carried out on the natural language content by utilizing a static wild rule; wherein, include: carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
11. A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
12. The apparatus of claim 11, wherein the processor, when performing the step of parsing the natural language content to obtain dependency characteristic data for the natural language content, comprises:
extracting at least one descriptor from the natural language content;
determining a modification relation between the at least one descriptor;
and determining a syntactic core word in the natural language content according to the modification relation, and taking the syntactic core word as dependency relation characteristic data of the natural language content.
13. The apparatus of claim 12, wherein the processor, when implementing the step of determining syntactic core words in the natural language content according to the modifier relation, comprises:
constructing a dependency relationship tree of the at least one descriptor according to the modification relationship;
and taking the descriptive word corresponding to the root node of the dependency relationship tree as a syntactic core word of the natural language content.
14. The apparatus of claim 12, wherein the dependency characteristic data further comprises at least one of:
the method comprises the steps of selecting a part of speech of a syntactic core word, a dependency word of the syntactic core word, the part of speech of the dependency word, a dynamic entity description word, the part of speech of the dynamic entity description word, a distance between the dynamic entity description word and the syntactic core word and a synonym set of the dynamic entity description word.
15. The apparatus of claim 11, wherein the processor, after the step of implementing, obtains the natural language content entered by the user, further comprises:
extracting dynamic intention descriptors in the natural language content;
acquiring a description mode which has the same meaning as the dynamic intention descriptor;
and matching the natural language content by utilizing a static wild rule, and matching the description modes which have the same meaning as the dynamic intention descriptor.
16. The apparatus of claim 11, wherein the machine learning model component is configured to train to include:
acquiring a plurality of historical natural language contents and historical user intentions respectively corresponding to the plurality of historical natural language contents;
extracting dependency relationship characteristic data of the plurality of historical natural language contents respectively;
constructing a machine learning model component, wherein training parameters are arranged in the machine learning model component;
and training the machine learning model component by using the dependency relationship characteristic data of the historical natural language content as input data of the machine learning model component and the historical user intention as output data and utilizing the corresponding relationship between the dependency relationship characteristic data and the historical user intention, and adjusting the training parameters until the machine learning model component reaches a preset requirement.
17. The apparatus of claim 16, wherein the historical user intents for each of the plurality of historical natural language content includes at least one preset type and wild cards are provided in the historical user intents.
18. The apparatus of claim 16, wherein the processor, after the step of implementing, obtains a plurality of historical natural language contents and historical user intents respectively corresponding to the plurality of historical natural language contents, further comprises:
extracting dynamic entity description words in the plurality of historical natural language contents;
setting the category labels of the dynamic entity descriptors as unified preset labels.
19. The apparatus of claim 11, wherein the natural language content comprises user-entered text content and/or text content converted from user-entered speech content.
20. A natural language processing apparatus comprising a processor and a memory for storing processor-executable instructions, the processor implementing when executing the instructions:
acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
Performing synonym expansion on the dynamic intention descriptor by using the descriptor with the same meaning as the dynamic intention descriptor;
user intention matching is carried out on the natural language content by utilizing a static wild rule; wherein, include: carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency relationship feature data comprises distances among the syntactic core words, the dynamic entity descriptor and the syntactic core words; the dependency words of the syntactic core words comprise descriptive words with a modified relation with the syntactic core words; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
21. A computer readable storage medium having stored thereon computer instructions, the instructions when executed performing the steps of:
acquiring natural language content input by a user;
carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words;
acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
22. A computer readable storage medium having stored thereon computer instructions, the instructions when executed performing the steps of:
Acquiring natural language content input by a user;
extracting dynamic intention descriptors in the natural language content;
the dynamic intention descriptor is expanded in a description mode by using the descriptor with the same meaning as the dynamic intention descriptor;
user intention matching is carried out on the natural language content by utilizing a static wild rule; wherein, include: carrying out syntactic structure analysis on the natural language content to obtain dependency relationship characteristic data of the natural language content, wherein the dependency relationship characteristic data is used for representing the dependency relationship among descriptive words in the natural language content; the descriptor comprises a syntactic core word, a dependency word corresponding to the syntactic core word and a dynamic entity descriptor; the dynamic entity descriptor is an entity word in the descriptor; the dependency feature data includes the syntactic core word; the dependency words of the syntactic core words comprise description words with a modification relation with the syntactic core words, and distances between the dynamic entity description words and the syntactic core words; acquiring user intention corresponding to the dependency characteristic data by using a machine learning model component; the machine learning model component is trained according to the corresponding relation between the plurality of historical dependency relation characteristic data and the historical user intention.
CN201810164982.6A 2018-02-28 2018-02-28 Natural language processing method and device Active CN110309252B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810164982.6A CN110309252B (en) 2018-02-28 2018-02-28 Natural language processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810164982.6A CN110309252B (en) 2018-02-28 2018-02-28 Natural language processing method and device

Publications (2)

Publication Number Publication Date
CN110309252A CN110309252A (en) 2019-10-08
CN110309252B true CN110309252B (en) 2023-11-24

Family

ID=68073648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810164982.6A Active CN110309252B (en) 2018-02-28 2018-02-28 Natural language processing method and device

Country Status (1)

Country Link
CN (1) CN110309252B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111062211A (en) * 2019-12-27 2020-04-24 中国联合网络通信集团有限公司 Information extraction method and device, electronic equipment and storage medium
CN111310059B (en) * 2020-04-01 2023-11-21 东软睿驰汽车技术(沈阳)有限公司 User intention positioning method and device based on aggregated resources
CN116628229B (en) * 2023-07-21 2023-11-10 支付宝(杭州)信息技术有限公司 Method and device for generating text corpus by using knowledge graph

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN104424216A (en) * 2013-08-23 2015-03-18 佳能株式会社 Method and device for intention digging
KR20150111678A (en) * 2014-03-26 2015-10-06 포항공과대학교 산학협력단 Device for analyzing natural language incrementally, adaptive answering machine and method using the device
CN105335348A (en) * 2014-08-07 2016-02-17 阿里巴巴集团控股有限公司 Object statement based dependency syntax analysis method and apparatus and server
CN106326386A (en) * 2016-08-16 2017-01-11 百度在线网络技术(北京)有限公司 Search result displaying method and device
CN106339366A (en) * 2016-08-08 2017-01-18 北京百度网讯科技有限公司 Method and device for requirement identification based on artificial intelligence (AI)
CN106528531A (en) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 Artificial intelligence-based intention analysis method and apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10692015B2 (en) * 2016-07-15 2020-06-23 Io-Tahoe Llc Primary key-foreign key relationship determination through machine learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102866989A (en) * 2012-08-30 2013-01-09 北京航空航天大学 Viewpoint extracting method based on word dependence relationship
CN104424216A (en) * 2013-08-23 2015-03-18 佳能株式会社 Method and device for intention digging
KR20150111678A (en) * 2014-03-26 2015-10-06 포항공과대학교 산학협력단 Device for analyzing natural language incrementally, adaptive answering machine and method using the device
CN105335348A (en) * 2014-08-07 2016-02-17 阿里巴巴集团控股有限公司 Object statement based dependency syntax analysis method and apparatus and server
CN106339366A (en) * 2016-08-08 2017-01-18 北京百度网讯科技有限公司 Method and device for requirement identification based on artificial intelligence (AI)
CN106326386A (en) * 2016-08-16 2017-01-11 百度在线网络技术(北京)有限公司 Search result displaying method and device
CN106528531A (en) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 Artificial intelligence-based intention analysis method and apparatus

Also Published As

Publication number Publication date
CN110309252A (en) 2019-10-08

Similar Documents

Publication Publication Date Title
CN109815333B (en) Information acquisition method and device, computer equipment and storage medium
Hussain et al. An approach to detect abusive bangla text
US9262406B1 (en) Semantic frame identification with distributed word representations
KR102491172B1 (en) Natural language question-answering system and learning method
CN112417102A (en) Voice query method, device, server and readable storage medium
Mehmood et al. A precisely xtreme-multi channel hybrid approach for roman urdu sentiment analysis
Gokul et al. Sentence similarity detection in Malayalam language using cosine similarity
CN112632226B (en) Semantic search method and device based on legal knowledge graph and electronic equipment
CN111460090A (en) Vector-based document retrieval method and device, computer equipment and storage medium
EP3598436A1 (en) Structuring and grouping of voice queries
CN111046656A (en) Text processing method and device, electronic equipment and readable storage medium
CN112115232A (en) Data error correction method and device and server
CN110309252B (en) Natural language processing method and device
US11983502B2 (en) Extracting fine-grained topics from text content
KR101545050B1 (en) Method for automatically classifying answer type and apparatus, question-answering system for using the same
Gharbieh et al. Deep learning models for multiword expression identification
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
Yang et al. Improving word representations with document labels
Jang et al. A novel density-based clustering method using word embedding features for dialogue intention recognition
CN113221553A (en) Text processing method, device and equipment and readable storage medium
CN112906368B (en) Industry text increment method, related device and computer program product
Sun et al. Fine-grained emotion analysis based on mixed model for product review
Tapsai et al. TLS-ART: Thai language segmentation by automatic ranking trie
CN117290478A (en) Knowledge graph question-answering method, device, equipment and storage medium
CN112528653A (en) Short text entity identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant