CN107918778B - Information matching method and related device - Google Patents

Information matching method and related device Download PDF

Info

Publication number
CN107918778B
CN107918778B CN201610887444.0A CN201610887444A CN107918778B CN 107918778 B CN107918778 B CN 107918778B CN 201610887444 A CN201610887444 A CN 201610887444A CN 107918778 B CN107918778 B CN 107918778B
Authority
CN
China
Prior art keywords
information
matching degree
branch
calculating
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610887444.0A
Other languages
Chinese (zh)
Other versions
CN107918778A (en
Inventor
张一昌
赵争超
张建伟
蔡仁贵
林君
肖谦
潘林林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201610887444.0A priority Critical patent/CN107918778B/en
Priority to TW106127140A priority patent/TW201814556A/en
Priority to PCT/CN2017/103858 priority patent/WO2018068648A1/en
Publication of CN107918778A publication Critical patent/CN107918778A/en
Application granted granted Critical
Publication of CN107918778B publication Critical patent/CN107918778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides an information matching method and a related device, wherein the method comprises the following steps: acquiring first information and second information to be matched; acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node; acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the first information, and the label node at the lowest layer of the second branch is matched with the content of the second information; and calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively. Therefore, the matching degree calculated by the embodiment of the application can reflect the relevance between the information, so that the matching accuracy is improved.

Description

Information matching method and related device
Technical Field
The present application relates to the field of computer technologies, and in particular, to an information matching method and a related apparatus.
Background
The information matching technology is a commonly used computer technology for obtaining the matching degree between a plurality of pieces of information. The information matching technology is widely applied to various internet scenes, for example, for a plurality of pieces of evaluation information input by a buyer on a website such as e-commerce and the like, the matching degree of each piece of evaluation information and merchant subscription information is obtained through the information matching technology, so that the evaluation information interested by a merchant can be quickly located.
One common information matching method currently includes: segmenting words of a plurality of pieces of information to be matched, judging whether the same segmentation result exists or not, and calculating the matching degree of the plurality of pieces of information according to the same segmentation result.
Obviously, the information matching method can only determine whether the multiple pieces of information have the same word segmentation result, but cannot reflect whether the multiple pieces of information have the relevance. For example, the evaluation information input by the buyer is "poor service", and the merchant subscription information is "customer service attitude", although the "poor service" and the "customer service attitude" are describing services and have a certain relevance, the matching degree calculated according to the information matching method is 0, and obviously, the matching accuracy is low.
Disclosure of Invention
The technical problem to be solved by the application is to provide an information matching method and a related device, so that the calculated matching degree can reflect the relevance between information, and the matching accuracy is improved.
Therefore, the technical scheme for solving the technical problem is as follows:
the application provides an information matching method, which comprises the following steps:
acquiring merchant subscription information and user evaluation information to be matched;
acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the user evaluation information, and the label node at the lowest layer of the second branch is matched with the content of the merchant subscription information;
and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, calculating the matching degree between the merchant subscription information and the user evaluation information at least according to the matching degree corresponding to each layer of the first branch and the second branch respectively, including:
calculating a first matching degree at least according to the matching degree of the first branch and the second branch corresponding to each layer;
and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the first matching degree.
Optionally, calculating a first matching degree at least according to the matching degrees respectively corresponding to each layer of the first branch and the second branch, including:
and calculating a first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
acquiring a trained statistical model;
calculating the emotion index of the user evaluation information according to the statistical model;
calculating the similarity between the emotion index of the user evaluation information and the target emotion index;
calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer, respectively, including:
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, the method further includes:
and calculating the emotion index of the merchant subscription information according to the statistical model, wherein the emotion index of the merchant subscription information is used as the target emotion index.
Optionally, calculating the matching degree between the user evaluation information and the merchant subscription information according to at least the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer, respectively, includes:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, obtaining the trained statistical model includes:
acquiring a category corresponding to the user evaluation information;
and acquiring the trained statistical model corresponding to the category.
Optionally, the obtaining of the category corresponding to the user evaluation information includes:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the method further includes:
acquiring a word vector of the user evaluation information and a word vector of the merchant subscription information;
calculating the matching degree of the word vector of the user evaluation information and the word vector of the merchant subscription information as a second matching degree;
calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer, respectively, including:
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree and the second matching degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, the method further includes:
obtaining the matching degree among a plurality of label nodes in the label category tree;
and performing machine learning according to the matching degree among the label nodes, and generating or correcting the label category tree according to the result of the machine learning.
The application also provides an information matching method, which comprises the following steps:
acquiring merchant subscription information and user evaluation information to be matched;
acquiring a trained statistical model;
calculating the emotion index of the user evaluation information according to the statistical model;
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index.
Optionally, the method further includes:
acquiring initial matching degree of the user evaluation information and the merchant evaluation information;
calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index, wherein the calculating step comprises the following steps:
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree and the initial matching degree.
Optionally, calculating a matching degree between the user evaluation information and the merchant subscription information according to at least the approximation degree and the initial matching degree, including:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the initial matching degree;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, obtaining the trained statistical model includes:
acquiring a category corresponding to the user evaluation information;
and acquiring the trained statistical model corresponding to the category.
Optionally, the obtaining of the category corresponding to the user evaluation information includes:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the method further includes:
calculating the emotion index of the merchant subscription information according to the statistical model, and taking the emotion index of the merchant subscription information as the target emotion index.
The application also provides an information input method, which comprises the following steps:
the method comprises the steps that a client side obtains user evaluation information or merchant subscription information input by a user;
and the client sends the user evaluation information or the merchant subscription information to a computing unit, and the computing unit is used for computing the matching degree of the user evaluation information and the merchant subscription information.
The application also provides an information matching method, which comprises the following steps:
acquiring first information and second information to be matched;
acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the first information, and the label node at the lowest layer of the second branch is matched with the content of the second information;
and calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, calculating the matching degree of the first information and the second information according to at least the matching degree of the first branch and the second branch corresponding to each layer respectively, includes:
calculating a first matching degree at least according to the matching degree of the first branch and the second branch corresponding to each layer;
and calculating the matching degree of the first information and the second information at least according to the first matching degree.
Optionally, calculating a first matching degree at least according to the matching degrees respectively corresponding to each layer of the first branch and the second branch, including:
and calculating a first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
acquiring a trained statistical model;
calculating the emotion index of the first information according to the statistical model;
calculating the similarity between the emotion index of the first information and the target emotion index;
calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer respectively, including:
and calculating the matching degree of the first information and the second information according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer.
Optionally, the method further includes:
and calculating the emotion index of the second information according to the statistical model, wherein the emotion index of the second information is used as the target emotion index.
Optionally, calculating the matching degree of the first information and the second information according to the matching degree and the approximation degree of the first branch and the second branch respectively corresponding to each layer, including:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0.
Optionally, obtaining the trained statistical model includes:
acquiring a category corresponding to the first information;
and acquiring the trained statistical model corresponding to the category.
Optionally, the obtaining of the category corresponding to the first information includes:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information.
Optionally, the training characteristics of the trained statistical model include word segmentation results of input information;
the method further comprises the following steps: performing word segmentation on the first information to obtain a word segmentation result of the first information;
calculating an emotion index of the first information according to the statistical model, including: and inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information.
Optionally, the word segmentation result of the input information is a word segmentation result obtained by segmenting every two adjacent characters in the input information;
the segmenting the first information includes: and performing word segmentation on every two adjacent characters in the first information.
Optionally, the training features of the trained statistical model further include context emotion features;
the method further comprises the following steps: extracting emotional features of the context of the first information;
inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information, wherein the obtaining of the emotion index of the first information comprises the following steps: and inputting the word segmentation result of the first information and the emotional characteristics of the context of the first information into the statistical model to obtain the emotional index of the first information.
Optionally, the emotional features of the context include any one or more of the following:
the emotion index of the previous sentence, the topic similarity of the previous sentence and the current sentence, the overall emotion distribution of the previous sentence, and the emotion distribution of at least one related sentence in the previous sentence, wherein the topic similarity of the at least one related sentence and the current sentence is greater than a second threshold value.
Optionally, the trained statistical model includes a first statistical model and a second statistical model after training, the training features of the first statistical model include word segmentation results of input information, and the training features of the second statistical model include context emotion features.
Optionally, the trained statistical model is a trained maximum entropy model.
Optionally, the method further includes:
acquiring a word vector of the first information and a word vector of the second information;
calculating the matching degree of the word vector of the first information and the word vector of the second information as a second matching degree;
calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer respectively, including:
and calculating the matching degree of the first information and the second information at least according to the matching degree and the second matching degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, the method further includes:
obtaining the matching degree among a plurality of label nodes in the label category tree;
and performing machine learning according to the matching degree among the label nodes, and generating or correcting the label category tree according to the result of the machine learning.
The present application further provides an information matching apparatus, including:
the information acquisition unit is used for acquiring the merchant subscription information and the user evaluation information to be matched;
the system comprises a category tree obtaining unit, a category tree obtaining unit and a label category tree judging unit, wherein the category tree obtaining unit is used for a label category tree which comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
the branch acquisition unit is used for acquiring a first branch and a second branch from the tag category tree, wherein the lowest layer of tag nodes of the first branch are matched with the content of the user evaluation information, and the lowest layer of tag nodes of the second branch are matched with the content of the merchant subscription information;
and the matching degree calculation unit is used for calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, the matching degree calculating unit is specifically configured to calculate a first matching degree at least according to matching degrees respectively corresponding to each layer of the first branch and the second branch, and calculate a matching degree between the merchant subscription information and the user evaluation information at least according to the first matching degree.
Optionally, when the first matching degree is calculated at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
the model acquisition unit is used for acquiring the trained statistical model;
the emotion calculating unit is used for calculating the emotion index of the user evaluation information according to the statistical model;
the approximation calculation unit is used for calculating the approximation of the emotion index of the user evaluation information and the target emotion index;
the matching degree calculation unit is specifically configured to calculate the matching degree between the user evaluation information and the merchant subscription information at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the merchant subscription information according to the statistical model, where the emotion index of the merchant subscription information is used as the target emotion index.
Optionally, when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the matching degree and the approximation degree respectively corresponding to each layer of the first branch and the second branch, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the user evaluation information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the user evaluation information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the method further includes: the word vector acquisition unit is used for acquiring the word vector of the user evaluation information and the word vector of the merchant subscription information;
the matching degree calculation unit is also used for calculating the matching degree of the word vector of the user evaluation information and the word vector of the merchant subscription information as a second matching degree;
when the matching degree of the user evaluation information and the merchant subscription information is calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculation unit is specifically configured to calculate the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, the method further includes:
and the correcting unit is used for acquiring the matching degree among a plurality of label nodes in the label category tree, performing machine learning according to the matching degree among the plurality of label nodes, and generating or correcting the label category tree according to the machine learning result.
The present application further provides an information matching apparatus, including:
the information acquisition unit is used for acquiring the merchant subscription information and the user evaluation information to be matched;
the model acquisition unit is used for acquiring the trained statistical model;
the emotion calculating unit is used for calculating the emotion index of the user evaluation information according to the statistical model;
and the matching degree calculation unit is used for calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index.
Optionally, the method further includes:
the matching degree obtaining unit is used for obtaining the initial matching degree of the user evaluation information and the merchant evaluation information;
when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the approximation degree between the emotion index of the user evaluation information and the target emotion index, the matching degree calculating unit is specifically configured to calculate the matching degree between the user evaluation information and the merchant subscription information at least according to the approximation degree and the initial matching degree.
Optionally, when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the approximation degree and the initial matching degree, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the initial matching degree;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the user evaluation information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the user evaluation information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the merchant subscription information according to the statistical model, and use the emotion index of the merchant subscription information as the target emotion index.
The present application further provides a client, including:
the information acquisition unit is used for acquiring user evaluation information or merchant subscription information input by a user;
and the sending unit is used for sending the user evaluation information or the merchant subscription information to the calculating unit, and the calculating unit is used for calculating the matching degree of the user evaluation information and the merchant subscription information.
The present application further provides an information matching apparatus, including:
the information acquisition unit is used for acquiring first information and second information to be matched;
the system comprises a category tree obtaining unit, a category tree obtaining unit and a label category tree judging unit, wherein the category tree obtaining unit is used for a label category tree which comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
the branch acquisition unit is used for acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the first information, and the label node at the lowest layer of the second branch is matched with the content of the second information;
and the matching degree calculation unit is used for calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, the matching degree calculating unit is specifically configured to calculate a first matching degree at least according to matching degrees respectively corresponding to each layer of the first branch and the second branch; and calculating the matching degree of the first information and the second information at least according to the first matching degree.
Optionally, when the first matching degree is calculated at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
the model acquisition unit is used for acquiring the trained statistical model;
the emotion calculating unit is used for calculating an emotion index of the first information according to the statistical model;
the approximation calculation unit is used for calculating the approximation of the emotion index of the first information and the target emotion index;
when the matching degree of the first information and the matching degree of the second information are calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculation unit is specifically configured to calculate the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer and the approximation degree.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the second information according to the statistical model, where the emotion index of the second information is used as the target emotion index.
Optionally, when the matching degree of the first information and the matching degree of the second information are calculated at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the first information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the first information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information.
Optionally, the training characteristics of the trained statistical model include word segmentation results of input information;
the device further comprises: the word segmentation unit is used for performing word segmentation on the first information to obtain a word segmentation result of the first information;
the emotion calculating unit is specifically configured to input the word segmentation result of the first information to the statistical model, so as to obtain an emotion index of the first information.
Optionally, the word segmentation result of the input information is a word segmentation result obtained by segmenting every two adjacent characters in the input information;
when the first information is segmented, the segmentation unit is specifically configured to segment every two adjacent characters in the first information.
Optionally, the training features of the trained statistical model further include context emotion features;
the device further comprises: the emotion extraction unit is used for extracting the emotion characteristics of the context of the first information;
when the word segmentation result of the first information is input to the statistical model to obtain the emotion index of the first information, the emotion calculation unit is specifically configured to input the word segmentation result of the first information and the emotion characteristics of the context of the first information to the statistical model to obtain the emotion index of the first information.
Optionally, the emotional features of the context include any one or more of the following:
the emotion index of the previous sentence, the topic similarity of the previous sentence and the current sentence, the overall emotion distribution of the previous sentence, and the emotion distribution of at least one related sentence in the previous sentence, wherein the topic similarity of the at least one related sentence and the current sentence is greater than a second threshold value.
Optionally, the trained statistical model includes a first statistical model and a second statistical model after training, the training features of the first statistical model include word segmentation results of input information, and the training features of the second statistical model include context emotion features.
Optionally, the trained statistical model is a trained maximum entropy model.
Optionally, the method further includes: a word vector acquiring unit, configured to acquire a word vector of the first information and a word vector of the second information;
the matching degree calculation unit is further used for calculating the matching degree of the word vector of the first information and the word vector of the second information as a second matching degree;
and when the matching degree of the first information and the second information is calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, the method further includes: and the correcting unit is used for acquiring the matching degree among a plurality of label nodes in the label category tree, performing machine learning according to the matching degree among the plurality of label nodes, and generating or correcting the label category tree according to the machine learning result.
According to the technical scheme, when the first information and the second information are matched, the first information and the second information are not matched directly after being participled, and the first branch corresponding to the first information and the second branch corresponding to the second information are obtained from the label category tree. Wherein, the lowest label node of the first branch matches with the content of the first information, and the parent label node of each label node in the label category tree is the parent category of the label node, so the first branch not only includes the label node matching with the content of the first information, but also includes the layer-by-layer parent category of the matching label node, similarly, the second branch not only includes the label node matching with the content of the second information, but also includes the layer-by-layer parent category of the matching label node, therefore, the matching degree of the first information and the second information calculated according to the matching degree of the first branch and the second branch respectively corresponding to each layer can not only reflect the matching degree of the first information and the second information, but also can reflect the layer-by-layer parent category matching degree of the first information and the second information, which is equivalent to the relevance between the layer-by-layer parent categories of the first information and the second information, thereby improving the matching accuracy.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
FIG. 1 is a schematic flow chart of an embodiment of a method provided herein;
FIG. 2 is a schematic diagram of a tree of tag categories provided herein;
FIG. 3 is a schematic flow chart diagram of another embodiment of a method provided herein;
FIG. 4 is a schematic diagram of a scene category tree provided herein;
FIG. 5 is a schematic flow chart diagram of another embodiment of a method provided herein;
FIG. 6 is a schematic diagram of an embodiment of an apparatus provided herein;
FIG. 7 is a schematic diagram of another embodiment of an apparatus provided herein;
FIG. 8 is a schematic diagram of another embodiment of an apparatus provided herein;
FIG. 9 is a schematic diagram of another embodiment of an apparatus provided herein;
FIG. 10 is a schematic diagram of another embodiment of an apparatus provided herein;
fig. 11 is a schematic structural diagram of another embodiment of the apparatus provided in the present application.
Detailed Description
The evaluation information refers to feedback information input by the user on a web platform such as a website or an APP (application). For example, after a buyer purchases a product on an e-commerce site, the buyer can evaluate the product and the service flow such as logistics and services provided by the merchant. By inputting the merchant subscription information, the merchant can extract the evaluation information interested by the merchant and push the evaluation information to the merchant. The specific process comprises the following steps: the buyer inputs a plurality of pieces of evaluation information, the merchant inputs merchant subscription information, the merchant subscription information and the evaluation information are respectively subjected to word segmentation, whether the merchant subscription information and the evaluation information have the same word segmentation results or not is judged, and the matching degree among the plurality of pieces of information is calculated according to the same word segmentation results.
Obviously, the information matching method can only determine whether the same segmentation result exists between the evaluation information and the merchant subscription information, but cannot reflect whether the correlation exists between the evaluation information and the merchant subscription information, for example, cannot determine the correlation between parents of the evaluation information and the merchant subscription information. For example, the evaluation information input by the buyer is "poor service", the merchant subscription information is "customer service attitude", and although the parent categories of "poor service" and "customer service attitude" are all services and have a certain relevance, the matching degree calculated according to the information matching method is 0, obviously, the matching accuracy is low, so that the merchant needs to obtain the evaluation information with relevance through an additional algorithm, and waste of system resources is caused.
The embodiment of the application provides an information matching method and a related device, so that the calculated matching degree can reflect the relevance among information, particularly the relevance among layer-by-layer father categories of a plurality of pieces of information, and the matching accuracy is improved.
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, an embodiment of the present application provides a method embodiment of an information matching method, where the method of the embodiment includes:
s101: and acquiring first information and second information to be matched.
The first information and/or the second information may be words, phrases and other information input by a user. For example, the first information may be user rating information input by a buyer, and the second information may be merchant subscription information input by a merchant.
S102: and acquiring a label category tree.
The label category tree in the embodiment of the application comprises at least two layers, each layer comprises at least one label node, and a parent label node of each label node is a parent category of the label node.
For example, the label category tree shown in fig. 2 includes three layers, the first layer includes a label node: "service", the root node of the label category tree; the second tier includes two label nodes: "before sale" and "after sale"; the third tier includes four tag nodes: "customer service attitude", "response speed", "cashback", and "warranty". The label category tree is refined layer by layer according to the sequence of increasing layer by layer, that is, the parent label node of each label node is the parent category of the label node. For example, "pre-sale" is a parent of "customer service attitude" and "service" is a parent of "pre-sale".
S103: and acquiring a first branch and a second branch from the tag category tree. The first branch and/or the second branch comprise at least one tag node.
And the label node at the lowest layer of the first branch is matched with the content of the first information, and the parent label node of each label node in the label category tree is the parent category of the label node. Therefore, if the first information matches with other than the root node, the first branch includes not only the tag node matching with the content of the first information, but also the layer-by-layer parent category of the matching tag node.
The obtaining process of the first branch may include: and matching the first information with each node in the label category tree to obtain matched label nodes, and taking the matched label nodes and layer-by-layer father nodes of the matched label nodes as the first branches. Before matching with the label category tree, the first information may be subjected to word segmentation, and a word segmentation result is matched with the label category tree.
For example, the first information is: the service is not good, the word segmentation results of service and bad are obtained after the first information is segmented, the word segmentation results of service and bad are matched with each node in the label category tree to obtain matched label node service, and the service is used as a first branch if the label node service is a root node and a father node is not available. For another example, the first information is: the customer service attitude is not good, the matched label node customer service attitude is obtained according to the similar mode, and the layer-by-layer father nodes of the customer service attitude and the customer service attitude are: "Pre-sale" and "service" as the first branches.
Similarly, the label node of the lowest layer of the second branch matches the content of the second information. And if the second information is matched with other than the root node, the second branch not only comprises the label node matched with the content of the second information, but also comprises a layer-by-layer father category of the matched label node. The process of acquiring the second branch is similar to the process of acquiring the first information, and may include: and matching the second information with each node in the label category tree to obtain matched nodes, and taking the matched nodes and layer-by-layer father nodes of the matched nodes as the second branches. Before matching with the label category tree, the second information may be subjected to word segmentation, and a word segmentation result is matched with the label category tree.
S104: and calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
Specifically, the step may include: calculating a first matching degree of the first branch and the second branch corresponding to each layer respectively; and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the first matching degree. In the embodiment of the application, the first matching degree can be directly used as the matching degree of the first information and the second information, and the matching degree of the first information and the second information can also be calculated by combining other parameters according to the first matching degree.
The first branch comprises at least one layer of label nodes, the second branch comprises at least one layer of label nodes, the label nodes corresponding to each layer of the first branch and the second branch are matched to obtain the matching degree corresponding to each layer, and the matching degree of the first information and the second information is calculated according to the matching degree corresponding to each layer.
For example, the first branch comprises in sequence: "service", the second branch comprising in sequence: "service" and "before sale", the matching degree of the first layer is 100%, the matching degree of the second layer is 0, and the first matching degree is calculated from the matching degrees of the two layers. For example, 1/2, which is the sum of the degrees of matching of the two layers, is defined as the degree of matching between the first information and the second information, and the degree of matching calculated in the above example is 50%. For another example, the first branch comprises, in order: "serve", "before sale", "customer service attitude", the second branch includes in proper order: "service", "before sale", and "response speed", 1/3 which is the sum of the matching degrees of the three layers is defined as the matching degree of the first information and the second information, and the calculated matching degree is 67%.
When the first matching degree is calculated according to the matching degree corresponding to each layer, the weight value of each layer may also be considered, for example, the first matching degree Tagsim is:
Figure BDA0001128636660000181
wherein, wiIs the weight value of the i-th layer, PiThe matching degree, P, of the first branch and the second branch at the ith layeriWhen the function I is equal to 1, P when 100%iWith ≠ 100%, function I is equal to 0. The weighted values of each layer may all be equal to 1, or may be incremented layer by layer, and the weighted values may be set and/or adjusted in a machine learning manner. It should be noted that the above formula is only an alternative calculation method of the first matching degree, and those skilled in the art can expand and modify the above formula, for example, PiThe function I may be equal to other values when the value is 100%, or the function I may be equal to 1 when other conditions are satisfied, for example, when the value is greater than a certain value, which is not limited in the embodiment of the present application.
According to the technical scheme, when the first information and the second information are matched, the first information and the second information are not matched directly after being participled, and the first branch corresponding to the first information and the second branch corresponding to the second information are obtained from the label category tree. The first branch comprises a label node matched with the content of the first information and also comprises a layer-by-layer father category of the matched label node, and similarly, the second branch comprises a label node matched with the content of the second information and also comprises a layer-by-layer father category of the matched label node, so that the matching degree of the first information and the second information calculated according to the matching degree of the first branch and the second branch corresponding to each layer respectively can reflect the matching degree of the first information and the second information and can reflect the layer-by-layer father category of the first information and the second information, which is equivalent to reflecting the relevance between the layer-by-layer father categories of the first information and the second information, and the matching accuracy is improved.
It can be seen that the embodiment of the present application is actually equivalent to adding at least one layer of category label to the first information and the second information, and calculating the matching degree of the first information and the second information according to the matching degree of the category label of the corresponding layer. Therefore, by applying the embodiment of the application, the matching degree between the information with a certain relevance belonging to the category, for example, the matching degree between synonyms, the matching degree between a plurality of pieces of information belonging to the same category, and the like can be calculated.
For example, the evaluation information input by the buyer is "poor service", and the merchant subscription information is "customer service attitude", although the "poor service" and the "customer service attitude" are both describing service and have a certain relevance, when the two are directly matched, the matching degree is 0, and the matching accuracy is low. When the matching degree of the two branches is calculated through the embodiment of the application, the first branch sequentially comprises: "service", the second branch comprising in sequence: "service" and "pre-sale", the matching degree of the first layer is 100%, the matching degree of the second layer is 0, and the final calculated matching degree may be 50%. It can be seen that the matching degree calculated in the embodiment of the present application can reflect the correlation between the two, so that the matching accuracy is improved.
It should be noted that, in the embodiment of the present application, besides the user evaluation information and the merchant subscription information, the first information and the second information may also be information in other application scenarios. For example, the first information is chat information input by a user in a WeChat group or a nailing group, and the second information is specific subscription information, such as a subscription word or a subscription phrase input by a group administrator, and the like, which is not limited in the embodiment of the present application. This is explained below by way of a specific example.
For a WeChat group of a movie interest group, the tag category tree includes two layers, the first layer includes a tag node: "movie", the second tier includes two tag nodes: "comedy" and "action". The label category tree is refined layer by layer according to the sequence of increasing layer by layer, that is, the parent label node of each label node is the parent category of the label node. For example, "movie" is a parent category of "comedy" and "action". If the subscription word input by the group administrator is: "movie", the chat information input by the user is: when the two are directly matched, the matching degree is 0, and the matching accuracy is low. When the matching degree of the two branches is calculated through the embodiment of the application, the first branch sequentially comprises: "movie", "comedy", said second branch comprising: the matching degree finally calculated by the method is 50%, so that the matching accuracy is improved.
It should be noted that, if the first information and/or the second information matches multiple branches from the tag category tree, one branch may be selected from the branches that are matched with the first information, one branch may be selected from the branches that are matched with the second information, the matching degree between two branches is calculated, and the calculated highest matching degree is used as the matching degree between the first information and the second information.
In the information matching method described in the background art, since it is only determined whether the same segmentation result exists, the matching degree between synonyms cannot be calculated, and further the matching accuracy is low. In order to solve the problem, an information matching mode based on a word embedding (Chinese: word vector) technology is also provided, word vectors of information are calculated by methods such as word2vec (a double-layer neural network for processing texts) and the like, and the matching degree is calculated according to the similarity between the word vectors. Therefore, when the matching degree of the first information and the second information is calculated, the similarity between the word vectors of the first information and the second information can be combined. This will be explained in detail below.
The method may further comprise: acquiring a word vector of the first information and a word vector of the second information; calculating the matching degree of the word vector of the first information and the word vector of the second information as a second matching degree; in S104, the matching degree of the first information and the second information is calculated at least according to the first matching degree, that is, the matching degree of the first branch and the second branch corresponding to each layer, and the second matching degree.
In specific implementation, after the first information is segmented into words, word vectors of each word are extracted, the word vectors of each word are added to obtain word vectors of the first information, word vectors of second information can be obtained in a similar mode, and the matching degree of the word vectors of the first information and the word vectors of the second information is calculated in a mode of calculating cosine similarity and the like. The word vector may be a word vector extracted by using word2vec and other technologies.
When the matching degree of the first information and the second information is calculated according to the first matching degree and the second matching degree, the sum of the first matching degree and the second matching degree can be used as the final matching degree, and meanwhile, a corresponding weight value can also be set. For example, the matching degree sim of the first information and the second information may be: sim ═ λ1Vecsim+λ1Tagsim, where Tagsim is the first degree of matching, Vecsim is the second degree of matching, λ1And λ2The weight value can be set and/or adjusted in a machine learning manner.
The principle of the word embedding technology is to utilize a machine learning technology to learn a large amount of information, so that words are represented by corresponding word vectors, and the word vectors actually represent the contexts of the words, but the matching degree calculated according to the word vectors has a problem of low accuracy in some cases. For example, in one case, although the contexts of some words are the same, the semantics are greatly different, and thus the word vector cannot accurately represent the semantics of the words in many cases. For example, the semantics of "good" and "bad" are opposite, but the cosine similarity between word vectors is high. For example, in another instance, the same term may be used in a different context to express a different meaning. For example, "very thin" is a positive word when describing a cell phone, and is a negative word when describing a down jacket, and the matching degrees calculated in this way by the word vectors are all the same. In addition, since it is difficult to prove the meanings of the numerical values in the word vector respectively, the word vector itself cannot be adjusted to solve the above-mentioned problems.
In order to solve the above problem, the embodiment of the present application may further calculate an emotion index of the information according to the statistical model, where the emotion index may indicate whether the information is a positive word, a negative word, or a neutral word, and the emotion index is considered when calculating the final degree of matching.
Specifically, as shown in fig. 3, the method according to the embodiment of the present application may further include:
s301: and acquiring the trained statistical model.
The statistical model can be obtained by training a large amount of training data, and each training data is marked with a corresponding emotion index. For example, the training data is 20 ten thousand sentences, each labeled with a corresponding sentiment index.
Alternatively, the statistical model may be any mathematical model such as a maximum entropy model. Through a large number of experiments of the inventor, the calculated emotion index can be more semantic-fit when the maximum entropy model is adopted, so that the accuracy of information matching can be improved.
S302: and calculating the emotion index of the first information according to the statistical model.
And inputting the first information into the trained statistical model to obtain the emotion index of the first information. According to the interval where the emotion index is located, whether the emotion corresponding to the first information is positive, negative or neutral can be indicated.
S303: and calculating the approximation degree of the emotion index of the first information and the target emotion index.
In this embodiment of the application, the target emotion index may be a preset emotion index, or may be calculated according to the second information. For example, calculating the emotion index of the second information according to the statistical model, wherein the emotion index of the second information is used as the target emotion index. The target sentiment index can indicate whether the target sentiment is positive, negative or neutral.
The approximation degree may be expressed in any form of a difference value, a ratio, or the like, or may also be calculated according to whether the emotion index of the first information and the emotion indicated by the target emotion index are the same, for example, if the emotion index of the first information and the emotion indicated by the target emotion index are negative, it indicates that the approximation degree of the first information and the target emotion index is high.
And in S104, calculating the matching degree of the first information and the second information according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer respectively.
In this embodiment, when the matching degree between the first information and the second information is calculated, the approximation degree between the emotion index of the first information and the target emotion index is also considered, and the greater the approximation degree is, that is, the closer the emotion of the first information is to the target emotion, the higher the calculated matching degree is, and vice versa, so that the problem of low matching accuracy caused by the fact that the contexts are the same but the semantics are very different is solved. For example, for "big" and "small", since the emotion is very different, the calculated matching degree is lower and is consistent with the semantics, thereby improving the matching accuracy.
Therefore, in this embodiment, it is assumed that the merchant cares about negative evaluation information in the user evaluation information, and therefore, the target emotion index may be preset as an emotion index corresponding to a negative result, and if the user evaluation information is closer to the target emotion index, the finally calculated matching degree is higher, so that the negative evaluation information cared by the merchant is extracted according to this way.
When the matching degree is specifically calculated, the following method can be adopted:
and if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer respectively. For example, the emotion index of the first information and the emotion indicated by the target emotion index are both negative, sim ═ Tagsim, where sim is the matching degree of the first information and the second information, and Tagsim is the first matching degree.
And if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0. For example, the emotion index of the first information is different from the emotion indicated by the target emotion index, and sim is 0. At this time, the matching degree of the first information and the second information may also be other lower values, which is not limited in this application embodiment.
In the embodiment of the application, for different meanings of the same word expressed in different environments, a plurality of statistical models corresponding to categories can be set, and each statistical model can calculate the sentiment index of the first information under the category. Different statistical models are obtained by training according to training data corresponding to different scene categories, for example, for the same sentence, the emotion indexes marked under different scene categories are different, so that the emotion indexes calculated by different statistical models correspond to the scene categories.
Specifically, obtaining the trained statistical model may include: and acquiring a category corresponding to the first information, and acquiring a trained statistical model corresponding to the category. The category corresponding to the first information may refer to a category to which an evaluation object of the first information belongs, for example, a buyer purchases a clothing product on an e-commerce website and inputs user evaluation information for evaluating the clothing product, that is, the category corresponding to the user evaluation information is a clothing category.
The category corresponding to the first information can be obtained in a scene category tree mode. Specifically, the obtaining of the category corresponding to the first information includes: the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node; and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information. The parent scene node at the previous level or multiple levels may refer to a root scene node, that is, the root scene node is directly obtained as a corresponding category.
For example, a buyer purchases a skirt on an e-commerce website and inputs user evaluation information for evaluating the skirt, so that a matching scene node is obtained from a scene category tree: the skirt determines a root scene node corresponding to the scene node: and the clothing class acquires a trained statistical model corresponding to the clothing class, and calculates the emotion index of the first information by using the statistical model. Therefore, when the thin emotion index is calculated, the corresponding statistical model is selected according to whether the scene category corresponding to the thin emotion index is a mobile phone or a garment, so that the thin emotion index is calculated according to the scene category, and the accuracy of information matching is improved.
Optionally, the training features of the statistical model in this embodiment include word segmentation results of input information;
the method further comprises the following steps: performing word segmentation on the first information to obtain a word segmentation result of the first information; calculating an emotion index of the first information according to the statistical model, including: and inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information.
A large number of experiments of the inventor show that when performing word segmentation, word segmentation can be performed based on a bigram mode, that is, word segmentation is performed on every two adjacent characters in the first information, so that a word segmentation result of the first information is obtained. For example: the word segmentation results in "not served" as "served", "not served" and "not good". The word segmentation based on the method can obtain higher accuracy of information matching.
Besides the word segmentation result, the training characteristics of the statistical model can also comprise context emotional characteristics, so that the emotion indexes can be calculated by combining the word and the context information. Specifically, the method further comprises: extracting emotional features of the context of the first information; inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information, wherein the obtaining of the emotion index of the first information comprises the following steps: and inputting the word segmentation result of the first information and the emotional characteristics of the context of the first information into the statistical model to obtain the emotional index of the first information.
Wherein the emotional characteristics of the context comprise any one or more of the following:
the emotion index of the previous sentence, the topic similarity of the previous sentence and the current sentence, the overall emotion distribution of the previous sentence, and the emotion distribution of at least one related sentence in the previous sentence, wherein the topic similarity of the at least one related sentence and the current sentence is greater than a second threshold value. The following are described separately. The sentiment index of the previous sentence may indicate whether the sentiment of the previous sentence is positive, negative or neutral; the topic similarity between the previous sentence and the current sentence can indicate whether the previous sentence and the current sentence describe the same or similar topics; the overall emotion distribution above may refer to the number of sentences above with emotions of positive, negative and neutral, respectively; the related sentences are used to represent sentences describing the same or similar subject as the current sentence, and the emotional distribution of at least one related sentence in the above may refer to the number of positive, negative and neutral sentences, respectively, in the sentences describing the same or similar subject as the current sentence.
The embodiment of the application can specifically adopt two statistical models to calculate the emotion index of the first information. That is, the trained statistical model includes a first trained statistical model and a second trained statistical model, the training features of the first statistical model include word segmentation results of input information, and the training features of the second statistical model include emotional features of a context.
A specific embodiment provided by the present application is described below by taking a scenario corresponding to an e-commerce website as an example.
Referring to fig. 5, another method embodiment of an information matching method is provided in the embodiment of the present application, where the method of the embodiment includes:
s501: and acquiring user evaluation information input by a buyer and merchant subscription information input by a merchant. The user evaluation information input by the buyer is used for evaluating the skirt purchased by the buyer, namely the evaluation object is the skirt.
For example, the user is rated as "slow response speed", and the merchant subscription information is "customer service attitude"
S502: a tree of label categories is obtained as shown in figure 2. The tag category tree in the embodiment of the present application may be modified by manual addition or the like.
S503: and acquiring a first branch and a second branch from the tag category tree. The matching of the label node of the lowest layer of the first branch and the user evaluation information specifically comprises the following steps: service, pre-sale, response speed; the lowest-layer tag node of the second branch is matched with the merchant subscription information, and the method specifically comprises the following steps: service, pre-sale, customer service attitude.
S503: and calculating a first matching degree at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
For example, the calculation formula of the first matching degree is:
Figure BDA0001128636660000251
wherein, wiIs the weight value of the i-th layer, PiThe matching degree, P, of the first branch and the second branch at the ith layeriWhen the function I is equal to 1, P when 100%iWith ≠ 100%, function I is equal to 0.
S504: and respectively obtaining the word vector of the user evaluation information and the word vector of the merchant subscription information, and calculating the matching degree of the word vectors as a second matching degree.
S505: a scene category tree as shown in fig. 4 is obtained. The scene category tree in the embodiment of the present application may be modified by manual addition or the like.
S506: acquiring scene nodes matched with the evaluation objects from a scene category tree: the skirt determines a root scene node corresponding to the scene node: a garment.
S507: and acquiring a maximum entropy model A and a maximum entropy model B after training corresponding to the clothing. The training characteristics of the maximum entropy model A comprise word segmentation results based on bigram patterns, and the training characteristics of the maximum entropy model B comprise emotional characteristics of the context.
S508: and segmenting words of the user evaluation information based on a bigram mode, and inputting a segmentation result into the maximum entropy model A to obtain the sentiment index of the user evaluation information.
S509: and extracting the emotional characteristics of the context of the user evaluation information, and inputting the emotional characteristics of the context and the emotional index obtained in the step S508 into the maximum entropy model B to obtain the modified emotional index.
As shown in table 1, the emotional characteristics of the context include the following items:
the sentiment index of the previous sentence (positive, negative or neutral, respectively, and the corresponding strength), whether the previous and current sentences describe the same topic, the number of sentences whose sentiments were positive, negative and neutral, respectively, above, and the number of sentences whose sentiments were positive, negative and neutral, respectively, above, in the sentences describing the same topic.
TABLE 1
Figure BDA0001128636660000261
S510: and calculating the matching degree of the user evaluation information and the merchant subscription information according to the modified emotion index, the first matching degree and the second matching degree.
Here, if the target emotion is negative and the emotion indicated by the modified emotion index obtained in S509 is not negative, the matching degree is 0.
If the feeling indicated by the modified emotion index obtained in S509 is negative, the matching degree is:
sim=λ1Vecsim+λ1Tagsim
tagsim is the first degree of matching calculated in S503, Vecsim is the second degree of matching calculated in S504, λ1And λ2Is the corresponding weight value.
Referring to fig. 6, another embodiment of an information matching method is provided in the present application. The method of the embodiment comprises the following steps:
s601: and acquiring first information and second information to be matched.
The first information and/or the second information may be words, phrases and other information input by a user. For example, the first information may be user rating information input by a buyer, and the second information may be merchant subscription information input by a merchant.
S602: and acquiring the trained statistical model.
S603: and calculating the emotion index of the first information according to the statistical model.
S604: and calculating the matching degree of the first information and the second information at least according to the approximation degree of the emotion index of the first information and the target emotion index.
Optionally, the method further includes: acquiring initial matching degree of the first information and the second information; step S604 includes: and calculating the matching degree of the first information and the second information at least according to the approximation degree and the initial matching degree.
The initial matching degree may be the first matching degree in the above embodiment, that is, the matching degree of the first branch and the second branch corresponding to each layer respectively.
Optionally, calculating a matching degree of the first information and the second information according to at least the approximation degree and the initial matching degree, including:
if the approximation degree is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information at least according to the initial matching degree;
and if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0.
Optionally, obtaining the trained statistical model includes:
acquiring a category corresponding to the first information; and acquiring the trained statistical model corresponding to the category.
Optionally, the obtaining of the category corresponding to the first information includes:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information.
Optionally, the method further includes:
and calculating the emotion index of the second information according to the statistical model, and taking the emotion index of the second information as the target emotion index.
For the related content of this embodiment, please refer to the related description in the embodiments shown in fig. 1, 3, and 5, which is not repeated herein.
Referring to fig. 7, an embodiment of an information input method is also provided. The method of the embodiment comprises the following steps:
s701: the client acquires the first information or the second information.
S702: and the client sends the first information or the second information to a computing unit, and the computing unit is used for computing the matching degree of the first information and the second information.
The calculating unit may adopt any one of the embodiments of the information matching method to calculate the matching degree of the first information and the second information. For the related content of this embodiment, please refer to the related description in the embodiments shown in fig. 1, 3, and 5, which is not repeated herein.
Corresponding to the above method embodiments, the present application further provides corresponding apparatus embodiments, which are specifically described below.
Referring to fig. 8, an embodiment of an apparatus for an information matching apparatus is provided. The apparatus of this embodiment includes:
an information obtaining unit 801, configured to obtain merchant subscription information and user evaluation information to be matched.
The category tree obtaining unit 802 is configured to obtain a label category tree, where the label category tree includes at least two layers, each layer includes at least one label node, and a parent label node of each label node is a parent category of the label node.
A branch obtaining unit 803, configured to obtain a first branch and a second branch from the tag category tree, where a tag node at the lowest layer of the first branch matches with the content of the user evaluation information, and a tag node at the lowest layer of the second branch matches with the content of the merchant subscription information.
A matching degree calculation unit 804, configured to calculate, according to at least matching degrees respectively corresponding to each layer of the first branch and the second branch, a matching degree of the merchant subscription information and the user evaluation information.
Optionally, the matching degree calculating unit is specifically configured to calculate a first matching degree at least according to matching degrees respectively corresponding to each layer of the first branch and the second branch, and calculate a matching degree between the merchant subscription information and the user evaluation information at least according to the first matching degree.
Optionally, when the first matching degree is calculated at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
the model acquisition unit is used for acquiring the trained statistical model;
the emotion calculating unit is used for calculating the emotion index of the user evaluation information according to the statistical model;
the approximation calculation unit is used for calculating the approximation of the emotion index of the user evaluation information and the target emotion index;
the matching degree calculation unit is specifically configured to calculate the matching degree between the user evaluation information and the merchant subscription information at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the merchant subscription information according to the statistical model, where the emotion index of the merchant subscription information is used as the target emotion index.
Optionally, when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the matching degree and the approximation degree respectively corresponding to each layer of the first branch and the second branch, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the user evaluation information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the user evaluation information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the method further includes: the word vector acquisition unit is used for acquiring the word vector of the user evaluation information and the word vector of the merchant subscription information;
the matching degree calculation unit is also used for calculating the matching degree of the word vector of the user evaluation information and the word vector of the merchant subscription information as a second matching degree;
when the matching degree of the user evaluation information and the merchant subscription information is calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculation unit is specifically configured to calculate the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, the method further includes:
and the correcting unit is used for acquiring the matching degree among a plurality of label nodes in the label category tree, performing machine learning according to the matching degree among the plurality of label nodes, and generating or correcting the label category tree according to the machine learning result.
Referring to fig. 9, another apparatus embodiment of an information matching apparatus is provided in the present application. The apparatus of this embodiment includes:
an information obtaining unit 901, configured to obtain merchant subscription information and user evaluation information to be matched;
a model obtaining unit 902, configured to obtain a trained statistical model;
an emotion calculating unit 903, configured to calculate an emotion index of the user evaluation information according to the statistical model;
a matching degree calculating unit 904, configured to calculate a matching degree between the user evaluation information and the merchant subscription information according to at least a similarity between an emotion index of the user evaluation information and a target emotion index.
Optionally, the method further includes:
the matching degree obtaining unit is used for obtaining the initial matching degree of the user evaluation information and the merchant evaluation information;
when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the approximation degree between the emotion index of the user evaluation information and the target emotion index, the matching degree calculating unit is specifically configured to calculate the matching degree between the user evaluation information and the merchant subscription information at least according to the approximation degree and the initial matching degree.
Optionally, when the matching degree between the user evaluation information and the merchant subscription information is calculated at least according to the approximation degree and the initial matching degree, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the initial matching degree;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the user evaluation information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the user evaluation information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the merchant subscription information according to the statistical model, and use the emotion index of the merchant subscription information as the target emotion index.
Referring to fig. 10, an embodiment of an apparatus of a client is provided. The apparatus of this embodiment includes:
an information obtaining unit 1001 configured to obtain user evaluation information or merchant subscription information input by a user;
the sending unit 1002 is configured to send the user evaluation information or the merchant subscription information to a calculating unit, where the calculating unit is configured to calculate a matching degree between the user evaluation information and the merchant subscription information.
Referring to fig. 11, another apparatus embodiment of an information matching apparatus is provided in the present application. The apparatus of this embodiment includes:
an information acquisition unit 1101 configured to acquire first information and second information to be matched;
a category tree obtaining unit 1102, configured to obtain a label category tree, where the label category tree includes at least two layers, each layer includes at least one label node, and a parent label node of each label node is a parent category of the label node;
a branch obtaining unit 1103, configured to obtain a first branch and a second branch from the tag category tree, where a tag node at a lowest layer of the first branch matches with the content of the first information, and a tag node at a lowest layer of the second branch matches with the content of the second information;
a matching degree calculation unit 1104, configured to calculate matching degrees of the first information and the second information according to at least matching degrees corresponding to each layer of the first branch and the second branch, respectively.
Optionally, the matching degree calculating unit is specifically configured to calculate a first matching degree at least according to matching degrees respectively corresponding to each layer of the first branch and the second branch; and calculating the matching degree of the first information and the second information at least according to the first matching degree.
Optionally, when the first matching degree is calculated at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
Optionally, the method further includes:
the model acquisition unit is used for acquiring the trained statistical model;
the emotion calculating unit is used for calculating an emotion index of the first information according to the statistical model;
the approximation calculation unit is used for calculating the approximation of the emotion index of the first information and the target emotion index;
when the matching degree of the first information and the matching degree of the second information are calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculation unit is specifically configured to calculate the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer and the approximation degree.
Optionally, the emotion calculating unit is further configured to calculate an emotion index of the second information according to the statistical model, where the emotion index of the second information is used as the target emotion index.
Optionally, when the matching degree of the first information and the matching degree of the second information are calculated at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer, the matching degree calculating unit is specifically configured to:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0.
Optionally, the model obtaining unit is specifically configured to obtain a category corresponding to the first information, and obtain a trained statistical model corresponding to the category.
Optionally, when the category corresponding to the first information is obtained, the model obtaining unit is specifically configured to:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information.
Optionally, the training characteristics of the trained statistical model include word segmentation results of input information;
the device further comprises: the word segmentation unit is used for performing word segmentation on the first information to obtain a word segmentation result of the first information;
the emotion calculating unit is specifically configured to input the word segmentation result of the first information to the statistical model, so as to obtain an emotion index of the first information.
Optionally, the word segmentation result of the input information is a word segmentation result obtained by segmenting every two adjacent characters in the input information;
when the first information is segmented, the segmentation unit is specifically configured to segment every two adjacent characters in the first information.
Optionally, the training features of the trained statistical model further include context emotion features;
the device further comprises: the emotion extraction unit is used for extracting the emotion characteristics of the context of the first information;
when the word segmentation result of the first information is input to the statistical model to obtain the emotion index of the first information, the emotion calculation unit is specifically configured to input the word segmentation result of the first information and the emotion characteristics of the context of the first information to the statistical model to obtain the emotion index of the first information.
Optionally, the emotional features of the context include any one or more of the following:
the emotion index of the previous sentence, the topic similarity of the previous sentence and the current sentence, the overall emotion distribution of the previous sentence, and the emotion distribution of at least one related sentence in the previous sentence, wherein the topic similarity of the at least one related sentence and the current sentence is greater than a second threshold value.
Optionally, the trained statistical model includes a first statistical model and a second statistical model after training, the training features of the first statistical model include word segmentation results of input information, and the training features of the second statistical model include context emotion features.
Optionally, the trained statistical model is a trained maximum entropy model.
Optionally, the method further includes: a word vector acquiring unit, configured to acquire a word vector of the first information and a word vector of the second information;
the matching degree calculation unit is further used for calculating the matching degree of the word vector of the first information and the word vector of the second information as a second matching degree;
and when the matching degree of the first information and the second information is calculated at least according to the matching degree of the first branch and the second branch corresponding to each layer, the matching degree calculating unit is specifically configured to calculate the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
Optionally, the method further includes: and the correcting unit is used for acquiring the matching degree among a plurality of label nodes in the label category tree, performing machine learning according to the matching degree among the plurality of label nodes, and generating or correcting the label category tree according to the machine learning result.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (35)

1. An information matching method, comprising:
acquiring merchant subscription information and user evaluation information to be matched;
acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the user evaluation information, and the label node at the lowest layer of the second branch is matched with the content of the merchant subscription information;
and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
2. The method according to claim 1, wherein calculating the matching degree between the merchant subscription information and the user rating information according to at least the matching degree corresponding to the first branch and the second branch at each layer comprises:
calculating a first matching degree at least according to the matching degree of the first branch and the second branch corresponding to each layer;
and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the first matching degree.
3. The method of claim 2, wherein calculating a first degree of matching based on at least the degree of matching of the first branch and the second branch at each level comprises:
and calculating a first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
4. The method of claim 1, further comprising:
acquiring a trained statistical model;
calculating the emotion index of the user evaluation information according to the statistical model;
calculating the similarity between the emotion index of the user evaluation information and the target emotion index;
calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer, respectively, including:
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer respectively.
5. The method of claim 4, further comprising:
and calculating the emotion index of the merchant subscription information according to the statistical model, wherein the emotion index of the merchant subscription information is used as the target emotion index.
6. The method of claim 4, wherein calculating the matching degree between the user rating information and the merchant subscription information according to at least the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer respectively comprises:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
7. The method of claim 4, wherein obtaining the trained statistical model comprises:
acquiring a category corresponding to the user evaluation information;
and acquiring the trained statistical model corresponding to the category.
8. The method according to claim 7, wherein obtaining the category corresponding to the user evaluation information comprises:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the user evaluation information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the user evaluation information.
9. The method of claim 1, further comprising:
acquiring a word vector of the user evaluation information and a word vector of the merchant subscription information;
calculating the matching degree of the word vector of the user evaluation information and the word vector of the merchant subscription information as a second matching degree;
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the matching degree and the second matching degree of the first branch and the second branch corresponding to each layer respectively.
10. The method of claim 1, further comprising:
obtaining the matching degree among a plurality of label nodes in the label category tree;
and performing machine learning according to the matching degree among the label nodes, and generating or correcting the label category tree according to the result of the machine learning.
11. An information matching method, comprising:
acquiring merchant subscription information and user evaluation information to be matched;
acquiring a trained statistical model; wherein, include: acquiring a category corresponding to the user evaluation information; acquiring a trained statistical model corresponding to the category; the obtaining of the category corresponding to the user evaluation information includes: the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node; acquiring scene nodes matched with the user evaluation information from the scene category tree, determining a previous-level or multi-level father scene node corresponding to the matched scene nodes, and taking the previous-level or multi-level father scene node as a category corresponding to the user evaluation information;
calculating the emotion index of the user evaluation information according to the statistical model;
calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index; the target sentiment index comprises an index calculated based on the merchant subscription information.
12. The method of claim 11, further comprising:
acquiring initial matching degree of the user evaluation information and the merchant subscription information;
calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index, wherein the calculating step comprises the following steps:
and calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree and the initial matching degree.
13. The method of claim 12, wherein calculating a match between the user rating information and the merchant subscription information based on at least the approximation and the initial match comprises:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the initial matching degree;
and if the similarity is smaller than the first threshold, the matching degree of the user evaluation information and the merchant subscription information is 0.
14. The method of claim 11, further comprising:
calculating the emotion index of the merchant subscription information according to the statistical model, and taking the emotion index of the merchant subscription information as the target emotion index.
15. An information input method, comprising:
the method comprises the steps that a client side obtains user evaluation information or merchant subscription information input by a user;
the client sends the user evaluation information or the merchant subscription information to a computing unit, and the computing unit is used for computing the matching degree of the user evaluation information and the merchant subscription information; wherein, include: acquiring merchant subscription information and user evaluation information to be matched; acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node; acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the user evaluation information, and the label node at the lowest layer of the second branch is matched with the content of the merchant subscription information; and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
16. An information matching method, comprising:
acquiring first information and second information to be matched;
acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the first information, and the label node at the lowest layer of the second branch is matched with the content of the second information;
and calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
17. The method of claim 16, wherein calculating the matching degree of the first information and the second information according to at least the matching degree of the first branch and the second branch corresponding to each layer respectively comprises:
calculating a first matching degree at least according to the matching degree of the first branch and the second branch corresponding to each layer;
and calculating the matching degree of the first information and the second information at least according to the first matching degree.
18. The method of claim 17, wherein calculating a first degree of matching based on at least the degree of matching of the first branch and the second branch at each level comprises:
and calculating a first matching degree at least according to the matching degree of the first branch and the second branch respectively corresponding to each layer and the weight value of each layer.
19. The method of claim 16, further comprising:
acquiring a trained statistical model;
calculating the emotion index of the first information according to the statistical model;
calculating the similarity between the emotion index of the first information and the target emotion index;
calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer respectively, including:
and calculating the matching degree of the first information and the second information according to the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer.
20. The method of claim 19, further comprising:
and calculating the emotion index of the second information according to the statistical model, wherein the emotion index of the second information is used as the target emotion index.
21. The method of claim 19, wherein calculating the matching degree of the first information and the second information according to at least the matching degree and the approximation degree of the first branch and the second branch corresponding to each layer respectively comprises:
if the similarity is larger than or equal to a first threshold value, calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer;
and if the approximation degree is smaller than the first threshold value, the matching degree of the first information and the second information is 0.
22. The method of claim 19, wherein obtaining the trained statistical model comprises:
acquiring a category corresponding to the first information;
and acquiring the trained statistical model corresponding to the category.
23. The method of claim 22, wherein obtaining the category corresponding to the first information comprises:
the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node;
and acquiring scene nodes matched with the first information from the scene category tree, determining the upper-level or multi-level father scene nodes corresponding to the matched scene nodes, and taking the upper-level or multi-level father scene nodes as categories corresponding to the first information.
24. The method of claim 19, wherein the training features of the trained statistical model comprise word segmentation results of input information;
the method further comprises the following steps: performing word segmentation on the first information to obtain a word segmentation result of the first information;
calculating an emotion index of the first information according to the statistical model, including: and inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information.
25. The method according to claim 24, wherein the word segmentation result of the input information is a word segmentation result obtained by segmenting every two adjacent characters in the input information;
the segmenting the first information includes: and performing word segmentation on every two adjacent characters in the first information.
26. The method of claim 24, wherein the training features of the trained statistical model further comprise contextual emotional features;
the method further comprises the following steps: extracting emotional features of the context of the first information;
inputting the word segmentation result of the first information into the statistical model to obtain the emotion index of the first information, wherein the obtaining of the emotion index of the first information comprises the following steps: and inputting the word segmentation result of the first information and the emotional characteristics of the context of the first information into the statistical model to obtain the emotional index of the first information.
27. The method of claim 26, wherein the contextual emotional characteristics comprise any one or more of:
the emotion index of the previous sentence, the topic similarity of the previous sentence and the current sentence, the overall emotion distribution of the previous sentence, and the emotion distribution of at least one related sentence in the previous sentence, wherein the topic similarity of the at least one related sentence and the current sentence is greater than a second threshold value.
28. The method of claim 26, wherein the trained statistical model comprises a first trained statistical model and a second trained statistical model, wherein the training features of the first statistical model comprise the word segmentation results of the input information, and the training features of the second statistical model comprise emotional features of the context.
29. The method of any one of claims 19 to 28, wherein the trained statistical model is a trained maximum entropy model.
30. The method of claim 16, further comprising:
acquiring a word vector of the first information and a word vector of the second information;
calculating the matching degree of the word vector of the first information and the word vector of the second information as a second matching degree;
calculating the matching degree of the first information and the second information according to the matching degree of the first branch and the second branch corresponding to each layer respectively, including:
and calculating the matching degree of the first information and the second information at least according to the matching degree and the second matching degree of the first branch and the second branch corresponding to each layer respectively.
31. The method of claim 16, further comprising:
obtaining the matching degree among a plurality of label nodes in the label category tree;
and performing machine learning according to the matching degree among the label nodes, and generating or correcting the label category tree according to the result of the machine learning.
32. An information matching apparatus, comprising:
the information acquisition unit is used for acquiring the merchant subscription information and the user evaluation information to be matched;
the system comprises a category tree obtaining unit, a category tree obtaining unit and a label category tree judging unit, wherein the category tree obtaining unit is used for a label category tree which comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
the branch acquisition unit is used for acquiring a first branch and a second branch from the tag category tree, wherein the lowest layer of tag nodes of the first branch are matched with the content of the user evaluation information, and the lowest layer of tag nodes of the second branch are matched with the content of the merchant subscription information;
and the matching degree calculation unit is used for calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
33. An information matching apparatus, comprising:
the information acquisition unit is used for acquiring the merchant subscription information and the user evaluation information to be matched;
the model acquisition unit is used for acquiring the trained statistical model; wherein, include: acquiring a category corresponding to the user evaluation information; acquiring a trained statistical model corresponding to the category; the obtaining of the category corresponding to the user evaluation information includes: the method comprises the steps of obtaining a scene category tree, wherein the scene category tree comprises at least two layers, each layer comprises at least one scene node, and a father scene node of each scene node is a father category of the scene node; acquiring scene nodes matched with the user evaluation information from the scene category tree, determining a previous-level or multi-level father scene node corresponding to the matched scene nodes, and taking the previous-level or multi-level father scene node as a category corresponding to the user evaluation information;
the emotion calculating unit is used for calculating the emotion index of the user evaluation information according to the statistical model;
the matching degree calculation unit is used for calculating the matching degree of the user evaluation information and the merchant subscription information at least according to the approximation degree of the emotion index of the user evaluation information and the target emotion index; the target sentiment index comprises an index calculated based on the merchant subscription information.
34. A client, comprising:
the information acquisition unit is used for acquiring user evaluation information or merchant subscription information input by a user;
the sending unit is used for sending the user evaluation information or the merchant subscription information to the calculating unit, and the calculating unit is used for calculating the matching degree of the user evaluation information and the merchant subscription information; wherein, include: acquiring merchant subscription information and user evaluation information to be matched; acquiring a label category tree, wherein the label category tree comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node; acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the user evaluation information, and the label node at the lowest layer of the second branch is matched with the content of the merchant subscription information; and calculating the matching degree of the merchant subscription information and the user evaluation information at least according to the matching degree of the first branch and the second branch corresponding to each layer.
35. An information matching apparatus, comprising:
the information acquisition unit is used for acquiring first information and second information to be matched;
the system comprises a category tree obtaining unit, a category tree obtaining unit and a label category tree judging unit, wherein the category tree obtaining unit is used for a label category tree which comprises at least two layers, each layer comprises at least one label node, and a father label node of each label node is a father category of the label node;
the branch acquisition unit is used for acquiring a first branch and a second branch from the label category tree, wherein the label node at the lowest layer of the first branch is matched with the content of the first information, and the label node at the lowest layer of the second branch is matched with the content of the second information;
and the calculating unit is used for calculating the matching degree of the first information and the second information at least according to the matching degree of the first branch and the second branch corresponding to each layer respectively.
CN201610887444.0A 2016-10-11 2016-10-11 Information matching method and related device Active CN107918778B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201610887444.0A CN107918778B (en) 2016-10-11 2016-10-11 Information matching method and related device
TW106127140A TW201814556A (en) 2016-10-11 2017-08-10 Information matching method and related device
PCT/CN2017/103858 WO2018068648A1 (en) 2016-10-11 2017-09-28 Information matching method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610887444.0A CN107918778B (en) 2016-10-11 2016-10-11 Information matching method and related device

Publications (2)

Publication Number Publication Date
CN107918778A CN107918778A (en) 2018-04-17
CN107918778B true CN107918778B (en) 2022-03-15

Family

ID=61891935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610887444.0A Active CN107918778B (en) 2016-10-11 2016-10-11 Information matching method and related device

Country Status (3)

Country Link
CN (1) CN107918778B (en)
TW (1) TW201814556A (en)
WO (1) WO2018068648A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034938B (en) * 2018-06-11 2022-07-05 广东因特利信息科技股份有限公司 Information rapid screening and matching method and device, electronic equipment and storage medium
CN109062986A (en) * 2018-06-29 2018-12-21 深圳市彬讯科技有限公司 A kind of classification processing method and device of label
CN109255000B (en) * 2018-07-17 2022-10-11 土巴兔集团股份有限公司 Dimension management method and device for label data
TWI682292B (en) * 2018-08-24 2020-01-11 內秋應智能科技股份有限公司 Intelligent voice device for recursive integrated dialogue
CN109614494B (en) * 2018-12-29 2021-10-26 东软集团股份有限公司 Text classification method and related device
CN110335131B (en) * 2019-06-04 2023-12-05 创新先进技术有限公司 Financial risk control method and device based on similarity matching of trees
CN111797898B (en) * 2020-06-03 2022-03-15 武汉大学 Online comment automatic reply method based on deep semantic matching

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102326144A (en) * 2008-12-12 2012-01-18 阿迪吉欧有限责任公司 The information that the usability interest worlds are confirmed is offered suggestions
CN103886034A (en) * 2014-03-05 2014-06-25 北京百度网讯科技有限公司 Method and equipment for building indexes and matching inquiry input information of user
CN104636386A (en) * 2013-11-14 2015-05-20 华为技术有限公司 Information monitoring method and device
CN104933084A (en) * 2015-05-04 2015-09-23 上海智臻网络科技有限公司 Method, apparatus and device for acquiring answer information
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
CN105740228A (en) * 2016-01-25 2016-07-06 云南大学 Internet public opinion analysis method

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679462B (en) * 2012-08-31 2019-01-15 阿里巴巴集团控股有限公司 A kind of comment data treating method and apparatus, a kind of searching method and system
CN103810192A (en) * 2012-11-09 2014-05-21 腾讯科技(深圳)有限公司 User interest recommending method and device
CN103207914B (en) * 2013-04-16 2016-02-24 武汉理工大学 The preference vector evaluated based on user feedback generates method and system
US20150186790A1 (en) * 2013-12-31 2015-07-02 Soshoma Inc. Systems and Methods for Automatic Understanding of Consumer Evaluations of Product Attributes from Consumer-Generated Reviews
CN103778214B (en) * 2014-01-16 2017-08-01 北京理工大学 A kind of item property clustering method based on user comment
CN105095288B (en) * 2014-05-14 2020-02-07 腾讯科技(深圳)有限公司 Data analysis method and data analysis device
CN105786838B (en) * 2014-12-22 2019-07-12 阿里巴巴集团控股有限公司 A kind of information matches treating method and apparatus
CN105183847A (en) * 2015-09-07 2015-12-23 北京京东尚科信息技术有限公司 Feature information collecting method and device for web review data
CN105354183A (en) * 2015-10-19 2016-02-24 Tcl集团股份有限公司 Analytic method, apparatus and system for internet comments of household electrical appliance products

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102326144A (en) * 2008-12-12 2012-01-18 阿迪吉欧有限责任公司 The information that the usability interest worlds are confirmed is offered suggestions
CN104636386A (en) * 2013-11-14 2015-05-20 华为技术有限公司 Information monitoring method and device
CN103886034A (en) * 2014-03-05 2014-06-25 北京百度网讯科技有限公司 Method and equipment for building indexes and matching inquiry input information of user
CN104933084A (en) * 2015-05-04 2015-09-23 上海智臻网络科技有限公司 Method, apparatus and device for acquiring answer information
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
CN105740228A (en) * 2016-01-25 2016-07-06 云南大学 Internet public opinion analysis method

Also Published As

Publication number Publication date
WO2018068648A1 (en) 2018-04-19
TW201814556A (en) 2018-04-16
CN107918778A (en) 2018-04-17

Similar Documents

Publication Publication Date Title
CN107918778B (en) Information matching method and related device
CN111708950B (en) Content recommendation method and device and electronic equipment
CN110377740B (en) Emotion polarity analysis method and device, electronic equipment and storage medium
CN105389722B (en) Malicious order identification method and device
US20180336193A1 (en) Artificial Intelligence Based Method and Apparatus for Generating Article
CN105022754B (en) Object classification method and device based on social network
CN111931062A (en) Training method and related device of information recommendation model
CN106339507B (en) Streaming Media information push method and device
CN110781407A (en) User label generation method and device and computer readable storage medium
CN110874439B (en) Recommendation method based on comment information
CN110569354B (en) Barrage emotion analysis method and device
CN109992781B (en) Text feature processing method and device and storage medium
CN108228576A (en) Text interpretation method and device
CN111159409A (en) Text classification method, device, equipment and medium based on artificial intelligence
CN110895656A (en) Text similarity calculation method and device, electronic equipment and storage medium
CN113392179A (en) Text labeling method and device, electronic equipment and storage medium
CN111813993A (en) Video content expanding method and device, terminal equipment and storage medium
CN114492669B (en) Keyword recommendation model training method, recommendation device, equipment and medium
CN114780709A (en) Text matching method and device and electronic equipment
JP7181999B2 (en) SEARCH METHOD AND SEARCH DEVICE, STORAGE MEDIUM
CN110162769B (en) Text theme output method and device, storage medium and electronic device
CN113704509B (en) Multimedia recommendation method and device, electronic equipment and storage medium
CN111460808B (en) Synonymous text recognition and content recommendation method and device and electronic equipment
CN107704632A (en) Modification method is recommended based on the Chinese label of synonym and antonym
CN115630639A (en) Keyword extraction method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant