CN107315731A - Text similarity computing method - Google Patents
Text similarity computing method Download PDFInfo
- Publication number
- CN107315731A CN107315731A CN201610268995.9A CN201610268995A CN107315731A CN 107315731 A CN107315731 A CN 107315731A CN 201610268995 A CN201610268995 A CN 201610268995A CN 107315731 A CN107315731 A CN 107315731A
- Authority
- CN
- China
- Prior art keywords
- text
- mrow
- phrase
- computing method
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of Text similarity computing method, including:Step (S1), according to the default classification scheme classified based on user view, according to history text, the intention assessment disaggregated model for the phrase being directed in the history text is created, the intention assessment disaggregated model reflects probability of the phrase under the classification scheme;Step (S2), using as the object text segmentation of Similarity Measure object be object phrase corresponding with the phrase in above-mentioned intention assessment disaggregated model, based on the intention assessment disaggregated model, phase adduction normalizing is carried out to the probability of the object phrase, the intent classifier vector of the object text is obtained, the intent classifier vector reflects probability of the object text under the classification scheme;And step (S3), according to intent classifier vector, the similarity of two object texts is asked for using Method of Cosine.
Description
Technical field
The present invention relates to a kind of Text similarity computing method, intention assessment classification mould is more particularly to utilized
The Text similarity computing method of type.
Background technology
Text similarity, that is, calculate the whether similar algorithm of two problems, and it is most basic as one kind
Algorithm has a wide range of applications, while being also search engine, text sequence, related question excavation etc. one
The core of series of problems.It is a series of to ask if the similarity between text two-by-two can be calculated effectively
Topic can also be solved therewith.
Intention assessment, that is, recognize a kind of intention of behavior.For example, in question answer dialog, quizmaster is every
Word all carries certain intention, and answer party is answered according to the intention of other side.Relevant issues are being searched
It is widely used under the scenes such as index is held up, chat robots.Especially, in chat robots, meaning
Figure identification is the nucleus module of whole system.When the problem of answering user, all problems are drawn in advance
It is divided into theme is classified by the intention of user one by one classification scheme (with company's customer service and user
Exemplified by dialogue, a theme is exactly a service point.For example, relevant goods return and replacement, relevant delivery address
Deng).User is putd question to every time, and all problem is mapped in some theme, particular topic pair is provided afterwards
The answer answered.
Machine learning is exactly the science of an artificial intelligence, and the main study subject in the field is artificial intelligence
Can, in particular how improve the performance of specific algorithm in empirical learning.Common machine learning method
Supervised learning, semi-supervised learning and unsupervised learning can be divided into.
So-called supervised learning, exactly goes out a function, when new from given training data focusing study
When data arrive, it can be predicted the outcome according to the function.The training set requirement of supervised learning is to include
Input and output, it may also be said to be feature and target.Target in training set can be marked in advance.
So-called topic model is exactly that the method that theme is modeled is implied to text.Given training corpus,
Training corpus is automatically divided into different themes, which theme new language material belongs to for predicting.
LR (Logistic regression) is logistic regression algorithm, is a kind of conventional supervised learning
Algorithm.
Bag of words (bag of words), are a kind of document representation methods.
For example, there is a dictionary:
{″John″:1, " likes ":2, " to ":3, " watch ":4, " movies ":5, " also ":
6, " football ":7, " games ":8, " Mary ":9, " too ":10}
One text:
John likes to watch movies.Mary likes too.
According to existing dictionary, following vector can be converted the text to:
[1,1,1,1,1,0,0,0,1,1]
Wherein, 1 shows that the word in dictionary occurred, and 0 represents do not occur.
The existing method for calculating text similarity is a lot, for example, convert the text to ask after term vector to
Cos (cosine) angle of amount, or BM25 (BM stands for Best Matching, optimal
With criterion), LCS (Longest Common Subsequence, longest common subsequence) etc.
Series of algorithms.
However, the existing algorithm for calculating text similarity can only often reflect text in terms of some
Similarity, and algorithm be substantially it is related to text literal strong (close).On the one hand, when two
Text matches are to core word or when matching general stop word, and the similarity that algorithm is provided is identical
, it is impossible to make a distinction;On the other hand, if two texts contain synonym, although expression is
One meaning, but be due to literal inconsistent and cause similarity very low.General topic model due to
Each theme is the generation of program automatic cluster, therefore, on the one hand, the theme of generation is often people
It can not understand, on the other hand, some incoherent problems can be divided into a theme and caused
Effect is extremely difficult to be expected.
In addition, generally requiring in actual use while being merged to multiple similarity algorithms.Moreover,
Effect also is difficult to satisfactory.
The content of the invention
The present invention is in view of prior art substantially all has stronger correlation and nothing with the literal of text
Method is real to be made from the semantic level of text judging the problem of text similarity etc. is above-mentioned such,
Its object is to provide a kind of avoid to calculate the disadvantage of similarity according to literal completely in the prior art
The degree of accuracy at end is higher and the more preferable Text similarity computing method of effect.
The Text similarity computing method of one aspect of the present invention, step (S1), according to default
The classification scheme classified based on user view, according to history text, is created and is directed to the history text
In phrase intention assessment disaggregated model, the intention assessment disaggregated model reflects the phrase in institute
State the probability under classification scheme;Step (S2), will be used as the object text point of Similarity Measure object
Object phrase corresponding with the phrase in above-mentioned intention assessment disaggregated model is segmented into, based on the meaning
Figure identification disaggregated model, phase adduction normalizing is carried out to the probability of the object phrase, obtains described
The intent classifier vector of object text, the intent classifier vector reflects the object text at described point
Probability under class theme;And step (S3), according to intent classifier vector, utilize Method of Cosine
Ask for the similarity of two object texts.
Text similarity computing method according to an aspect of the present invention, the formula of the Method of Cosine
For:
Wherein, cos θ represent similarity, and i represents the classification scheme number of the intent classifier vector, its
Value is 1 to n positive integer, and A represents the first object text, and B represents the second object text, Ai、Bi
The institute of the first object text or the second object text under current class theme is represented respectively
State probability.
Text similarity computing method according to an aspect of the present invention, the intention assessment classification mould
Type is created by bag of words method and combines logistic regression algorithm to realize.
Text similarity computing method according to an aspect of the present invention, the classification scheme is customer service
With the service point of user session.
Text similarity computing method according to an aspect of the present invention, the history text is customer service
With the text in the history consulting daily record of user session.
Text similarity computing method according to an aspect of the present invention, the phrase is as needed
A part of phrase filtered out from the history text.
Text similarity computing method according to an aspect of the present invention, the classification scheme number is institute
State the dimension of intent classifier vector.
Text similarity computing method according to an aspect of the present invention, the probability is the intention
The numerical value of class vector.
In summary, according to the above-mentioned technical proposal of the Text similarity computing method of the present invention, realize
A kind of degree of accuracy is higher and effect more preferable Text similarity computing method, it is to avoid in the prior art
Completely according to it is literal to calculate similarity the drawbacks of.
Brief description of the drawings
Fig. 1 is the general block diagram of the Text similarity computing method of the present invention.
The step of Fig. 2 is the establishment intention assessment disaggregated model of the Text similarity computing method of the present invention
S1 flow chart.
Fig. 3 is the relevant used equipment of the control method of the intelligent terminal access intelligence spot net of the present invention
Process schematic diagram during access.
Embodiment
The present invention is the Text similarity computing method that make use of intention assessment disaggregated model, according to prior
Each ready-portioned classification scheme, it is intended that identification disaggregated model can map the text to corresponding classification
So as to therefrom obtain the information of its semantic level on theme.Text similarity meter is carried out on this basis
Calculate, so as to obtain more preferable effect.
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with specific reality
Example is applied, and referring to the drawings, the present invention is described in detail.
Fig. 1 is the general block diagram of the Text similarity computing method of the present invention.As shown in figure 1, above-mentioned
Text similarity computing method includes:Create the step S1 of intention assessment disaggregated model;Obtain object text
The step S2 of this intention assessment class vector;And calculate the step S3 of similarity.
The step of Fig. 2 is the establishment intention assessment disaggregated model of the Text similarity computing method of the present invention
S1 flow chart.
As shown in Fig. 2 in the step S1 for creating intention assessment disaggregated model, first, setting in advance
Surely the classification scheme (step S1-1) classified by the intention of user.With company's customer service and user couple
Exemplified by words, a classification scheme is exactly a service point, and each problem (text) of user can be with
Corresponding service point correspondence in these service points.For example, it is assumed herein that being divided into 3 kinds of classification schemes:
" relevant freight charges ", " relevant goods return and replacement ", " relevant delivery address ".
Then, obtain history text and (by taking company's customer service and user session as an example, then seek advice from day for history
Text in will), and history text is subjected to cutting word, to determine modeling phrase (step S1-2).
That is, by taking above-mentioned Bag of words (bag of words) method as an example, it can be cut into and the dictionary number in bag of words
According to corresponding phrase one by one, modeling phrase is used as.Here, can not be all conducts of all phrases
Modeling phrase, but actually useful a part of phrase can be filtered out as needed as modeling
Use phrase.
Then, for each identified modeling phrase, according to above-mentioned default classification scheme, profit
With known algorithm (for example, using Bag of words (bag of words) method, every text is converted to
Vector, is that logistic regression algorithm carries out model training using LR (Logistic regression) then),
Create the intention assessment disaggregated model (step S1-3) for each phrase.
Here, the output of intention assessment disaggregated model is a vector (also referred to as theme vector), to
The classification scheme number of the dimension of amount and above-mentioned division is consistent (in this example, being " 3 "), per one-dimensional
Numerical value represent text or phrase belongs to the probability of corresponding classification scheme, probability is bigger to represent text
This or phrase are more likely to belong to current class theme, and vectorial all dimensions add up to 1.
It is following【Table 1】, there is shown one of the intention assessment disaggregated model for phrase created shows
Example.(here, table 1 indicates an example, numerical value not actual numerical value.Moreover, the intention assessment
Disaggregated model is a kind of existing machine learning algorithm, more than one, different its algorithm logic of algorithm
It is different)
【Table 1】
Relevant freight charges | Relevant goods return and replacement | Relevant delivery address | |
Thing | 0.33 | 0.33 | 0.33 |
Delivery | 0.45 | 0.10 | 0.45 |
Bag postal | 0.80 | 0.10 | 0.10 |
Where | 0.15 | 0.05 | 0.80 |
Freight charges | 0.80 | 0.10 | 0.10 |
··· | ··· | ··· | ··· |
Fig. 3 is the intention assessment classification of the acquisition object text of the Text similarity computing method of the present invention
The step S2 of vector flow chart.
As shown in figure 3, in the step S2 of intention assessment class vector of object text is obtained, it is first
First, the object text (step S2-1) as the object for carrying out similarity assessment is obtained.
Then, using the intention assessment disaggregated model of above-mentioned establishment, the intention for obtaining the object text is known
Not vectorial (step S2-2).Specifically, it is intended that the input of identification disaggregated model is the object text,
The output of intention assessment disaggregated model is a vector (also referred to as theme vector), vectorial dimension with
The classification scheme number of above-mentioned division is consistent (in this example, being " 3 "), is represented per one-dimensional numerical value
Text or phrase belong to the probability of corresponding classification scheme, and probability is bigger represents text or phrase more
Current class theme is likely to belong to, vectorial all dimensions add up to 1.
For example, it is assumed that object text is " who goes out thing freight out ", then according to Bag of words
(bag of words) method carries out cutting word, and cutting word is " thing ", " delivery ", " freight charges ", " who goes out ".Then,
According to above-mentioned【Table 1】The intention assessment disaggregated model for phrase, utilize the side of phase adduction normalizing
Method is vectorial come the intention assessment for obtaining the object text, i.e., object text belongs to corresponding each classification master
Probability under topic.For example, it is as follows specifically to calculate (phase adduction normalizing algorithm).
The first step, calculates the probability that text belongs to each classification:
Belong to the probability P 1=0.33+0.45+0.80 of classification scheme 1 (such as " about freight charges ");
Belong to the probability P 2=0.33+0.10+0.10 of classification scheme 2 (such as " about goods return and replacement ");
·····
Belong to classification scheme n probability P n=xxx+xxx+xxx;
Second step, normalizes each probability:
Belong to final probability=P1/ (P1+P2++Pn) of classification scheme 1;
Belong to final probability=P2/ (P1+P2++Pn) of classification scheme 2;
·····
Belong to classification scheme n final probability=Pn/ (P1+P2++Pn);
Here, an also simply example, numerical value not actual numerical value.Moreover, this also and it is not exclusive calculate
Method.
Then, judge whether the object text that carry out similar assessment obtains and finish, be judged as it is not complete
When finishing ("No"), return to step S2-1 obtains next object text;It is being judged as finishing ("Yes")
When, into step S3.
It is following【Table 2】, there is shown an example of the intent classifier vector of acquired object text.
(here, table 2 is also an example, numerical value not actual numerical value, and above-mentioned【Table 1】Not
Matching completely)
【Table 2】
In the step S3 for calculating similarity, two texts are asked for according to following cosine formula (formula 1)
This similarity.
Wherein, cos θ represent similarity, and i represents the dimension i.e. classification scheme number of vector, and its value is 1
To n positive integer (in this example, n=3), A represents the first object text, and B represents the second object text
This, Ai、BiRepresent respectively the first object text or the second object text under current class theme to
Numerical quantity is probability.
So, according to above-mentioned【Table 2】, by above formula try to achieve " who goes out thing freight out " with
The similarity of " commodity bundle postal " be 0.9967, and " who goes out thing freight out " with " thing from
Where deliver " similarity be 0.0819.
It can be seen that, when text representation be same intention when just can preferably reflect its similarity also compared with
It is close.Conversely, when being intended to difference farther out, text similarity is also low.Moreover, similarity and word
Face relation is not literal closer to more similar less.
Thus, the similarity that the present invention is calculated and tried to achieve by the above method is to text semantic rank
Understand, more general similarity calculating method abstraction level is higher.It is not simple according to text
It is literal whether unanimously to seek similarity, but whether two texts are in table from the point of view of being really intended to according to text
State same implication.Compared to general literal similarity algorithm, it is to avoid what is be mentioned above is complete
According to the drawbacks of literal calculating similarity.For general topic model, due to intention assessment classification mould
Type has higher accuracy rate, and effect is also more excellent.
Particular embodiments described above, is carried out to the purpose of the present invention, technical scheme and beneficial effect
It is further described, should be understood that the specific example that the foregoing is only of the invention,
It is not intended to limit the invention.Any modification within the spirit and principles of the invention, being made,
Equivalent substitution, improvement etc., should be included in the scope of the protection.
Claims (8)
1. a kind of Text similarity computing method, including:
Step (S1), according to the default classification scheme classified based on user view, according to history
Text, creates the intention assessment disaggregated model for the phrase being directed in the history text, the intention assessment
Disaggregated model reflects probability of the phrase under the classification scheme;
Step (S2), will be to know with above-mentioned intention as the object text segmentation of Similarity Measure object
The corresponding object phrase of the phrase in other disaggregated model, based on the intention assessment disaggregated model,
Phase adduction normalizing is carried out to the probability of the object phrase, the intention point of the object text is obtained
Class vector, the intent classifier vector reflects probability of the object text under the classification scheme;
And
Step (S3), according to intent classifier vector, two objects are asked for using Method of Cosine
The similarity of text.
2. Text similarity computing method according to claim 1, it is characterised in that
The formula of the Method of Cosine is:
<mrow>
<mi>cos</mi>
<mi>&theta;</mi>
<mo>=</mo>
<mfrac>
<mrow>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<mrow>
<mo>(</mo>
<msub>
<mi>A</mi>
<mi>i</mi>
</msub>
<mo>&times;</mo>
<msub>
<mi>B</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<msqrt>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>A</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
</msqrt>
<mo>&times;</mo>
<msqrt>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>n</mi>
</msubsup>
<msup>
<mrow>
<mo>(</mo>
<msub>
<mi>B</mi>
<mi>i</mi>
</msub>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
</msqrt>
</mrow>
</mfrac>
</mrow>
Wherein, cos θ represent similarity, and i represents the classification scheme number of the intent classifier vector, its
Value is 1 to n positive integer, and A represents the first object text, and B represents the second object text, Ai、Bi
The institute of the first object text or the second object text under current class theme is represented respectively
State probability.
3. Text similarity computing method according to claim 1, it is characterised in that
The intention assessment disaggregated model is created by bag of words method and combines logistic regression algorithm coming
Realize.
4. Text similarity computing method according to claim 1, it is characterised in that
The classification scheme is the service point of customer service and user session.
5. Text similarity computing method according to claim 1, it is characterised in that
The history text is the text in customer service and the history consulting daily record of user session.
6. Text similarity computing method according to claim 1, it is characterised in that
The phrase is a part of phrase filtered out as needed from the history text.
7. Text similarity computing method according to claim 1, it is characterised in that
The classification scheme number is the dimension of the intent classifier vector.
8. Text similarity computing method according to claim 1, it is characterised in that
The probability is the numerical value of the intent classifier vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610268995.9A CN107315731A (en) | 2016-04-27 | 2016-04-27 | Text similarity computing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610268995.9A CN107315731A (en) | 2016-04-27 | 2016-04-27 | Text similarity computing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107315731A true CN107315731A (en) | 2017-11-03 |
Family
ID=60184590
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610268995.9A Pending CN107315731A (en) | 2016-04-27 | 2016-04-27 | Text similarity computing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107315731A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334891A (en) * | 2017-12-15 | 2018-07-27 | 北京奇艺世纪科技有限公司 | A kind of Task intent classifier method and device |
CN108388914A (en) * | 2018-02-26 | 2018-08-10 | 中译语通科技股份有限公司 | A kind of grader construction method, grader based on semantic computation |
CN109284486A (en) * | 2018-08-14 | 2019-01-29 | 重庆邂智科技有限公司 | Text similarity measure, device, terminal and storage medium |
CN109344857A (en) * | 2018-08-14 | 2019-02-15 | 重庆邂智科技有限公司 | Text similarity measurement method and device, terminal and storage medium |
CN109635105A (en) * | 2018-10-29 | 2019-04-16 | 厦门快商通信息技术有限公司 | A kind of more intension recognizing methods of Chinese text and system |
CN110019715A (en) * | 2017-12-08 | 2019-07-16 | 阿里巴巴集团控股有限公司 | Response determines method, apparatus, equipment, medium and system |
CN111373391A (en) * | 2017-11-29 | 2020-07-03 | 三菱电机株式会社 | Language processing device, language processing system, and language processing method |
CN111428010A (en) * | 2019-01-10 | 2020-07-17 | 北京京东尚科信息技术有限公司 | Man-machine intelligent question and answer method and device |
CN112527985A (en) * | 2020-12-04 | 2021-03-19 | 杭州远传新业科技有限公司 | Unknown problem processing method, device, equipment and medium |
CN115187153A (en) * | 2022-09-14 | 2022-10-14 | 杭银消费金融股份有限公司 | Data processing method and system applied to business risk tracing |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1755687A (en) * | 2004-09-30 | 2006-04-05 | 微软公司 | Forming intent-based clusters and employing same by search engine |
CN101621391A (en) * | 2009-08-07 | 2010-01-06 | 北京百问百答网络技术有限公司 | Method and system for classifying short texts based on probability topic |
CN102662987A (en) * | 2012-03-14 | 2012-09-12 | 华侨大学 | Classification method of web text semantic based on Baidu Baike |
CN102681983A (en) * | 2011-03-07 | 2012-09-19 | 北京百度网讯科技有限公司 | Alignment method and device for text data |
CN102880723A (en) * | 2012-10-22 | 2013-01-16 | 深圳市宜搜科技发展有限公司 | Searching method and system for identifying user retrieval intention |
CN103823844A (en) * | 2014-01-26 | 2014-05-28 | 北京邮电大学 | Question forwarding system and question forwarding method on the basis of subjective and objective context and in community question-and-answer service |
CN104050256A (en) * | 2014-06-13 | 2014-09-17 | 西安蒜泥电子科技有限责任公司 | Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method |
CN104408153A (en) * | 2014-12-03 | 2015-03-11 | 中国科学院自动化研究所 | Short text hash learning method based on multi-granularity topic models |
CN104516986A (en) * | 2015-01-16 | 2015-04-15 | 青岛理工大学 | Method and device for recognizing sentence |
CN104731958A (en) * | 2015-04-03 | 2015-06-24 | 北京航空航天大学 | User-demand-oriented cloud manufacturing service recommendation method |
CN104951433A (en) * | 2015-06-24 | 2015-09-30 | 北京京东尚科信息技术有限公司 | Method and system for intention recognition based on context |
CN105653738A (en) * | 2016-03-01 | 2016-06-08 | 北京百度网讯科技有限公司 | Search result broadcasting method and device based on artificial intelligence |
-
2016
- 2016-04-27 CN CN201610268995.9A patent/CN107315731A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1755687A (en) * | 2004-09-30 | 2006-04-05 | 微软公司 | Forming intent-based clusters and employing same by search engine |
CN101621391A (en) * | 2009-08-07 | 2010-01-06 | 北京百问百答网络技术有限公司 | Method and system for classifying short texts based on probability topic |
CN102681983A (en) * | 2011-03-07 | 2012-09-19 | 北京百度网讯科技有限公司 | Alignment method and device for text data |
CN102662987A (en) * | 2012-03-14 | 2012-09-12 | 华侨大学 | Classification method of web text semantic based on Baidu Baike |
CN102880723A (en) * | 2012-10-22 | 2013-01-16 | 深圳市宜搜科技发展有限公司 | Searching method and system for identifying user retrieval intention |
CN103823844A (en) * | 2014-01-26 | 2014-05-28 | 北京邮电大学 | Question forwarding system and question forwarding method on the basis of subjective and objective context and in community question-and-answer service |
CN104050256A (en) * | 2014-06-13 | 2014-09-17 | 西安蒜泥电子科技有限责任公司 | Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method |
CN104408153A (en) * | 2014-12-03 | 2015-03-11 | 中国科学院自动化研究所 | Short text hash learning method based on multi-granularity topic models |
CN104516986A (en) * | 2015-01-16 | 2015-04-15 | 青岛理工大学 | Method and device for recognizing sentence |
CN104731958A (en) * | 2015-04-03 | 2015-06-24 | 北京航空航天大学 | User-demand-oriented cloud manufacturing service recommendation method |
CN104951433A (en) * | 2015-06-24 | 2015-09-30 | 北京京东尚科信息技术有限公司 | Method and system for intention recognition based on context |
CN105653738A (en) * | 2016-03-01 | 2016-06-08 | 北京百度网讯科技有限公司 | Search result broadcasting method and device based on artificial intelligence |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111373391B (en) * | 2017-11-29 | 2023-10-20 | 三菱电机株式会社 | Language processing device, language processing system, and language processing method |
CN111373391A (en) * | 2017-11-29 | 2020-07-03 | 三菱电机株式会社 | Language processing device, language processing system, and language processing method |
CN110019715A (en) * | 2017-12-08 | 2019-07-16 | 阿里巴巴集团控股有限公司 | Response determines method, apparatus, equipment, medium and system |
CN110019715B (en) * | 2017-12-08 | 2023-07-14 | 阿里巴巴集团控股有限公司 | Response determination method, device, equipment, medium and system |
CN108334891A (en) * | 2017-12-15 | 2018-07-27 | 北京奇艺世纪科技有限公司 | A kind of Task intent classifier method and device |
CN108388914B (en) * | 2018-02-26 | 2022-04-01 | 中译语通科技股份有限公司 | Classifier construction method based on semantic calculation and classifier |
CN108388914A (en) * | 2018-02-26 | 2018-08-10 | 中译语通科技股份有限公司 | A kind of grader construction method, grader based on semantic computation |
CN109344857A (en) * | 2018-08-14 | 2019-02-15 | 重庆邂智科技有限公司 | Text similarity measurement method and device, terminal and storage medium |
CN109284486B (en) * | 2018-08-14 | 2023-08-22 | 重庆邂智科技有限公司 | Text similarity measurement method, device, terminal and storage medium |
CN109344857B (en) * | 2018-08-14 | 2022-05-13 | 重庆邂智科技有限公司 | Text similarity measurement method and device, terminal and storage medium |
CN109284486A (en) * | 2018-08-14 | 2019-01-29 | 重庆邂智科技有限公司 | Text similarity measure, device, terminal and storage medium |
CN109635105A (en) * | 2018-10-29 | 2019-04-16 | 厦门快商通信息技术有限公司 | A kind of more intension recognizing methods of Chinese text and system |
CN111428010A (en) * | 2019-01-10 | 2020-07-17 | 北京京东尚科信息技术有限公司 | Man-machine intelligent question and answer method and device |
CN111428010B (en) * | 2019-01-10 | 2024-01-12 | 北京汇钧科技有限公司 | Man-machine intelligent question-answering method and device |
CN112527985A (en) * | 2020-12-04 | 2021-03-19 | 杭州远传新业科技有限公司 | Unknown problem processing method, device, equipment and medium |
CN115187153B (en) * | 2022-09-14 | 2022-12-09 | 杭银消费金融股份有限公司 | Data processing method and system applied to business risk tracing |
CN115187153A (en) * | 2022-09-14 | 2022-10-14 | 杭银消费金融股份有限公司 | Data processing method and system applied to business risk tracing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107315731A (en) | Text similarity computing method | |
CN109493166B (en) | Construction method for task type dialogue system aiming at e-commerce shopping guide scene | |
CN104951433B (en) | The method and system of intention assessment is carried out based on context | |
CN104111933B (en) | Obtain business object label, set up the method and device of training pattern | |
CN104820629B (en) | A kind of intelligent public sentiment accident emergent treatment system and method | |
CN103150333B (en) | Opinion leader identification method in microblog media | |
CN109189904A (en) | Individuation search method and system | |
CN110147445A (en) | Intension recognizing method, device, equipment and storage medium based on text classification | |
CN109767318A (en) | Loan product recommended method, device, equipment and storage medium | |
CN107977415A (en) | Automatic question-answering method and device | |
CN103116588A (en) | Method and system for personalized recommendation | |
CN106126751A (en) | A kind of sorting technique with time availability and device | |
CN105844424A (en) | Product quality problem discovery and risk assessment method based on network comments | |
CN102193936A (en) | Data classification method and device | |
CN105022754A (en) | Social network based object classification method and apparatus | |
Seret et al. | A new SOM-based method for profile generation: Theory and an application in direct marketing | |
CN105787025A (en) | Network platform public account classifying method and device | |
CN109766557A (en) | A kind of sentiment analysis method, apparatus, storage medium and terminal device | |
CN104750674A (en) | Man-machine conversation satisfaction degree prediction method and system | |
CN106844407A (en) | Label network production method and system based on data set correlation | |
CN112215629B (en) | Multi-target advertisement generating system and method based on construction countermeasure sample | |
CN104572915A (en) | User event relevance calculation method based on content environment enhancement | |
CN110134866A (en) | Information recommendation method and device | |
Catapang et al. | A bilingual chatbot using support vector classifier on an automatic corpus engine dataset | |
CN114036289A (en) | Intention identification method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171103 |