CN108763258B - Document theme parameter extraction method, product recommendation method, device and storage medium - Google Patents

Document theme parameter extraction method, product recommendation method, device and storage medium Download PDF

Info

Publication number
CN108763258B
CN108763258B CN201810287788.7A CN201810287788A CN108763258B CN 108763258 B CN108763258 B CN 108763258B CN 201810287788 A CN201810287788 A CN 201810287788A CN 108763258 B CN108763258 B CN 108763258B
Authority
CN
China
Prior art keywords
product
theme
target
topic
topics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810287788.7A
Other languages
Chinese (zh)
Other versions
CN108763258A (en
Inventor
王义文
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810287788.7A priority Critical patent/CN108763258B/en
Priority to PCT/CN2018/100312 priority patent/WO2019192122A1/en
Publication of CN108763258A publication Critical patent/CN108763258A/en
Application granted granted Critical
Publication of CN108763258B publication Critical patent/CN108763258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a document theme parameter extraction method, which is characterized in that a trained related theme model is obtained through training a document training set to obtain the distribution of a target document on a theme, the relation distribution between any two themes in a plurality of themes and the distribution between products and themes. The invention also provides a product recommendation method, which comprises the following steps: and acquiring the input product description, and processing the product description to obtain the distribution of the product description on the theme, the relationship between the themes in the related theme model and the probability distribution between the product and the theme. The invention also provides electronic equipment and a storage medium. The invention can avoid finding only products with similar contents, and improve the accuracy, thereby realizing more accurate products.

Description

Document theme parameter extraction method, product recommendation method, equipment and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to a document theme parameter extraction method, a product recommendation method, equipment and a storage medium.
Background
The rapid development of the internet catalyzes the generation of massive information, and gradually makes big data a necessary trend of the current information technology, so that valuable data needs to be extracted from various information rapidly and effectively. However, in the current product recommendation, products containing keywords are found from a large number of products according to similarity of contents or keywords and are recommended to a user, but products which are not similar to the description contents of the user but related to topics are omitted, for example, the 'healthy' keywords are not related to the 'gene' keywords but related to the topics, but when the 'healthy' keywords are input through the prior art, the products related to the 'gene' cannot be found, and therefore the recommendation accuracy is influenced.
Disclosure of Invention
In view of the above, it is necessary to provide a document theme parameter extraction method, a product recommendation method, and an electronic device, which can avoid finding only products with similar contents, improve accuracy, and thus implement more accurate products.
A document theme parameter extraction method, the method comprising:
preprocessing a target document to obtain a word set of the target document;
and inputting the word set of the target document into a trained related topic model CTM to obtain the distribution of the target document on the topic, the relationship distribution between any two topics in a plurality of topics and the distribution between products and topics, wherein the trained related topic model is obtained by training based on a document sample set, and the trained related topic model comprises a plurality of topics.
According to the preferred embodiment of the present invention, the preprocessing the target document to obtain the word set of the target document includes:
removing special words in the target document to obtain a processed document;
and performing word segmentation on the processed document to obtain a tuple set.
According to a preferred embodiment of the invention, the method further comprises:
in the tuple set, removing high-frequency tuples with the occurrence frequency in front of a preset number of digits and low-frequency tuples with the occurrence frequency lower than the preset number of times in the text corpus, and determining the processed tuple set as the word set of the target document.
A method of product recommendation, the method comprising:
acquiring an input product description, and taking the acquired product description as a target document;
processing the product description by using the document theme parameter extraction method in any embodiment to obtain the distribution of the product description on the theme, the relation between themes in the relevant theme model and the probability distribution between the product and the theme;
recommending target products related to the topics of the product description to the user based on the distribution of the product description on the topics, the relation between the topics in the relevant topic model and the probability distribution between the products and the topics.
According to the preferred embodiment of the present invention, the recommending the target product associated with the topic of the product description to the user based on the distribution of the product description on the topic and the relationship between the topics and the probability distribution between the products and the topics in the relevant topic model comprises one or more of the following combinations:
obtaining at least one target subject contained in the product description based on the distribution of the product description on the subject, determining the subject with the highest association degree with each target subject in the at least one target subject according to the relation between the subjects in the relevant subject model, and determining the product with the determined subject accounting for a preset number of digits before as a part of the target product according to the probability distribution of the product and the subject in the relevant subject model;
obtaining a theme with the highest proportion in the product description based on the distribution of the product description on the theme, determining a target theme with the highest proportion and the highest relevance degree of the theme according to the relation between the themes in the relevant theme model, and determining a product with the preset digit in the front of the proportion of the target theme as a part of the target product according to the probability distribution of the product and the theme in the relevant theme model;
and acquiring at least one target subject contained in the product description based on the distribution of the product description on the subject, determining a product containing the at least one target subject according to the probability distribution of the product and the subject in the relevant subject model, and taking the determined product as a part of the target product.
According to a preferred embodiment of the present invention, the recommending a target product associated with the topic of the product description to the user based on the distribution of the product description on the topic, the relationship between the topics in the relevant topic model, and the probability distribution between the products and the topics further comprises:
the method comprises the steps of obtaining at least one target theme contained in the product description based on the distribution of the product description on themes, determining a first theme related to the at least one target theme according to the relation between themes in the related theme model, then determining a second theme only related to the first theme, and determining products with the second theme having a preset number of digits before the second theme in proportion as a part of the target products according to the probability distribution of the products and the themes in the related theme model.
According to a preferred embodiment of the invention, the method further comprises: and displaying the product categories associated with the topics in the product description, and displaying the recommended mode of each product category.
According to a preferred embodiment of the invention, the method further comprises: the method comprises the steps of obtaining a product selected by a user according to a recommended target product, determining a theme contained in the selected product, and taking a product with the theme contained in the selected product in a preset digit number as a part of the target product.
An electronic device, comprising a memory and a processor, wherein the memory is used for storing at least one instruction, and the processor is used for executing the at least one instruction to realize the document theme parameter extraction method in any embodiment and/or the product recommendation method in any embodiment.
A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements the document theme parameter extraction method of any of the embodiments, and/or the product recommendation method of any of the embodiments.
According to the technical scheme, the document theme parameter extraction method provided by the invention obtains the distribution of the target document on the theme, the relation distribution between any two themes in a plurality of themes and the distribution between products and themes in the trained relevant theme model obtained through training of the document training set. And acquiring the input product description, and processing the product description to obtain the distribution of the product description on the theme, the relationship between the themes in the related theme model and the probability distribution between the product and the theme. According to the method and the device, based on the related topic model, the products which are dissimilar in content and related in topic can be searched, so that the products which are closely related in topic are recommended, the situation that only the products with similar content are found is avoided, the accuracy is improved, and the more accurate products are realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a flow chart of a first preferred embodiment of the document theme parameter extraction method of the present invention.
FIG. 2 is a flowchart illustrating a product recommendation method according to a first preferred embodiment of the present invention.
FIG. 3 is a block diagram of a first preferred embodiment of the document theme parameter extraction apparatus according to the present invention.
FIG. 4 is a block diagram of a product recommendation device according to a first preferred embodiment of the present invention.
FIG. 5 is a block diagram of a preferred embodiment of an electronic device in accordance with at least one embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and "third," etc. in the description and claims of the present invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
FIG. 1 is a flow chart of the document theme parameter extraction method according to the first preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
S10, the electronic equipment preprocesses the target document to obtain a word set of the target document.
Preferably, the preprocessing the target document to obtain the word set of the target document includes:
(1) And removing the special words in the target document to obtain a processed document.
Further, the special words include website links, user name marks, special characters, place name marks, punctuation marks, and the like.
(2) And performing word segmentation on the processed document to obtain a tuple set.
And performing word segmentation on the processed document in a mode of extracting n-tuples (n-gram) to generate n-tuples (n is a positive integer, for example, n is less than 4). For example, such as segmenting Chinese text corpora is done based on Chinese Lexical Analysis System (ICTCCLAS) tools. For example, for space-separated text corpora (e.g., english), the word segmentation can be performed directly through spaces; and for Chinese, japanese and the like, no blank space is used as separated text corpora.
Further, a one-tuple, two-tuple and three-tuple common three-kind tuple set is extracted from the text corpus.
Preferably, after obtaining the tuple set, the method further includes: in the tuple set, removing high-frequency tuples (i.e. high-frequency words) with the occurrence frequency in the text corpus being a preset digit (such as the first 50 digits) and low-frequency tuples (i.e. low-frequency words) with the occurrence frequency being lower than a preset frequency (such as the 3 times), and determining the processed tuple set as the word set of the target document.
In an alternative embodiment, in consideration of the linguistic characteristics of the words, a certain proportion of high-frequency tuples (usually stop words and the like) and low-frequency tuples (usually names of people, non-words and the like) are removed, and only the remaining medium-frequency tuples are taken as candidate words of the emotion dictionary. The high-frequency tuples are usually stop words which have higher co-occurrence chances with various words, so that the expression of emotional characteristics is not obvious; low frequency tuples are typically non-words or usernames, etc., which have no linguistic meaning and therefore need to be removed. In this way, the intermediate frequency tuple with the intermediate occurrence number is used as a part of candidate words.
In other implementations, after word segmentation is performed by adopting a word segmentation technology, a candidate word set is generated by combining n-tuples, so that n-tuples which cannot be formed into words can be removed. The word segmentation technology is the prior art, and the invention is not limited in any way. This can improve dictionary accuracy. This treatment does not hinder the effectiveness of the overall process.
S11, the electronic equipment inputs the word set of the target document into a trained related Topic Model CTM (Correlated Topic Model) to obtain the distribution of the target document on the Topic, the relationship distribution between any two topics in a plurality of topics and the distribution between products and topics, wherein the trained related Topic Model is obtained by training based on a document sample set, and the trained related Topic Model comprises a plurality of topics.
In the present invention, the related Topic Model CTM (Correlated Topic Model) is a Model for Topic proportion using a covariance matrix in a logistic normal distribution to find the distribution of document topics and the association between topics.
The related topic model is a generative probability model which can automatically extract implicit semantic topics from discrete data sets, wherein the topics refer to contents which often co-occur in the data sets. The related topic model describes the relation among all variables through a probability graph model, and the probability distribution related to the topic is calculated through a sampling or variation inference method.
The relevant topic model may automatically discover topics implicit in a document collection, the topics being probability distributions of words. The relevant topic model provides a convenient tool for unsupervised analysis of documents and prediction of new documents. The basic idea of the relevant topic model is that a document is a random mixture of several topics, where each topic is a multinomial distribution of words. In the document set, the topics are probability distribution of vocabularies in the corpus, and if one corpus has K topics, the K topics occupy different proportions in each document. Therefore, the related topic model is trained through the document set, and the distribution among a plurality of topics and the distribution relation among products and topics can be obtained.
Preferably, the process of training the relevant topic model is as follows:
(a1) And acquiring a document sample set, and configuring the document sample set into a training set and a testing set. For example, 70% of the document samples are used as the training set and 30% of the document samples are used as the test set.
(a2) And configuring the optimal number of the topics of the training set.
The optimal number of topics is used for representing the number of topics in the relevant topic model.
(a3) And modeling the documents in the training set by using a related topic model based on the training set and the optimal topic number to obtain each parameter in the related topic model.
(a4) And inputting the word set corresponding to the document sample in the test set into the related topic model obtained by training to obtain the topic representation of each document in the test set.
(a5) And evaluating the accuracy of the trained related topic model, if the trained related topic model is less than a preset accuracy, for example, 99%, increasing samples in a training set and/or adjusting the number of the optimal topics step by step, and repeating the steps of training the related topic model until the accuracy of the trained related topic model is greater than or equal to the preset accuracy, for example, 99%.
According to the method, a document training set is trained to obtain a trained relevant topic model, and the distribution of the target document on the topics, the relation distribution between any two topics in a plurality of topics and the distribution between products and topics are obtained. Therefore, the method and the device can extract the theme parameter information of the document, so that the relevance among the theme parameters of the document can be conveniently utilized subsequently, and products relevant to the theme can be recommended to the user.
FIG. 2 is a flowchart illustrating a product recommendation method according to a first preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
And S20, the electronic equipment acquires the input product description and takes the acquired product description as a target document.
In alternative embodiments, the product description includes, but is not limited to, combinations of one or more of the following: words, phrases, sections, etc. The form of the product description comprises one or more of a voice form and a text form.
Preferably, the products include, but are not limited to: financial products, online purchased goods, and the like.
For example, the financial products of banks are classified into a plurality of modules, such as a high-profit module, a random access module, a periodic one-month module, and the like. When purchasing a financial product, a user can input a financial product description which the user wants to buy, such as voice input of a session and the like, so as to find out the financial product with a similar theme to the product description input by the user.
And S21, the electronic equipment processes the product description to obtain the distribution of the product description on the theme, the relation between the themes in the related theme model and the probability distribution between the product and the theme.
In a preferred embodiment, the electronic device processes the product description by using the document theme parameter extraction method.
In an alternative embodiment, the training samples for training the relevant topic model include product descriptions of individual products. A product description is taken as a sample of a document. The relevant topic model is trained using the method of the first preferred embodiment.
Further, the distribution of the product description on the theme is used for representing the specific gravity of the theme contained in the product description. For example, the product description includes three topics, a topic a, a topic B, and a topic C, wherein the specific gravity relationship is: subject a: subject B: subject C = 16.
Further, the relationship between the topics of the product is used for representing the degree of association between any two topics in the related topic model. For example, there are three topics, the relevance of topic a to topic B is 0.2, the relevance of topic a to topic C is 0.8, the relevance of topic B to topic C is 0.4, and so on.
And S22, recommending a target product associated with the theme of the product description to the user by the electronic equipment based on the distribution of the product description on the theme, the relation between the themes in the related theme model and the probability distribution between the products and the themes.
Preferably, the recommending the target product associated with the topic of the product description to the user based on the distribution of the product description on the topic, the relationship between the topics in the relevant topic model and the probability distribution between the products and the topics comprises one or more of the following combinations:
(1) The method comprises the steps of obtaining at least one target theme contained in the product description based on the distribution of the product description on the themes, determining the theme with the highest association degree with each target theme in the at least one target theme according to the relation between the themes in the relevant theme model, and determining products with the determined themes occupying the preset digits as a part of the target products according to the probability distribution of the products and the themes in the relevant theme model.
For example, the description of the financial product input by the user comprises two topics with high profit and short time, the topic with the highest relevance with the topic with high profit is that the profit becomes more than 5% for years, and the topic with the highest relevance with the topic with short time is taken at any time. Wherein the annual income is more than 5% and accounts for the highest proportion in the financial products A and the financial products C, the short-time theme accounts for the highest proportion in the financial products A and the financial products D, and the financial products A, the financial products C and the financial products D are target products. Therefore, each theme in the product description can recommend the product with the highest association degree with the theme to the user, and personalized recommendation of the product is achieved.
(2) And obtaining a theme with the highest proportion in the product description based on the distribution of the product description on the theme, determining a target theme with the highest proportion and the highest relevance with the theme according to the relation between the themes in the relevant theme model, and determining a product with the preset digit in the front of the proportion of the target theme as a part of the target product according to the probability distribution of the product and the theme in the relevant theme model.
For example, the description of the financial product input by the user comprises two topics of high profit and short time, wherein the high profit has the highest proportion, and the topic with the highest association degree with the high profit topic is that the profit is aged by more than 5%. Wherein the annual income is more than 5 percent of the highest proportion of the financial products A and C, and the financial products A and C are target products.
(3) And acquiring at least one target subject contained in the product description based on the distribution of the product description on the subject, determining a product containing the at least one target subject according to the probability distribution of the product and the subject in the relevant subject model, and taking the determined product as a part of the target product.
(4) Based on the distribution of the product description on the topics, at least one target topic contained in the product description is obtained, a first topic associated with the at least one target topic is determined according to the relation between the topics in the relevant topic model, then a second topic associated with the first topic is determined, and products with the second topic having a preset number of digits before the second topic is determined as a part of the target products according to the probability distribution of the products and the topics in the relevant topic model. Thus, the indirect relation among the themes is embodied, so that indirect strong-association themes are found out, and personalized products are recommended to the user.
For example, the product description contains a topic a, in the related topic model, a topic C is related to the topic a, and a topic D is only related to the topic C, which indicates that the topic D is strongly related to the topic C, so that a product with the topic D occupying a preset number of digits before is taken as a part of the target product.
Preferably, the products associated with the subject in the product description are displayed in categories, and the recommended mode of each type of product is displayed. For example, the product type most associated with the theme A, the product class most associated with the theme C and the like, so that the user can intuitively know the products associated with the theme which is interested in the user, and the user can conveniently and individually select the products according to the recommended product scheme.
Preferably, the method further comprises: the method comprises the steps of obtaining a product selected by a user according to a recommended target product, determining a theme contained in the selected product, and taking a product with the theme contained in the selected product in a preset digit as a part of the target product. Therefore, the product which is interested by the user can be recommended, the requirements of the user can be better met, and the personalized recommendation of the product is realized.
By the aid of the method and the device, products which are not similar in content and related in theme can be searched based on the related theme model, products which are closely related in theme are recommended, finding of products with similar content is avoided, accuracy is improved, and accurate products are achieved.
Through the embodiments, the invention provides a document theme parameter extraction method, which obtains the distribution of the target document on the theme, the relationship distribution between any two themes in a plurality of themes and the distribution between products and themes in a trained related theme model through a document training set and training. And acquiring the input product description, and processing the product description to obtain the distribution of the product description on the theme, the relationship between the themes in the related theme model and the probability distribution between the product and the theme. According to the method and the device, based on the related topic model, the products which are dissimilar in content and related in topic can be searched, so that the products which are closely related in topic are recommended, the situation that only the products with similar content are found is avoided, the accuracy is improved, and the more accurate products are realized.
Referring to FIG. 3, a block diagram of a program of a first preferred embodiment of the document theme parameter extraction apparatus of the invention is shown. The document theme parameter extraction device 3 includes, but is not limited to, one or more of the following modules: a preprocessing module 30, a calculation module 31 and a training module 32. The unit referred to in the present invention means a series of computer program segments which can be executed by the processor of the document theme parameter extraction means 3 and which can perform a fixed function, and which are stored in the memory. The functions of the respective units will be described in detail in the following embodiments.
The preprocessing module 30 preprocesses the target document to obtain a word set of the target document.
Preferably, the preprocessing module 30 preprocesses the target document to obtain the word set of the target document includes:
(1) And removing the special words in the target document to obtain a processed document.
Further, the special words include website links, user name marks, special characters, place name marks, punctuation marks, and the like.
(2) And performing word segmentation on the processed document to obtain a tuple set.
And performing word segmentation on the processed document in a mode of extracting n-grams to generate n-grams (n is a positive integer, for example, n is less than 4). For example, such as segmenting Chinese text corpora is done based on Chinese Lexical Analysis System (ICTCCLAS) tools. For example, for space-separated text corpus (e.g., english), the word may be segmented directly by space; and for Chinese, japanese and the like, no blank space is used as separated text corpora.
Further, a one-tuple, two-tuple and three-tuple common tuple set is extracted from the text corpus.
Preferably, after obtaining the tuple set, the preprocessing module 30 is further specifically configured to: in the tuple set, removing high-frequency tuples (i.e. high-frequency words) with the occurrence frequency in the text corpus being a preset digit (such as the first 50 digits) and low-frequency tuples (i.e. low-frequency words) with the occurrence frequency being lower than a preset frequency (such as the 3 times), and determining the processed tuple set as the word set of the target document.
In an alternative embodiment, in consideration of the linguistic characteristics of the words, a certain proportion of high-frequency tuples (usually stop words, etc.) and low-frequency tuples (usually names of people, non-words, etc.) are removed, and only the remaining medium-frequency tuples are taken as candidate words of the emotion dictionary. The high-frequency tuples are usually stop words, and have higher co-occurrence chances with various words, so the expression of emotional characteristics is not obvious; low frequency tuples are typically non-words or usernames, etc., which have no linguistic meaning and therefore need to be removed. In this way, the intermediate frequency tuple with the intermediate occurrence number is used as a part of candidate words.
In other implementations, after word segmentation is performed by adopting a word segmentation technology, a candidate word set is generated by combining n-tuples, so that n-tuples which cannot be formed into words can be removed. The word segmentation technology is the prior art, and the invention is not limited in any way. This can improve dictionary accuracy. This treatment does not hinder the effectiveness of the overall process.
The calculation module 31 inputs the word set of the target document into a trained related Topic Model CTM (Correlated Topic Model) to obtain the distribution of the target document on the Topic, the relationship distribution between any two topics in the multiple topics, and the distribution between the product and the Topic, where the trained related Topic Model is obtained by training based on the document sample set, and the trained related Topic Model includes multiple topics.
In the present invention, the related Topic Model CTM (Correlated Topic Model) is a Model for Topic proportion using a covariance matrix in a logistic normal distribution to find the distribution of document topics and the association between topics.
The related topic model is a generative probability model which can automatically extract implicit semantic topics from discrete data sets, wherein the topics refer to contents which often co-occur in the data sets. The related topic model describes the relation among all variables through a probability graph model, and the probability distribution related to the topic is calculated through a sampling or variation inference method.
The relevant topic model may automatically discover topics implicit in a document collection, the topics being probability distributions of words. The relevant topic model provides a convenient tool for unsupervised analysis of documents and prediction of new documents. The basic idea of the relevant topic model is that a document is a random mixture of several topics, where each topic is a multi-term distribution of words. In the document set, the topics are probability distribution of vocabularies in the corpus, and if one corpus has K topics, the K topics occupy different proportions in each document. Therefore, the related topic model is trained through the document set, and the distribution among a plurality of topics and the distribution relation among products and topics can be obtained.
Preferably, the process of training the relevant topic model by the training module 32 is as follows:
(a1) And acquiring a document sample set, and configuring the document sample set into a training set and a testing set. For example, 70% of the document samples are used as the training set and 30% of the document samples are used as the test set.
(a2) And configuring the optimal number of the topics of the training set.
The optimal number of topics is used for representing the number of topics in the relevant topic model.
(a3) And modeling the documents in the training set by using a related topic model based on the training set and the optimal topic number to obtain each parameter in the related topic model.
(a4) And inputting the word set corresponding to the document sample in the test set into a related topic model obtained by training to obtain the topic representation of each document in the test set.
(a5) And evaluating the accuracy of the related topic model obtained by training, if the related topic model obtained by training is less than the preset accuracy, for example, 99%, increasing samples in a training set and/or adjusting the number of the optimal topics step by step, and repeating the steps of training the related topic model until the accuracy of the related topic model obtained by training is greater than or equal to the preset accuracy, for example, 99%.
Referring to FIG. 4, a block diagram of a first preferred embodiment of the product recommendation device of the present invention is shown. The product recommendation device 4 includes, but is not limited to, one or more of the following modules: an acquisition module 40, a data calculation module 41, a recommendation module 42 and a display module 43. The unit referred to in the present invention refers to a series of computer program segments, stored in a memory, that can be executed by a processor of the product recommendation device 4 and that can perform a fixed function. The function of each unit will be described in detail in the following embodiments.
The obtaining module 40 obtains the input product description, and takes the obtained product description as a target document.
In alternative embodiments, the product description includes, but is not limited to, combinations of one or more of the following: words, phrases, sections, etc. The form of the product description comprises one or more of a voice form and a text form.
Preferably, the products include, but are not limited to: financial products, online purchased goods, and the like.
For example, the financial products of banks are classified into a plurality of modules, such as a high-profit module, a random access module, a periodic one-month module, and the like. When purchasing a financial product, a user can input a financial product description which the user wants to buy, such as voice input of a session and the like, so as to find out the financial product with a similar theme to the product description input by the user.
The data calculation module 41 processes the product description to obtain the distribution of the product description on the topics, the relationship between the topics in the relevant topic model, and the probability distribution between the products and the topics.
In a preferred embodiment, the electronic device processes the product description by using the document theme parameter extraction method.
In an alternative embodiment, the training samples for training the relevant topic model include product descriptions of individual products. A product description is taken as a sample of a document. The relevant topic model is trained using the method of the first preferred embodiment.
Further, the distribution of the product description on the theme is used for representing the specific gravity of the theme contained in the product description. For example, the product description includes three topics, a topic a, a topic B, and a topic C, wherein the specific gravity relationship is: subject a: subject B: subject C = 16.
Further, the relationship between the topics of the product is used for representing the degree of association between any two topics in the relevant topic model. For example, there are three topics, the relevance of topic a to topic B is 0.2, the relevance of topic a to topic C is 0.8, the relevance of topic B to topic C is 0.4, and so on.
The recommendation module 42 recommends the target product associated with the subject of the product description to the user based on the distribution of the product description on the subject and the relationship between the subjects and the probability distribution between the products and the subjects in the relevant subject model.
Preferably, the recommending module 42 recommends the target product associated with the topic of the product description to the user based on the distribution of the product description on the topic and the relationship between topics and the probability distribution between products and topics in the relevant topic model, including one or more of the following combinations:
(1) The method comprises the steps of obtaining at least one target theme contained in the product description based on the distribution of the product description on the themes, determining the theme with the highest association degree with each target theme in the at least one target theme according to the relation between the themes in the relevant theme model, and determining products with the determined themes occupying the preset digits as a part of the target products according to the probability distribution of the products and the themes in the relevant theme model.
For example, the description of the financial product input by the user comprises two topics of high profit and short time, the topic with the highest relevance degree to the topic of high profit is that the profit is aged more than 5%, and the topic with the highest relevance degree to the topic of short time is taken at any time. Wherein the annual income is more than 5% and accounts for the highest proportion in the financial products A and the financial products C, the short-time theme accounts for the highest proportion in the financial products A and the financial products D, and the financial products A, the financial products C and the financial products D are target products. Therefore, each theme in the product description can recommend the product with the highest association degree with the theme to the user, and personalized recommendation of the product is realized.
(2) And obtaining a theme with the highest proportion in the product description based on the distribution of the product description on the theme, determining a target theme with the highest proportion and the highest relevance with the theme according to the relation between the themes in the relevant theme model, and determining a product with the preset digit in the front of the proportion of the target theme as a part of the target product according to the probability distribution of the product and the theme in the relevant theme model.
For example, the description of the financial product input by the user comprises two topics of high profit and short time, wherein the high profit accounts for the highest proportion, and the topic with the highest relevance degree to the high profit topic is that the profit is aged by more than 5%. Wherein the annual income is more than 5 percent of the highest proportion of the financial products A and C, and the financial products A and C are target products.
(3) And acquiring at least one target subject contained in the product description based on the distribution of the product description on the subject, determining a product containing the at least one target subject according to the probability distribution of the product and the subject in the relevant subject model, and taking the determined product as a part of the target product.
(4) Based on the distribution of the product description on the topics, at least one target topic contained in the product description is obtained, a first topic associated with the at least one target topic is determined according to the relation between the topics in the relevant topic model, then a second topic associated with the first topic is determined, and products with the second topic having a preset number of digits before the second topic is determined as a part of the target products according to the probability distribution of the products and the topics in the relevant topic model. Thus, indirect relations among the themes are embodied, indirect strong association themes are found out, and personalized products are recommended to users.
For example, the product description contains a topic a, in the related topic model, a topic C is related to the topic a, and a topic D is only related to the topic C, which indicates that the topic D is strongly related to the topic C, so that a product with the topic D occupying a preset number of digits before is taken as a part of the target product.
Preferably, the display module 43 displays the product categories associated with the topics in the product description and displays the recommended modes of each product category. For example, the product type most associated with the theme A, the product class most associated with the theme C and the like, so that the user can intuitively know the products associated with the theme interested by the user, and the user can conveniently and individually select the products according to the recommended product scheme.
Preferably, the recommending module 42 is further configured to: the method comprises the steps of obtaining a product selected by a user according to a recommended target product, determining a theme contained in the selected product, and taking a product with the theme contained in the selected product in a preset digit as a part of the target product. Therefore, the product can be recommended in combination with the interesting product of the user, the requirements of the user can be better met, and the personalized recommendation of the product is realized.
By the aid of the method and the device, products which are not similar in content and related in theme can be searched based on the related theme model, products which are closely related in theme are recommended, finding of products with similar content is avoided, accuracy is improved, and accurate products are achieved.
Through the embodiments, the invention provides a document theme parameter extraction method, which obtains the distribution of the target document on the theme, the relationship distribution between any two themes in a plurality of themes and the distribution between products and themes in a trained related theme model through a document training set and training. And acquiring the input product description, and processing the product description to obtain the distribution of the product description on the theme, the relationship between the themes in the relevant theme model and the probability distribution between the product and the theme. According to the method and the device, based on the related topic model in the embodiment, products which are not similar in content and related in topic can be searched, so that products which are closely related in topic are recommended, the situation that only products with similar content are found is avoided, accuracy is improved, and more accurate products are achieved.
The integrated unit implemented in the form of a software program module may be stored in a computer-readable storage medium. The software program module is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the method according to each embodiment of the present invention.
As shown in fig. 5, the electronic device 5 comprises at least one transmitting means 51, at least one memory 52, at least one processor 53, at least one receiving means 54 and at least one communication bus. Wherein the communication bus is used for realizing connection communication among the components.
The electronic device 5 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like. The electronic device 5 may also comprise a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network servers, wherein Cloud Computing is one of distributed Computing, a super virtual computer consisting of a collection of loosely coupled computers.
The electronic device 5 may be, but not limited to, any electronic product that can perform human-computer interaction with a user through a keyboard, a touch pad, a voice control device, or the like, for example, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), an intelligent wearable device, a camera device, a monitoring device, or other terminals.
The Network where the electronic device 5 is located includes, but is not limited to, the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
The receiving device 54 and the transmitting device 51 may be wired transmitting ports, or may be wireless devices, for example, including antenna devices, for performing data communication with other devices.
The memory 52 is used to store program code. The Memory 52 may be a circuit without a physical form and having a Memory function In an integrated circuit, such as a RAM (Random-Access Memory), a FIFO (First In First Out), and the like. Alternatively, the memory 52 may be a memory in a physical form, such as a memory bank, a TF Card (Trans-flash Card), a smart media Card (smart media Card), a secure digital Card (secure digital Card), a flash memory Card (flash Card), and so on.
The processor 53 may include one or more microprocessors, digital processors. The processor 53 may call program code stored in the memory 52 to perform the associated functions. For example, the modules described in fig. 3 are program codes stored in the memory 52 and executed by the processor 53 to implement a document theme parameter extraction method; and/or the various modules described in fig. 4 are program code stored in the memory 52 and executed by the processor 53 to implement a product recommendation method. The processor 53 is also called a Central Processing Unit (CPU), and is an ultra-large scale integrated circuit, which is an operation Core (Core) and a Control Core (Control Unit).
Embodiments of the present invention also provide a computer-readable storage medium, on which computer instructions are stored, and when the instructions are executed by an electronic device including one or more processors, the instructions cause the electronic device to perform the document theme parameter extraction method and/or the product recommendation method according to the above method embodiments.
As shown in fig. 1, the memory 52 of the electronic device 5 stores a plurality of instructions to implement a document theme parameter extraction method, and the processor 53 can execute the plurality of instructions to implement:
preprocessing a target document to obtain a word set of the target document; and inputting the word set of the target document into a trained related topic model CTM to obtain the distribution of the target document on the topic, the relationship distribution between any two topics in a plurality of topics and the distribution between products and topics, wherein the trained related topic model is obtained by training based on a document sample set, and comprises a plurality of topics.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising:
removing special words in the target document to obtain a processed document;
and performing word segmentation on the processed document to obtain a tuple set.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising:
in the tuple set, removing high-frequency tuples with the occurrence frequency in front of a preset number of digits and low-frequency tuples with the occurrence frequency lower than the preset number of times in the text corpus, and determining the processed tuple set as the word set of the target document.
In any embodiment, a plurality of instructions corresponding to the document theme parameter extraction method are stored in the memory 52 and executed by the processor 53, which will not be described in detail herein.
As shown in fig. 2, the memory 52 of the electronic device 5 stores a plurality of instructions to implement a product recommendation method, and the processor 53 can execute the plurality of instructions to implement:
acquiring an input product description, and taking the acquired product description as a target document; processing the product description by using the document theme parameter extraction method in any embodiment to obtain the distribution of the product description on the theme, the relation between the themes in the relevant theme model and the probability distribution between the product and the theme; and recommending target products related to the topics of the product description to the user based on the distribution of the product description on the topics, the relation between the topics in the relevant topic model and the probability distribution between the products and the topics.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising:
obtaining at least one target subject contained in the product description based on the distribution of the product description on the subject, determining the subject with the highest association degree with each target subject in the at least one target subject according to the relation between the subjects in the relevant subject model, and determining the product with the determined subject accounting for a preset number of digits before as a part of the target product according to the probability distribution of the product and the subject in the relevant subject model;
obtaining a theme with the highest proportion in the product description based on the distribution of the product description on the theme, determining a target theme with the highest proportion and the highest relevance degree of the theme according to the relation between the themes in the relevant theme model, and determining a product with the preset digit in the front of the proportion of the target theme as a part of the target product according to the probability distribution of the product and the theme in the relevant theme model;
and acquiring at least one target theme contained in the product description based on the distribution of the product description on the theme, determining a product containing the at least one target theme according to the probability distribution of the product and the theme in the relevant theme model, and taking the determined product as a part of the target product.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising:
the method comprises the steps of obtaining at least one target theme contained in the product description based on the distribution of the product description on themes, determining a first theme related to the at least one target theme according to the relation between themes in the related theme model, then determining a second theme only related to the first theme, and determining products with the second theme having a preset number of digits before the second theme in proportion as a part of the target products according to the probability distribution of the products and the themes in the related theme model.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising: and displaying the product categories associated with the topics in the product description, and displaying the recommended mode of each product category.
In an alternative embodiment of the present invention, the processor 53 may execute the plurality of instructions further comprising: the method comprises the steps of obtaining a product selected by a user according to a recommended target product, determining a theme contained in the selected product, and taking a product with the theme contained in the selected product in a preset digit number as a part of the target product.
The above-described characteristic means of the present invention may be implemented by an integrated circuit, and controls the functions of implementing the document theme parameter extraction method described in any of the above embodiments. That is, the integrated circuit according to the present invention is mounted on the electronic device, and causes the electronic device to function as: preprocessing a target document to obtain a word set of the target document; and inputting the word set of the target document into a trained related topic model CTM to obtain the distribution of the target document on the topic, the relationship distribution between any two topics in a plurality of topics and the distribution between products and topics, wherein the trained related topic model is obtained by training based on a document sample set, and the trained related topic model comprises a plurality of topics.
The functions that can be realized by the document theme parameter extraction method in any embodiment can be installed in the electronic device through the integrated circuit of the invention, so that the electronic device can play the functions that can be realized by the document theme parameter extraction method in any embodiment, and detailed description is omitted here.
The above-described characteristic means of the present invention may be implemented by an integrated circuit, and controls the functions of implementing the document theme parameter extraction method described in any of the above embodiments. That is, the integrated circuit according to the present invention is mounted on the electronic device, and causes the electronic device to function as: acquiring an input product description, and taking the acquired product description as a target document; processing the product description by using the document theme parameter extraction method in any embodiment to obtain the distribution of the product description on the theme, the relation between the themes in the relevant theme model and the probability distribution between the product and the theme; and recommending target products related to the topics of the product description to the user based on the distribution of the product description on the topics, the relation between the topics in the relevant topic model and the probability distribution between the products and the topics.
The functions that can be realized by the product recommendation method in any embodiment can be installed in the electronic device through the integrated circuit of the present invention, so that the electronic device can perform the functions that can be realized by the product recommendation method in any embodiment, which is not described in detail herein.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. A method for recommending products, the method comprising:
acquiring an input product description, and taking the acquired product description as a target document;
processing the product description by using a preset document theme parameter extraction method to obtain the distribution of the product description on the theme, the relation between the themes in the relevant theme model and the probability distribution between the product and the theme, wherein the preset document theme parameter extraction method comprises the following steps: preprocessing a target document to obtain a word set of the target document; inputting the word set of the target document into a trained related topic model (CTM) to obtain the distribution of the target document on topics, the relationship distribution between any two topics in a plurality of topics and the distribution between products and topics, wherein the trained related topic model is obtained by training based on a document sample set and comprises a plurality of topics;
recommending target products related to the topics of the product descriptions to the users based on the distribution of the product descriptions on the topics, the relation between the topics in the relevant topic model and the probability distribution between the products and the topics, wherein the target products comprise: based on the distribution of the product description on the topics, at least one target topic contained in the product description is obtained, a first topic associated with the at least one target topic is determined according to the relation between the topics in the relevant topic model, then a second topic associated with the first topic is determined, and products with the second topic having a preset number of digits before the second topic is determined as a part of the target products according to the probability distribution of the products and the topics in the relevant topic model.
2. The product recommendation method of claim 1, wherein recommending the target product associated with the topic of the product description to the user based on the distribution of the product description on the topic and the relationship between topics and the probability distribution between products and topics in the relevant topic model further comprises one or more of the following:
obtaining at least one target subject contained in the product description based on the distribution of the product description on the subjects, determining a subject with the highest association degree with each target subject in the at least one target subject according to the relationship between the subjects in the relevant subject model, and determining a product with the determined subject occupying a preset number of digits before as a part of the target product according to the probability distribution of the product and the subject in the relevant subject model;
obtaining a theme with the highest proportion in the product description based on the distribution of the product description on the theme, determining a target theme with the highest proportion and the highest relevance with the theme according to the relation between the themes in the relevant theme model, and determining a product with a preset number of digits before the proportion of the target theme as a part of the target product according to the probability distribution of the product and the theme in the relevant theme model;
and acquiring at least one target subject contained in the product description based on the distribution of the product description on the subject, determining a product containing the at least one target subject according to the probability distribution of the product and the subject in the relevant subject model, and taking the determined product as a part of the target product.
3. The product recommendation method of claim 1, further comprising: and displaying the product categories associated with the subjects in the product description, and displaying the recommended mode of each type of product.
4. The product recommendation method of claim 1, further comprising: the method comprises the steps of obtaining a product selected by a user according to a recommended target product, determining a theme contained in the selected product, and taking a product with the theme contained in the selected product in a preset digit as a part of the target product.
5. The product recommendation method of claim 1, wherein said preprocessing the target document to obtain a vocabulary set of the target document comprises:
removing special words in the target document to obtain a processed document;
and performing word segmentation on the processed document to obtain a tuple set.
6. The product recommendation method of claim 5, further comprising:
in the tuple set, removing high-frequency tuples with the occurrence frequency in front of a preset number of digits and low-frequency tuples with the occurrence frequency lower than the preset number of times in the text corpus, and determining the processed tuple set as the word set of the target document.
7. An electronic device, comprising a memory configured to store at least one instruction and a processor configured to execute the at least one instruction to implement the product recommendation method of any of claims 1-6.
8. A computer-readable storage medium storing at least one instruction which, when executed by a processor, implements a product recommendation method as recited in any one of claims 1-6.
CN201810287788.7A 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium Active CN108763258B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810287788.7A CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium
PCT/CN2018/100312 WO2019192122A1 (en) 2018-04-03 2018-08-14 Document topic parameter extraction method, product recommendation method and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810287788.7A CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN108763258A CN108763258A (en) 2018-11-06
CN108763258B true CN108763258B (en) 2023-01-10

Family

ID=63980754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810287788.7A Active CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium

Country Status (2)

Country Link
CN (1) CN108763258B (en)
WO (1) WO2019192122A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763084A (en) * 2020-09-21 2021-12-07 北京沃东天骏信息技术有限公司 Product recommendation processing method, device, equipment and storage medium
CN113538020B (en) * 2021-07-05 2024-03-26 深圳索信达数据技术有限公司 Method and device for acquiring association degree of group of people features, storage medium and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226557A (en) * 2008-02-22 2008-07-23 中国科学院软件研究所 Method and system for processing efficient relating subject model data
CN105389377A (en) * 2015-11-18 2016-03-09 清华大学 Topic mining based event cluster acquisition method
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN107220232A (en) * 2017-04-06 2017-09-29 北京百度网讯科技有限公司 Keyword extracting method and device, equipment and computer-readable recording medium based on artificial intelligence

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9639881B2 (en) * 2013-05-20 2017-05-02 TCL Research America Inc. Method and system for personalized video recommendation based on user interests modeling
CN104679778B (en) * 2013-11-29 2019-03-26 腾讯科技(深圳)有限公司 A kind of generation method and device of search result
US9817904B2 (en) * 2014-12-19 2017-11-14 TCL Research America Inc. Method and system for generating augmented product specifications
CN107730346A (en) * 2017-09-25 2018-02-23 北京京东尚科信息技术有限公司 The method and apparatus of article cluster

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226557A (en) * 2008-02-22 2008-07-23 中国科学院软件研究所 Method and system for processing efficient relating subject model data
CN105389377A (en) * 2015-11-18 2016-03-09 清华大学 Topic mining based event cluster acquisition method
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN107220232A (en) * 2017-04-06 2017-09-29 北京百度网讯科技有限公司 Keyword extracting method and device, equipment and computer-readable recording medium based on artificial intelligence

Also Published As

Publication number Publication date
WO2019192122A1 (en) 2019-10-10
CN108763258A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
US11093854B2 (en) Emoji recommendation method and device thereof
CN106649818B (en) Application search intention identification method and device, application search method and server
CN110297988B (en) Hot topic detection method based on weighted LDA and improved Single-Pass clustering algorithm
US9519634B2 (en) Systems and methods for determining lexical associations among words in a corpus
CN109299280B (en) Short text clustering analysis method and device and terminal equipment
JP5544602B2 (en) Word semantic relationship extraction apparatus and word semantic relationship extraction method
CN111324771B (en) Video tag determination method and device, electronic equipment and storage medium
CN110134777B (en) Question duplication eliminating method and device, electronic equipment and computer readable storage medium
CN111126067B (en) Entity relationship extraction method and device
CN110569354A (en) Barrage emotion analysis method and device
CN108763258B (en) Document theme parameter extraction method, product recommendation method, device and storage medium
CN117313861A (en) Model pre-training data acquisition method, model pre-training method, device and equipment
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN115964474A (en) Policy keyword extraction method and device, storage medium and electronic equipment
Bobicev et al. Can anonymous posters on medical forums be reidentified?
CN114676699A (en) Entity emotion analysis method and device, computer equipment and storage medium
CN113420544A (en) Hot word determination method and device, electronic equipment and storage medium
CN109298796B (en) Word association method and device
Al Oudah et al. Wajeez: An extractive automatic arabic text summarisation system
CN112329478A (en) Method, device and equipment for constructing causal relationship determination model
CN112445959A (en) Retrieval method, retrieval device, computer-readable medium and electronic device
CN113536802A (en) Method, device, equipment and storage medium for judging emotion of text data in languages
JP7326637B2 (en) CHUNKING EXECUTION SYSTEM, CHUNKING EXECUTION METHOD, AND PROGRAM
Nikolić et al. Modelling the System of Receiving Quick Answers for e-Government Services: Study for the Crime Domain in the Republic of Serbia
KR102309802B1 (en) Analysis method for trend of sns

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant