CN108763258A - Document subject matter parameter extracting method, Products Show method, equipment and storage medium - Google Patents

Document subject matter parameter extracting method, Products Show method, equipment and storage medium Download PDF

Info

Publication number
CN108763258A
CN108763258A CN201810287788.7A CN201810287788A CN108763258A CN 108763258 A CN108763258 A CN 108763258A CN 201810287788 A CN201810287788 A CN 201810287788A CN 108763258 A CN108763258 A CN 108763258A
Authority
CN
China
Prior art keywords
theme
product
document
distribution
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810287788.7A
Other languages
Chinese (zh)
Other versions
CN108763258B (en
Inventor
王义文
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810287788.7A priority Critical patent/CN108763258B/en
Priority to PCT/CN2018/100312 priority patent/WO2019192122A1/en
Publication of CN108763258A publication Critical patent/CN108763258A/en
Application granted granted Critical
Publication of CN108763258B publication Critical patent/CN108763258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of document subject matter parameter extracting method, pass through document training set, training obtains in trained related subject model, obtains the destination document being distributed on theme, the relationship distribution in multiple themes between any two theme and the distribution between product and theme.The present invention also provides a kind of Products Show methods:The product description for obtaining input, handles the product description, obtains the relationship between theme and the probability distribution between product and theme in distribution and the related subject model of the product description on theme.The present invention also provides a kind of electronic equipment and storage mediums.The invention can avoid the similar product of content is only looked for, accuracy is improved, to realize more accurate product.

Description

Document subject matter parameter extracting method, Products Show method, equipment and storage medium
Technical field
The present invention relates to artificial intelligence field more particularly to a kind of document subject matter parameter extracting method, Products Show method, Equipment and storage medium.
Background technology
The fast development of internet has been catalyzed the generation of magnanimity information, and big data is gradually allowed to become current information technology Inexorable trend then needs quickly, and effectively extracts valuable data from various information.And current Products Show according to Content is similar, or is found the Products Show comprising keyword to user from the product of magnanimity by keyword, but loses It has leaked that dissimilar with user description content but the relevant product of theme, such as " health " are uncorrelated to " gene " keyword, but has led Topic is related, but by the prior art when input " health " keyword, can not find and " gene " relevant product, to affect The accuracy of recommendation.
Invention content
In view of the foregoing, it is necessary to a kind of document subject matter parameter extracting method, Products Show method and electronics are provided and set It is standby, it has been avoided that and has only looked for the similar product of content, improved accuracy, to realize more accurate product.
A kind of document subject matter parameter extracting method, the method includes:
Destination document is pre-processed, the word set of the destination document is obtained;
By in the trained related subject MODEL C TM of the input of the destination document, the destination document is obtained in theme On be distributed, in multiple themes between any two theme relationship distribution and the distribution between product and theme, it is described to train Related subject model be to train to obtain based on document sample set, the trained related subject model include multiple themes.
Described to be pre-processed to destination document according to the preferred embodiment of the present invention, the word set for obtaining the destination document includes:
The special word in the destination document is removed, the document that obtains that treated;
Treated that document is segmented to described, obtains tuple set.
According to the preferred embodiment of the present invention, the method further includes:
In the tuple set, remove in corpus of text occurrence number rank forefront presetting digit capacity high frequency tuple and be less than The low frequency tuple of preset times, by the word set of treated tuple set the is determined as destination document.
A kind of Products Show method, the method includes:
The product description for obtaining input, using the product description of acquisition as destination document;
The product description is handled using the document subject matter parameter extracting method as described in any embodiment, is obtained Relationship between theme and the probability between product and theme in distribution and the related subject model of the product description on theme Distribution;
Based between theme in distribution of the product description on theme and the related subject model relationship and production Probability distribution between product and theme recommends target product associated with the theme of the product description to user.
According to the preferred embodiment of the present invention, the theme of the distribution and product based on the product description on theme it Between relationship, it includes following one or more to recommend associated with the theme of product description target product to user Combination:
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic determines and each target at least one target topic according to the relationship between theme in the related subject model The highest theme of the degree of association of theme determines described true according to the probability distribution of product and theme in the related subject model A part of the product of presetting digit capacity as the target product before fixed theme accounting comes;
Distribution based on the product description on theme obtains the highest theme of accounting in the product description, according to Relationship in the related subject model between theme determines the highest target master of the degree of association with the highest theme of the accounting Topic determines default before the target topic accounting comes according to the probability distribution of product and theme in the related subject model A part of the product of digit as the target product;
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic is determined according to the probability distribution of product and theme in the related subject model comprising at least one target topic Product, using determining product as a part for the target product.
According to the preferred embodiment of the present invention, the theme of the distribution and product based on the product description on theme it Between relationship, recommend associated with the theme of the product description target product to further include to user:
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic determines and at least one target topic associated first according to the relationship between theme in the related subject model Theme, then determination and the associated second theme of the first theme, according to the probability of product and theme in the related subject model Distribution determines a part of the product of presetting digit capacity before the second theme accounting comes as the target product.
According to the preferred embodiment of the present invention, the method further includes:It will be with the associated product of theme in the product description Classification display, and show the mode per class Products Show.
According to the preferred embodiment of the present invention, the method further includes:Obtain what user chose according to the target product of recommendation Product, determine described in the product the chosen theme that includes, default position before the theme accounting that the product chosen includes is come A part of several products as the target product.
A kind of electronic equipment, the electronic equipment include memory and processor, and the memory is for storing at least one A instruction, the processor is for executing at least one instruction to realize any one of any embodiment document subject matter Any one of parameter extracting method and/or any embodiment the Products Show method.
A kind of computer readable storage medium, the computer-readable recording medium storage has at least one instruction, described Any one of realization any embodiment document subject matter parameter extracting method when at least one instruction is executed by processor, and/ Or any one of any embodiment Products Show method.
By above technical scheme it is found that the present invention provides a kind of document subject matter parameter extracting method, by document training set, Training obtains in trained related subject model, obtains the destination document being distributed, is arbitrary in multiple themes on theme Relationship distribution between two themes and the distribution between product and theme.The product description for obtaining input, to the product description Handled, obtain relationship in distribution and the related subject model of the product description on theme between theme and product with Probability distribution between theme.The present invention can search content not phase by being based on the related subject model in above-described embodiment Seemingly, but the relevant product of theme, to the closely related product of proposed topic, so as to avoid the similar product of content is only looked for, Accuracy is improved, to realize more accurate product.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow chart of the first preferred embodiment of document subject matter parameter extracting method of the present invention.
Fig. 2 is the flow chart of the first preferred embodiment of Products Show method of the present invention.
Fig. 3 is the Program modual graph of the first preferred embodiment of document subject matter parameter extraction device of the present invention.
Fig. 4 is the Program modual graph of the first preferred embodiment of Products Show device of the present invention.
Fig. 5 is the structural schematic diagram of the preferred embodiment of electronic equipment at least one example of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below in conjunction with the accompanying drawings and specific real Applying mode, the present invention is described in further detail.
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The every other embodiment that member is obtained without making creative work should all belong to the model that the present invention protects It encloses.
Term " first ", " second " and " third " in description and claims of this specification and above-mentioned attached drawing etc. is For distinguishing different objects, not for description particular order.In addition, term " comprising " and their any deformations, it is intended that Non-exclusive include in covering.Such as process, method, system, product or the equipment for containing series of steps or unit do not have It is defined in the step of having listed or unit, but further includes the steps that optionally not listing or unit, or further include optionally For the intrinsic other steps of these processes, method, product or equipment or unit.
As shown in Figure 1, being the flow chart of the first preferred embodiment of document subject matter parameter extracting method of the present invention.According to not With demand, the sequence of step can change in the flow chart, and certain steps can be omitted.
S10, electronic equipment pre-process destination document, obtain the word set of the destination document.
Preferably, described to be pre-processed to destination document, the word set for obtaining the destination document includes:
(1) the special word in the destination document is removed, the document that obtains that treated.
Further, the special word includes website links, user name label, spcial character, place name label, punctuate symbol Number etc..
(2) treated that document is segmented to described, obtains tuple set.
Treated that document carries out that participle generates n tuples that (n is just to described by way of extracting n tuples (n-gram) 4) integer, such as n are less than.For example, it is to be based on Chinese lexical analysis system such as to carry out participle to Chinese corpus of text (Institute of Computing Technology, Chinese Lexical Analysis System, ICTCLAS) work What tool was completed.For example, for the corpus of text (such as English) of space-separated, can directly be segmented by space;And for Chinese, day Language etc. is without space as the corpus of text separated.
Further, a tuple, two tuples and triple are extracted from corpus of text is total to three classes tuple-set.
Preferably, after obtaining the tuple set, the method further includes:In the tuple set, remove in text language In material occurrence number rank forefront presetting digit capacity (such as ranking forefront 50) high frequency tuple (i.e. high frequency words) and be less than preset times (such as 3 It is secondary) low frequency tuple (i.e. low-frequency word), by the word set of treated tuple set the is determined as destination document.
In an alternative embodiment, it is contemplated that the characteristic of speech sounds of word removes a certain proportion of high frequency tuple and (is usually off Word etc.) and low frequency tuple (being typically name, non-word etc.), only take candidate word of the remaining intermediate frequency tuple as sentiment dictionary.It is high Frequency tuple is typically stop words, they have higher co-occurrence chance with all kinds of words, therefore not to the expression of emotional characteristic Obviously;Low frequency tuple is usually non-word or user name etc., these tuples do not have language meaning, it is therefore desirable to is removed.In this way, Using occurrence number intermediate frequency tuple placed in the middle as a part of candidate word.
In other implementations, after being segmented using participle technique, candidate word set is generated in conjunction with n tuples, can be removed Not at the n tuples of word.The participle technique is the prior art, and the present invention does not do any restrictions.Dictionary precision can be improved in this way. It is this to handle the validity for not interfering holistic approach.
S11, the electronic equipment are by the trained related subject MODEL C TM of the input of the destination document In (Correlated Topic Model), the destination document being distributed on theme, any two in multiple themes are obtained Relationship distribution between theme and the distribution between product and theme, the trained related subject model are to be based on document sample Training is got, and the trained related subject model includes multiple themes.
In the present invention, the relating subject MODEL C TM (Correlated Topic Model) is to use logistic Covariance matrix in normal distribution is modeled to find distribution and theme and the theme of document subject matter to theme proportion Between association.
The related subject model is a kind of life that wherein implicit semantic theme can be automatically extracted from discrete data concentration At probabilistic model, wherein theme refers to the content often occurred jointly in data set.The related subject model passes through probability Graph model model describes the relationship between each variable, is calculated by sampling or variation estimating method with theme related general Rate is distributed.
The related subject model can find to lie in the theme in collection of document automatically, and theme is the probability point of word Cloth.The related subject model is to analyze the tool that document and the new document of prediction provide a convenient unsupervisedly.It is described related main The basic thought of topic model is the random mixing that document is several themes, wherein each theme is the multinomial distribution of word.In document It concentrates, theme is the probability distribution of vocabulary in corpus, it is assumed that a corpus has K theme, and K theme is in every document In shared ratio it is different.Therefore, the related subject model is trained by document sets, the distribution between multiple themes can be obtained And the distribution relation between product and theme.
Preferably, the process of the training related subject model is as follows:
(a1), document sample set is obtained, the document sample set is configured to training set and test set.For example, 70% Document sample is as training set, and 30% document sample is as test set.
(a2), the optimal theme number of the training set is configured.
The optimal theme number is used to indicate the theme number in related subject model.
(a3), it is based on the training set and the optimal theme number, using related subject model in the training set Document modeled, obtain the parameters in related subject model.
(a4), the corresponding word set of document sample in the test set is input in the related subject model that training obtains, The each document subject matter obtained in the test set indicates.
(a5), the accuracy rate for the related subject model that evaluation training obtains, if the related subject model that training obtains is less than Default accuracy rate, for example, 99%, then the sample and/or step-by-step movement increased in training set adjusts the optimal theme number, repeats The step of above-mentioned trained related subject model, presets accurately until the accuracy rate for the related subject model that training obtains is more than or equal to Rate, for example, 99%.
The present invention is obtained by document training set, training in trained related subject model, and the destination document is obtained Being distributed on theme, the relationship distribution in multiple themes between any two theme and the distribution between product and theme.Therefore, The present invention can extract the topic parameter information of document, consequently facilitating the correlation between later use document subject matter parameter, Xiang Yong Recommend and the relevant product of theme at family.
As shown in Fig. 2, being the flow chart of the first preferred embodiment of Products Show method of the present invention.According to different need It asks, the sequence of step can change in the flow chart, and certain steps can be omitted.
S20, electronic equipment obtain the product description of input, using the product description of acquisition as destination document.
In an alternative embodiment, the product description includes but not limited to following one or more kinds of combination:Word, word, One section of word etc..The form of the product description includes the combination of one or more of speech form, written form.
Preferably, the product includes, but are not limited to:Finance product, net purchase commodity etc..
For example, the finance product of bank is all to be categorized into multiple modules at present, such as the high module of income, at any time with modulus block, A regular month module etc. different types of finance product.User can input when buying finance product and oneself want to buy Finance product describes, such as voice inputs one section of word, to find out reason similar with the theme of product description input by user Property product.
S21, the electronic equipment handle the product description, obtain distribution and institute of the product description on theme State the relationship between theme and the probability distribution between product and theme in related subject model.
In a preferred embodiment, the electronic equipment utilizes the document subject matter parameter extracting method to the product description It is handled.
In an alternative embodiment, the training sample of the training related subject model includes the product description of each product. Using a product description as a document sample.The related subject mould is trained using the method in the first preferred embodiment Type.
Further, ratio of distribution of the product description on theme for indicating theme that the product description includes Weight.For example, the product description is including three themes, theme A, theme B, theme C, wherein proportion relationship:Theme A:Theme B:Theme C=16:2:1.
Further, the relationship between the theme of the product for indicate in related subject model any two theme it Between correlation degree.For example, there is three themes, the degree of association of theme A and theme B is 0.2, and theme A and the degree of association of theme C are The degree of association of 0.8, theme B and theme C are 0.4 etc..
S22, the electronic equipment are based on main in distribution of the product description on theme and the related subject model The probability distribution between relationship and product and theme between topic recommends mesh associated with the theme of the product description to user Mark product.
Preferably, the relationship between the theme of the distribution and product based on the product description on theme, Xiang Yong It includes the combination of following one or more that target product associated with the theme of the product description is recommended at family:
(1) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines and each mesh at least one target topic according to the relationship between theme in the related subject model The highest theme of the degree of association for marking theme, according to the probability distribution of product and theme in the related subject model, determine described in A part of the product of presetting digit capacity as the target product before determining theme accounting comes.
For example, the theme that the description of finance product input by user includes includes income height, time short two themes, with receipts The high highest theme of the theme degree of association of benefit is income year 5% or more, with the time short highest theme of the theme degree of association be with When with taking.Wherein income year 5% or more in finance product A and finance product C accounting highest, the time, short theme was resonable Accounting highest on property product A and finance product D, then finance product A, finance product C and finance product D are target products.In this way Each theme in product description can be recommended to realize that the personalization of product pushes away to user with the highest product of the theme degree of association It recommends.
(2) distribution based on the product description on theme obtains the highest theme of accounting, root in the product description According to the relationship between theme in the related subject model, the highest target of the degree of association with the highest theme of the accounting is determined Theme determines pre- before the target topic accounting comes according to the probability distribution of product and theme in the related subject model If a part of the product of digit as the target product.
For example, the theme that the description of finance product input by user includes includes income height, time short two themes, wherein The high accounting highest of income is income year 5% or more with the high highest theme of the theme degree of association of income.Wherein income year 5% or more in finance product A and finance product C accounting highest, then finance product A, finance product C are target products.
(3) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines to include at least one target topic according to the probability distribution of product and theme in the related subject model Product, using determining product as a part for the target product.
(4) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines and at least one target topic associated the according to the relationship between theme in the related subject model One theme, then determine only and the associated second theme of the first theme, according to the general of product in the related subject model and theme Rate is distributed, and determines a part of the product of presetting digit capacity before the second theme accounting comes as the target product.In this way The indirect relation between theme is embodied, to find out indirect strong relating subject, recommends personalized product to user.
For example, the product description includes theme A, in the related subject model, theme C is related to the theme A, And theme D is only associated with the theme C, illustrates that theme D is associated with by force with theme C, and therefore, default position before theme D accountings are come A part of several products as the target product.
Preferably, it will show, and be shown per class Products Show with the associated product classification of theme in the product description Mode.For example, with the most associated product types of theme A, with the most associated product classes of theme C etc., such user can be intuitively Know with oneself interested associated product of theme, convenient for user according to the products scheme personalization of recommendation choose.
Preferably, the method further includes:The product that user chooses according to the target product of recommendation is obtained, determines the choosing In the product theme that includes, the product of presetting digit capacity is as described in before the theme accounting that the product chosen includes is come A part for target product.It can recommend in this way in conjunction with the interested product of user, can more be bonded the demand of user, it is real The personalized recommendation of existing product.
By being based on the related subject model in above-described embodiment, content dissmilarity can be searched, but theme is relevant Product, to which the closely related product of proposed topic improves accuracy so as to avoid the similar product of content is only looked for, from And realize more accurate product.
By above example, the present invention provides a kind of document subject matter parameter extracting method, passes through document training set, training It obtains in trained related subject model, obtains the destination document being distributed on theme, any two in multiple themes Relationship distribution between theme and the distribution between product and theme.The product description for obtaining input carries out the product description Processing obtains relationship in distribution and the related subject model of the product description on theme between theme and product and theme Between probability distribution.The present invention by above-described embodiment be based on the related subject model, can search content dissmilarity, but The relevant product of theme, to which the closely related product of proposed topic improves so as to avoid the similar product of content is only looked for Accuracy, to realize more accurate product.
As shown in figure 3, the Program modual graph of the first preferred embodiment of document subject matter parameter extraction device of the present invention.It is described Document subject matter parameter extraction device 3 includes, but are not limited to one or more following module:Preprocessing module 30, computing module 31 and training module 32.The so-called unit of the present invention refer to it is a kind of can be by the processor institute of document subject matter parameter extraction device 3 The series of computation machine program segment of fixed function is executed and can complete, storage is in memory.Work(about each unit It can will be described in detail in subsequent embodiment.
The preprocessing module 30 pre-processes destination document, obtains the word set of the destination document.
Preferably, the preprocessing module 30 pre-processes destination document, and the word set for obtaining the destination document includes:
(1) the special word in the destination document is removed, the document that obtains that treated.
Further, the special word includes website links, user name label, spcial character, place name label, punctuate symbol Number etc..
(2) treated that document is segmented to described, obtains tuple set.
Treated that document carries out that participle generates n tuples that (n is just to described by way of extracting n tuples (n-gram) 4) integer, such as n are less than.For example, it is to be based on Chinese lexical analysis system such as to carry out participle to Chinese corpus of text (Institute of Computing Technology, Chinese Lexical Analysis System, ICTCLAS) work What tool was completed.For example, for the corpus of text (such as English) of space-separated, can directly be segmented by space;And for Chinese, day Language etc. is without space as the corpus of text separated.
Further, a tuple, two tuples and triple are extracted from corpus of text is total to three classes tuple-set.
Preferably, after obtaining the tuple set, the preprocessing module 30 also particularly useful for:In the tuple set, Remove in corpus of text occurrence number rank forefront presetting digit capacity (such as ranking forefront 50) high frequency tuple (i.e. high frequency words) and be less than The low frequency tuple (i.e. low-frequency word) of preset times (such as 3 times), by the word set of treated tuple set the is determined as destination document.
In an alternative embodiment, it is contemplated that the characteristic of speech sounds of word removes a certain proportion of high frequency tuple and (is usually off Word etc.) and low frequency tuple (being typically name, non-word etc.), only take candidate word of the remaining intermediate frequency tuple as sentiment dictionary.It is high Frequency tuple is typically stop words, they have higher co-occurrence chance with all kinds of words, therefore not to the expression of emotional characteristic Obviously;Low frequency tuple is usually non-word or user name etc., these tuples do not have language meaning, it is therefore desirable to is removed.In this way, Using occurrence number intermediate frequency tuple placed in the middle as a part of candidate word.
In other implementations, after being segmented using participle technique, candidate word set is generated in conjunction with n tuples, can be removed Not at the n tuples of word.The participle technique is the prior art, and the present invention does not do any restrictions.Dictionary precision can be improved in this way. It is this to handle the validity for not interfering holistic approach.
Computing module 31 is by the trained related subject MODEL C TM (Correlated of the input of the destination document Topic Model) in, obtain the destination document being distributed on theme, the pass in multiple themes between any two theme System's distribution and the distribution between product and theme, the trained related subject model are to train to obtain based on document sample set, The trained related subject model includes multiple themes.
In the present invention, the relating subject MODEL C TM (Correlated Topic Model) is to use logistic Covariance matrix in normal distribution is modeled to find distribution and theme and the theme of document subject matter to theme proportion Between association.
The related subject model is a kind of life that wherein implicit semantic theme can be automatically extracted from discrete data concentration At probabilistic model, wherein theme refers to the content often occurred jointly in data set.The related subject model passes through probability Graph model model describes the relationship between each variable, is calculated by sampling or variation estimating method with theme related general Rate is distributed.
The related subject model can find to lie in the theme in collection of document automatically, and theme is the probability point of word Cloth.The related subject model is to analyze the tool that document and the new document of prediction provide a convenient unsupervisedly.It is described related main The basic thought of topic model is the random mixing that document is several themes, wherein each theme is the multinomial distribution of word.In document It concentrates, theme is the probability distribution of vocabulary in corpus, it is assumed that a corpus has K theme, and K theme is in every document In shared ratio it is different.Therefore, the related subject model is trained by document sets, the distribution between multiple themes can be obtained And the distribution relation between product and theme.
Preferably, training module 32 trains the process of the related subject model as follows:
(a1), document sample set is obtained, the document sample set is configured to training set and test set.For example, 70% Document sample is as training set, and 30% document sample is as test set.
(a2), the optimal theme number of the training set is configured.
The optimal theme number is used to indicate the theme number in related subject model.
(a3), it is based on the training set and the optimal theme number, using related subject model in the training set Document modeled, obtain the parameters in related subject model.
(a4), the corresponding word set of document sample in the test set is input in the related subject model that training obtains, The each document subject matter obtained in the test set indicates.
(a5), the accuracy rate for the related subject model that evaluation training obtains, if the related subject model that training obtains is less than Default accuracy rate, for example, 99%, then the sample and/or step-by-step movement increased in training set adjusts the optimal theme number, repeats The step of above-mentioned trained related subject model, presets accurately until the accuracy rate for the related subject model that training obtains is more than or equal to Rate, for example, 99%.
As shown in figure 4, the Program modual graph of the first preferred embodiment of Products Show device of the present invention.The Products Show Device 4 includes, but are not limited to one or more following module:Acquisition module 40, data computation module 41, recommending module 42 and Display module 43.The so-called unit of the present invention refer to it is a kind of can be performed by the processor of Products Show device 4 and can The series of computation machine program segment of fixed function is completed, storage is in memory.Function about each unit will be subsequent It is described in detail in embodiment.
The acquisition module 40 obtains the product description of input, using the product description of acquisition as destination document.
In an alternative embodiment, the product description includes but not limited to following one or more kinds of combination:Word, word, One section of word etc..The form of the product description includes the combination of one or more of speech form, written form.
Preferably, the product includes, but are not limited to:Finance product, net purchase commodity etc..
For example, the finance product of bank is all to be categorized into multiple modules at present, such as the high module of income, at any time with modulus block, A regular month module etc. different types of finance product.User can input when buying finance product and oneself want to buy Finance product describes, such as voice inputs one section of word, to find out reason similar with the theme of product description input by user Property product.
The data computation module 41 handles the product description, obtain distribution of the product description on theme and Relationship between theme and the probability distribution between product and theme in the related subject model.
In a preferred embodiment, the electronic equipment utilizes the document subject matter parameter extracting method to the product description It is handled.
In an alternative embodiment, the training sample of the training related subject model includes the product description of each product. Using a product description as a document sample.The related subject mould is trained using the method in the first preferred embodiment Type.
Further, ratio of distribution of the product description on theme for indicating theme that the product description includes Weight.For example, the product description is including three themes, theme A, theme B, theme C, wherein proportion relationship:Theme A:Theme B:Theme C=16:2:1.
Further, the relationship between the theme of the product for indicate in related subject model any two theme it Between correlation degree.For example, there is three themes, the degree of association of theme A and theme B is 0.2, and theme A and the degree of association of theme C are The degree of association of 0.8, theme B and theme C are 0.4 etc..
The recommending module 42 is based on theme in distribution of the product description on theme and the related subject model Between relationship and the probability distribution between product and theme, recommend associated with the theme of product description target to user Product.
Preferably, between the theme of distribution and product of the recommending module 42 based on the product description on theme Relationship, it includes the group of following one or more to recommend target product associated with the theme of the product description to user It closes:
(1) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines and each mesh at least one target topic according to the relationship between theme in the related subject model The highest theme of the degree of association for marking theme, according to the probability distribution of product and theme in the related subject model, determine described in A part of the product of presetting digit capacity as the target product before determining theme accounting comes.
For example, the theme that the description of finance product input by user includes includes income height, time short two themes, with receipts The high highest theme of the theme degree of association of benefit is income year 5% or more, with the time short highest theme of the theme degree of association be with When with taking.Wherein income year 5% or more in finance product A and finance product C accounting highest, the time, short theme was resonable Accounting highest on property product A and finance product D, then finance product A, finance product C and finance product D are target products.In this way Each theme in product description can be recommended to realize that the personalization of product pushes away to user with the highest product of the theme degree of association It recommends.
(2) distribution based on the product description on theme obtains the highest theme of accounting, root in the product description According to the relationship between theme in the related subject model, the highest target of the degree of association with the highest theme of the accounting is determined Theme determines pre- before the target topic accounting comes according to the probability distribution of product and theme in the related subject model If a part of the product of digit as the target product.
For example, the theme that the description of finance product input by user includes includes income height, time short two themes, wherein The high accounting highest of income is income year 5% or more with the high highest theme of the theme degree of association of income.Wherein income year 5% or more in finance product A and finance product C accounting highest, then finance product A, finance product C are target products.
(3) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines to include at least one target topic according to the probability distribution of product and theme in the related subject model Product, using determining product as a part for the target product.
(4) distribution based on the product description on theme obtains at least one target that the product description includes Theme determines and at least one target topic associated the according to the relationship between theme in the related subject model One theme, then determine only and the associated second theme of the first theme, according to the general of product in the related subject model and theme Rate is distributed, and determines a part of the product of presetting digit capacity before the second theme accounting comes as the target product.In this way The indirect relation between theme is embodied, to find out indirect strong relating subject, recommends personalized product to user.
For example, the product description includes theme A, in the related subject model, theme C is related to the theme A, And theme D is only associated with the theme C, illustrates that theme D is associated with by force with theme C, and therefore, default position before theme D accountings are come A part of several products as the target product.
Preferably, the display module 43 will be shown with the associated product classification of theme in the product description, and be shown Per the mode of class Products Show.For example, with the most associated product types of theme A, with the most associated product classes of theme C etc., this Sample user can intuitively know with oneself interested associated product of theme, convenient for user according to the products scheme individual character of recommendation Change and chooses.
Preferably, the recommending module 42 is additionally operable to:The product that user chooses according to the target product of recommendation is obtained, is determined The theme that the product chosen includes, the product of presetting digit capacity is made before the theme accounting that the product chosen includes is come For a part for the target product.It can recommend in this way in conjunction with the interested product of user, can more be bonded user's Demand realizes the personalized recommendation of product.
By being based on the related subject model in above-described embodiment, content dissmilarity can be searched, but theme is relevant Product, to which the closely related product of proposed topic improves accuracy so as to avoid the similar product of content is only looked for, from And realize more accurate product.
By above example, the present invention provides a kind of document subject matter parameter extracting method, passes through document training set, training It obtains in trained related subject model, obtains the destination document being distributed on theme, any two in multiple themes Relationship distribution between theme and the distribution between product and theme.The product description for obtaining input carries out the product description Processing obtains relationship in distribution and the related subject model of the product description on theme between theme and product and theme Between probability distribution.The present invention by above-described embodiment be based on the related subject model, can search content dissmilarity, but The relevant product of theme, to which the closely related product of proposed topic improves so as to avoid the similar product of content is only looked for Accuracy, to realize more accurate product.
The above-mentioned integrated unit realized in the form of software program module, can be stored in one and computer-readable deposit In storage media.Above-mentioned software program module is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention The part steps of embodiment the method.
As shown in figure 5, the electronic equipment 5 includes at least one sending device 51, at least one processor 52, at least one A processor 53, at least one reception device 54 and at least one communication bus.Wherein, the communication bus is for realizing this Connection communication between a little components.
The electronic equipment 5 be it is a kind of can according to the instruction for being previously set or storing, it is automatic carry out numerical computations and/or The equipment of information processing, hardware include but not limited to microprocessor, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), number Word processing device (Digital Signal Processor, DSP), embedded device etc..The electronic equipment 5 may also include network Equipment and/or user equipment.Wherein, the network equipment includes but not limited to single network server, multiple network servers The server group of composition or the cloud being made of a large amount of hosts or network server for being based on cloud computing (Cloud Computing), Wherein, cloud computing is one kind of Distributed Calculation, a super virtual computing being made of the computer collection of a group loose couplings Machine.
The electronic equipment 5, which may be, but not limited to, any type, to pass through keyboard, touch tablet or voice-operated device with user Etc. modes carry out the electronic product of human-computer interaction, for example, tablet computer, smart mobile phone, personal digital assistant (Personal Digital Assistant, PDA), intellectual Wearable, picture pick-up device, the terminals such as monitoring device.
Network residing for the electronic equipment 5 includes, but are not limited to internet, wide area network, Metropolitan Area Network (MAN), LAN, virtual Dedicated network (Virtual Private Network, VPN) etc..
Wherein, the reception device 54 and the sending device 51 can be wired sending ports, or wirelessly set It is standby, such as including antenna assembly, for other equipment into row data communication.
The memory 52 is for storing program code.The memory 52 can not have physical form in integrated circuit The circuit with store function, such as RAM (Random-Access Memory, random access memory), FIFO (First In First Out) etc..Alternatively, the memory 52 can also be the memory with physical form, such as memory bar, TF card (Trans-flash Card), smart media card (smart media card), safe digital card (secure digital Card), storage facilities such as flash memory cards (flash card) etc..
The processor 53 may include one or more microprocessor, digital processing unit.The processor 53 is adjustable With the program code stored in memory 52 to execute relevant function.For example, the modules described in Fig. 3 are stored in institute The program code in memory 52 is stated, and performed by the processor 53, to realize a kind of document subject matter parameter extracting method; And/or the modules described in Fig. 4 are stored in the program code in the memory 52, and held by the processor 53 Row, to realize a kind of Products Show method.The processor 53 is also known as central processing unit (CPU, Central Processing Unit), it is one piece of ultra-large integrated circuit, is arithmetic core (Core) and control core (Control Unit).
The embodiment of the present invention also provides a kind of computer readable storage medium, is stored thereon with computer instruction, the finger It enables when being executed by the electronic equipment including one or more processors, electronic equipment is made to execute as described in embodiment of the method above Document subject matter parameter extracting method and/or Products Show method.
In conjunction with shown in Fig. 1, the memory 52 in the electronic equipment 5 stores multiple instruction to realize a kind of document master Parameter extracting method is inscribed, the processor 53 can perform the multiple instruction to realize:
Destination document is pre-processed, the word set of the destination document is obtained;The input of the destination document is trained In related subject MODEL C TM, the destination document being distributed on theme is obtained, in multiple themes between any two theme Relationship is distributed and the distribution between product and theme, and the trained related subject model is trained based on document sample set It arrives, the trained related subject model includes multiple themes.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:
The special word in the destination document is removed, the document that obtains that treated;
Treated that document is segmented to described, obtains tuple set.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:
In the tuple set, remove in corpus of text occurrence number rank forefront presetting digit capacity high frequency tuple and be less than The low frequency tuple of preset times, by the word set of treated tuple set the is determined as destination document.
The corresponding multiple instruction of document subject matter parameter extracting method described in any embodiment is stored in the memory 52, and executed by the processor 53, this will not be detailed here.
In conjunction with shown in Fig. 2, the memory 52 in the electronic equipment 5 stores multiple instruction to realize that a kind of product pushes away Method is recommended, the processor 53 can perform the multiple instruction to realize:
The product description for obtaining input, using the product description of acquisition as destination document;Described in any embodiment Document subject matter parameter extracting method handles the product description, obtains distribution and the phase of the product description on theme Close the relationship in topic model between theme and the probability distribution between product and theme;Based on the product description on theme Relationship between theme and the probability distribution between product and theme in distribution and the related subject model, to user's recommendation and institute State the associated target product of theme of product description.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic determines and each target at least one target topic according to the relationship between theme in the related subject model The highest theme of the degree of association of theme determines described true according to the probability distribution of product and theme in the related subject model A part of the product of presetting digit capacity as the target product before fixed theme accounting comes;
Distribution based on the product description on theme obtains the highest theme of accounting in the product description, according to Relationship in the related subject model between theme determines the highest target master of the degree of association with the highest theme of the accounting Topic determines default before the target topic accounting comes according to the probability distribution of product and theme in the related subject model A part of the product of digit as the target product;
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic is determined according to the probability distribution of product and theme in the related subject model comprising at least one target topic Product, using determining product as a part for the target product.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:
Distribution based on the product description on theme obtains at least one target master that the product description includes Topic determines and at least one target topic associated first according to the relationship between theme in the related subject model Theme, then determination and the associated second theme of the first theme, according to the probability of product and theme in the related subject model Distribution determines a part of the product of presetting digit capacity before the second theme accounting comes as the target product.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:Will with it is described The associated product classification of theme is shown in product description, and shows the mode per class Products Show.
In the alternative embodiment of the present invention, the processor 53 can perform the multiple instruction and further include:Obtain user According to the product that the target product of recommendation is chosen, determine described in the product the chosen theme that includes, by the product packet chosen A part of the product of presetting digit capacity as the target product before the theme accounting contained comes.
The characteristic means of present invention mentioned above can be realized by integrated circuit, and control above-mentioned of realization The function of document subject matter parameter extracting method described in embodiment of anticipating.That is, the integrated circuit of the present invention is installed on the electronics and sets In standby, the electronic equipment is made to play the following functions:Destination document is pre-processed, the word set of the destination document is obtained;By institute In the trained related subject MODEL C TM of input for stating destination document, the destination document being distributed, is multiple on theme is obtained Relationship distribution between any two theme and the distribution between product and theme, the trained related subject model in theme It is to train to obtain based on document sample set, the trained related subject model includes multiple themes.
Function achieved by the document subject matter parameter extracting method described in any embodiment can be transferred through the present invention's Integrated circuit is installed in the electronic equipment, so that the electronic equipment is played document subject matter parameter described in any embodiment and is carried The function achieved by method is taken, this will not be detailed here.
The characteristic means of present invention mentioned above can be realized by integrated circuit, and control above-mentioned of realization The function of document subject matter parameter extracting method described in embodiment of anticipating.That is, the integrated circuit of the present invention is installed on the electronics and sets In standby, the electronic equipment is made to play the following functions:The product description for obtaining input, using the product description of acquisition as target text Shelves;The product description is handled using document subject matter parameter extracting method described in any embodiment, product is obtained and retouches State the relationship in distribution and the related subject model on theme between theme and the probability distribution between product and theme;Base Between relationship and product and theme in distribution of the product description on theme and the related subject model between theme Probability distribution, recommend associated with the theme of product description target product to user.
Function achieved by the Products Show method described in any embodiment can be transferred through the integrated circuit of the present invention It is installed in the electronic equipment, the electronic equipment is made to play achieved by Products Show method described in any embodiment Function, this will not be detailed here.
It should be noted that for each method embodiment above-mentioned, for simple description, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should understand that, the present invention is not limited by the described action sequence because According to the present invention, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention It is necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, it can be by another way It realizes.For example, the apparatus embodiments described above are merely exemplary, for example, the unit division, it is only a kind of Division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can combine or can To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Coupling, direct-coupling or communication connection can be by some interfaces, the INDIRECT COUPLING or communication connection of device or unit, Can be electrical or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in various embodiments of the present invention can be integrated in a processing unit, also may be used It, can also be during two or more units be integrated in one unit to be that each unit physically exists alone.It is above-mentioned integrated The form that hardware had both may be used in unit is realized, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can be stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or network equipment etc.) execute each embodiment the method for the present invention whole or Part steps.And storage medium above-mentioned includes:USB flash disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can to store program code Medium.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to before Stating embodiment, invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to preceding The technical solution recorded in each embodiment is stated to modify or equivalent replacement of some of the technical features;And these Modification or replacement, the range for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution.

Claims (10)

1. a kind of document subject matter parameter extracting method, which is characterized in that the method includes:
Destination document is pre-processed, the word set of the destination document is obtained;
By in the trained related subject MODEL C TM of the input of the destination document, the destination document is obtained on theme Relationship distribution between any two theme and the distribution between product and theme, the trained phase in distribution, multiple themes It is to train to obtain based on document sample set to close topic model, and the trained related subject model includes multiple themes.
2. document subject matter parameter extracting method as described in claim 1, which is characterized in that it is described that destination document is pre-processed, The word set for obtaining the destination document includes:
The special word in the destination document is removed, the document that obtains that treated;
Treated that document is segmented to described, obtains tuple set.
3. document subject matter parameter extracting method as claimed in claim 2, which is characterized in that the method further includes:
In the tuple set, remove the occurrence number in corpus of text rank forefront presetting digit capacity high frequency tuple and less than default The low frequency tuple of number, by the word set of treated tuple set the is determined as destination document.
4. a kind of Products Show method, which is characterized in that the method includes:
The product description for obtaining input, using the product description of acquisition as destination document;
The product description is handled using the document subject matter parameter extracting method as described in any one of claims 1 to 3, It obtains between relationship and product and the theme in distribution and the related subject model of the product description on theme between theme Probability distribution;
Based between theme in distribution of the product description on theme and the related subject model relationship and product with Probability distribution between theme recommends target product associated with the theme of the product description to user.
5. Products Show method as claimed in claim 4, which is characterized in that it is described based on the product description on theme Relationship between distribution and the theme of product recommends associated with the theme of the product description target product to include to user The combination of following one or more:
Distribution based on the product description on theme obtains at least one target topic that the product description includes, root According to the relationship between theme in the related subject model, determine and each target topic at least one target topic The highest theme of the degree of association determines the master of the determination according to the probability distribution of product and theme in the related subject model A part of the product of presetting digit capacity as the target product before topic accounting comes;
Distribution based on the product description on theme obtains the highest theme of accounting in the product description, according to described Relationship in related subject model between theme determines the highest target topic of the degree of association with the highest theme of the accounting, According to the probability distribution of product and theme in the related subject model, determine that the target topic accounting comes preceding presetting digit capacity A part of the product as the target product;
Distribution based on the product description on theme obtains at least one target topic that the product description includes, root According to the probability distribution of product and theme in the related subject model, the product for including at least one target topic is determined, Using determining product as a part for the target product.
6. Products Show method as claimed in claim 4, which is characterized in that it is described based on the product description on theme Relationship between distribution and the theme of product recommends target product associated with the theme of the product description also to wrap to user It includes:
Distribution based on the product description on theme obtains at least one target topic that the product description includes, root According to the relationship between theme in the related subject model, determination and at least one associated first theme of target topic, It determines again only and the associated second theme of the first theme, according to the probability distribution of product and theme in the related subject model, Determine a part of the product of presetting digit capacity before the second theme accounting comes as the target product.
7. the Products Show method as described in right wants 4, which is characterized in that the method further includes:It will be with the product description The middle associated product classification of theme is shown, and shows the mode per class Products Show.
8. Products Show method as claimed in claim 4, which is characterized in that the method further includes:User is obtained according to pushing away The product that the target product recommended is chosen, determine described in the product the chosen theme that includes, the master for including by the product chosen A part of the product of presetting digit capacity as the target product before topic accounting comes.
9. a kind of electronic equipment, which is characterized in that the electronic equipment includes memory and processor, and the memory is for depositing At least one instruction is stored up, the processor is for executing at least one instruction to realize such as any one of claims 1 to 3 The document subject matter parameter extracting method, and/or the Products Show method as described in any one of claim 4 to 8.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has at least one Instruction, at least one instruction realize the document subject matter parameter as described in any one of claims 1 to 3 when being executed by processor Extracting method, and/or the Products Show method as described in any one of claim 4 to 8.
CN201810287788.7A 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium Active CN108763258B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810287788.7A CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium
PCT/CN2018/100312 WO2019192122A1 (en) 2018-04-03 2018-08-14 Document topic parameter extraction method, product recommendation method and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810287788.7A CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN108763258A true CN108763258A (en) 2018-11-06
CN108763258B CN108763258B (en) 2023-01-10

Family

ID=63980754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810287788.7A Active CN108763258B (en) 2018-04-03 2018-04-03 Document theme parameter extraction method, product recommendation method, device and storage medium

Country Status (2)

Country Link
CN (1) CN108763258B (en)
WO (1) WO2019192122A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763084A (en) * 2020-09-21 2021-12-07 北京沃东天骏信息技术有限公司 Product recommendation processing method, device, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113538020B (en) * 2021-07-05 2024-03-26 深圳索信达数据技术有限公司 Method and device for acquiring association degree of group of people features, storage medium and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226557A (en) * 2008-02-22 2008-07-23 中国科学院软件研究所 Method and system for processing efficient relating subject model data
US20140344103A1 (en) * 2013-05-20 2014-11-20 TCL Research America Inc. System and methodforpersonalized video recommendation based on user interests modeling
CN105389377A (en) * 2015-11-18 2016-03-09 清华大学 Topic mining based event cluster acquisition method
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN107220232A (en) * 2017-04-06 2017-09-29 北京百度网讯科技有限公司 Keyword extracting method and device, equipment and computer-readable recording medium based on artificial intelligence

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679778B (en) * 2013-11-29 2019-03-26 腾讯科技(深圳)有限公司 A kind of generation method and device of search result
US9817904B2 (en) * 2014-12-19 2017-11-14 TCL Research America Inc. Method and system for generating augmented product specifications
CN107730346A (en) * 2017-09-25 2018-02-23 北京京东尚科信息技术有限公司 The method and apparatus of article cluster

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101226557A (en) * 2008-02-22 2008-07-23 中国科学院软件研究所 Method and system for processing efficient relating subject model data
US20140344103A1 (en) * 2013-05-20 2014-11-20 TCL Research America Inc. System and methodforpersonalized video recommendation based on user interests modeling
CN105389377A (en) * 2015-11-18 2016-03-09 清华大学 Topic mining based event cluster acquisition method
CN105426514A (en) * 2015-11-30 2016-03-23 扬州大学 Personalized mobile APP recommendation method
CN107220232A (en) * 2017-04-06 2017-09-29 北京百度网讯科技有限公司 Keyword extracting method and device, equipment and computer-readable recording medium based on artificial intelligence

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763084A (en) * 2020-09-21 2021-12-07 北京沃东天骏信息技术有限公司 Product recommendation processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2019192122A1 (en) 2019-10-10
CN108763258B (en) 2023-01-10

Similar Documents

Publication Publication Date Title
CN108334533B (en) Keyword extraction method and device, storage medium and electronic device
US11093854B2 (en) Emoji recommendation method and device thereof
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN108829822A (en) The recommended method and device of media content, storage medium, electronic device
EP2581843B1 (en) Bigram Suggestions
CN108388660B (en) Improved E-commerce product pain point analysis method
CN103605691B (en) Device and method used for processing issued contents in social network
CN108304373A (en) Construction method, device, storage medium and the electronic device of semantic dictionary
CN108733675B (en) Emotion evaluation method and device based on large amount of sample data
CN102609424B (en) Method and equipment for extracting assessment information
CN108733644A (en) A kind of text emotion analysis method, computer readable storage medium and terminal device
CN111666757A (en) Commodity comment emotional tendency analysis method, device and equipment and readable storage medium
CN109902157A (en) A kind of training sample validation checking method and device
CN105159927B (en) Method and device for selecting subject term of target text and terminal
CN114443847A (en) Text classification method, text processing method, text classification device, text processing device, computer equipment and storage medium
CN109522275B (en) Label mining method based on user production content, electronic device and storage medium
CN108763258A (en) Document subject matter parameter extracting method, Products Show method, equipment and storage medium
CN107357782A (en) One kind identification user's property method for distinguishing and terminal
CN111737607B (en) Data processing method, device, electronic equipment and storage medium
CN111460808B (en) Synonymous text recognition and content recommendation method and device and electronic equipment
CN109033241A (en) News recommended method, device and electronic equipment
CN107291686B (en) Method and system for identifying emotion identification
CN108288172A (en) Advertisement DSP orientations launch the method and terminal of advertisement
CN107944589A (en) The Forecasting Methodology and prediction meanss of ad click rate
CN109033078B (en) The recognition methods of sentence classification and device, storage medium, processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant