CN110020109A - Method and device for information popularization - Google Patents

Method and device for information popularization Download PDF

Info

Publication number
CN110020109A
CN110020109A CN201710817091.1A CN201710817091A CN110020109A CN 110020109 A CN110020109 A CN 110020109A CN 201710817091 A CN201710817091 A CN 201710817091A CN 110020109 A CN110020109 A CN 110020109A
Authority
CN
China
Prior art keywords
data
vector
user
group
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710817091.1A
Other languages
Chinese (zh)
Other versions
CN110020109B (en
Inventor
邵佳帅
陈海勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710817091.1A priority Critical patent/CN110020109B/en
Publication of CN110020109A publication Critical patent/CN110020109A/en
Application granted granted Critical
Publication of CN110020109B publication Critical patent/CN110020109B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • G06F16/337Profile generation, learning or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a kind of method and device for information popularization.It is related to computer information processing field, this method comprises: constructing group's vector data by basic data;User vector data are generated according to the historical data of user and group's vector;Calculate the similarity data for obtaining the user vector data and group's vector data;And according to the similarity data, information popularization is carried out to the user.Method and device disclosed in the present application for information popularization can obtain the user group of similar interests, and then carry out accurately Recommendations, improve customer experience.

Description

Method and device for information popularization
Technical field
The present invention relates to computer information processing fields, in particular to a kind of method and dress for information popularization It sets.
Background technique
With the promotion and popularization of shopping at network, the competition between shopping website is more fierce, and enterprise is to steady in a long-term Existence it may first have to attract user, next needs to manage user, so that user becomes the loyal user of enterprise.How very Good operation user, is a problem, with the record of user behavior data, the maturation of data mining algorithm technology, enterprise can To manage user by a variety of methods, wherein most commonly seen also most crucial is exactly to carry out precision marketing to user, precisely seek Pin be pair time pair commercial product recommending to pair people.Precision marketing is carried out to user, also or certain supplier needs The people that the commodity of oneself are sold to couple, it is necessary to it is realized by user's portrait, and user interest similarity is to measure user The interest similarity degree for wanting purchase to some category or brand can be very easy by the identification to special interests The interest commodity for recommending them to like are oriented to target user, for example are equally to recommend clothes, if the interest group of this user Body is " street corner ", then processing can be weighted the product of the style, obtains better recommendation effect, generates website pin The promotion sold.
The existing technical solution that information popularization is carried out according to user interest degree, can be for example, be led in advance by operation personnel The article for crossing user's browsing carries out similar article popularization, and the commodity that can have also for example been bought according to user carry out pushing away for similar commodity Extensively, the data that this information popularization mode has been browsed due to being limited to user, carry out when message popularization the information promoted compared with Few range is relatively narrow, does not get a desired effect.
Therefore, it is necessary to a kind of new method and devices for information popularization.
Above- mentioned information are only used for reinforcing the understanding to background of the invention, therefore it disclosed in the background technology part It may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
In view of this, the present invention provides a kind of method and device for information popularization, the use of similar interests can be obtained Family group, and then accurately Recommendations are carried out, improve customer experience.
Other characteristics and advantages of the invention will be apparent from by the following detailed description, or partially by the present invention Practice and acquistion.
According to an aspect of the invention, it is proposed that a kind of method for information popularization, this method comprises: passing through basic data Construct group's vector data;User vector data are generated according to the historical data of user and group's vector;Calculate the user to The similarity of data and group's vector data is measured, similarity data are generated;And according to the similarity data, to the use Family carries out information popularization.
It is described that group's vector is constructed by basic data, comprising: to the base in a kind of exemplary embodiment of the disclosure Plinth data carry out data mart modeling processing, obtain process data;Data mining is carried out to the process data, obtains interest tags number According to;And the interest tags data are generated into group's vector data by sparse vector.
It is described that data mart modeling processing is carried out to the basic data in a kind of exemplary embodiment of the disclosure, it obtains Process data, comprising: extract the keeper unit data containing predetermined labels in basic data;To the library in the predetermined time Storage unit data carries out word segmentation processing, obtains participle data;Count the participle data inversely to obtain process data.
It is described that data mining is carried out to the process data in a kind of exemplary embodiment of the disclosure, obtain interest Label data, comprising: the process data is ranked up by word frequency, generates the first sorting data;Pass through inverse document frequency The process data is ranked up, the second sorting data is generated;Pass through first sorting data and the second row ordinal number It is handled according to the process data, obtains interest tags data, the interest tags data include multiple classification data, often A classification data includes multiple subclassification data.
It is described that the interest tags data are generated into group by sparse vector in a kind of exemplary embodiment of the disclosure Vector data, comprising: the subclassification data in classification data described in each of described interest tags data are carried out Number generates number data;Vector expression is carried out to the classification data by the sparse vector and the number data, it is raw At group's vector data.
It is described to classify to described in each of described interest tags data in a kind of exemplary embodiment of the disclosure The subclassification data in data are numbered, and generate number data, further includes: by word2vec algorithm to the interest The subclassification data in classification data described in each of label data are numbered, and generate the number data.
In a kind of exemplary embodiment of the disclosure, the historical data according to user and group's vector are generated and are used Family vector data, comprising: the historical data of user is subjected to word segmentation processing;And by the number data and divided Word treated historical data generates the user vector data.
In a kind of exemplary embodiment of the disclosure, the calculating obtains the user vector data and group's vector The similarity data of data, comprising: the phase of the user vector data with group's vector data is calculated by cos angle algorithm Like degree, the similarity data are generated.
It is described according to the similarity data in a kind of exemplary embodiment of the disclosure, letter is carried out to the user Breath is promoted, comprising: by the similarity data arranged in sequence;And by the sequencing of similarity, to the predetermined user Carry out information popularization.
It is described according to the similarity data in a kind of exemplary embodiment of the disclosure, letter is carried out to the user Breath is promoted, further includes: the similarity data are carried out arranged in sequence according to the classification in classification data;And pass through the phase It sorts like degree, information popularization is carried out to the predetermined user according to classification.
According to an aspect of the invention, it is proposed that a kind of device for information popularization, which includes: basic data mould Block, for constructing group's vector data by basic data;Vector data module, for according to the historical data of user and the group Vector generates user vector data;Similarity module, for calculating the phase of the user vector data with group's vector data Like degree, similarity data are generated;And promotional module, for carrying out information to the user and pushing away according to the similarity data Extensively.
According to an aspect of the invention, it is proposed that a kind of electronic equipment, which includes: one or more processors; Storage device, for storing one or more programs;When one or more programs are executed by one or more processors, so that one A or multiple processors realize such as methodology above.
According to an aspect of the invention, it is proposed that a kind of computer-readable medium, is stored thereon with computer program, feature It is, method as mentioned in the above is realized when program is executed by processor.
Method and device according to the present invention for information popularization, can obtain the user group of similar interests, in turn Accurately Recommendations are carried out, customer experience is improved.
It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited Invention.
Detailed description of the invention
Its example embodiment is described in detail by referring to accompanying drawing, above and other target of the invention, feature and advantage will It becomes more fully apparent.Drawings discussed below is only some embodiments of the present invention, for the ordinary skill of this field For personnel, without creative efforts, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of system architecture of method for information popularization shown according to an exemplary embodiment.
Fig. 2 is a kind of flow chart of method for information popularization shown according to an exemplary embodiment.
Fig. 3 is a kind of flow chart of the method for information popularization shown according to another exemplary embodiment.
Fig. 4 is a kind of schematic diagram of method for information popularization shown according to an exemplary embodiment.
Fig. 5 is a kind of schematic diagram of the method for information popularization shown according to another exemplary embodiment.
Fig. 6 is a kind of schematic diagram of the method for information popularization shown according to another exemplary embodiment.
Fig. 7 is a kind of flow chart of the method for information popularization shown according to another exemplary embodiment.
Fig. 8 is a kind of block diagram of device for information popularization shown according to an exemplary embodiment.
Fig. 9 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
Figure 10 is a kind of computer-readable medium schematic diagram shown according to an exemplary embodiment.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However, It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one or more in specific detail, Or it can be using other methods, constituent element, device, step etc..In other cases, it is not shown in detail or describes known side Method, device, realization or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in the drawings is merely illustrative, it is not necessary to including all content and operation/step, It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or part merge, therefore the sequence actually executed is possible to change according to the actual situation.
It should be understood that although herein various assemblies may be described using term first, second, third, etc., these groups Part should not be limited by these terms.These terms are to distinguish a component and another component.Therefore, first group be discussed herein below Part can be described as the second component without departing from the teaching of disclosure concept.As used herein, term " and/or " include associated All combinations for listing any of project and one or more.
It will be understood by those skilled in the art that attached drawing is the schematic diagram of example embodiment, module or process in attached drawing Necessary to not necessarily implementing the present invention, therefore it cannot be used for limiting the scope of the invention.
Disclosure example embodiment is described in detail with reference to the accompanying drawing.
Fig. 1 is a kind of system architecture of method for information popularization shown according to an exemplary embodiment.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 101,102,103 The application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as utilize terminal device 101,102,103 to user The shopping class website browsed provides the back-stage management server supported.Back-stage management server can believe the product received The data such as breath inquiry request carry out the processing such as analyzing, and processing result (such as pushed information, product information) is fed back to terminal Equipment.
It is generally executed by server 105 it should be noted that promoting method for generating message provided by the embodiment of the present application, Correspondingly, the displayed web page of PUSH message is generally positioned in client 101.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
Fig. 2 is a kind of flow chart of method for information popularization shown according to an exemplary embodiment.
As shown in Fig. 2, constructing group's vector data by basic data in S202.Basic data may be, for example, all productions The data of product, can be for example, contain some style ranking by extracting in the commodity of the clothes category in database in commodity title Sku (keeper unit).Assuming that may be, for example, 100 by interest 100 that obtain user in basic data by word segmentation processing The style of kind clothes, can for example continue the product word, the brand word, qualifier that obtain this 100 clothes styles by data mining Summarized results, may be, for example, to share 1000 words.1000 words of result are numbered, sequentially may be, for example, be product word, Brand word, qualifier, it is assumed that have 300 product words, 300 brand words, 400 qualifiers, then the number of product word can example It is for example 1~300, the number of brand word is 301~600, and the number of qualifier is 601~1000.Each word has unique one A number corresponds to.Sequence described above be it is merely illustrative, invention is not limited thereto.Having is indicated using vector to the word of each style Element is 1, and element-free is 0, and the vector data after the completion of constructing is as group's vector data.
In S204, user vector data are generated according to the historical data of user and group's vector.The history number of user According to the historical data that may be, for example, user's shopping or browse web sites, in the present embodiment, can be gone through for example, extracting user by database History to the purchase order data of clothes category, can for example by word segmentation processing, by historical data be converted to product word, brand word, The form of qualifier.To group's vector that reply is above established, user vector data are constructed.
In S206, the similarity of the user vector data Yu group's vector data is calculated, generates similarity data. The similarity of each user vector and each group of vectors is calculated, similarity calculation can be counted for example, by using cos angle calcu-lation mode The bigger similarity represented between user and group's vector of cosine value result is higher in calculation result, and calculated result can be for example as similar Degree.
In S208, according to the similarity data, information popularization is carried out to the user.It can be for example, by described similar Degree is according to arranged in sequence;And by the sequencing of similarity, information popularization is carried out to the predetermined user.Can also for example, The similarity data are subjected to arranged in sequence according to the classification in classification data;And by the sequencing of similarity, according to Classification carries out information popularization to the predetermined user.It can be for example, choosing in the group of each interest style, cosine value calculated result The user of (similarity result) TOP500w is as each Interests User group.And then letter is targetedly carried out to these groups Breath is promoted.
Method according to the present invention for information popularization passes through the structures such as the product word, brand word, qualifier of utilization product Group vector is built, and then user data is compared with group's vector data, obtains the mode of similarity, it can be in different classifications Under (category), the user group of similar interests is obtained, and then carry out accurately Recommendations, improve customer experience.
It will be clearly understood that the present disclosure describe how being formed and using particular example, but the principle of the present invention is not limited to These exemplary any details.On the contrary, the introduction based on present disclosure, these principles can be applied to many other Embodiment.
In the present invention, the interest of user may be, for example, that user is at least lasting to certain a kind of quotient whithin a period of time Product have the demand of purchase, show the behaviors such as purchase, click, browsing and search.For example a user is interested in climbing the mountain in the recent period, He can continue for some time clothes and equipment etc. required for browsing and purchase mountain-climbing in website.So " mountain-climbing " is exactly this The interest of user.In the shopping of women clothing class, interest may be, for example, female user on clothes different-style it is emerging Interest, such as these styles " Joker ", " tide ", " individual character ", " allusion ", " street corner ", " brief ", " Great Britain ", " national wind " etc. below. It is an object of the present invention to how to a specified interest, such as style of wearing the clothes: national wind.The user of this interest is found out, And it sorts according to interest-degree height.And then carry out information popularization.
Fig. 3 is a kind of flow chart of the method for information popularization shown according to another exemplary embodiment.Fig. 3 is pair The exemplary description of group's vector data is constructed in Fig. 2 by basic data.
As shown in figure 3, carrying out data mart modeling processing in S302 to the basic data, obtaining process data.Include: Extract the keeper unit data containing predetermined labels in basic data;To the keeper unit data in the predetermined time into Row word segmentation processing obtains participle data;Count the participle data inversely to obtain process data.
It can be for example, from the sku containing the style ranking in commodity title be extracted in the commodity of the clothes category in database. For example contain " allusion " this word in commodity title, then this sku is marked to " allusion ".Calculate the close of these labeled sku The data such as 1 year order volume, lower single number.Using participle tool (common processing mode in the prior art) to these sku into Row participle, obtains the product word, brand word and qualifier of commodity title.Existing participle tool can segment title, And identify all types of information in title.What product word can for example refer to that the commodity sell is any product;Qualifier refers to Adjective;Such as title are as follows: " 2 smartwatch of Apple Watch Sport Series (38 millimeters of deep space grey aluminum metal tables 50 meters of waterproof MP0D2CH/A of shell black sports type watchband GPS) ", product word: smartwatch, brand word: apple, qualifier: Deep space grey, aluminum metal, black, sport footwear, waterproof.Reverse statistics segments the order of obtained product word, brand word, qualifier Amount, includes the information such as sku number and accounting at order number
In S304, data mining is carried out to the process data, obtains interest tags data.It can be for example, passing through word frequency The process data is ranked up, the first sorting data is generated;The process data is ranked up by inverse document frequency, Generate the second sorting data;By first sorting data and second sorting data to the process data at Reason obtains interest tags data, and the interest tags data include multiple classification data, and each classification data includes multiple Subclassification data.
By process data above by the method for data mining, obtain the product clump of each interest, brand clump, Clump is modified, such as the product clump of interest " street corner " is military uniform trousers, street dance takes, t sympathizes, street dance trousers, beggar's trousers, trousers with braces, hangs Shelves trousers, camouflage color trousers, collapse trousers;Brand clump be beautiful street corner, ccqueen, seven sound of laughing, Ma Ma a kind of thick silk, pass, Hua Yige;Qualifier Group be street corner, hip-hop, individual character, mashed up, broken hole, yellow peach color, handsome, cowboy, neutrality, colorant match striped, even cap, American, rock and roll wind, It is blended.
Each word segmentation result accounting situation TF (word frequency) is calculated, each participle significance level under the same interest is equivalent to Longitudinal comparison.Example: including (B1, B2, B3) 3 kinds of qualifiers in theme A1, wherein the commodity accounting 80% comprising B1, includes B2's Commodity accounting 10%, the commodity accounting 25% comprising B3, longitudinal comparison B1 are critically important.
In the present embodiment, number/article total degree that word frequency=some word occurs in article;
The same seg (participle) distribution situation IDF (inverse document frequency) in different themes is calculated, some point is equivalent to Lateral comparison of the word significance level in different themes.Example: altogether 10 themes, participle B1 appear in 9 themes (A1, A2, A3, A4, A5, A6, A7, A8, A9) in, then inverse document frequency are as follows:
In the present embodiment,
TF-IDF value=TF*IDF, product word, brand word, qualifier to each style, can for example take top20 knot Fruit removes some dirty datas, can obtain the relevant product clump of each interest, brand then by simple artificial screening Clump, modification clump, can be for example as interest tags data.Different interest is as the classification data in interest tags, such as Fig. 4 It is shown.
In S306, the interest tags data are generated into group's vector data by sparse vector.
Can be for example, in classification data, the interest of user 100, that is, in 100 clothes style, and pass through step Three excavate the product clump for having arrived each style, brand clump, modification clump.
By the product word of all styles of interest tags data acquisition, brand word, qualifier summarized results, be exactly all The result of word can such as totally 1000 words.1000 words of result are numbered, are sequentially product word, brand word, qualifier, vacation Equipped with 300 product words, 300 brand words, 400 qualifiers, then the number of product word is 1~300, the number of brand word It is 301~600, the number of qualifier is 601~1000.Each word has only one number to correspond to.Product word and number Situation is as shown in Figure 5.
Method according to the present invention for information popularization is handled and is dug to data by word frequency and inverse document word frequency The mode of pick can accurately excavate interest Related product word, brand word, qualifier label.
It is described that the interest tags data are generated into group by sparse vector in a kind of exemplary embodiment of the disclosure Vector data, comprising: the subclassification data in classification data described in each of described interest tags data are carried out Number generates number data;Vector expression is carried out to the classification data by the sparse vector and the number data, it is raw At group's vector data.Sparse vector expression is carried out to each style, each style vector length is 1000, and some elements are 1, it is not 0, after rarefaction representation, as shown in Figure 6.Such as the qualifier of interest style " allusion " are as follows: dancing girl, princess, classic wind, in Formula, gradual change, beam chest formula, spun gold side, pseudo-classic, jag, the oblique flap, disjunctor, stand-up collar, silk, cheongsam formula, mandarin collar, in third step meter Value is obtained in the TF-IDF result of calculation to be updated, and then obtains group's vector data.
It can also be for example, by word2vec algorithm in classification data described in each of described interest tags data The subclassification data are numbered, and generate the number data.It, can also when constructing group's vector calculating user interest group To carry out vectorization to word using the method for deep learning, the similarity of user and group's vector are then calculated.Assuming that each style Word constitute a line, then vectorization can be carried out to each word by word2vec tool, then sum it up each in each style The vector of a word obtains group's vector of each style.The similarly vector of available user.
In a kind of exemplary embodiment of the disclosure, the historical data according to user and group's vector are generated and are used Family vector data, comprising: the historical data of user is subjected to word segmentation processing;And by the number data and divided Word treated historical data generates the user vector data.Third step is updated to the value of each element of sparse vector The calculated result of TF-IDF.
It can be for example, extracting user's history to the purchase order of clothes category by database.The merchandise news that user is bought Switch to the form of product word, brand word, qualifier.Using label coding, the purchase sparse vector of user is constructed, dimension is also 1000 dimensions, the value of label are each product word accounting, each brand word accounting, each qualifier of the overall most clothes purchase of user Accounting.
In a kind of exemplary embodiment of the disclosure, the calculating obtains the user vector data and group's vector The similarity data of data, comprising: the phase of the user vector data with group's vector data is calculated by cos angle algorithm Like degree, the similarity data are generated.
Fig. 7 is a kind of flow chart of the method for information popularization shown according to another exemplary embodiment.
As shown in fig. 7, defining user interest in S702.
In S704, basic data of the processed user to interest behavior.
In S706, interest tags are excavated.
In S708, interest group vector is constructed.
In S710, user vector is constructed.
In S712, similar interests user group is calculated.
In S714, the Interests User of each interest is exported.
Method according to the present invention for information popularization passes through the structures such as the product word, brand word, qualifier of utilization product Group vector is built, and then user data is compared with group's vector data, the mode of similarity is obtained, provides in electric business and find The entire work flow of similar interests user group.
It will be appreciated by those skilled in the art that realizing that all or part of the steps of above-described embodiment is implemented as being executed by CPU Computer program.When the computer program is executed by CPU, above-mentioned function defined by the above method provided by the invention is executed Energy.The program can store in a kind of computer readable storage medium, which can be read-only memory, magnetic Disk or CD etc..
Further, it should be noted that above-mentioned attached drawing is only place included by method according to an exemplary embodiment of the present invention Reason schematically illustrates, rather than limits purpose.It can be readily appreciated that above-mentioned processing shown in the drawings is not indicated or is limited at these The time sequencing of reason.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.
Following is apparatus of the present invention embodiment, can be used for executing embodiment of the present invention method.For apparatus of the present invention reality Undisclosed details in example is applied, embodiment of the present invention method is please referred to.
Fig. 8 is a kind of block diagram of device for information popularization shown according to an exemplary embodiment.
Basic data module 802 is used to construct group's vector data by basic data.
Vector data module 804 is used to generate user vector data according to the historical data of user and group's vector.
Similarity module 806 is used to calculate the similarity of the user vector data Yu group's vector data, generates phase Like degree evidence.
Promotional module 808 is used for according to the similarity data, carries out information popularization to the user.
Device according to the present invention for information popularization passes through the structures such as the product word, brand word, qualifier of utilization product Group vector is built, and then user data is compared with group's vector data, the mode of similarity is obtained, provides in electric business and find The total solution of similar interests user group.
Fig. 9 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.
The electronic equipment 200 of this embodiment according to the present invention is described referring to Fig. 9.The electronics that Fig. 9 is shown Equipment 200 is only an example, should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 9, electronic equipment 200 is showed in the form of universal computing device.The component of electronic equipment 200 can wrap It includes but is not limited to: at least one processing unit 210, at least one storage unit 220, (including the storage of the different system components of connection Unit 220 and processing unit 210) bus 230, display unit 240 etc..
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 210 Row, so that the processing unit 210 executes described in this specification above-mentioned electronic prescription circulation processing method part according to this The step of inventing various illustrative embodiments.For example, the processing unit 210 can be executed as shown in Fig. 2,3 and Fig. 7 The step of.
The storage unit 220 may include the readable medium of volatile memory cell form, such as random access memory Unit (RAM) 2201 and/or cache memory unit 2202 can further include read-only memory unit (ROM) 2203.
The storage unit 220 can also include program/practical work with one group of (at least one) program module 2205 Tool 2204, such program module 2205 includes but is not limited to: operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.
Bus 230 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 200 can also be with one or more external equipments 300 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 200 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 200 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 250.Also, electronic equipment 200 can be with By network adapter 260 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 260 can be communicated by bus 230 with other modules of electronic equipment 200.It should Understand, although not shown in the drawings, other hardware and/or software module can be used in conjunction with electronic equipment 200, including but unlimited In: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server or network equipment etc.) executes the above-mentioned electronics according to disclosure embodiment Prescription circulation processing method.
Figure 10 is a kind of computer-readable medium schematic diagram shown according to an exemplary embodiment.
Refering to what is shown in Fig. 10, describing the program product for realizing the above method of embodiment according to the present invention 400, can using portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as it is run on PC.However, program product of the invention is without being limited thereto, in this document, readable storage medium storing program for executing can be with To be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or It is in connection.
Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The computer readable storage medium may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any other than readable storage medium storing program for executing Readable medium, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on readable storage medium storing program for executing can transmit with any suitable medium, packet Include but be not limited to wireless, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by one When the equipment executes, so that the computer-readable medium implements function such as: constructing group's vector data by basic data;According to The historical data of user and group's vector generate user vector data;Calculate the user vector data and group's vector number According to similarity, generate similarity data;And according to the similarity data, information popularization is carried out to the user.
It will be appreciated by those skilled in the art that above-mentioned each module can be distributed in device according to the description of embodiment, it can also Uniquely it is different from one or more devices of the present embodiment with carrying out corresponding change.The module of above-described embodiment can be merged into One module, can also be further split into multiple submodule.
By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implement according to the present invention The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can To be personal computer, server, mobile terminal or network equipment etc.) it executes according to the method for the embodiment of the present invention.
By above detailed description, those skilled in the art is it can be readily appreciated that according to an embodiment of the present invention for believing The method and device that breath is promoted has one or more of the following advantages.
According to some embodiments, the method for information popularization of the invention passes through the product word using product, brand Word, qualifier etc. construct group's vector, and then user data is compared with group's vector data, obtain the mode of similarity, energy It is enough to obtain the user group of similar interests, and then carry out accurately Recommendations under different classifications (category), raising client's body It tests.
According to other embodiments, the method for information popularization of the invention passes through word frequency and inverse document word frequency logarithm According to the mode for being handled and being excavated, interest Related product word, brand word, qualifier label can be accurately excavated.
It is particularly shown and described exemplary embodiment of the present invention above.It should be appreciated that the present invention is not limited to Detailed construction, set-up mode or implementation method described herein;On the contrary, it is intended to cover included in appended claims Various modifications and equivalence setting in spirit and scope.
In addition, structure shown by this specification Figure of description, ratio, size etc., only to cooperate specification institute Disclosure, for skilled in the art realises that be not limited to the enforceable qualifications of the disclosure with reading, therefore Do not have technical essential meaning, the modification of any structure, the change of proportionate relationship or the adjustment of size are not influencing the disclosure Under the technical effect and achieved purpose that can be generated, it should all still fall in technology contents disclosed in the disclosure and obtain and can cover In the range of.Meanwhile cited such as "upper" in this specification, " first ", " second " and " one " term, be also only and be convenient for Narration is illustrated, rather than to limit the enforceable range of the disclosure, relativeness is altered or modified, without substantive change Under technology contents, when being also considered as the enforceable scope of the present invention.

Claims (13)

1. a kind of method for information popularization characterized by comprising
Group's vector data is constructed by basic data;
User vector data are generated according to the historical data of user and group's vector;
The similarity of the user vector data Yu group's vector data is calculated, similarity data are generated;And
According to the similarity data, information popularization is carried out to the user.
2. the method as described in claim 1, which is characterized in that described to construct group's vector by basic data, comprising:
Data mart modeling processing is carried out to the basic data, obtains process data;
Data mining is carried out to the process data, obtains interest tags data;And
The interest tags data are generated into group's vector data by sparse vector.
3. method according to claim 2, which is characterized in that it is described that data mart modeling processing is carried out to the basic data, it obtains Take process data, comprising:
Extract the keeper unit data containing predetermined labels in basic data;
Word segmentation processing is carried out to the keeper unit data in the predetermined time, obtains participle data;
Count the participle data inversely to obtain process data.
4. method according to claim 2, which is characterized in that it is described that data mining is carried out to the process data, it obtains emerging Interesting label data, comprising:
The process data is ranked up by word frequency, generates the first sorting data;
The process data is ranked up by inverse document frequency, generates the second sorting data;
The process data is handled by first sorting data and second sorting data, obtains interest tags Data, the interest tags data include multiple classification data, and each classification data includes multiple subclassification data.
5. method as claimed in claim 4, which is characterized in that described to generate the interest tags data by sparse vector Group's vector data, comprising:
The subclassification data in classification data described in each of described interest tags data are numbered, generates and compiles Number;
Vector expression is carried out to the classification data by the sparse vector and the number data, generates group's vector number According to.
6. method as claimed in claim 5, which is characterized in that described to described in each of described interest tags data points The subclassification data in class data are numbered, and generate number data, further includes:
By word2vec algorithm to the subclassification number in classification data described in each of described interest tags data According to being numbered, the number data are generated.
7. method as claimed in claim 5, which is characterized in that the historical data according to user and group's vector generate User vector data, comprising:
The historical data of user is subjected to word segmentation processing;And
The user vector data are generated by the historical data after the number data and progress word segmentation processing.
8. the method as described in claim 1, which is characterized in that the calculating obtain the user vector data and the group to Measure the similarity data of data, comprising:
The similarity of the user vector data Yu group's vector data is calculated by cos angle algorithm, is generated described similar Degree evidence.
9. the method as described in claim 1, which is characterized in that it is described according to the similarity data, the user is carried out Information popularization, comprising:
By the similarity data arranged in sequence;And
By the sequencing of similarity, information popularization is carried out to the predetermined user.
10. method as claimed in claim 4, which is characterized in that it is described according to the similarity data, the user is carried out Information popularization, further includes:
The similarity data are subjected to arranged in sequence according to the classification in classification data;And
By the sequencing of similarity, information popularization is carried out to the predetermined user according to classification.
11. a kind of device for information popularization characterized by comprising
Basic data module, for constructing group's vector data by basic data;
Vector data module, for generating user vector data according to the historical data of user and group's vector;
Similarity module generates similar degree for calculating the similarity of the user vector data Yu group's vector data According to;And
Promotional module, for carrying out information popularization to the user according to the similarity data.
12. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1-10.
13. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1-10 is realized when row.
CN201710817091.1A 2017-09-12 2017-09-12 Method and device for information popularization Active CN110020109B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710817091.1A CN110020109B (en) 2017-09-12 2017-09-12 Method and device for information popularization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710817091.1A CN110020109B (en) 2017-09-12 2017-09-12 Method and device for information popularization

Publications (2)

Publication Number Publication Date
CN110020109A true CN110020109A (en) 2019-07-16
CN110020109B CN110020109B (en) 2021-12-07

Family

ID=67186227

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710817091.1A Active CN110020109B (en) 2017-09-12 2017-09-12 Method and device for information popularization

Country Status (1)

Country Link
CN (1) CN110020109B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160132830A1 (en) * 2014-11-12 2016-05-12 Adp, Llc Multi-level score based title engine
CN105956146A (en) * 2016-05-12 2016-09-21 腾讯科技(深圳)有限公司 Article information recommending method and device
CN106649774A (en) * 2016-12-27 2017-05-10 北京百度网讯科技有限公司 Artificial intelligence-based object pushing method and apparatus
CN106959966A (en) * 2016-01-12 2017-07-18 腾讯科技(深圳)有限公司 A kind of information recommendation method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160132830A1 (en) * 2014-11-12 2016-05-12 Adp, Llc Multi-level score based title engine
CN106959966A (en) * 2016-01-12 2017-07-18 腾讯科技(深圳)有限公司 A kind of information recommendation method and system
CN105956146A (en) * 2016-05-12 2016-09-21 腾讯科技(深圳)有限公司 Article information recommending method and device
CN106649774A (en) * 2016-12-27 2017-05-10 北京百度网讯科技有限公司 Artificial intelligence-based object pushing method and apparatus

Also Published As

Publication number Publication date
CN110020109B (en) 2021-12-07

Similar Documents

Publication Publication Date Title
US9613371B2 (en) Dynamic taxonomy generation with demand-based product groups
CN109542916A (en) Platform commodity enter method, apparatus, computer equipment and storage medium
CN104636371B (en) Information recommendation method and equipment
CN111523010A (en) Recommendation method and device, terminal equipment and computer storage medium
US20180053234A1 (en) Description information generation and presentation systems, methods, and devices
US10664888B2 (en) Method and system for attribute extraction from product titles using sequence labeling algorithms
CN110728015A (en) Cognitive automation and interactive personalized fashion design
CN109447713A (en) A kind of recommended method and device of knowledge based map
CN105512180B (en) A kind of search recommended method and device
CN103034680B (en) For data interactive method and the device of terminal device
CN106062743A (en) Systems and methods for keyword suggestion
CN103377193A (en) Information providing method, webpage server and webpage browser
JP6976207B2 (en) Information processing equipment, information processing methods, and programs
CN110782267A (en) System and method for cognitive adjacency planning and cognitive planogram design
CN103020128B (en) With the method and apparatus of data interaction with terminal device
US20170329840A1 (en) Computerized system and method for performing a feature-based search and displaying an interactive dynamically updatable, multidimensional user interface therefrom
CN107193932A (en) Information-pushing method and device
CN108197298A (en) A kind of smart shopper exchange method and system based on natural language processing
CN107832338A (en) A kind of method and system for identifying core product word
Shi et al. The exploration of artificial intelligence application in fashion trend forecasting
CN110717097A (en) Service recommendation method and device, computer equipment and storage medium
CN109961329A (en) Articles handling method and device, storage medium and electronic equipment
CN116167825A (en) Commodity recommendation method, device, equipment and storage medium based on electronic commerce
CN110807691B (en) Cross-commodity-class commodity recommendation method and device
CN109840788A (en) For analyzing the method and device of user behavior data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant