CN109359244B - Personalized information recommendation method and device - Google Patents

Personalized information recommendation method and device Download PDF

Info

Publication number
CN109359244B
CN109359244B CN201811276173.0A CN201811276173A CN109359244B CN 109359244 B CN109359244 B CN 109359244B CN 201811276173 A CN201811276173 A CN 201811276173A CN 109359244 B CN109359244 B CN 109359244B
Authority
CN
China
Prior art keywords
label
user
behavior
shop
commodity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811276173.0A
Other languages
Chinese (zh)
Other versions
CN109359244A (en
Inventor
张梦菲
方金云
韩聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201811276173.0A priority Critical patent/CN109359244B/en
Publication of CN109359244A publication Critical patent/CN109359244A/en
Application granted granted Critical
Publication of CN109359244B publication Critical patent/CN109359244B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement

Landscapes

  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a personalized information recommendation method and device. The method comprises the following steps: collecting multidimensional attribute characteristics of a user, multidimensional attribute characteristics of a commodity and multidimensional attribute characteristics of a shop; extracting label information and calculating the weight of a label based on the obtained multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop so as to construct a user portrait, a commodity portrait and a shop portrait; providing recommendation information for the user based on the constructed user representation, commodity representation or shop representation. The method of the invention can accurately and effectively provide personalized information recommendation for the user.

Description

Personalized information recommendation method and device
Technical Field
The invention relates to the technical field of information processing, in particular to a personalized information recommendation method and device.
Background
The recommendation system recommends an article meeting requirements for the user through a certain algorithm based on historical behavior data or article data of the user. When the information is overloaded, the recommendation system predicts the interests and demands of the user and helps the user to quickly find interested articles. The user portrait is simply labeled on the information of the user, the user portrait is used as a byproduct of the recommendation system, the user can be labeled, the social attribute, behavior habit and preference information of the user can be depicted, the cold start problem of a new user in the recommendation system is solved, and the data basis of the recommendation system is supported.
However, in the portrait technology of the existing recommendation system, only the portrait of the user is constructed, the granularity is coarse, the comprehensiveness is insufficient, and feature mining and portrait information construction of the commodity and the shop are omitted, so that it is difficult to obtain a precise effect when recommending the commodity or the shop.
Therefore, there is a need for improvements in the prior art to provide a more comprehensive and complete representation construction method in a recommendation system that includes multi-dimensional features.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a method and a device for recommending personalized information, which provide richer characteristics for a recommendation system by establishing a multi-dimensional label system.
According to a first aspect of the present invention, a personalized information recommendation method is provided. The method comprises the following steps:
step 1: collecting multidimensional attribute characteristics of a user, multidimensional attribute characteristics of a commodity and multidimensional attribute characteristics of a shop;
step 2: extracting label information and calculating the weight of a label based on the obtained multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop so as to construct a user portrait, a commodity portrait and a shop portrait;
and step 3: providing recommendation information for the user based on the constructed user representation, commodity representation or shop representation.
In one embodiment, the multi-dimensional attribute characteristics of the user include at least one of gender, age, business nature, industry of interest, search behavior, collection behavior, browsing behavior, shopping cart behavior, like behavior, consumption behavior, comment behavior, user device, user location.
In one embodiment, the multi-dimensional attribute characteristics of the commodity include at least one of a category, a brand, a price, an industry, an affiliate, and a click-through rate of the commodity.
In one embodiment, the multidimensional attribute characteristics of the store include at least one of store location, hosted goods, click through rate, industry information, brand information, multi-label classification of posts.
In one embodiment, the user representation includes at least one of a gender tag, an age tag, a business property tag, an industry of interest tag, a user interest preference tag, a device tag, a user location tag, a consumption capability tag, a loyalty tag, an activity tag, a user value tag.
In one embodiment, the weights of the user interest preference tags are calculated according to the following steps: .
Figure BDA0001847028570000021
Wherein w (i) is a weight value corresponding to a browsing behavior, a searching behavior, a collecting behavior, a shopping cart behavior, a like behavior, a consuming behavior, a commenting behavior, i is an index of the behavior,
Figure BDA0001847028570000022
the attenuation factor is, staytimeffector (st) is a stay time factor of browsing behavior, the stay time factor for searching behavior, collecting behavior, shopping cart behavior, like behavior, consuming behavior and commenting behavior is 1, deepfactor (j) is a depth factor of searching behavior, the depth factor for browsing behavior, collecting behavior, shopping cart behavior, like behavior, consuming behavior and commenting behavior is 1, m is the number of behaviors, n is the number of days of behavior occurrence time from the current date, t is stay time when browsing web page, d is the access depth after searching keyword, and alpha is a constant.
In one embodiment, the weight of the user value label is calculated according to the following formula:
Figure BDA0001847028570000023
where σ is a sigmoid function, i is an index for indicating consumption ability, loyalty, or liveness, and xiWeight, w, representing a consumption capability tag, loyalty tag, or activity tagiIs the weight of the user value relative to the consumption ability, loyalty and liveness.
In one embodiment, the merchandise representation includes at least one of a category tag, a brand tag, a price tag, an industry tag, an affiliate tag, a heat tag of the merchandise.
In one embodiment, the merchandise representation is obtained by:
taking K previous results from the multi-classification of the commodities in the multi-dimensional attribute characteristics of the commodities to obtain a classification label of the commodity, wherein K is an integer greater than or equal to 2;
directly obtaining a brand label of the commodity according to brand information of the commodity;
obtaining a price interval label of the commodity according to the price of the commodity;
acquiring an industry label of the commodity according to the industry information of the commodity;
obtaining an affiliated word label of the commodity according to the affiliated word information of the commodity;
and dividing the sum of the click rate of the commodities and the click rate of all the commodities to obtain the click rate of the commodities as the popularity label of the commodities.
In one embodiment, the shop representation includes at least one of a shop location tag, a shop hosted merchandise tag, a shop industry tag, a shop heat tag, a shop brand tag.
In one embodiment, the shop representation is obtained according to the following steps:
extracting a shop position label from the position characteristics in the multidimensional attribute characteristics of the shop;
carrying out word segmentation on the characteristics of the commodities hosted by the shops and merging the characteristics with the multi-label classification of the posts to obtain the labels of the commodities hosted by the shops;
dividing the click rate characteristics of the shops with the sum of the click rates of all the shops to obtain the click rate of the shops as a shop popularity label;
extracting the shop industry type from the shop industry information as a shop industry label;
and merging the brand information of the commodities in the shop to obtain the shop brand label.
According to a second aspect of the present invention, a personalized information recommendation apparatus is provided. The device includes:
a feature extraction module: collecting multidimensional attribute characteristics of a user, multidimensional attribute characteristics of a commodity and multidimensional attribute characteristics of a shop;
an image construction module: extracting label information and calculating the weight of a label based on the obtained multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop so as to construct a user portrait, a commodity portrait and a shop portrait;
a real-time recommendation module: and providing recommendation information for the user based on the constructed user portrait, commodity portrait or shop portrait.
Compared with the prior art, the invention has the advantages that: by collecting multi-source user behavior data, portraits are respectively established for users, commodities and shops, label weight values in the portraits are dynamically updated, the established portraits are comprehensive and complete, and the established portraits have good timeliness and provide data support for a personalized recommendation system.
Drawings
The invention is illustrated and described only by way of example and not by way of limitation in the scope of the invention as set forth in the following drawings, in which:
FIG. 1 shows a flow diagram of a representation construction method in a recommendation system according to one embodiment of the invention;
FIG. 2 is a diagram of a sketch constructing apparatus in a recommendation system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions, design methods, and advantages of the present invention more apparent, the present invention will be further described in detail by specific embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
According to an embodiment of the present invention, there is provided a sketch constructing method in a recommendation system, which is capable of constructing a multi-dimensional label system from multiple aspects of users, commodities, shops, and the like, and as shown in fig. 1, the method includes the following steps:
step S110, acquiring data and preprocessing the data.
The acquired data comprises user behavior logs and social network data, wherein the user behavior logs comprise information such as user identification, URL, source URL, IP, commodity or shop ID, session ID, user-agent information and data source identification of the user behavior logs, and the social network data comprises social relations, post publishing, comments, approval data and the like of common users and shop users.
In practical application, a multi-source user behavior log can be obtained from a log server, and the multi-source user behavior log comprises multi-site user behavior logs of different provinces and countries. The social network data can be obtained by inquiring a service database, wherein the service database also comprises data such as user information, commodity information, shop information, user collection information, shopping cart information, order information, approval information and the like.
The data preprocessing process comprises the steps of processing unstructured user behavior logs to obtain user behavior logs in a unified and standardized format, processing post contents (such as shop-on-new, shop inventory, user purchase, popular dynamics and the like) and comment information and the like in the obtained social network data, and extracting multi-label classification of the posts and the comment information to obtain social network attribute characteristics of users and shops.
The data preprocessing process also detects user behavior logs, filters crawlers to delete irrelevant web page information, filters abnormal data and the like.
In one embodiment, after preprocessing the multi-source user behavior log, the extracted key fields include one or more of a user identifier, a URL, a source URL, an IP, a commodity or shop ID, a session ID, a time when the behavior occurs, user-agent information, a data source identifier of the multi-source user behavior log, and the like.
In one embodiment, social network attribute characteristics of users and shops are obtained by the following process: performing word segmentation processing on the text information and comment information in the post content to obtain a plurality of word segmentation results; filtering the multiple word segmentation results; performing Word vector processing on the filtered Word segmentation result to obtain Word Embedding (Word Embedding) of post text information and comment information; embedding words of post text information and comment information into a text-classified convolutional neural network (TextCNN) model to perform multi-label short text classification, and outputting probability values of a plurality of labels to obtain the multi-label classification of the text information and the multi-label classification of the comment information in the post content; carrying out picture classification on pictures in the post content by using Resnet to obtain probability value results of 5 classifications of the pictures of the post content; combining the multi-label classification and picture classification results of the text information in the post content, and normalizing the weight to obtain the multi-label classification of the post, wherein each label classification has different weights; and storing browsing, praise and comment behaviors of the posts by the user, the publishing behaviors of the shops and the multi-label classification of the posts to obtain a social network attribute feature library of the user and the shops.
And step S120, extracting the multi-dimensional attribute features of the user, the goods and the shops.
And analyzing the preprocessed user behavior data, the social network attribute characteristics of the user and the shops and the data in the service database to extract the multidimensional attribute characteristics of the user, the commodity and the shops.
In one embodiment, the multi-dimensional attribute features of the user comprise static attribute features and dynamic attribute features of the user, wherein the static attribute features comprise gender, age, business property, industry of interest and the like, and the dynamic attribute features comprise dynamic behavior features of searching, collecting, browsing, shopping cart, praise, consuming, commenting and the like, and at least one of user equipment, location features and the like obtained by analyzing the key fields of the multi-source user behavior log.
For the dynamic attribute characteristics, the search behavior characteristics comprise commodity classification, keywords and search keyword depth searched by the user; behavior characteristics of collection, shopping cart, praise and comment respectively comprise commodity ID, commodity classification, multi-label classification of posts, industry of commodities and the like; the browsing behavior characteristics comprise browsed commodity ID, commodity classification, commodity industry, post multi-label classification, equipment information, region information, residence time, recent access interval, access frequency, browsing depth, login times and the like; the consumption behavior characteristics include the total amount of the order, the average consumption amount, the consumption frequency, the recent consumption interval, the order return ratio and the like. In addition, for the user device and location features in the dynamic attribute features, the user device and location features can be obtained by analyzing key fields in the user behavior log, for example, the device information in the user-agent includes a browser type, an operating system type, a mobile phone type, and the like used by the user.
In one embodiment, the multi-dimensional attribute features of the merchandise include multiple categories of the merchandise, brand, price, industry, attached words, click rate (e.g., day/week/month click rate), wherein the attached words refer to at least one of the attribute features of style, material, style, and the like of the merchandise.
For the multi-dimensional attribute characteristics of the commodity, the characteristics of multi-classification, brand, attached words and the like of the commodity can be obtained by extracting a commodity title and performing word segmentation, filtering, named entity identification, text classification and other processing on text information of the title, for example, the multi-classification, brand and attached word results and the like of the commodity can also be obtained by directly querying an established commodity database according to the ID of the commodity; the industry of the commodity can be obtained by inquiring a database table of a shop to which the commodity belongs; the daily/weekly/monthly click rate of the commodity can be obtained by analyzing the data in the user behavior log for statistical analysis.
In one embodiment, the multidimensional attribute features of the store include at least one of location features, primary commodities, day/week/month click-through rates, industry information, brand information, multi-label classifications of posts, and the like.
For the multidimensional attribute characteristics of the shops, the positions of the shops can be obtained through longitude and latitude information of the shops; the main business is the main business introduction in the shop description information, and the characteristic can be obtained by extracting the information of the main business introduction; the shop day/week/month click rate can be obtained by obtaining a user behavior log and performing statistical analysis; the shop industry information and the brand information are obtained by inquiring a shop table in a business database; the multi-tag classification of the post may be obtained by querying the store social network attribute feature repository in step S110.
And storing the obtained multidimensional attribute characteristics of the user, the commodity and the shop to obtain a characteristic library of the user, the commodity and the shop.
Step S130, respectively constructing a user portrait, a commodity portrait and a shop portrait.
In the step, label information is extracted based on the multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop, and the weight of the label is calculated and updated, so that the user portrait, the commodity portrait and the shop portrait can be obtained respectively.
In one embodiment, the user representation includes at least one of a static attribute tag and a dynamically updated user interest preference tag, a device tag, a location tag, a consumption capability tag, a loyalty tag, an activity tag, a user value tag, and the like.
Specifically, the static attribute tags directly extract corresponding static attributes according to static attribute features (such as gender, age, business property, concerned industry, and the like), and the weight of each static attribute tag is 1. For example, gender is a tag that may take the values "male", "female", "indeterminate"; the business property is also a label that can take the values "wholesaler", "foreign trade company", "web merchant", "brick-and-mortar store", "chain store", "individual consumer", "others", etc.
And the dynamically updated user interest preference labels comprise shop, industry, commodity, classification and brand attribute labels extracted from the dynamic attribute behavior characteristics, and the weight value of the labels is calculated to obtain the interest degree of the user on the labels. According to one embodiment of the invention, according to tag information respectively corresponding to dynamic behavior characteristics including searching, collecting, browsing, shopping cart, praise, consumption, comment and the like, and a weight value, an attenuation factor, a stay time factor of browsing behavior and a depth factor of searching behavior corresponding to each dynamic behavior, the weight of the user to the preference tag is calculated or updated by adopting the following formula:
Figure BDA0001847028570000071
where score is the interest level of the user in the tag, w (i) is the weight value corresponding to the dynamic behavior,
Figure BDA0001847028570000072
is attenuation factor, staytimeffector (st) is the stay time factor of browsing behavior, the stay time factor of searching, collecting, shopping cart, praise, consuming and commenting behavior is 1, deepactor (j) is the depth factor of searching behavior, the depth factor of browsing, collecting, shopping cart, praise, consuming and commenting is 1, m is the number of dynamic behavior, n is behavior occurrence timeThe number of days from the current date, t is the stay time when the webpage is browsed, d is the access depth after searching a certain keyword, and alpha is a constant.
The dynamically updated user equipment label is valued as a hash set of the user historical equipment characteristics, the weight is 1, and when a user uses a new equipment and generates a behavior, the equipment is dynamically updated into the hash set, wherein the historical equipment characteristics can be android, IOS, Winphone, Symbian, Blackberry, PC and the like. For example, the device tag of the user a takes the value "PC, android", and the device tag of the user B takes the value "IOS".
And for the dynamically updated position label, extracting a hash set of the user footprint through the position information features in the dynamic attribute features, wherein the weight is 1, and when the user browses a website in a new city or country, dynamically updating the new position into the hash set. For example, the location tag value of the user a is "china, guangdong, shenzhen".
And for dynamically updated consumption capacity labels, loyalty labels and liveness labels, performing characteristic cross combination on dynamic behavior characteristics such as searching, collection, browsing, shopping carts, praise, consumption and comment in the dynamic attribute characteristics and user equipment and position characteristics obtained by analyzing user behavior log key fields by adopting a DeepFM algorithm in a CTR (click through Rate) estimation model, and predicting the weight values of the consumption capacity labels, the loyalty labels and the liveness labels through model training.
According to one embodiment of the present invention, the user value label weight may be obtained from the consumption capability label, the loyalty label, and the activity label, and the calculation formula is as follows:
Figure BDA0001847028570000081
where σ is a sigmoid function for mapping user value between 0 and 1, i is an index for indicating consumption ability, loyalty, or liveness, xiRespectively, the consumption ability label and loyalty degree learned by the deep FM modelWeight value of label, liveness label, wiThe value of the user is a weighted value relative to the consumption capacity, the loyalty and the liveness, and can be defined manually.
In one embodiment, the merchandise representation includes at least one of a category label, a brand label, a price label, an industry label, an affiliate label, a heat label, etc. of the merchandise.
Specifically, various types of labels for the merchandise representation may be obtained according to the following steps: taking the first K results of the multi-classification of the commodities in the multi-dimensional attribute characteristics of the commodities to obtain the classification labels of the commodities; directly obtaining a brand label of the commodity according to brand information of the commodity; obtaining a price interval label of the commodity according to the price of the commodity; obtaining an industry label of the commodity according to the industry information of the commodity; obtaining an affiliated word label of the commodity according to the affiliated word information of the commodity; and dividing the daily/weekly/monthly click rate of the commodity by the sum of the click rates of all the commodities to obtain the click rate of the commodity as the popularity label of the commodity.
In one embodiment, the shop representation includes at least one of a shop location tag, a shop camping tag, a shop industry tag, a shop heat tag, a shop brand tag, and the like.
According to one embodiment of the present invention, various types of tags for a representation of an article of merchandise may be obtained according to the following steps: extracting a shop position label from the position characteristics in the multidimensional attribute characteristics of the shop; performing word segmentation on the characteristics of the main business goods of the shops, filtering useless words, and combining the useless words with the multi-label classification of posts to obtain main business goods labels of the shops; dividing the daily/weekly/monthly click rate of the shops with the sum of the click rates of all the shops to obtain the click rate of the shops as the heat label of the shops; extracting a large category of the shop industry as an industry label of the shop according to the shop industry information; and merging the brand information of the commodities in the shops to obtain brand labels of the shops.
According to one embodiment of the invention, a user representation, a merchandise representation, a shop representation may be stored offline to a database in the form of triple data, e.g., < user, tag, weight >, < merchandise, tag, weight >, < shop, tag, weight >.
And step S140, carrying out personalized information recommendation for the user.
In the step, for a new user in the recommendation system, a multi-source user behavior log can be collected from a log server in real time, the log is processed in real time, key fields reflecting user behaviors are extracted, and stored user portraits, commodity portraits and shop portraits are used for real-time commodity recommendation, shop recommendation and the like, so that interested personalized tag information is provided for the user.
According to another aspect of the present invention, a sketch constructing apparatus in a recommendation system is provided, which is capable of implementing the sketch constructing method of the present invention.
Referring to FIG. 2, in one embodiment, a representation construction apparatus 200 of the present invention includes a data acquisition module 210, a data pre-processing module 220, a social network information processing module 230, a feature extraction module 240, a representation construction module 250, a representation storage module 260, and a real-time recommendation module 270. The functions performed by the modules correspond to the steps of the image construction method described above, and for the sake of clarity, they will not be described in detail here.
The data obtaining module 210 is configured to obtain a multi-source user behavior log from a log server in an offline or real-time manner, and query social network data from a business database.
The data preprocessing module 220 is configured to preprocess unstructured data of the multi-source user behavior log to obtain uniformly and standardly formatted user behavior data, and filter crawlers and abnormal information.
And the social network information processing module 230 is configured to process the post content and the comment information in the social network data to obtain social network attribute features of the user and the shop.
And the feature extraction module 240 is configured to perform data analysis on the preprocessed user behavior data, the social network attribute features of the user and the shop, and the data in the service database, extract and store the multidimensional attribute features of the user, the commodity, and the shop, and obtain a feature library of the user, the commodity, and the shop.
For example, dynamically changing user, commodity, and shop attribute features may be stored in a distributed Nosql database to facilitate updating the computation, while static user, commodity, and shop attribute features may be stored in a Sql database.
And the portrait construction module 250 is used for extracting label information based on the multi-dimensional attribute characteristics of the user, the commodity and the shop, calculating and updating the weight of the label, and obtaining the portrait of the user, the portrait of the commodity and the portrait of the shop.
And the portrait storage module 260 is used for storing user portrait, commodity portrait and shop portrait and providing data support for real-time commodity recommendation and shop recommendation.
For example, three sets of data, i.e., < user, tag, weight >, < product, tag, weight >, < shop, tag, weight > are obtained from the user portrait, the product portrait, and the shop portrait, respectively, and stored in the database.
And the real-time recommendation module 270 is configured to process the log in real time according to the multi-source user behavior log acquired from the log server in real time, so as to extract the multi-dimensional attribute features of the user, the multi-dimensional attribute features of the commodity or the multi-dimensional attribute features of the shop, and perform real-time commodity recommendation and shop recommendation by using the user portrait, the commodity portrait and the shop portrait data stored offline, so as to recommend tag information for the user.
In another embodiment, the image construction device of the present invention may be implemented using distributed storage and computing techniques such as Hadoop, Spark, HBase, etc.
For example, the data acquisition module stores the offline acquired data on the HDFS of the computing cluster for Spark offline computation. The data acquisition module may further include:
the FLUME log collecting subunit is used for collecting logs of the Agent;
a Kafka middleware storage subunit, and a flux log collection subunit distributes the collected logs to the Kafka cluster for spark streaming real-time consumption.
The feature extraction module can store the dynamically changed attribute features of the users, the commodities and the shops into the distributed HBase database, and store the static attribute features of the users, the commodities and the shops into the Mysql database.
The representation storage module may store user representations, merchandise representations, and shop representations in the HBase cluster.
The portrait construction method provided by the invention has the advantages that comprehensive user behavior logs, social network information and the like are collected and analyzed, the characteristics of the multidimensional attributes of the users, the commodities and the shops are obtained, portrait labels and weights of the users, the commodities and the shops are dynamically updated, accurate data information is provided for real-time recommendation through the distributed storage pictures, the timeliness is good, and the comprehensiveness is good.
It should be noted that, although the steps are described in a specific order, the steps are not necessarily performed in the specific order, and in fact, some of the steps may be performed concurrently or even in a changed order as long as the required functions are achieved.
The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.
The computer readable storage medium may be a tangible device that retains and stores instructions for use by an instruction execution device. The computer readable storage medium may include, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing.
Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (12)

1. A personalized information recommendation method comprises the following steps:
step 1: based on a multi-source user behavior log acquired from a log server and social network data inquired from a business database, performing data analysis on the user behavior data and the social network attribute characteristics of the user and the shops to obtain multi-dimensional attribute characteristics of the user, multi-dimensional attribute characteristics of goods and multi-dimensional attribute characteristics of the shops;
step 2: extracting label information and calculating the weight of a label based on the obtained multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop so as to construct a user portrait, a commodity portrait and a shop portrait; the user representation comprises at least one of a gender label, an age label, a business property label, an attention industry label, a user interest preference label, an equipment label, a user location label, a consumption capability label, a loyalty label, an activity label, and a user value label; calculating the weight of the user interest preference tag according to the following formula:
Figure FDA0002953129740000011
wherein w (i) is a weight value corresponding to a browsing behavior, a searching behavior, a collecting behavior, a shopping cart behavior, a like behavior, a consuming behavior, a commenting behavior, i is an index of the behavior,
Figure FDA0002953129740000012
the attenuation factor is, staytimeffector (st) is a stay time factor of browsing behavior, the stay time factor for searching behavior, collecting behavior, shopping cart behavior, like behavior, consuming behavior and commenting behavior is 1, deepfactor (j) is a depth factor of searching behavior, the depth factor for browsing behavior, collecting behavior, shopping cart behavior, like behavior, consuming behavior and commenting behavior is 1, m is the number of behaviors, n is the number of days of behavior occurrence time from the current date, t is stay time when browsing web page, d is the access depth after searching keyword, and alpha is a constant;
and step 3: providing recommendation information for the user based on the constructed user representation, commodity representation or shop representation.
2. The method of claim 1, wherein the multi-dimensional attribute characteristics of the user include at least one of gender, age, business nature, industry of interest, search behavior, collection behavior, browsing behavior, shopping cart behavior, like behavior, consumption behavior, comment behavior, user device, user location.
3. The method of claim 1, wherein the multi-dimensional attribute features of the good comprise at least one of a category, a brand, a price, a business, a dependent word, and a click-through rate of the good.
4. The method of claim 1, wherein the multi-dimensional attribute features of the store comprise at least one of store location, hosted goods, click through rate, industry information, brand information, multi-label classification of posts.
5. The method of claim 1, wherein the weight of the user value label is calculated according to the following formula:
Figure FDA0002953129740000021
where σ is a sigmoid function, i is an index for indicating consumption ability, loyalty, or liveness, and xiWeight, w, representing a consumption capability tag, loyalty tag, or activity tagiIs the weight of the user value relative to the consumption ability, loyalty and liveness.
6. The method of claim 1, wherein the merchandise representation includes at least one of a category label, a brand label, a price label, an industry label, an affiliate label, a heat label for the merchandise.
7. The method of claim 6, wherein the merchandise representation is obtained by:
taking K previous results from the multi-classification of the commodities in the multi-dimensional attribute characteristics of the commodities to obtain a classification label of the commodity, wherein K is an integer greater than or equal to 2;
directly obtaining a brand label of the commodity according to brand information of the commodity;
obtaining a price interval label of the commodity according to the price of the commodity;
acquiring an industry label of the commodity according to the industry information of the commodity;
obtaining an affiliated word label of the commodity according to the affiliated word information of the commodity;
and dividing the sum of the click rate of the commodities and the click rate of all the commodities to obtain the click rate of the commodities as the popularity label of the commodities.
8. The method of claim 1, wherein the shop representation includes at least one of a shop location label, a shop hosted merchandise label, a shop industry label, a shop popularity label, a shop brand label.
9. The method of claim 8, wherein the shop representation is obtained according to the steps of:
extracting a shop position label from the position characteristics in the multidimensional attribute characteristics of the shop;
carrying out word segmentation on the characteristics of the commodities hosted by the shops and merging the characteristics with the multi-label classification of the posts to obtain the labels of the commodities hosted by the shops;
dividing the click rate characteristics of the shops with the sum of the click rates of all the shops to obtain the click rate of the shops as a shop popularity label;
extracting the shop industry type from the shop industry information as a shop industry label;
and merging the brand information of the commodities in the shop to obtain the shop brand label.
10. A personalized information recommendation device, comprising:
a feature extraction module: on the basis of a multi-source user behavior log acquired from a log server and social network data inquired from a business database, performing data analysis on the user behavior data and the social network attribute characteristics of the user and the shops to obtain multi-dimensional attribute characteristics of the user, multi-dimensional attribute characteristics of the commodities and multi-dimensional attribute characteristics of the shops to obtain the multi-dimensional attribute characteristics of the user, the multi-dimensional attribute characteristics of the commodities and the multi-dimensional attribute characteristics of the shops;
an image construction module: extracting label information and calculating the weight of a label based on the obtained multidimensional attribute characteristics of the user, the multidimensional attribute characteristics of the commodity and the multidimensional attribute characteristics of the shop so as to construct a user portrait, a commodity portrait and a shop portrait;
a real-time recommendation module: and providing recommendation information for the user based on the constructed user portrait, commodity portrait or shop portrait.
11. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.
12. A computer device comprising a memory and a processor, on which memory a computer program is stored which is executable on the processor, characterized in that the steps of the method of any of claims 1 to 9 are implemented when the processor executes the program.
CN201811276173.0A 2018-10-30 2018-10-30 Personalized information recommendation method and device Active CN109359244B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811276173.0A CN109359244B (en) 2018-10-30 2018-10-30 Personalized information recommendation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811276173.0A CN109359244B (en) 2018-10-30 2018-10-30 Personalized information recommendation method and device

Publications (2)

Publication Number Publication Date
CN109359244A CN109359244A (en) 2019-02-19
CN109359244B true CN109359244B (en) 2021-07-20

Family

ID=65347458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811276173.0A Active CN109359244B (en) 2018-10-30 2018-10-30 Personalized information recommendation method and device

Country Status (1)

Country Link
CN (1) CN109359244B (en)

Families Citing this family (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033324A (en) * 2019-04-11 2019-07-19 上海拉扎斯信息科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN110162700A (en) * 2019-04-23 2019-08-23 腾讯科技(深圳)有限公司 The training method of information recommendation and model, device, equipment and storage medium
CN110059177B (en) * 2019-04-24 2023-09-26 南京传唱软件科技有限公司 Activity recommendation method and device based on user portrait
CN110097400A (en) * 2019-04-29 2019-08-06 贵州小爱机器人科技有限公司 Information recommendation method, apparatus and system, storage medium, intelligent interaction device
CN110246007B (en) * 2019-05-28 2021-11-19 中国联合网络通信集团有限公司 Commodity recommendation method and device
CN110288434A (en) * 2019-06-12 2019-09-27 达疆网络科技(上海)有限公司 One kind is based on commodity Real-time Feedback O2O real time individual sort method in shops
CN112131417B (en) * 2019-06-25 2024-04-02 北京百度网讯科技有限公司 Image tag generation method and device
CN110275980A (en) * 2019-06-26 2019-09-24 徐州工业职业技术学院 One kind having an X-rayed music recommended method based on group
CN110264285A (en) * 2019-06-28 2019-09-20 佛山石湾鹰牌陶瓷有限公司 A kind of tile product sales exhibition terminal device
CN110415091A (en) * 2019-08-06 2019-11-05 重庆仙桃前沿消费行为大数据有限公司 Shop and Method of Commodity Recommendation, device, equipment and readable storage medium storing program for executing
CN110428231A (en) * 2019-08-06 2019-11-08 重庆仙桃前沿消费行为大数据有限公司 Administrative information recommended method, device, equipment and readable storage medium storing program for executing
CN110489645A (en) * 2019-08-15 2019-11-22 深圳市云积分科技有限公司 A kind of brand association consumer orients the information processing method and device of marketing
CN110490729B (en) * 2019-08-16 2022-11-18 南京汇银迅信息技术有限公司 Financial user classification method based on user portrait model
CN110457589B (en) * 2019-08-19 2020-05-12 上海新共赢信息科技有限公司 Vehicle recommendation method, device, equipment and storage medium
CN110827063A (en) * 2019-10-18 2020-02-21 用友网络科技股份有限公司 Multi-strategy fused commodity recommendation method, device, terminal and storage medium
CN112825076B (en) * 2019-11-20 2024-03-01 北京搜狗科技发展有限公司 Information recommendation method and device and electronic equipment
CN111178950B (en) * 2019-12-19 2023-08-29 车智互联(北京)科技有限公司 User portrait construction method and device and computing equipment
CN111178953B (en) * 2019-12-20 2023-10-31 贝壳技术有限公司 Information generation method and device, electronic equipment and storage medium
CN111143680B (en) * 2019-12-27 2024-03-26 上海携程商务有限公司 Route recommendation method, system, electronic equipment and computer storage medium
CN111191713A (en) * 2019-12-27 2020-05-22 大象慧云信息技术有限公司 User portrait method and device based on invoice data
CN111210275B (en) * 2020-01-06 2023-07-21 平安科技(深圳)有限公司 VR data-based user portrait construction method and device and computer equipment
CN111709819B (en) * 2020-01-20 2021-03-30 山东佳联电子商务有限公司 Point-and-shoot-net property right transaction recommendation system and recommendation method based on graph neural network
CN111581452B (en) * 2020-03-26 2023-10-17 浙江口碑网络技术有限公司 Recommendation object data obtaining method and device and electronic equipment
CN111598648A (en) * 2020-04-16 2020-08-28 上海源慧信息科技股份有限公司 Full-link online marketing method based on fast-moving industrial commodities
CN111652735A (en) * 2020-04-17 2020-09-11 世纪保众(北京)网络科技有限公司 Insurance product recommendation method based on user behavior label characteristics and commodity characteristics
CN111651584A (en) * 2020-04-17 2020-09-11 世纪保众(北京)网络科技有限公司 Insurance article recommendation method based on user behavior characteristics and article attributes
CN111858702B (en) * 2020-06-28 2022-02-11 西安工程大学 User behavior data acquisition and weighting method for dynamic portrait
CN111932297A (en) * 2020-07-23 2020-11-13 宁波奥克斯电气股份有限公司 User portrait generation method and recommendation method of air conditioning equipment
CN111861679A (en) * 2020-08-04 2020-10-30 深圳市创智园知识产权运营有限公司 Commodity recommendation method based on artificial intelligence
TWI757854B (en) * 2020-08-28 2022-03-11 中國信託商業銀行股份有限公司 Business recommendation system and method
CN112348430B (en) * 2020-10-19 2021-09-07 北京中恒云科技有限公司 User data analysis method, computer equipment and storage medium
CN112861541B (en) * 2020-12-15 2022-06-17 哈尔滨工程大学 Commodity comment sentiment analysis method based on multi-feature fusion
CN112651805B (en) * 2020-12-30 2023-11-03 广东各有所爱信息科技有限公司 Commodity recommendation method and system for online mall
CN113744019A (en) * 2021-01-12 2021-12-03 北京沃东天骏信息技术有限公司 Commodity recommendation method, commodity recommendation device, commodity recommendation equipment and storage medium
CN112818037A (en) * 2021-02-01 2021-05-18 上海阿法迪智能数字科技股份有限公司 Book recommendation system and method
TWI776395B (en) * 2021-02-20 2022-09-01 愛酷智能科技股份有限公司 User status analysis system for business message groups in social platform
CN112990973B (en) * 2021-03-22 2023-06-30 山东顺能网络科技有限公司 Online shop portrait construction method and system
CN113220985B (en) * 2021-04-06 2022-07-19 天津大学 Service recommendation method based on embedded user portrait model in healthy endowment environment
CN113222687A (en) * 2021-04-22 2021-08-06 杭州腾纵科技有限公司 Deep learning-based recommendation method and device
CN113268645A (en) * 2021-05-07 2021-08-17 北京三快在线科技有限公司 Information recall method, model training method, device, equipment and storage medium
CN113407827A (en) * 2021-06-11 2021-09-17 广州三七极创网络科技有限公司 Information recommendation method, device, equipment and medium based on user value classification
CN113505295A (en) * 2021-06-29 2021-10-15 广州智会云科技发展有限公司 Enterprise customer acquisition push algorithm implementation method and system
CN113592590A (en) * 2021-07-27 2021-11-02 中国联合网络通信集团有限公司 User portrait generation method and device
CN113627995A (en) * 2021-09-17 2021-11-09 广州华多网络科技有限公司 Commodity recommendation list updating method and device, equipment, medium and product thereof
CN114049142A (en) * 2021-10-27 2022-02-15 创优数字科技(广东)有限公司 Commodity quality data processing method and device, computer equipment and storage medium
CN113988727B (en) * 2021-12-28 2022-05-10 卡奥斯工业智能研究院(青岛)有限公司 Resource scheduling method and system
CN114429371B (en) * 2022-04-06 2022-06-28 新石器慧通(北京)科技有限公司 Unmanned vehicle-based commodity marketing method and device, electronic equipment and storage medium
CN114648392B (en) * 2022-05-19 2022-07-29 湖南华菱电子商务有限公司 Product recommendation method and device based on user portrait, electronic equipment and medium
TWI831287B (en) * 2022-07-12 2024-02-01 財團法人商業發展研究院 A target customer consumption preference behavior observation system and method
CN114936326B (en) * 2022-07-20 2022-11-29 陈守红 Information recommendation method, device, equipment and storage medium based on artificial intelligence
CN115511582B (en) * 2022-10-31 2023-06-27 深圳市快云科技有限公司 Commodity recommendation system and method based on artificial intelligence
CN115687786A (en) * 2022-11-18 2023-02-03 广东大比特网络科技有限公司 Personalized recommendation method, system and storage medium
CN117522528B (en) * 2024-01-04 2024-03-12 厦门智数联科技有限公司 Internet data detection and analysis method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009116552A (en) * 2007-11-05 2009-05-28 Yahoo Japan Corp Behavior attribute acquisition system and method for controlling the system
CN101493832A (en) * 2009-03-06 2009-07-29 辽宁般若网络科技有限公司 Website content combine recommendation system and method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966220A (en) * 2015-07-23 2015-10-07 北京按钮云商科技有限公司 Commodity information push and user habit analysis method and system
CN106776619B (en) * 2015-11-20 2020-09-04 百度在线网络技术(北京)有限公司 Method and device for determining attribute information of target object
CN108154401B (en) * 2018-01-15 2022-03-29 阿里巴巴(中国)有限公司 User portrait depicting method, device, medium and computing equipment
CN108230051A (en) * 2018-02-12 2018-06-29 昆山数泰数据技术有限公司 A kind of user based on label Weight algorithm is to the determining method of commodity attention rate

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009116552A (en) * 2007-11-05 2009-05-28 Yahoo Japan Corp Behavior attribute acquisition system and method for controlling the system
CN101493832A (en) * 2009-03-06 2009-07-29 辽宁般若网络科技有限公司 Website content combine recommendation system and method

Also Published As

Publication number Publication date
CN109359244A (en) 2019-02-19

Similar Documents

Publication Publication Date Title
CN109359244B (en) Personalized information recommendation method and device
CN108121737B (en) Method, device and system for generating business object attribute identifier
CN107220365B (en) Accurate recommendation system and method based on collaborative filtering and association rule parallel processing
CN105224699B (en) News recommendation method and device
KR101419504B1 (en) System and method providing a suited shopping information by analyzing the propensity of an user
US8725592B2 (en) Method, system, and medium for recommending gift products based on textual information of a selected user
CN112785397A (en) Product recommendation method, device and storage medium
WO2021025926A1 (en) Digital content prioritization to accelerate hyper-targeting
CN108885624B (en) Information recommendation system and method
CN112364204B (en) Video searching method, device, computer equipment and storage medium
CN105426528A (en) Retrieving and ordering method and system for commodity data
CN105023178B (en) A kind of electronic commerce recommending method based on ontology
CN109101553B (en) Purchasing user evaluation method and system for industry of non-beneficiary party of purchasing party
Cho et al. Clustering method using item preference based on RFM for recommendation system in u-commerce
CN112269805A (en) Data processing method, device, equipment and medium
CN110111167A (en) A kind of method and apparatus of determining recommended
CN114266443A (en) Data evaluation method and device, electronic equipment and storage medium
CN115147130A (en) Problem prediction method, apparatus, storage medium, and program product
CN109190027A (en) Multi-source recommended method, terminal, server, computer equipment, readable medium
CN110795613A (en) Commodity searching method, device and system and electronic equipment
Wu et al. Determining the factors affecting customer satisfaction using an extraction-based feature selection approach
Sharma et al. Intelligent data analysis using optimized support vector machine based data mining approach for tourism industry
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
CN114282119B (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network
CN106933993B (en) Information processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant