CN109710876B - Information recommendation method and device and computer readable storage medium - Google Patents

Information recommendation method and device and computer readable storage medium Download PDF

Info

Publication number
CN109710876B
CN109710876B CN201811605413.7A CN201811605413A CN109710876B CN 109710876 B CN109710876 B CN 109710876B CN 201811605413 A CN201811605413 A CN 201811605413A CN 109710876 B CN109710876 B CN 109710876B
Authority
CN
China
Prior art keywords
access
party
accessed object
clustering
accessed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811605413.7A
Other languages
Chinese (zh)
Other versions
CN109710876A (en
Inventor
曲之琳
李琳
吴耀华
李小海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Original Assignee
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Migu Cultural Technology Co Ltd, China Mobile Communications Group Co Ltd filed Critical Migu Cultural Technology Co Ltd
Priority to CN201811605413.7A priority Critical patent/CN109710876B/en
Publication of CN109710876A publication Critical patent/CN109710876A/en
Application granted granted Critical
Publication of CN109710876B publication Critical patent/CN109710876B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention discloses an information recommendation method and device and a computer readable storage medium, wherein the information recommendation method comprises the following steps: acquiring real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object; determining the access reserve amount of each accessed object of each access party according to the mapping relation; clustering each visitor according to the visit reservation quantity to obtain a clustering result; and recommending the accessed object corresponding to the clustering center to each access party according to the clustering result.

Description

Information recommendation method and device and computer readable storage medium
Technical Field
The invention relates to the field of big data analysis and calculation, in particular to an information recommendation method and device and a computer-readable storage medium.
Background
The explosion of the internet brings the outbreak of mass information, and in order to improve the efficiency of a user in pertinently acquiring required information from the mass information, the user demand can be acquired by analyzing the user interest in retention, and then the information which the user is interested in is pertinently recommended.
The existing mainstream retention analysis calculation method is to collect data, upload the data to a high-throughput distributed publish-subscribe message system, and process all action stream data in a website, such as access data of web browsing, searching or other access objects, wherein the retention analysis uses a Hive script language to obtain recommendation information according to user active identification, user concern relationship or analysis user characteristics.
However, the existing retention analysis calculation method is an offline calculation model of a T +1 mode, retention analysis is performed on offline data, and the problem of low accuracy exists, and the existing retention analysis only directly obtains recommendation information based on a retention analysis result, and the problem of lack of flexibility in information recommendation exists.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention desirably provide an information recommendation method and apparatus, and a computer-readable storage medium, which can perform targeted recommendation based on an access object, and improve flexibility of information recommendation.
The technical scheme of the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides an information recommendation method, where the method includes:
acquiring real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object;
determining the access reserve amount of each accessed object of each access party according to the mapping relation;
clustering each visitor according to the visit reservation quantity to obtain a clustering result;
and recommending the accessed object corresponding to the clustering center to each access party according to the clustering result.
In a second aspect, an embodiment of the present invention provides an information recommendation apparatus, where the apparatus includes:
an acquisition unit for acquiring real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object;
the determining unit is used for determining the access reservation quantity of each access party to each accessed object according to the mapping relation;
the clustering unit is used for clustering the visitors according to the access reservation quantity of the visitors to the visited objects;
and the recommending unit is used for recommending the accessed object corresponding to the clustering center to each access party according to the clustering result.
In a third aspect, an embodiment of the present invention provides an information recommendation apparatus, where the information recommendation apparatus at least includes a processor, a memory storing instructions executable by the processor, a communication interface, and a bus for connecting the processor, the memory, and the communication interface, and when the instructions are executed, the processor implements the information recommendation method provided in the foregoing embodiment.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the information recommendation method provided in the foregoing embodiment.
The embodiment of the invention provides an information recommendation method, an information recommendation device and a computer readable storage medium, wherein the information recommendation method comprises the following steps: acquiring real-time stream data; the real-time streaming data includes: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object; determining the access reserve amount of each access party to each accessed object according to the mapping relation; according to the access reservation quantity, clustering all the access parties to obtain a clustering result; and recommending the accessed object corresponding to the clustering center to each access party according to the clustering result. That is to say, on one hand, the embodiment of the invention acquires the real-time streaming data, and can perform real-time effective retention analysis based on the real-time streaming data, and on the other hand, after acquiring the access retention information, clustering is performed on each access party by using the retention information, and information recommendation is performed for a clustering result, so that targeted recommendation can be performed based on each access party, and the flexibility and accuracy of information recommendation are improved.
Drawings
Fig. 1 is a first schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an implementation flow of an information recommendation method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention;
fig. 4 is a schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention;
fig. 5 is a schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention;
fig. 6 is a schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention;
fig. 7 is a seventh implementation flow diagram of an information recommendation method according to an embodiment of the present invention;
fig. 8 is a first schematic structural diagram of an information recommendation device according to an embodiment of the present invention;
fig. 9 is a schematic diagram of a composition structure of an information recommendation device according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
Example one
An information recommendation method is provided in an embodiment of the present invention and applied to an information recommendation device, fig. 1 is a schematic view of an implementation flow of the information recommendation method provided in the embodiment of the present invention, and as shown in fig. 1, the information recommendation method implemented by an information recommendation device in the information recommendation method provided in the embodiment of the present invention may include steps 101 to 104. The following were used:
step 101, acquiring real-time stream data.
In the embodiment of the present invention, in order to solve the problems that the amount of calculated data is large and real-time effective retention analysis cannot be obtained in the current retention analysis using an offline calculation model, the embodiment of the present invention obtains real-time stream data through an information recommendation device, where the real-time stream data includes: and mapping relation between the identifier of each accessed object and the login time of each accessing party to each accessed object.
The information recommendation device includes a real-time streaming system and a service system, wherein the real-time streaming system collects data generated by the service system in real time, delivers the data to a streaming processing frame for data processing and statistics, stores the data in a storage, and displays the statistical result in real time in a visual manner; the retention analysis is a statistical method for judging whether each visitor has access to each accessed object for a long time or not, and can most intuitively monitor whether each visitor is converted into a stable visitor or not, so that operators of each visited object and development of each visitor can know the attraction degree of each visited object.
In the embodiment of the present invention, the identifier of each accessed object is used to characterize that the accessing party accesses a page or an element on each accessed object.
It should be noted that, in the process of accessing by the access party, the access type includes a registration access type and a login access type, and corresponding processing manners are different according to different access types, so that the access type of the access party needs to be determined before the real-time stream is acquired in the embodiment of the present invention.
Further, the acquiring of the real-time data stream by the information recommendation device in the embodiment of the present invention may include:
acquiring access information of each access party for accessing each access object; when the access type is determined to be login access, acquiring the login time of each accessed object; and acquiring real-time stream data according to the login time and the identification of each accessed object.
It should be noted that the access information includes the identifier and the access type of each access object, the access type is used to represent whether the access behavior of each access party is Login access or registration access, when the Login type is determined, the access identifier and the Login time are obtained, and the generated real-time data stream may be the identifier of Login-access party-Login time-access object; and when the registration type is determined, the user is indicated to be registered, and the next login time needs to be recorded, so that the real-time streaming data is obtained.
Illustratively, the real-time data stream may be a Login-a-2018-06-26-A page.
And 102, determining the access reservation quantity of each access party to each accessed object according to the mapping relation.
In the embodiment of the invention, after the information recommendation device acquires the real-time stream data, the access reservation quantity of each access party to each accessed object is determined according to the mapping relation.
The access retention amount may be expressed as an accessing party, a first login time, a login time difference, a number of times of accessing the object, wherein the login time difference is determined according to the login time of each accessing party to each accessed object, and the number of times is determined according to the number of times of accessing each accessing object by each accessing party at the login time.
Illustratively, the access retention amount may be a-2018-06-24-10-a page-n, and the access retention amount may obtain that the number of times that the visitor a accesses the a page corresponding to the identifier of the accessed object every 10 days after logging in the accessed object 2018-06-24 for the first time is n.
The information recommendation device may determine, according to the mapping relationship, an access retention amount of each accessing party for each accessed object, including: carrying out duplicate removal processing on a mapping relation corresponding to the real-time data stream in a preset time period to obtain a duplicated mapping relation; and carrying out retention analysis on the mapping relationship after the duplication removal to obtain the access retention amount of each visitor to each accessed object, wherein the access retention amount comprises the retention amount of the accessed object and the retention amount of the visitor.
And 103, clustering all the visitors according to the access reservation quantity to obtain a clustering result.
In the embodiment of the invention, after determining the access reservation amount of each accessed object by each access party according to the mapping relation, the information recommendation device clusters each access party according to the access reservation amount of each accessed object by each access party to obtain a clustering result.
It should be noted that the information recommendation apparatus may analyze the access reservation amount of each visited object by each visitor through a K-means clustering algorithm, so as to cluster each visitor, wherein,
the K-means algorithm is a process of repeatedly moving the center point of a class, moving the center point of the class, i.e., the centroid, to the average position where it contains members, and then repartitioning its internal members. Wherein k is a hyper-parameter calculated by an algorithm and represents the number of classes; k-means can automatically assign samples to different classes, but cannot decide whether to assign classes at all; k must be a positive integer less than the number of samples in the training set;
the number of classes may be specified by the question content; the parameters of K-means are the centroid position of the class and the position of its internal observations, which are similar to the generalized linear model and the decision tree, and the optimal solution of the K-means parameters is also targeted for cost function minimization.
The cost function is the sum of the distortion levels of the classes, each class distortion level being equal to the sum of the squares of the distance between the center of the class and its internal membership location. The more compact the members within a class are, the less distorted the class is, whereas the more dispersed the members within a class are, the greater the distorted the class is. The parameter for solving the minimization of the cost function is a process for repeatedly configuring the observed value contained in each class and continuously moving the centroid of each class. In practice, the centroid position may randomly select the position of the observed value. At the time of each iteration of the process,
the K-means algorithm assigns observations to their nearest class and then moves the centroid to the mean of all member positions in that class, with the K-means cost function as given by equation (1):
Figure BDA0001923474530000061
wherein the cost function is V, k cluster number, XjRepresenting a data set, SiRepresents XjDimension of data set, uiRepresenting the cluster center.
And 104, recommending the accessed object corresponding to the clustering center to each access party according to the clustering result.
In the embodiment of the invention, after clustering is carried out on each visitor by the information recommending device, the visited object corresponding to the clustering center is recommended to each visitor according to the clustering result.
It should be noted that the clustering result may be K clustering centers obtained by a K-means clustering algorithm, and then accessed objects corresponding to each accessing party may be obtained by the K clustering centers, that is, accessed objects that are interested by each accessing party may be obtained and recommended to the corresponding accessing party.
Illustratively, the visitor a obtains that the visitor a belongs to the clustering center a through a K-means algorithm, which indicates that the page or element corresponding to the clustering center a is a page or element which is interested by the visitor a, and recommends the page or element corresponding to the clustering center a to the visitor a as information to be recommended.
Fig. 2 is a schematic view of an implementation flow of an information recommendation method according to an embodiment of the present invention, and as shown in fig. 2, an information recommendation apparatus in the information recommendation method according to the embodiment of the present invention determines, according to a mapping relationship, an access reservation amount of each visited object by each visitor, that is, a process of step 102 may include step 102a and step 102 b. The following were used:
and step 102a, determining the access times of each access party for accessing each accessed object according to the mapping relation.
In the embodiment of the invention, the information recommendation device determines the access times of each access party for accessing each accessed object according to the mapping relation between the identifier of each accessed object and the login time of each access party for each accessed object.
It should be noted that, if the access party accesses the accessed object on a certain date, it indicates that the access frequency of the access party accessing the accessed object is 1 time, and if the access party accesses the accessed formation again on the date, it indicates that the access frequency of the access party accessing the accessed object is 2 times, and the access frequencies are accumulated in sequence, so that the access frequency of each access party accessing each access object can be obtained through statistics.
Illustratively, the real-time data stream is Login-a-2018-06-30-A page, the visitor a visits the A page n times again in 2018-06-30, and the visit times of the visitor a visiting the A page in 2018-06-30 are n + 1.
And step 102b, determining the access reservation quantity of each visitor to each visited object according to the number of visits and the login time of each visitor to each visited object.
In the embodiment of the present invention, after obtaining the number of times of access by each accessing party to each accessed object and the login time, the information recommendation apparatus may obtain the access remaining amount of each accessing party to each accessed object according to the number of times of access and the login time.
Illustratively, the real-time access data stream is a Login-a-2018-06-30-A page, corresponding to the first access time of 2018-06-24, the access time difference of 6 and the access frequency of 3, the access reservation quantity of a-2018-06-24-6-A page-3 can be obtained, and the access frequency of the access party a to access the A page after 6 days of the first access time of 2018-06-24 is 3.
Fig. 3 is a third schematic flow chart illustrating an implementation process of an information recommendation method according to an embodiment of the present invention, and as shown in fig. 3, an information recommendation apparatus in the information recommendation method according to the embodiment of the present invention determines, according to the number of accesses and a login time when each visitor accesses each accessed object, an access reservation amount of each visitor for each accessed object, that is, a process of step 102b includes step 102b1, step 102b2, and step 102b 3. The following were used:
and step 102b1, according to each access party, searching the first login time of each access party for accessing each accessed object from the preset database.
In the embodiment of the invention, the preset database in the information recommendation device stores the time when each access party accesses each access object for the first time or the time when each access party registers each access object.
For example, the preset database may store information such as: Login-a-2018-06-24-A page, Login-b-2018-05-24-A page and Login-c-2018-04-24-A page.
Further, the information recommendation device searches the first login time of each access party accessing each accessed object from a preset database according to each access party, and the information recommendation device comprises:
acquiring historical identification and historical login time of each access party for accessing each accessed object from a preset database according to each access party; and when the history identification is determined to be consistent with the identification of each accessed object, taking the history login time as the first login time of each accessed object accessed by each access party.
It should be noted that, if the obtained real-time data stream is Login, the accessing party, the Login time, and each identifier of the accessed object, the information recommendation device needs to find out, from the preset access database, a historical data stream corresponding to each access object accessed by the accessing party, where the historical data stream includes the historical Login time corresponding to the historical identifier of each access object accessed by the accessing party for the first time; and if the obtained real-time data stream is Register-access party-registration time, recording the next access time, and then processing based on the next access time to obtain the corresponding retention amount.
If the history identifier is consistent with the identifier of the accessed object, the access party is indicated to access the same history identifier, and the history login time of accessing the history identifier for the first time can be searched from the database; if the fact that the historical identifier which is consistent with the identifier of the accessed object does not exist in the preset database is determined, the fact that the accessing party accesses the accessed identifier of the accessed object for the first time is indicated, and for this reason, information of the accessing party accessing the accessed object needs to be recorded and stored in the preset database, and therefore the next calculation is convenient to conduct.
Illustratively, the real-time data stream is a Login-a-2018-06-30-A page, and if the corresponding historical data stream is the Login-a-2018-06-24-A page, which is searched from the preset data, the historical identifier is determined to be consistent with the identifier of the accessed object, and correspondingly, the first Login time for accessing the A page is 2018-06-24.
When the real-time data stream is a Login-a-2018-06-30-A page and the corresponding A page corresponding to the access identifier is not accessed by the access party a in the corresponding preset database, recording the information of the access party accessing the accessed object, namely the Login-a-2018-06-30-A page, in the preset database.
And step 102b2, acquiring the difference of the login time according to the login time and the first login time when each visitor accesses each accessed object.
In the embodiment of the invention, after the information recommendation device obtains the login time and the first login time of the real-time data stream, the login time difference can be obtained by performing difference processing on the login time and the first login time.
Illustratively, the first login time obtained by searching the preset database is 2018-06-24, the login time when each access party accesses each accessed object is 2018-06-30, and the corresponding login time difference is 6.
Step 102b3, according to the difference of the registration time and the access frequency, the access reservation amount of each access party to each access object is determined.
In the embodiment of the invention, after the information recommendation device obtains the login time difference and the access times, the information recommendation device can determine the access remaining amount of each access party to each access object according to the login time difference and the access times.
Illustratively, if the real-time data stream is a Login-a-2018-06-30-A page, the A page is accessed again on the day of 2018-06-30, and the corresponding historical data stream in the preset database is the Login-a-2018-06-24-A page, the corresponding Login time difference is 6, the access frequency is 2, and then the remaining amount of the access party a to the A page is determined to be a-2018-06-24-2-A page-2 according to the Login time difference and the access frequency.
Fig. 4 is a fourth schematic flow chart illustrating an implementation flow of an information recommendation method according to an embodiment of the present invention, and as shown in fig. 4, before an information recommendation device in the information recommendation method according to the embodiment of the present invention clusters access parties according to an access reservation amount, that is, before step 103, the information recommendation device may further include step 105. The following were used:
and 105, judging the type of each access party according to whether the access reservation amount of each access party to each accessed object in a preset time length is lower than a preset value.
In the embodiment of the invention, before clustering each visitor, the information recommendation device can judge the type of each visitor according to whether the access reservation amount of each visited object in the preset time length by the visitor is lower than a preset value.
It should be noted that the access party type may include a loyalty access party and a non-loyalty access party, wherein the loyalty access party is configured to characterize that the access retention amount for each accessed object within a predetermined time length is higher than a preset value; the non-loyalty access party is used for representing that the access retention amount of the access party for each accessed object in a preset time length is lower than a preset value.
In the embodiment of the invention, the information recommendation device can obtain the times of each access party accessing each accessed object through the access reservation amount of each access party to each accessed object within the preset time length, and further can determine the type of each access party through comparison between the access times and the preset value, when the access times are higher than the preset value, each access party is determined to be a loyalty access party, and when the access times are lower than the preset value, each access party is determined to be a non-loyalty access party.
It should be noted that the preset time length may be set according to actual requirements, for example, the preset time length may be set to be one month.
Illustratively, the amount of access reservation of the visitor a for each visited object within the preset time length is as follows: a-2018-06-24-n-A page-n, a-2018-06-24-n-B page-q and a-2018-06-24-n-C page-x, and the number of times that the access party a accesses each accessed object is n + q + x.
Before determining the type of each access party by comparing the access times with a preset value, the information recommendation device needs to set an active access party, for example, a party with a login day of more than 20 days per month is set as the active access party, and the average value of the access times of the access parties accessing each accessed object in a preset time period is obtained from the remaining quantity of the active access party.
Illustratively, the preset time period may be set to be one month, and the corresponding preset value is 20.
Correspondingly, clustering each visitor according to the access reservation amount of each visitor to each accessed object, comprising:
when the type of each access party is determined to be a preset type, monitoring each accessed object corresponding to the preset type; and when the change of each accessed object corresponding to the preset type is determined, clustering each access party corresponding to the preset type according to the access reservation quantity of each accessed object corresponding to the preset type.
In the embodiment of the invention, the preset type is used for representing that the access retention amount of the access party to each accessed object in the preset time length is higher than the preset value.
Based on the above, the visitor may correspond to a loyalty visitor when the access reservation amount for each visited object within the predetermined time period is higher than the preset value, that is, the embodiment of the present invention monitors whether each visited object corresponding to the loyalty visitor changes.
It should be noted that, in the embodiment of the present invention, the type of each access party may be determined by the number of accesses, and once the type of the access party is changed, it indicates that the number of times the access party accesses the accessed object is changed, and it may be determined that the information recommended to the access party may not meet the interest of the access party, so that the information to be recommended to the access party needs to be re-determined for each accessed object corresponding to the changed preset type.
Fig. 5 is a schematic flow chart of a fifth implementation flow of the information recommendation method according to the embodiment of the present invention, and as shown in fig. 5, the information recommendation apparatus in the information recommendation method according to the embodiment of the present invention clusters the visitors according to the access reservation amounts of the visitors to the visited objects, that is, the process of step 103 may include step 103a and step 103 b. The following were used:
and 103a, setting an initial centroid according to each accessed object.
In the embodiment of the invention, the information recommendation device needs to set the initial centroid in the process of clustering each visitor according to the visiting reservation quantity of each visitor to each visited object, so that clustering can be performed based on the initial centroid.
And 103b, clustering all the access parties according to the access remaining amount and the initial mass center to obtain a clustering result.
In the embodiment of the invention, after the information recommendation device sets the initial centroid, the information recommendation device can cluster each visitor through the initial centroid and the access reservation quantity of each visitor to each accessed object by the K-means clustering algorithm.
It should be noted that the K-means clustering algorithm may include the following steps:
1) an initial centroid is determined for each cluster, i.e., K data are selected from the data set as K initial centroids.
2) And allocating the samples to the nearest clusters according to a minimum distance principle.
It should be noted that, the distance between two points is measured by using the euclidean distance formula, for example, for two points T0(x1, y2) and T1(x2, y2), the euclidean distance d between T0 and T1 is as shown in formula (2):
d=sqrt((x1-x2)2+(y1-y2)2) (2)
3) the sample mean in each cluster is used as the new cluster center.
4) And repeating the steps 2) and 3) until the cluster center is not changed any more, and obtaining k clusters.
It should be noted that the K-means clustering algorithm is an unsupervised learning mode, and can directly perform classification statistics without performing advanced training and learning on big data.
Fig. 6 is a schematic diagram illustrating a sixth implementation flow of an information recommendation method according to an embodiment of the present invention, as shown in fig. 6, an information recommendation apparatus in the information recommendation method according to the embodiment of the present invention recommends, to each visitor, an accessed object corresponding to a cluster center according to a clustering result, that is, step 104 includes steps 104a and 104b and step 104 c. The following were used:
and step 104a, determining a clustering center to which each access party belongs according to the clustering result.
In the embodiment of the present invention, before acquiring an accessed object corresponding to a clustering center, an information recommendation device needs to determine the clustering center to which each accessing party belongs according to a clustering result.
Illustratively, the clustering result may be that the visitor a [10, 10, 5, 10] corresponds to the clustering center a; the visitors B [20, 5, 10, 20] correspond to the cluster center B, wherein a [10, 10, 5, 10] is used for representing the retention amount of different visitors accessing each visited object.
And 104b, acquiring the accessed object corresponding to the cluster center from the accessed objects.
And 104c, recommending each accessed object to each access party.
In the embodiment of the invention, the information recommendation device can acquire each access object corresponding to the clustering center from the clustering result, and recommend each access object corresponding to the clustering center as recommendation information to each access party.
It should be noted that, in the embodiment of the present invention, after the access objects of the loyalty access party are changed, that is, when the loyalty access party is changed into a non-loyalty access party, the loyalty access is further classified by using a clustering algorithm, so that the access objects that the changed loyalty access party is interested in can be obtained, and recommendation for the changed access party is further implemented.
It can be understood that the embodiment of the invention distinguishes different types of the access party through the access retention amount, and then carries out information recommendation based on the type of the access party, so that the recommended information can be adjusted in time according to the types of different access objects, and the information can be recommended in a targeted manner.
According to the information recommendation method provided by the embodiment of the invention, on one hand, the remaining quantities of different objects are analyzed according to the access time and the access page, so that the processing process can be simplified, the data processing quantity is reduced, and the specific page on the interested target object is determined; on the other hand, the access objects are classified, targeted recommendation is performed based on the types of the access objects, and the accuracy of information recommendation can be improved.
Example two
Based on the same inventive concept of the first embodiment, in order to better explain the information recommendation method provided by the embodiment of the present invention, the information recommendation method is applied to the information recommendation device by taking an example in which the identifier of each accessed object corresponds to an access page, fig. 7 is a schematic diagram illustrating an implementation flow of the information recommendation method provided by the embodiment of the present invention, and as shown in fig. 7, the information recommendation method implemented by the information recommendation device may include the following steps:
step 201, establishing a database according to the behavior data of the access party.
In the embodiment of the invention, the access party behaviors comprise login access behaviors and registration access behaviors, correspondingly, the access behavior data comprise first login data and registration data, and the information recommendation device establishes a database according to the access party behavior data, including establishing a login table and establishing a registry.
Illustratively, the database may be a non-relational database redis.
It should be noted that the creating of the database by the information recommendation apparatus includes: firstly, the registration time of different access parties is stored, the specific dates of registration of the access parties and the access parties are stored, and a key value key is set as the specific date of registration of the access parties and the access parties, so that a corresponding registration table is obtained. Secondly, storing the first login time of different visitors, wherein three rows of data are required to be stored, namely the specific date of the first login of the visitors and the specific page visited by the visitors, and setting the key as ' the specific date of the first login of the visitors-the specific page visited by the visitors ' -to obtain the corresponding login table '.
Illustratively, the registry is as in table 1 and the registry is as in table 2.
TABLE 1
Register—a—2018-06-10
Register—b—2018-05-10
Register—c—2018-04-10
Register—d—2018-04-10
TABLE 2
Login-a-2018-06-24-A page
Login-b-2018-05-24-A page
Login-c-2018-04-24-A page
Step 202, obtaining a real-time access data stream.
In the embodiment of the invention, an information recommendation device acquires behavior data of an access party in a real-time data stream, the behavior data of the access party comprises access time and access information, and the behavior type of the access party is determined according to the behavior data of the access party; when the behavior type of the access party is a Login access type, the correspondingly obtained real-time data stream is a specific page accessed by Login, the access party, the Login time and the access; and when the registration type is determined, recording the registration time, so as to facilitate subsequent login.
Illustratively, the real-time data stream may be a Login-a-2018-06-26-A page.
And step 203, obtaining the retention amount of the access page of the access party according to the database and the real-time data stream.
In the embodiment of the present invention, if the real-time data stream received by the information recommendation apparatus is: the Login-a-2018-06-26-A page searches the Login table in the step 201 through the visitor a to obtain the first Login time 2018-06-24 of the visitor a, the difference between the Login time 2018-06-26 and the first Login time 2018-06-24 is calculated to be 2, and the remaining amount of the visitor a accessing the A page is obtained according to the difference: a-2018-06-24-2-A page-1; if the access party a accesses the page A again on the same day, accumulating to obtain the retention amount of the access party a accessing the page A as follows: a-2018-06-24-2-A page-2, and the retention amount of the visitor a visiting the A page is a-2018-06-24-2-A page-n by sequentially accumulating.
If the real-time data stream received in the information recommendation device is as follows: the Login-a-2018-06-30-A page is correspondingly accessed by an access party a when the access party a logs in the A page at the Login time 2018-06-30, the first Login time 2018-06-24 is found by the access party a, and the difference between the Login time 2018-06-30 and the first Login time 2018-06-24 is calculated as follows: 6, obtaining the retention amount of the visitor a for visiting the A page as follows: a-2018-06-24-6-A page-1.
If the data stream received in the information recommendation device is as follows: register-d-2018-04-10 shows that the visitor d just registers, does not Register yet, and needs to record the next registration time to perform value superposition.
If the real-time data stream received in the information recommendation device is as follows: Login-a-2018-07-24-B page, the log table in step 201 is searched to obtain that the visitor a has not visited the B page before, and then the real-time data stream is recorded in the log table.
In the embodiment of the present invention, the information recommendation apparatus calculates the real-time data stream, and can obtain the number of times that the access party accesses the page within a certain time difference, where the remaining amount of the access party accessing the page is as follows: a-2018-06-24-10-A page-n, the login time of the visitor a for logging in the A page for the first time 4 is 2018-06-2, and the number of times of visiting the A page on the current day after 10 days is n.
And 204, processing the reserved amount of the access page of the access party in the preset time period to obtain the reserved amount of the access page in the preset time period of the access party.
Illustratively, the retention amount of the visitor accessing the page within the preset time period is: a-2018-06-24-10-A page-n, B-2018-05-24-10-A page-m, C-2018-04-24-10-A page-p, a-2018-06-24-10-B page-q, B-2018-05-24-10-B page-r, C-2018-04-24-10-B page-s, a-2018-06-24-10-C page-x, B-2018-05-24-10-C page-y, C-2018-04-24-10-C page-z.
According to the remaining amount of the visited page of the visitor, the visitor logs in the page for the first time, the total remaining amount of the visited A page after 10 days is n + m + p, the total remaining amount of the B page is q + r + s, the total remaining amount of the C page is x + y + z, the visited object assumes only three pages of A, B and C, and then the total remaining amount of the visited object is all: n + m + p + q + r + s + x + y + z. The retention rate of the A page is (n + m + p)/all, the retention rate of the B page is (q + r + s)/all, and the retention rate of the C page is (x + y + z)/all.
The remaining amount of the page accessed by the visitor can also obtain the corresponding user remaining amount and remaining rate after the first registration: after the visitor registers the visited object for the first time, using deduplication statistics, if the remaining amount after 10 days is three visitors a, b, and c, the visitor remaining amount is 3, the number of visitors registered at the beginning is a + b + c + d, and the total user remaining rate is 3/(a + b + c + d). Step 205, determining loyalty access party by analyzing the retention amount of access page within preset period of time of access party
In the embodiment of the invention, the analyzing, by the information recommendation device, the retention amount of the access page in the preset period of time of the access party comprises the following steps: counting the number of visiting days of the visiting party in a preset time period; and accumulating and counting the times of the access party accessing all the pages on the accessed object every day, sequentially obtaining the times of the access party accessing all the pages on the accessed object in any time period, and obtaining a time table for accessing all the pages on the target object.
Illustratively, if the real-time access statistic data stream is a-2018-06-24-N-A page-N, a-2018-06-24-N-B page-q, a-2018-06-24-N-C page-x to a-2018-06-24-N-N page-z, the number of times that the user equipment accesses all pages on the target object every day is N + q + x + … + z.
Assuming that the preset time period is one month, setting an active access party with one month access days greater than 20 days, and taking out the average value of the times of accessing the accessed object every day in one month from the obtained number table corresponding to the active access party, correspondingly, through the access party access times and the average value of the times, the access party on the accessed object can be divided into a loyalty access party and a non-loyalty access party, specifically, when the access times of the access party is greater than or equal to the average value of the times, the access party is determined to be a loyalty access party, and when the access times of the access party is less than the average value of the times, the access party is determined to be a non-loyalty access party.
For example, the average number of times of accessing the accessed object every day in one month is set to 20, if the number of times of accessing the accessed object every day in one month by the accessing party a is greater than or equal to 20, the corresponding accessing party a is determined to be a loyalty accessing party, otherwise, the accessing party a is determined to be a non-loyalty accessing party.
For loyalty access parties, whenever such loyalty access parties are supervised in real time whether to become non-loyalty access parties, once they become non-loyalty access parties, the non-loyalty access parties are clustered.
And step 206, when the fact that the loyalty access party is changed into the non-loyalty access party is monitored, performing a K-means algorithm on the non-loyalty access party to obtain the information to be recommended.
In the embodiment of the invention, the information recommendation device performs a K-means clustering algorithm on non-loyalty access parties, selects the non-loyalty access parties, performs personalized recommendation on the non-loyalty access parties by using access pages corresponding to the selected non-loyalty access parties in a time-sharing manner, performs retention calculation on data after the personalized recommendation, checks whether the number of access days in a fixed time period is increased, and respectively compares the influence effect of personalized recommendation on the number of the access parties by using different pages, thereby obtaining a page which can attract the non-loyalty access parties most, and taking the page as a home page of the personalized recommendation.
Illustratively, as with the data set of Table 3, Table 3 accounts for Z-class pages, with the data for each row representing the amount of reservations for each particular user to visit the pages within a certain time difference.
TABLE 3
A page B page C page …… Z page
10 10 5 …… 10
20 5 10 …… 20
…… …… …… …… ……
5 10 20 …… 1
Assuming that 2000 persons of data are counted, the data in the above form is taken as a data set. Setting k to Z, correspondingly, the data set has a total of Z centroids, and then 1500 samples are selected as the test set, and correspondingly, the output content has a total of 1500 rows.
Through clustering statistics on the data set, Z centroid coordinate data output in the experimental process are obtained as follows: the 0 th type cluster center coordinates are [ a, b, c … ]; the type 1 cluster center coordinates are [ d, e, f … ]; the type 2 cluster center coordinates are [ i, j, k … ]; … …, respectively; the Z-1 type cluster center coordinate is [ x, y, Z … ], wherein the 0-Z cluster centers respectively represent the Z classes into which the cluster centers are divided.
The experimental results of the clustering output obtained were: visitor a [10, 10, 5 … 10] belongs to type 1 cluster center; visitor b [20, 5, 10 … 20] belongs to type 1 cluster center; … …, respectively; the visitor z [5, 10, 20 … 1] belongs to type 1 cluster center.
According to the experimental result, the page which can attract the accessing party to be the page corresponding to each cluster center coordinate most by the accessed object, so that customized recommendation is performed on the accessing party. In addition, in an actual situation, a certain accessing party usually has not only one interest, and if the first three names of the pages concerned by the accessing party are desired to be obtained, the distance between the coordinate point of the accessing page of the accessing party and the center point of the cluster can be calculated and stored every time the center of one cluster is obtained in the code operation process, and the distance is iterated in each iteration process. And in the final output result, comparing the distance between the coordinate point and each central point, and outputting the first three pages.
Meanwhile, if the personalized content recommendation of the page corresponding to each access party is difficult, all the access parties can be gathered into a limited class by using a K-means algorithm, the classification of all the access parties is realized, the personalized content is basically matched and recommended for all the access parties in the limited class, the difficulty is reduced, and the favorite content of the user can be matched.
And step 207, recommending the information to be recommended to the non-loyalty access party.
By the embodiment of the invention, the remaining amount of the accessed object accessed by the access party is calculated according to the login time and the login times of the access party, so that the processing process can be simplified, and specific interested pages corresponding to different access parties can be mined; on the other hand, the embodiment of the invention classifies the access parties, and then carries out targeted recommendation based on the changed non-loyalty access parties, so that the recommendation information can be adjusted in real time, and the accuracy of information recommendation is improved.
EXAMPLE III
Based on the same inventive concept of the first embodiment to the second embodiment, the embodiment of the present invention provides an information recommendation apparatus, fig. 8 is a schematic structural diagram of a composition of the information recommendation apparatus provided by the embodiment of the present invention, as shown in fig. 8, in the embodiment of the present invention, an information recommendation apparatus 1000 includes an obtaining unit 1001, a determining unit 1002, a clustering unit 1003 and a recommending unit 1004, wherein,
the acquiring unit 1001 is configured to acquire real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object;
the determining unit 1002 is configured to determine, according to the mapping relationship, an access reservation amount of each accessed object for each accessing party;
the clustering unit 1003 is configured to cluster the visitors according to the access reservation amount to obtain a clustering result;
the recommending unit 1004 is configured to recommend, to each visitor, an accessed object corresponding to a clustering center according to the clustering result.
In other embodiments, the determining unit 1002 may include:
a first determining unit 1002a, configured to determine, according to the mapping relationship, the number of times that each accessing party accesses each accessed object; and determining the access reservation quantity of each visitor to each accessed object according to the access times and the login time of each visitor accessing each accessed object.
In other embodiments, the apparatus 1000 may further include:
a determining unit 1005, configured to determine the type of each accessing party according to whether an access remaining amount of each accessed object by the accessing party within a predetermined time length is lower than a preset value.
In other embodiments, the clustering unit 1003 may include:
a first clustering unit 1003a, configured to set an initial centroid according to each accessed object; and clustering the visitors according to the visit reservation quantity and the initial centroid.
In other embodiments, the recommending unit 1004 may include:
a second determining unit 1004a, configured to determine, according to a clustering result, a clustering center to which each of the access parties belongs;
a first obtaining unit 1004b, configured to obtain an accessed object corresponding to the cluster center from the access objects;
a first recommending unit 1004c, configured to recommend each accessed object to each accessing party.
In other embodiments, the first determining unit 1002a may include:
the searching unit 1002a1 is configured to search, according to the each accessing party, a first login time at which the each accessing party accesses the each accessed object from a preset database;
a third determining unit 1002a2, configured to obtain a difference between login times according to the login time when each visitor accesses each accessed object and the first login time;
the third determining unit 1002a3 is configured to determine, according to the difference in registration time and the number of times of access, an access reservation amount of each access party for each access object.
By the embodiment of the invention, the remaining amount of the accessed object accessed by the access party is calculated according to the login time and the login times of the access party, so that the processing process can be simplified, and specific interested pages corresponding to different access parties can be mined; on the other hand, the embodiment of the invention classifies the access parties, and then carries out targeted recommendation based on the changed non-loyalty access parties, so that the recommendation information can be adjusted in real time, and the accuracy of information recommendation is improved.
Example four
Based on the same inventive concept of the first embodiment to the third embodiment, fig. 9 is a schematic structural diagram of a composition of an information recommendation device provided by the first embodiment of the present invention, as shown in fig. 9, the information recommendation device at least includes a processor 01, a memory 02, a communication interface 03 and a communication bus 04, where the communication bus 04 is used for realizing connection and communication between the processor and the memory; the communication interface 03 is used for receiving and transmitting information; the processor 01 is configured to execute the information recommendation program stored in the memory 02 to implement the steps of the information recommendation methods provided in the first to second embodiments.
By the embodiment of the invention, the remaining amount of the accessed object accessed by the access party is calculated according to the login time and the login times of the access party, so that the processing process can be simplified, and specific interested pages corresponding to different access parties can be mined; on the other hand, the embodiment of the invention classifies the access parties, and then carries out targeted recommendation based on the changed non-loyalty access parties, so that the recommendation information can be adjusted in real time, and the accuracy of information recommendation is improved.
Based on the foregoing embodiments, an embodiment of the present invention provides a computer-readable storage medium, on which an information recommendation program is stored, where the information recommendation program, when executed by a processor, implements an information recommendation method in one or more embodiments described above.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (8)

1. An information recommendation method, characterized in that the method comprises:
acquiring real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object;
determining the access reserve amount of each accessed object of each access party according to the mapping relation;
judging the type of each access party according to whether the access reservation amount of each accessed object in the preset time length of each access party is lower than a preset value;
clustering each visitor according to the visit reservation quantity to obtain a clustering result;
recommending an accessed object corresponding to a clustering center to each access party according to the clustering result;
the clustering the visitors according to the visit reservation quantity to obtain a clustering result comprises the following steps:
when the type of each access party is determined to be a preset type, monitoring each accessed object corresponding to the preset type, wherein the preset type is used for representing that the access retention amount of each accessed object within a preset time length by the access party is higher than the preset value;
and when determining that each accessed object corresponding to the preset type changes, clustering each access party corresponding to the preset type according to the access reservation quantity of each accessed object corresponding to the preset type to obtain a clustering result.
2. The method according to claim 1, wherein said determining the access reservation amount of each visitor to each visited object according to the mapping relationship comprises:
determining the access times of each accessed object accessed by each access party according to the mapping relation;
and determining the access reservation quantity of each visitor to each accessed object according to the access times and the login time of each visitor accessing each accessed object.
3. The method of claim 1, wherein clustering the visitors according to the access reservation amount to obtain a clustering result comprises:
setting an initial mass center according to each accessed object;
and clustering the access parties according to the access reservation quantity and the initial centroid to obtain the clustering result.
4. The method of claim 1, wherein recommending, to the visitors, the visited object corresponding to the cluster center according to the clustering result comprises:
determining a clustering center to which each access party belongs according to the clustering result;
acquiring an accessed object corresponding to the clustering center from the accessed objects;
and recommending the accessed object to each access party.
5. The method of claim 2, wherein the determining the access reservation amount of each visitor to each visited object according to the number of visits and the login time of each visitor to each visited object comprises:
searching the first login time of each access party for accessing each accessed object from a preset database according to each access party;
acquiring a login time difference according to the login time of each accessed object accessed by each access party and the first login time;
and determining the access reservation quantity of each access party to each access object according to the login time difference and the access times.
6. An information recommendation apparatus, characterized in that the apparatus comprises:
an acquisition unit for acquiring real-time stream data; the real-time streaming data comprises: mapping relation between the identification of each accessed object and the login time of each access party to each accessed object;
the determining unit is used for determining the access reservation quantity of each access party to each accessed object according to the mapping relation; judging the type of each access party according to whether the access reservation amount of each accessed object in the preset time length of each access party is lower than a preset value;
the clustering unit is used for clustering all the access parties according to the access reservation quantity to obtain a clustering result;
the recommending unit is used for recommending the accessed object corresponding to the clustering center to each access party according to the clustering result;
the clustering unit is specifically configured to monitor each accessed object corresponding to a preset type when the type of each accessing party is determined to be the preset type, where the preset type is used to represent that the access retention amount of the accessing party to each accessed object within a preset time length is higher than the preset value; and when determining that each accessed object corresponding to the preset type changes, clustering each access party corresponding to the preset type according to the access reservation quantity of each accessed object corresponding to the preset type to obtain a clustering result.
7. An information recommendation device comprising at least a processor, a memory storing instructions executable by the processor, a communication interface, and a bus connecting the processor, the memory and the communication interface, wherein the instructions, when executed, the processor performs the method of any one of claims 1 to 5.
8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 5.
CN201811605413.7A 2018-12-26 2018-12-26 Information recommendation method and device and computer readable storage medium Active CN109710876B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811605413.7A CN109710876B (en) 2018-12-26 2018-12-26 Information recommendation method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811605413.7A CN109710876B (en) 2018-12-26 2018-12-26 Information recommendation method and device and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109710876A CN109710876A (en) 2019-05-03
CN109710876B true CN109710876B (en) 2021-08-06

Family

ID=66257781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811605413.7A Active CN109710876B (en) 2018-12-26 2018-12-26 Information recommendation method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109710876B (en)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100416569C (en) * 2006-01-10 2008-09-03 西安交通大学 Web page metadata based formalized description method for user access behaviors
US7689457B2 (en) * 2007-03-30 2010-03-30 Amazon Technologies, Inc. Cluster-based assessment of user interests
CN101685521A (en) * 2008-09-23 2010-03-31 北京搜狗科技发展有限公司 Method for showing advertisements in webpage and system
CN102637178A (en) * 2011-02-14 2012-08-15 北京瑞信在线系统技术有限公司 Music recommending method, music recommending device and music recommending system
CN102158365A (en) * 2011-05-20 2011-08-17 北京邮电大学 User clustering method and system in weblog mining
CN104063801B (en) * 2014-06-23 2016-05-25 有米科技股份有限公司 A kind of moving advertising recommend method based on cluster
CN106528778A (en) * 2016-11-04 2017-03-22 广州华多网络科技有限公司 Method and device for obtaining user retention data
CN107483613B (en) * 2017-08-31 2020-07-14 江西博瑞彤芸科技有限公司 Information pushing method
CN107944059A (en) * 2017-12-29 2018-04-20 深圳市中润四方信息技术有限公司西安分公司 A kind of user behavior analysis method and system based on stream calculation

Also Published As

Publication number Publication date
CN109710876A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN107862022B (en) Culture resource recommendation system
CN111159564A (en) Information recommendation method and device, storage medium and computer equipment
CN106790487B (en) Method, device and system for displaying help information
CN111666351A (en) Fuzzy clustering system based on user behavior data
EP4083857A1 (en) Information prediction model training method and apparatus, information prediction method and apparatus, storage medium, and device
CN112445690B (en) Information acquisition method and device and electronic equipment
CN112149352B (en) Prediction method for marketing activity clicking by combining GBDT automatic characteristic engineering
CN114187036B (en) Internet advertisement intelligent recommendation management system based on behavior characteristic recognition
CN112463859B (en) User data processing method and server based on big data and business analysis
US20160078454A1 (en) Method of website optimisation for a website hosted on a server system, and a server system
CN110287173A (en) Automatically generate significant user segment
CN106843941A (en) Information processing method, device and computer equipment
CN104391843A (en) System and method for recommending files
CN111400546A (en) Video recall method and video recommendation method and device
CN109145109B (en) User group message propagation abnormity analysis method and device based on social network
CN104731937B (en) The processing method and processing device of user behavior data
CN111159559A (en) Method for constructing recommendation engine according to user requirements and user behaviors
CN116701772B (en) Data recommendation method and device, computer readable storage medium and electronic equipment
CN111325255B (en) Specific crowd delineating method and device, electronic equipment and storage medium
CN109710876B (en) Information recommendation method and device and computer readable storage medium
CN111833080B (en) Information pushing method, device, electronic equipment and computer readable storage medium
KR20020020584A (en) Internet survey system and method and media for storing program source thereof
CN112308419A (en) Data processing method, device, equipment and computer storage medium
CN114969494A (en) Effective behavior determination method, device, equipment and storage medium
CN118312657B (en) Knowledge base-based intelligent large model analysis recommendation system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant