CN111984866B - Ranking list generation method and device for data - Google Patents

Ranking list generation method and device for data Download PDF

Info

Publication number
CN111984866B
CN111984866B CN202010844694.2A CN202010844694A CN111984866B CN 111984866 B CN111984866 B CN 111984866B CN 202010844694 A CN202010844694 A CN 202010844694A CN 111984866 B CN111984866 B CN 111984866B
Authority
CN
China
Prior art keywords
content data
value
heat
heat value
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010844694.2A
Other languages
Chinese (zh)
Other versions
CN111984866A (en
Inventor
张雪纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202010844694.2A priority Critical patent/CN111984866B/en
Publication of CN111984866A publication Critical patent/CN111984866A/en
Application granted granted Critical
Publication of CN111984866B publication Critical patent/CN111984866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for generating a ranking list of data, wherein the method comprises the following steps: continuously obtaining a heat value of content data counted according to time granularity; updating the heat value of the content data recorded in a first database according to the obtained heat value, wherein the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule; according to the attribute information recorded in the first database, determining candidate content data to be subjected to ranking list; and according to the popularity value of the candidate content data recorded in the first database, ranking the candidate content data to obtain a ranking list. When the scheme provided by the embodiment of the invention is applied to generate the ranking list of the data, the real-time performance of the generated ranking list can be improved.

Description

Ranking list generation method and device for data
Technical Field
The invention relates to the technical field of data analysis, in particular to a ranking list generation method and device for data.
Background
When the platform provides content data such as video and text to the user, in order to facilitate the user to select the content data, a ranking list of the content data is generally generated, and the user selects the content data according to the ranking list provided by the platform. Specifically, the content data ranking list is determined by ranking the popularity values of the content data, so that the content data ranking list can reflect the relative popularity of each content data.
In the prior art, when generating the ranking list, the platform refers to the popularity value of each content data stored by itself. However, since the amount of content data stored by the platform is large, the time required to generate a leaderboard is often long, and therefore, the generated leaderboard is generally generated once a day or longer, and thus, the real-time performance of the generated leaderboard is poor.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for generating a ranking list of data, so as to improve the real-time performance of the generated ranking list. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a method for generating a ranking list of data, where the method includes:
continuously obtaining a heat value of content data counted according to time granularity;
updating the heat value of the content data recorded in a first database according to the obtained heat value, wherein the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
according to the attribute information recorded in the first database, determining candidate content data to be subjected to ranking list;
and according to the popularity value of the candidate content data recorded in the first database, ranking the candidate content data to obtain a ranking list.
In one embodiment of the present invention, the method further includes:
for each candidate content data, a preset number of latest heat values of the candidate content data are obtained, and real-time change information of the heat values of the candidate content data is generated according to the obtained latest heat values.
In one embodiment of the present invention, the obtaining, for each candidate content data, a preset number of latest heat values of the candidate content data, and generating real-time variation information of the heat values of the candidate content data according to the obtained latest heat values includes:
real-time variation information of the heat value of each candidate content data is generated in the following manner:
obtaining a preset number of latest heat values of candidate content data;
and obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value, and taking the curve as real-time change information of the heat value of the candidate content data.
In one embodiment of the present invention, the predetermined line shape is a parabolic shape,
the step of obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value, comprising:
Estimating a heat value corresponding to a middle moment according to two heat values in the obtained latest heat values, wherein the middle moment is a moment positioned in the middle of the generation moments of the two heat values;
generating a parabola representing the change of the heat value along with time according to the two heat values, the estimated heat value, the generation moment of the two heat values and the intermediate moment, and taking the parabola as a reference parabola;
estimating a time-varying heat value of the candidate content data according to the reference parabola;
and updating the reference parabola based on the estimated heat value.
In one embodiment of the present invention, the estimating the heat value corresponding to the intermediate time according to two heat values in the obtained latest heat values includes:
calculating the heat value y corresponding to the intermediate time according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than a preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
In one embodiment of the present invention, the estimating the heat value of the candidate content data over time according to the reference parabola includes:
calculating the heat value of the content data along with the time change according to the reference parabola and the time for estimating the heat value of the content data;
when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
In a second aspect, an embodiment of the present invention provides a ranking list generating apparatus for data, where the apparatus includes:
the heat value obtaining module is used for continuously obtaining the heat value of the content data counted according to the time granularity;
the heat value updating module is used for updating the heat value of the content data recorded in the first database according to the obtained heat value, and the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
the data determining module is used for determining candidate content data to be subjected to ranking according to the attribute information recorded in the first database;
And the ranking list obtaining module is used for carrying out ranking list on the candidate content data according to the popularity value of the candidate content data recorded in the first database to obtain the ranking list.
In one embodiment of the present invention, the apparatus further includes:
the information generation module is used for obtaining a preset number of latest heat values of each candidate content data, and generating real-time change information of the heat values of the candidate content data according to the obtained latest heat values.
In one embodiment of the present invention, the information generating module includes:
a heat value obtaining sub-module, configured to obtain a preset number of latest heat values of the candidate content data;
and the information determination submodule is used for obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation moment of each latest heat value, and the curve is used as real-time change information of the heat value of the candidate content data.
In one embodiment of the present invention, the predetermined line shape is a parabolic shape,
the information determination submodule includes:
a first heat value estimation unit, configured to estimate a heat value corresponding to an intermediate time according to two heat values in the obtained latest heat values, where the intermediate time is a time located in the middle of the generation times of the two heat values;
A reference parabola determining unit, configured to generate a parabola representing a change of the heat value with time as a reference parabola according to the two heat values, the estimated heat value, the generation time of the two heat values, and the intermediate time;
a second heat value estimation unit for estimating a heat value of the candidate content data over time according to the reference parabola;
and a reference parabola updating unit for updating the reference parabola based on the estimated heat value.
In one embodiment of the present invention, the first heat value estimation unit is specifically configured to calculate the heat value y corresponding to the intermediate time according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than a preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
In one embodiment of the present invention, the second heat value estimation unit is specifically configured to calculate a heat value of the content data that varies with time according to the reference parabola and a time when the heat value of the content data is estimated; when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and a processor, configured to implement the method steps described in the first aspect when executing the program stored in the memory.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of the first aspect described above.
In view of the above, when the scheme provided by the embodiment of the invention is applied to generate a ranking list of data, candidate content data to be ranked is determined by referring to the attribute information and the popularity value of the content data recorded in the first database after obtaining the popularity value of the content data counted according to the time granularity, so as to generate the ranking list. And because the content data with the heat value and the attribute information which accord with the preset ranking list rule are recorded in the first database, compared with the prior art, when the ranking list of the data is generated, the quantity of the referenced content data is greatly reduced, so that the instantaneity of the ranking list generation is improved.
In addition, since the ranking list is generated according to the popularity value and the attribute information of the content data, and the popularity value and the attribute information of the content data are recorded in the first database, the popularity value and the attribute information of the content data can be directly obtained from the first database when the ranking list is generated, so that the instantaneity of generating the ranking list is further improved.
Finally, the heat value of the content data counted according to the time granularity is continuously obtained, and the heat value of the content data recorded in the first database is updated according to the obtained heat value, so that the heat value of the content data recorded in the first database can be updated more timely. And because the ranking list is generated based on the popularity value and the attribute information of the content data recorded in the first database, the real-time performance of the generated ranking list is higher.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a first method for generating a ranking list of data according to an embodiment of the present invention;
FIG. 2 is a flowchart of a ranking list generating method of a second type of data according to the embodiment of the present invention;
FIG. 3 is a schematic diagram of a ranking list generating system according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a ranking list generating apparatus according to a first embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a ranking list generating apparatus according to a second embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Because of poor real-time performance of the ranking list generated in the prior art, in order to solve the technical problem, the embodiment of the invention provides a ranking list generation method and device for data.
In one embodiment of the present invention, there is provided a ranking list generating method of data, the method including:
continuously obtaining a heat value of content data counted according to time granularity;
updating the heat value of the content data recorded in a first database according to the obtained heat value, wherein the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
according to the attribute information recorded in the first database, determining candidate content data to be subjected to ranking list;
and carrying out ranking on the candidate content data according to the heat value of the candidate content data recorded in the first database to obtain the ranking list.
As can be seen from the above, when the ranking list of the data is generated by applying the scheme provided by the embodiment, the candidate content data to be ranked is determined by referring to the attribute information and the popularity value of the content data recorded in the first database after obtaining the popularity value of the content data counted according to the time granularity, so as to generate the ranking list. And because the content data with the heat value and the attribute information which accord with the preset ranking list rule are recorded in the first database, compared with the prior art, when the ranking list of the data is generated, the quantity of the referenced content data is greatly reduced, so that the instantaneity of the ranking list generation is improved.
In addition, since the ranking list is generated according to the popularity value and the attribute information of the content data, and the popularity value and the attribute information of the content data are recorded in the first database, the popularity value and the attribute information of the content data can be directly obtained from the first database when the ranking list is generated, so that the instantaneity of generating the ranking list is further improved.
Finally, the heat value of the content data counted according to the time granularity is continuously obtained, and the heat value of the content data recorded in the first database is updated according to the obtained heat value, so that the heat value of the content data recorded in the first database can be updated more timely, namely, the real-time property of the heat value of the content data recorded in the first database is higher. Therefore, the real-time performance of the ranking list generated based on the popularity value and the attribute information of the content data recorded in the first database is high.
Referring to fig. 1, fig. 1 is a flowchart of a first method for generating a ranking list of data according to an embodiment of the present invention, where the method includes S101-S104.
The execution body of the embodiment may be an electronic device. Specifically, the electronic device may be a server or the like.
S101: the popularity value of the content data counted according to the time granularity is continuously obtained.
The content data may be video, audio, text, etc.
The heat value of the content data may be used to reflect a browsing index or a playing index of the content data, that is, may reflect how frequently the content data is browsed or played in a period of time. For example: when the browsing index or playing index of the content data is higher, the heat value of the content data is higher; the lower the browsing index or playback index of the content data is, the lower the popularity value of the content data is.
The above-mentioned time granularity is a period of a heat value of the statistical content data, for example: when the time granularity is 5 minutes, the popularity value of the content data is counted every 5 minutes.
The above-mentioned continuous acquisition of the heat value of the content data according to the time granularity statistics can be understood as: after each time of the popularity value of the content data counted according to the time granularity, the counted popularity value is obtained. Since the popularity value of the content data counted according to the time granularity is continuously obtained, the obtained popularity value has high instantaneity. Specifically, the above-mentioned time granularity may be 3 minutes, 5 minutes, 10 minutes, or the like.
Specifically, kafka may count the popularity value of the content data according to the time granularity, so as to generate a statistical popularity value of the content data. The Kafka may transmit the generated statistical heat value of the content data in the form of a data stream to the Spark frame, and the data stream may be also referred to as a real-time play data stream because the Kafka transmits the generated statistical heat value of the content data in the form of a data stream and the heat value of the content data can reflect play data of the content data. The Spark frame is a distributed frame, and the electronic device can be applied to the Spark frame. And because the SparkStreaming in the Spark framework can process the real-time data stream, the real-time playing data stream sent by the Kafka can be obtained based on the SparkStreaming.
S102: and updating the heat value of the content data recorded in the first database according to the obtained heat value.
The first database is used for recording: and the popularity value and attribute information of the content data accord with a preset ranking list rule. The heat value and the attribute information of the content data may be stored in a table in the first database, and the table may be referred to as a content data information table because the heat value and the attribute information of the content data are stored in the table.
The specific content of the attribute information of different content data may be different. For example: when the content data is video, the attribute information of the content data may include type information of the video, duration information of the video, etc., and the type information of the video may include a type of a television series, a type of a movie, etc.; when the content data is audio, the attribute information of the content data may include audio type information, audio duration information, and the like, and the audio type information may include a popular music type, a classical music type, and the like; when the content data is text, the attribute information of the content data may include text type information, text length information, etc., and the text type information may include a prose type, a novel type, etc. Since the attribute information of the content data can reflect specific information of the content data, the attribute information of the content data may be referred to as metadata of the content data.
Specifically, the electronic device may scan attribute information of content data in the original database at a high frequency, and store the attribute information of the content data conforming to a preset ranking list rule in the first database. The original database records: attribute information of the entire content data of the platform. The above-described platform is a platform that provides content data to a user, for example: the platform may be a video website, a text website, etc.
The above-mentioned preset rules of the leaderboard may be understood as constraint conditions for constraining the content data in the leaderboard. The rules of the above-mentioned preset leaderboard may be set empirically by a worker, for example: the rules of the above-mentioned preset ranking list may be: the content data in the ranking list is the content data uploaded by the platform and coming from the content database of the platform, namely, the content data in the ranking list comes from the content database of the platform, but does not come from the content database uploaded by the user or other platforms. The rules of the preset leaderboards can be the same rules in the specific rules of each leaderboard preset by the platform, namely constraint conditions which are required to be met when the content data in each leaderboard are constrained.
When updating the heat value of the content data recorded in the first database, the heat value of the content data recorded in the first database may be updated immediately after each time the heat value of the content data counted according to the time granularity is obtained.
When the heat value of the content data recorded in the first database is updated, the heat value of the content data recorded in the first database may be updated at preset time intervals. In this case, since the heat value of the content data may be obtained a plurality of times within the preset time interval, the heat value of the content data recorded in the first database may be updated based on the heat value of the content data obtained last time; the obtained heat value of the plurality of pieces of content data may also be summed, and the heat value of the content data recorded in the first database may be updated based on the sum of the calculated heat values of the content data.
Specifically, when updating the popularity value of the content data recorded in the first database, the identity of the content data to be updated of the popularity value may be determined from the first database according to the identity of the content data corresponding to the obtained popularity value, and then the popularity value of the content data corresponding to the determined identity may be updated in the first database.
Since the data amount of the obtained heat value may be much smaller than the data amount of the heat value of the content data recorded in the first database described above, only the heat value of a part of the content data recorded in the first database is often updated when the heat value of the content data recorded in the first database is updated. And because the first database records the heat value of the content data conforming to the rule of the preset ranking list, when the content data corresponding to the obtained heat value is not recorded in the first database, the content data is the content data not conforming to the rule of the preset ranking list. When the popularity value of the content data recorded in the first database is updated according to the obtained popularity value of the content data, the popularity value of the content data which accords with the preset ranking list rule and corresponds to the obtained popularity value in the first database is updated.
For example: assume that the heat values of the content data recorded in the first database are respectively: the heat value H of the content data 1 1 The heat value H of the content data 2 2 … …, the heat value H of the content data 10 10 It can be understood that the content data 1, the content data 2, … …, and the content data 10 are all content data conforming to a preset ranking list rule, and the obtained popularity values are: the heat value H of the content data 1 12 The heat value H of the content data 11 112 . When updating the popularity value of the content data recorded in the first database, the popularity value of the content data 1 is recorded in the first database, and the popularity value of the content data 11 is not recorded, which means that the content data 1 in the obtained popularity value is the content data conforming to the preset ranking list rule, and the content data 11 is the content data not conforming to the preset ranking list rule. Therefore, the heat value of the content data 1 recorded in the first database is updated to H 12
And S103, determining candidate content data to be subjected to ranking according to the attribute information recorded in the first database.
Since the leaderboard may include different types of leaderboards, for example: when the content data is video, the leaderboard may include a television series leaderboard, a movie leaderboard, and the like; when the content data is text, the leaderboard may include a novel leaderboard, a prose leaderboard, and the like. Taking a television series ranking list as an example, the video types of the videos in the television series ranking list are all television series types. Therefore, when determining the candidate content data to be ranked, the content data whose attribute information is the same attribute information may be used as the candidate content data to be ranked based on the attribute information of the content data whose heat value is updated in the first database.
In one embodiment of the present invention, the candidate content data to be ranked according to the attribute information of the content data with updated popularity value in the first database may be further determined by screening the content data with updated popularity value based on the specific rule of the ranking list to be generated.
The specific rules of the ranking list to be generated are as follows: constraint conditions for constraining content data in a leaderboard to be generated. For example: the ranking list to be generated is as follows: a, a television ranking list in the area A, wherein the specific rules of the ranking list are as follows: the type of content data of the leaderboard is a television type, and the play control information can be provided for users in area A. The play control information of the content data described above may be understood as information for determining whether the content data can be provided to a user in different areas.
Based on the specific rule of the ranking list to be generated, the attribute information of the content data in the ranking list to be generated can be determined, and the candidate content data to be ranked according to the determined attribute information and the attribute information of the content data with updated popularity value in the first database can be determined.
And S104, ranking the candidate content data according to the popularity value of the candidate content data recorded in the first database to obtain a ranking list.
Specifically, when the ranking list is performed, the candidate content data may be sequentially ranked according to the order of the popularity value of the candidate content data recorded in the first database from high to low, so as to obtain the ranking list.
When ranking, the candidate content data with the highest heat value is selected based on the heat values of the candidate content data recorded in the first database, and the selected candidate content data is ranked in sequence according to the sequence of the heat values of the selected candidate content data from high to low, so that the ranking list is obtained.
Specifically, after ranking the candidate content data, a string composed of the identification arrangements of the candidate content data may be obtained, a preset symbol is adopted, the string is divided based on the identification of each candidate content data, the string divided by the preset symbol is stored, and the ranking list is generated based on the stored string. The preset symbol may be: comma, semicolon, etc., the identification of the candidate content data may be: the ID number, name, etc. of the candidate content data.
In one embodiment of the present invention, the candidate content data may be screened according to the attribute information of the candidate content data based on a specific rule of the ranking list to be generated, and the screened candidate content data may be ranked based on the popularity value of the screened candidate content data, so as to obtain the ranking list.
As can be seen from the above, when the ranking list of the data is generated by applying the scheme provided by the embodiment, the candidate content data to be ranked is determined by referring to the attribute information and the popularity value of the content data recorded in the first database after obtaining the popularity value of the content data counted according to the time granularity, so as to generate the ranking list. And because the content data with the heat value and the attribute information which accord with the preset ranking list rule are recorded in the first database, compared with the prior art, when the ranking list of the data is generated, the quantity of the referenced content data is greatly reduced, so that the instantaneity of the ranking list generation is improved.
In addition, since the ranking list is generated according to the popularity value and the attribute information of the content data, and the popularity value and the attribute information of the content data are recorded in the first database, the popularity value and the attribute information of the content data can be directly obtained from the first database when the ranking list is generated, so that the instantaneity of generating the ranking list is further improved.
Finally, the heat value of the content data counted according to the time granularity is continuously obtained, and the heat value of the content data recorded in the first database is updated according to the obtained heat value, so that the heat value of the content data recorded in the first database can be updated more timely, namely, the real-time property of the heat value of the content data recorded in the first database is higher. Therefore, the real-time performance of the ranking list generated based on the popularity value and the attribute information of the content data recorded in the first database is high.
Referring to fig. 2, fig. 2 is a flowchart of a ranking list generating method of second data according to an embodiment of the present invention. After S104 described above, S105 may also be included.
S105: for each candidate content data, a preset number of latest heat values of the candidate content data are obtained, and real-time change information of the heat values of the candidate content data is generated according to the obtained latest heat values.
The above-mentioned preset number may be set empirically by a worker, for example: the preset number may be 3, 5, etc.
The above-mentioned preset number of latest heat values can be understood as: the popularity value of the content data generated at each generation time in the latest time period including the current time.
Specifically, a preset number of latest heat values of the candidate content data may be obtained from the second database. The second database stores therein heat values of content data generated at a plurality of generation times.
In an embodiment of the present invention, the second database may be a Couchbase database. The second database may store an identifier of content data, a preset number of latest popularity values, and attribute information, which conform to a preset ranking list rule. For example: the predetermined number may be 5, 6, etc.
In one embodiment of the present invention, after the step S101, the obtained heat value may be further added to the heat value of the content data recorded in the second database, so that a preset number of latest heat values of the candidate content data may be obtained from the second database.
Specifically, when the obtained heat value is added to the second database, the heat value of the content data in the backup database of the second database may be read, where the read heat value of the content data is the heat value of the content data corresponding to the obtained heat value. And adding the obtained heat value into the read heat value, deleting the heat value with the generation time of the heat value in the read heat value as the earliest time, and writing the added heat value into the second database and the backup database.
Specifically, the backup database may be a dis.
When the real-time change information is generated, the heat value of the content data is read from the second database, and the backup database of the second database is used for writing the heat value of the content data, so that the read-write separation is realized by using the double databases when the real-time change information is generated.
The real-time change information of the candidate content data is used for reflecting the real-time change of the candidate content data. Specifically, the real-time variation information of the heat value of the candidate content data may be represented by a curve of a preset line type, where the preset line type may be a parabolic type, a folded line type, or the like.
Specifically, in generating the real-time variation information of the heat value of the candidate content data, the relationship of the heat value of the candidate content data varying with time may be determined as the real-time variation information of the heat value of the candidate content data according to the latest heat value obtained and the generation time of each heat value.
In this way, the real-time change information of the heat value of the candidate content data is generated according to the preset number of latest heat values of the candidate content data, and the real-time change condition of the heat value of the candidate content data can be accurately reflected by the preset number of latest heat values of the candidate content data, so that the real-time change information of the heat value of the candidate content data can be accurately generated.
In one embodiment of the present invention, the above S105 may be implemented according to the following steps A1 to A2: for each candidate content data, a preset number of latest heat values of the candidate content data are obtained, and real-time change information of the heat values of the candidate content data is generated according to the obtained latest heat values.
Step A1: a preset number of latest heat values of the candidate content data are obtained.
The above-mentioned preset number may be set empirically by a worker, for example: the preset number may be 3, 5, etc.
Specifically, a preset number of latest heat values of the candidate content data may be obtained from the second database. The second database records: the heat value of the content data generated at a plurality of generation times.
And A2, obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value, and taking the curve as real-time change information of the heat value of the candidate content data.
The predetermined line may be a parabolic line, a folded line, etc.
Specifically, curve fitting may be performed according to a curve of a preset line type based on the obtained latest heat value and the generation time of each latest heat value, so as to obtain a curve of a preset line type representing the time-dependent change of the heat value, as real-time change information of the heat value of the candidate content data.
In this way, the curve representing the preset linear shape of the change of the heat value along with time is used as the real-time change information of the heat value of the candidate content data, so that the real-time change information of the heat value of the candidate content data can be intuitively reflected. In addition, the curve of the preset line type representing the change of the heat value along with time is obtained according to the obtained latest heat value and the generation time of each latest heat value, so that the curve of the preset line type can be accurately obtained.
In one embodiment of the present invention, when the predetermined line shape is parabolic, the step A2 may be implemented according to the following steps a21 to a 24: and obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value.
Step A21: and estimating the heat value corresponding to the middle moment according to two heat values in the obtained latest heat values.
The intermediate time is a time located intermediate between the generation times of the two heat values.
Specifically, the two heat values may be two heat values having the latest generation time and the smallest pitch among the obtained latest heat values. The two heat values may be any two of the obtained latest heat values.
When estimating the heat value corresponding to the intermediate time, an average value of the two heat values may be calculated as the heat value corresponding to the intermediate time.
In one embodiment of the present invention, the heat value y corresponding to the intermediate time may also be calculated according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than the preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
The first preset range and the second preset range may have an overlapping area or may not have an overlapping area, which is not limited in the embodiment of the present invention.
Specifically, the preset threshold, the first preset range, and the second preset range may be set by a worker according to experience. For example, the preset threshold may be 2, the first preset range may be [1,3], and the second preset range may be [2,5].
Step A22: and generating a parabola representing the change of the heat value along with time according to the two heat values, the estimated heat value, the generation moment and the middle moment of the two heat values, and taking the parabola as a reference parabola.
Specifically, when generating a parabola representing the change of the heat degree along with time, curve fitting can be performed on the two heat degree values, the estimated heat degree value, the generation time and the middle time of the two heat degree values according to a preset parabola shape, so as to obtain the parabola representing the change of the heat degree value along with time; the parameter term in the basic formula of the parabolic curve can be calculated according to the heat value and the generation moment, and the parabolic curve representing the change of the heat value along with time can be obtained.
Step A23: and estimating the heat value of the candidate content data along with time according to the reference parabola.
Since the reference parabola is a parabola representing the time-varying heat value, when the time-varying heat value of the candidate content data is estimated according to the reference parabola, the time-varying heat value of the candidate content data can be obtained according to the reference parabola and the time of estimating the heat value of the candidate content data.
Specifically, the time of estimating the heat value of the candidate content data may be substituted into the calculation formula of the reference parabola, so as to obtain the heat value of the content data corresponding to the estimated time, so that the heat value of the candidate content data changing with time can be obtained.
In one embodiment of the present invention, the heat value y of the candidate content data over time may be estimated according to the following expression:
y=A*(X-X 1 +X 2 ) 2 +B*(X-X 1 +X 2 )+C
wherein X is the time of estimating the heat value of the candidate content data, X 1 、X 2 The preset parameters in the reference parabolas are preset coefficients A, B, C.
Step A24: and updating the reference parabola based on the estimated heat value.
Specifically, when updating the reference parabola, curve fitting can be performed according to a preset parabola type according to the estimated heat value and the time of the estimated heat value, the parabola representing the change of the heat value along with time is obtained again, and the obtained parabola is spliced with the reference parabola, so that the updated reference parabola is obtained.
In this way, since the reference parabola is updated based on the estimated heat value, which is estimated based on the reference parabola, and since the reference parabola is a curve representing the change of the heat value of the content data with time, the heat value of the content data with time can be estimated based on the reference parabola more accurately, and the parabola can be updated more accurately based on the estimated heat value.
In one embodiment of the present invention, the estimation of the time-varying heat value of the candidate content data according to the reference parabola in the above step a23 may also be implemented in the following manner.
Calculating the heat value of the content data along with the time change according to the reference parabola and the time for estimating the heat value of the content data; when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
The preset difference threshold and the preset error value may be set empirically by a worker.
Specifically, when the calculated heat value is adjusted based on the preset error value, the sum of the heat value and the preset error value may be calculated as the heat value of the candidate content data that varies with time; the difference between the heat value and the preset error value can also be calculated as the heat value of the candidate content data changing with time.
Thus, when the calculated heat value is smaller, the calculated heat value is adjusted based on the preset error value, so that the heat value which is more accurate and can be obtained as the candidate content data changes along with time can be obtained.
Referring to fig. 3, fig. 3 is a chart generation system of data according to an embodiment of the present invention, where fig. 3 includes an attribute information layer, a play index layer, a chart calculation layer, and an interface service layer.
In one implementation, the ranking list generation system of the data may be implemented based on a Spark framework.
Specifically, the above-mentioned functional layers may be implemented by the same electronic device, or may be implemented by different electronic devices.
The attribute information layer comprises a first database, wherein the first database is used for determining attribute information of content data conforming to a preset ranking list rule according to a preset time interval, and storing the determined attribute information into the first database.
The attribute information layer may further include an original database, and the attribute information layer scans attribute information of content data in the original database according to a preset time interval, and stores the attribute information of the content data conforming to a preset ranking list rule in the first database.
The attribute information layer may also send playback control information of the content data in the original database to the interface service layer.
And the play index layer is used for continuously obtaining the heat value of the content data counted according to the time granularity, and updating the heat value of the content data recorded in the first database according to the obtained heat value.
In one implementation, the popularity value of the content data is obtained by Kafka according to temporal granularity statistics.
Since the attribute information of the content data in the first database is obtained by scanning the original database by the attribute information layer, and the heat value of the content data is updated by the play index layer based on the obtained heat value of the content data, the attribute information and the heat value of the content data in the first database can be considered to be hierarchically spliced. Because the attribute information and the heat value of the content data in the first database are obtained by layering and splicing, different layers are not interfered with each other, and therefore the data obtaining efficiency is improved.
When updating the heat value recorded in the first database, the real-time playing data stream sent by Kafka can be obtained based on sparkstring, and the heat value of the content data recorded in the first database is updated according to the obtained heat value of the content data carried in the real-time playing data stream.
And the ranking list data determining layer is used for determining candidate content data to be ranked according to the attribute information recorded in the first database.
Specifically, candidate content data to be ranked may be determined based on the Spark framework according to attribute information of content data of which a popularity value is updated in the first database. Since the candidate content data is the content data to be subject to the ranking, the database storing the determined candidate content data may be referred to as a ranking pool,
and the interface service layer is used for ranking the candidate content data according to the popularity value of the candidate content data recorded in the first database to obtain a ranking list.
The interface service layer may be further configured to typeset the candidate content data according to the popularity value of the candidate content data recorded in the first database and the play control information of the received content data, so as to obtain a ranking list.
The interface service layer may further include a second database and a data backup database of the second database, and the interface service layer is further configured to generate real-time change information of the popularity value of the candidate content data in the ranking list based on a plurality of latest popularity values of the content data recorded in the second database.
Because different functions are executed by each functional layer to jointly generate the ranking list of the data, the ranking list of the data is generated in a high qps scene with higher efficiency, and the faults of the different functional layers do not influence the operation of other functional layers.
Corresponding to the ranking list generation method of the data, the embodiment of the invention also provides a ranking list generation device of the data.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a ranking list generating apparatus for first data according to an embodiment of the present invention, where the apparatus includes 401 to 404.
A popularity value obtaining module 401, configured to continuously obtain popularity values of content data according to time granularity statistics;
a heat value updating module 402, configured to update a heat value of content data recorded in a first database according to the obtained heat value, where the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
A data determining module 403, configured to determine candidate content data to be ranked according to attribute information recorded in the first database;
and a ranking list obtaining module 404, configured to perform ranking list on the candidate content data according to the popularity value of the candidate content data recorded in the first database, so as to obtain a ranking list.
As can be seen from the above, when the ranking list of the data is generated by applying the scheme provided by the embodiment, the candidate content data to be ranked is determined by referring to the attribute information and the popularity value of the content data recorded in the first database after obtaining the popularity value of the content data counted according to the time granularity, so as to generate the ranking list. And because the content data with the heat value and the attribute information which accord with the preset ranking list rule are recorded in the first database, compared with the prior art, when the ranking list of the data is generated, the quantity of the referenced content data is greatly reduced, so that the instantaneity of the ranking list generation is improved.
In addition, since the ranking list is generated according to the popularity value and the attribute information of the content data, and the popularity value and the attribute information of the content data are recorded in the first database, the popularity value and the attribute information of the content data can be directly obtained from the first database when the ranking list is generated, so that the instantaneity of generating the ranking list is further improved.
Finally, the heat value of the content data counted according to the time granularity is continuously obtained, and the heat value of the content data recorded in the first database is updated according to the obtained heat value, so that the heat value of the content data recorded in the first database can be updated more timely. And because the ranking list is generated based on the popularity value and the attribute information of the content data recorded in the first database, the real-time performance of the generated ranking list is higher.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a ranking list generating apparatus for second data according to an embodiment of the present invention, and the apparatus further includes 405 on the basis of the above embodiment.
The information generating module 405 is configured to obtain, for each candidate content data, a preset number of latest popularity values of the candidate content data, and generate real-time variation information of the popularity value of the candidate content data according to the obtained latest popularity values.
In this way, the real-time change information of the heat value of the candidate content data is generated according to the preset number of latest heat values of the candidate content data, and the real-time change condition of the heat value of the candidate content data can be accurately reflected by the preset number of latest heat values of the candidate content data, so that the real-time change information of the heat value of the candidate content data can be accurately generated.
In one embodiment of the present invention, the information generating module 405 includes:
a heat value obtaining sub-module, configured to obtain a preset number of latest heat values of the candidate content data;
and the information determination submodule is used for obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation moment of each latest heat value, and the curve is used as real-time change information of the heat value of the candidate content data.
In this way, the curve representing the preset linear shape of the change of the heat value along with time is used as the real-time change information of the heat value of the candidate content data, so that the real-time change information of the heat value of the candidate content data can be intuitively reflected. In addition, the curve of the preset line type representing the change of the heat value along with time is obtained according to the obtained latest heat value and the generation time of each latest heat value, so that the curve of the preset line type can be accurately obtained.
In one embodiment of the present invention, the predetermined line shape is a parabolic shape,
the information determination submodule includes:
a first heat value estimation unit, configured to estimate a heat value corresponding to an intermediate time according to two heat values in the obtained latest heat values, where the intermediate time is a time located in the middle of the generation times of the two heat values;
A reference parabola determining unit, configured to generate a parabola representing a change of the heat value with time as a reference parabola according to the two heat values, the estimated heat value, the generation time of the two heat values, and the intermediate time;
a second heat value estimation unit for estimating a heat value of the candidate content data over time according to the reference parabola;
and a reference parabola updating unit for updating the reference parabola based on the estimated heat value.
In this way, since the reference parabola is updated based on the estimated heat value, which is estimated based on the reference parabola, and since the reference parabola is a curve representing the change of the heat value of the content data with time, the heat value of the content data with time can be estimated based on the reference parabola more accurately, and the parabola can be updated more accurately based on the estimated heat value.
In one embodiment of the present invention, the first heat value estimation unit is specifically configured to calculate the heat value y corresponding to the intermediate time according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than a preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
In one embodiment of the present invention, the second heat value estimation unit is specifically configured to calculate a heat value of the content data that varies with time according to the reference parabola and a time when the heat value of the content data is estimated; when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
Thus, when the calculated heat value is smaller, the calculated heat value is adjusted based on the preset error value, so that the heat value which is more accurate and can be obtained as the candidate content data changes along with time can be obtained.
Corresponding to the ranking list generation method of the data, the embodiment of the invention also provides electronic equipment.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, which includes a processor 601, a communication interface 602, a memory 603, and a communication bus 604, wherein the processor 601, the communication interface 602, and the memory 603 communicate with each other through the communication bus 604,
a memory 603 for storing a computer program;
the processor 601 is configured to implement the ranking list generating method of data according to the embodiment of the present invention when executing the program stored in the memory 603.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In still another embodiment of the present invention, a computer readable storage medium is provided, where a computer program is stored, and when the computer program is executed by a processor, the method for generating a ranking list of data provided by the embodiment of the present invention is implemented.
In yet another embodiment of the present invention, a computer program product containing instructions that, when executed on a computer, cause the computer to implement a leaderboard generation method for data provided by the embodiment of the present invention is also provided.
As can be seen from the above, when the ranking list of the data is generated by applying the scheme provided by the embodiment, the candidate content data to be ranked is determined by referring to the attribute information and the popularity value of the content data recorded in the first database after obtaining the popularity value of the content data counted according to the time granularity, so as to generate the ranking list. And because the content data with the heat value and the attribute information which accord with the preset ranking list rule are recorded in the first database, compared with the prior art, when the ranking list of the data is generated, the quantity of the referenced content data is greatly reduced, so that the instantaneity of the ranking list generation is improved.
In addition, since the ranking list is generated according to the popularity value and the attribute information of the content data, and the popularity value and the attribute information of the content data are recorded in the first database, the popularity value and the attribute information of the content data can be directly obtained from the first database when the ranking list is generated, so that the instantaneity of generating the ranking list is further improved.
Finally, the heat value of the content data counted according to the time granularity is continuously obtained, and the heat value of the content data recorded in the first database is updated according to the obtained heat value, so that the heat value of the content data recorded in the first database can be updated more timely. And because the ranking list is generated based on the popularity value and the attribute information of the content data recorded in the first database, the real-time performance of the generated ranking list is higher.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, electronic devices, computer readable storage medium embodiments, since they are substantially similar to method embodiments, the description is relatively simple, and relevant references are made to the partial description of method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (8)

1. A method for generating a leaderboard of data, the method comprising:
continuously obtaining a heat value of content data counted according to time granularity;
updating the heat value of the content data recorded in a first database according to the obtained heat value, wherein the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
according to the attribute information recorded in the first database, determining candidate content data to be subjected to ranking list;
according to the popularity value of the candidate content data recorded in the first database, ranking the candidate content data to obtain a ranking list;
for each candidate content data, obtaining a preset number of latest heat values of the candidate content data, and generating real-time change information of the heat values of the candidate content data according to the obtained latest heat values;
the method for obtaining a preset number of latest heat values of each candidate content data and generating real-time change information of the heat values of the candidate content data according to the obtained latest heat values comprises the following steps:
Real-time variation information of the heat value of each candidate content data is generated in the following manner:
obtaining a preset number of latest heat values of candidate content data;
obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value, and taking the curve as real-time change information of the heat value of the candidate content data;
the preset line type is parabolic,
the step of obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation time of each latest heat value, comprising:
estimating a heat value corresponding to a middle moment according to two heat values in the obtained latest heat values, wherein the middle moment is a moment positioned in the middle of the generation moments of the two heat values;
generating a parabola representing the change of the heat value along with time according to the two heat values, the estimated heat value, the generation moment of the two heat values and the intermediate moment, and taking the parabola as a reference parabola;
estimating a time-varying heat value of the candidate content data according to the reference parabola;
and updating the reference parabola based on the estimated heat value.
2. The method according to claim 1, wherein estimating the corresponding heat value at the intermediate time based on two of the obtained latest heat values comprises:
calculating the heat value y corresponding to the intermediate time according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than a preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
3. The method of claim 1, wherein estimating a time-varying heat value of candidate content data based on the baseline parabola comprises:
calculating the heat value of the content data along with the time change according to the reference parabola and the time for estimating the heat value of the content data;
when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
4. A ranking list generating apparatus for data, the apparatus comprising:
the heat value obtaining module is used for continuously obtaining the heat value of the content data counted according to the time granularity;
the heat value updating module is used for updating the heat value of the content data recorded in the first database according to the obtained heat value, and the first database is used for recording: the popularity value and attribute information of the content data conforming to a preset ranking list rule;
the data determining module is used for determining candidate content data to be subjected to ranking according to the attribute information recorded in the first database;
the ranking list obtaining module is used for ranking the candidate content data according to the popularity value of the candidate content data recorded in the first database to obtain a ranking list;
the apparatus further comprises:
the information generation module is used for obtaining a preset number of latest heat values of each candidate content data, and generating real-time change information of the heat values of the candidate content data according to the obtained latest heat values;
the information generation module comprises:
a heat value obtaining sub-module, configured to obtain a preset number of latest heat values of the candidate content data;
The information determination submodule is used for obtaining a curve of a preset line type representing the change of the heat value along with time according to the obtained latest heat value and the generation moment of each latest heat value, and the curve is used as real-time change information of the heat value of the candidate content data;
the preset line type is parabolic,
the information determination submodule includes:
a first heat value estimation unit, configured to estimate a heat value corresponding to an intermediate time according to two heat values in the obtained latest heat values, where the intermediate time is a time located in the middle of the generation times of the two heat values;
a reference parabola determining unit, configured to generate a parabola representing a change of the heat value with time as a reference parabola according to the two heat values, the estimated heat value, the generation time of the two heat values, and the intermediate time;
a second heat value estimation unit for estimating a heat value of the candidate content data over time according to the reference parabola;
and a reference parabola updating unit for updating the reference parabola based on the estimated heat value.
5. The apparatus of claim 4, wherein the device comprises a plurality of sensors,
The first heat value estimation unit is specifically configured to calculate a heat value y corresponding to the intermediate time according to the following expression 3
y 3 =(y 1 +y 2 )/2±interval
Wherein y is 1 、y 2 Respectively the two heat values, wherein interval is a preset value, when y 1 And y is 2 When the difference value is smaller than or equal to the preset threshold value, the interval is any value in the first preset range, and when y 1 And y is 2 When the difference value is larger than a preset threshold value, the interval is any value in a second preset range, and the first preset range and the second preset range are two different ranges.
6. The apparatus of claim 4, wherein the device comprises a plurality of sensors,
the second heat value estimation unit is specifically configured to calculate a heat value of the content data that changes with time according to the reference parabola and a time point when the heat value of the content data is estimated; when the difference between the calculated heat value and the latest heat value of the content data is smaller than a preset difference threshold value, the calculated heat value is adjusted based on the preset error value, and the adjusted heat value is used as the heat value of the candidate content data changing along with time.
7. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
A memory for storing a computer program;
a processor for carrying out the method steps of any one of claims 1-3 when executing a program stored on a memory.
8. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-3.
CN202010844694.2A 2020-08-20 2020-08-20 Ranking list generation method and device for data Active CN111984866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010844694.2A CN111984866B (en) 2020-08-20 2020-08-20 Ranking list generation method and device for data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010844694.2A CN111984866B (en) 2020-08-20 2020-08-20 Ranking list generation method and device for data

Publications (2)

Publication Number Publication Date
CN111984866A CN111984866A (en) 2020-11-24
CN111984866B true CN111984866B (en) 2023-09-05

Family

ID=73442422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010844694.2A Active CN111984866B (en) 2020-08-20 2020-08-20 Ranking list generation method and device for data

Country Status (1)

Country Link
CN (1) CN111984866B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105589895A (en) * 2014-11-13 2016-05-18 深圳市腾讯计算机系统有限公司 Resource ranking data generation method and device
US9485536B1 (en) * 2008-09-03 2016-11-01 The Directv Group, Inc. Method and system for updating programming listing data for a broadcasting system
CN107797446A (en) * 2016-09-05 2018-03-13 欧姆龙株式会社 Model predictive control apparatus, control method, message handling program and recording medium
CN108460094A (en) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 The method and system of storage statistical data
CN110096637A (en) * 2019-04-16 2019-08-06 广州虎牙信息科技有限公司 Method, apparatus, storage medium and the terminal device that more lists generate
CN110633376A (en) * 2019-08-22 2019-12-31 北京奇艺世纪科技有限公司 Media object sorting method, device, equipment and storage medium
CN111367922A (en) * 2020-02-25 2020-07-03 香港乐蜜有限公司 Data updating method and related equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9965503B2 (en) * 2015-08-12 2018-05-08 International Business Machines Corporation Data cube generation
US10884704B2 (en) * 2017-09-21 2021-01-05 International Business Machines Corporation Sorting a table in analytical databases
US11281647B2 (en) * 2017-12-06 2022-03-22 International Business Machines Corporation Fine-grained scalable time-versioning support for large-scale property graph databases

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9485536B1 (en) * 2008-09-03 2016-11-01 The Directv Group, Inc. Method and system for updating programming listing data for a broadcasting system
CN105589895A (en) * 2014-11-13 2016-05-18 深圳市腾讯计算机系统有限公司 Resource ranking data generation method and device
CN107797446A (en) * 2016-09-05 2018-03-13 欧姆龙株式会社 Model predictive control apparatus, control method, message handling program and recording medium
CN108460094A (en) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 The method and system of storage statistical data
CN110096637A (en) * 2019-04-16 2019-08-06 广州虎牙信息科技有限公司 Method, apparatus, storage medium and the terminal device that more lists generate
CN110633376A (en) * 2019-08-22 2019-12-31 北京奇艺世纪科技有限公司 Media object sorting method, device, equipment and storage medium
CN111367922A (en) * 2020-02-25 2020-07-03 香港乐蜜有限公司 Data updating method and related equipment

Also Published As

Publication number Publication date
CN111984866A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
US10572565B2 (en) User behavior models based on source domain
US10405016B2 (en) Recommending media items based on take rate signals
US10025785B2 (en) Method and system of automatically downloading media content in a preferred network
IL234134A (en) Method of machine learning classes of search queries
CN112765400B (en) Weight updating method, content recommending method, device and equipment for interest labels
CA2759034A1 (en) Hierarchical tags with community-based ratings
US9811578B2 (en) Multidimensional data process method and device thereof
US10762122B2 (en) Method and device for assessing quality of multimedia resource
CN111062527B (en) Video traffic collection prediction method and device
CN110933492B (en) Method and device for predicting playing time
CN110858912A (en) Streaming media caching method and system, caching policy server and streaming service node
CN109740621B (en) Video classification method, device and equipment
CN111984866B (en) Ranking list generation method and device for data
CN114064445A (en) Test method, device, equipment and computer readable storage medium
CN108804647B (en) Video sequencing method and device
CN112804566A (en) Program recommendation method, device and computer readable storage medium
CN110442789A (en) Method, apparatus and electronic equipment are determined based on the association results of user behavior
CN110839178B (en) Content prediction method and device and computer readable storage medium
CN111353052B (en) Multimedia object recommendation method and device, electronic equipment and storage medium
CN110223108B (en) Click through rate prediction method, device and equipment
CN110674330B (en) Expression management method and device, electronic equipment and storage medium
CN113313511A (en) Video traffic prediction method, device, electronic equipment and medium
CN113312515B (en) Playing data prediction method, system, electronic equipment and medium
CN116320635A (en) Training method of network model, video recommendation method and device, and electronic equipment
CN116361505A (en) Content recall and content recommendation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant