CN106375796B - Audience rating statistical method and system - Google Patents

Audience rating statistical method and system Download PDF

Info

Publication number
CN106375796B
CN106375796B CN201610811426.4A CN201610811426A CN106375796B CN 106375796 B CN106375796 B CN 106375796B CN 201610811426 A CN201610811426 A CN 201610811426A CN 106375796 B CN106375796 B CN 106375796B
Authority
CN
China
Prior art keywords
information
target
program
tensor
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610811426.4A
Other languages
Chinese (zh)
Other versions
CN106375796A (en
Inventor
徐佳宏
李益永
邓宏栋
成学文
韩涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ipanel TV Inc
Original Assignee
Shenzhen Ipanel TV Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ipanel TV Inc filed Critical Shenzhen Ipanel TV Inc
Priority to CN201610811426.4A priority Critical patent/CN106375796B/en
Publication of CN106375796A publication Critical patent/CN106375796A/en
Application granted granted Critical
Publication of CN106375796B publication Critical patent/CN106375796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25866Management of end-user data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44204Monitoring of content usage, e.g. the number of times a movie has been viewed, copied or the amount which has been watched
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity

Abstract

The application discloses an audience rating statistical method and an audience rating statistical system, wherein the audience rating statistical method utilizes a cluster log database generated by a three-network integration system to collect the watching behaviors of all television audiences, so that the audience rating of a target program or a target channel is accurately reflected through statistics. Specifically, cluster logs in a target time period are extracted from a cluster log database, extracted cluster log information is screened according to preset parameters to obtain target cluster logs, an information tensor is generated and subjected to completion processing, information lost in the generation process of the cluster log database is completed, viewing behaviors of all television audiences classified according to the preset parameters in the target time period are obtained, and then the audience rating of a target program or a target channel can be calculated by using the information tensor of the completion information, so that the accurate audience rating of the target program or the target channel is obtained.

Description

Audience rating statistical method and system
Technical Field
The present application relates to the technical field of audience rating statistics, and more particularly, to an audience rating statistics method and system.
Background
The audience rating is the percentage of the number of people (or user cluster number) watching a certain television channel (or a certain television program) in a certain time period to the total number of television audiences (or the total user cluster number), and the statistics of the audience rating has important significance for analyzing the television audience market, is an important reference for program making, arrangement and adjustment, is a main index of program evaluation, and is a powerful tool for making and evaluating a medium plan and improving advertisement putting messages.
In the prior art, methods for counting audience ratings are mainly divided into two methods, namely a diary method and an audience meter detection method; the main idea of the two methods is to select a part of television audiences as sampling samples, and obtain the audience rating of a certain program or a certain channel by collecting the watching behaviors of the sampling samples and calculating the total number of the sampling samples. Since the number of sampling samples is generally small relative to the total audience, the prior art statistical rating is not a true rating, i.e., there is an error between the prior art statistical rating and the true rating, and the smaller the number of sampling samples, the larger the error.
Disclosure of Invention
In order to solve the technical problems, the invention provides an audience rating statistical method and system to realize accurate statistics of audience ratings.
In order to achieve the technical purpose, the embodiment of the invention provides the following technical scheme:
an audience rating statistics method comprising:
extracting cluster logs in a certain time period from a cluster log database, wherein the cluster logs comprise user attributes and watching behaviors, and the cluster log database is generated through a three-network integration system;
screening the extracted cluster log information according to preset parameters to obtain a target cluster log, wherein the preset parameters are one of the user attributes or one of the watching behaviors;
calculating the time of each user cluster for watching the program in each time interval according to the target cluster log, and storing the time in an information tensor mode, wherein the information tensor comprises a program name, the watching time corresponding to the program name and the number of users in the user cluster;
completing the information tensor to complete the lost information to obtain an information tensor of the completed information;
and calculating the audience rating of the target program or the target channel according to the information tensor of the completion information.
Preferably, the calculating the audience rating of the target program or the target channel according to the information tensor of the completion information includes:
and calculating the audience rating of the target program or the target channel in the target time slot according to the information tensor of the completion information.
Preferably, the target time period is a certain target time period or a certain target day.
Preferably, the calculating the audience rating of the target program in the target time slot according to the information tensor of the completion information includes:
searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
searching the information tensor of the completion information by taking the target time period as a parameter to obtain a total information tensor in the target time period;
and calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period.
Preferably, the calculating the audience rating of the target channel in the target time period according to the information tensor of the completion information includes:
generating a program set of the target channel, wherein the program set comprises all programs of the target channel;
extracting one program in the program set as a target program, calculating the audience rating of the target program in a target time period according to the information tensor of the completion information, and accumulating;
and judging whether the program set is empty, if so, taking the accumulated audience rating as the audience rating of the target channel in the target time period, and if not, returning to the step of extracting one program in the program set as the target program.
Preferably, the completing the information tensor to obtain the information tensor of the completion information includes:
classifying the information tensor by time periods to obtain the information tensor of each time period;
substituting the information tensor of each time interval into the information completion model for calculation to obtain the information tensor of the completion information of each time interval;
summarizing the information tensor of the completion information of each time period to obtain the information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents the program name, j represents the time period, and r represents the number of users in the user cluster.
An audience rating statistics system comprising:
the system comprises an extraction module, a data processing module and a data processing module, wherein the extraction module is used for extracting cluster logs in a certain time period from a cluster log database, the cluster logs comprise user attributes and watching behaviors, and the cluster log database is generated through a three-network integration system;
the classification module is used for screening the extracted cluster log information according to preset parameters to obtain a target cluster log, wherein the preset parameters are one of the user attributes or one of the watching behaviors;
the information tensor generation module is used for calculating the time of each user cluster for watching the program in each time period according to the target cluster log and storing the time in an information tensor mode, wherein the information tensor comprises a program name, the watching time corresponding to the program name and the number of users in the user cluster;
the information completion module is used for performing completion processing on the information tensor, completing the lost information and obtaining the information tensor of the completed information;
and the calculating module is used for calculating the audience rating of the target program or the target channel according to the information tensor of the completion information.
Preferably, the calculating module is specifically configured to calculate the audience rating of the target program or the target channel in the target time period according to the information tensor of the completion information.
Preferably, the target time period is a certain target time period or a certain target day.
Preferably, the calculating module is configured to calculate the audience rating of the target program in the target time period, and includes:
the first searching unit is used for searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
the second searching unit is used for searching the information tensor of the completion information by taking the target time period as a parameter to obtain the total information tensor in the target time period;
and the first calculating unit is used for calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period.
Preferably, the calculating module is configured to calculate the audience rating of the target channel in the target time period, and includes: a first sub-module, a second sub-module, and a third sub-module, wherein,
the first submodule is used for generating a program set of the target channel, the program set comprises all programs of the target channel, one program in the program set is extracted as a target program, the second submodule is used for calculating the audience rating of the target program in a target time period, and the audience rating is accumulated in the third submodule;
the third submodule is used for judging whether the program set is empty, and if so, the accumulated audience rating is taken as the audience rating of the target channel in the target time period;
the second sub-module includes:
a target program determining unit, configured to generate a program set of a target channel, where the program set includes all programs of the target channel, extract one program in the program set as a target program,
the first searching unit is used for searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
the second searching unit is used for searching the information tensor of the completion information by taking the target time period as a parameter to obtain the total information tensor in the target time period;
and the first calculating unit is used for calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period.
Preferably, the information completing module includes:
the classification unit is used for classifying the information tensor by time periods to obtain the information tensor of each time period;
the second calculation unit is used for substituting the information tensor of each time interval into the information completion model for calculation to obtain the information tensor of the completion information of each time interval;
the summarizing unit is used for summarizing the information tensor of the completion information of each time interval to obtain the information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents the program name, j represents the time period, and r represents the number of users in the user cluster.
It can be seen from the above technical solutions that the embodiments of the present invention provide an audience rating statistical method and system, wherein the audience rating statistical method collects the viewing behaviors of all television viewers by using a cluster log database generated by a triple play system, so as to count the audience rating accurately reflecting a target program or a target channel. Specifically, the audience rating statistical method extracts the cluster logs in a target time period from the cluster log database, screens the extracted cluster log information according to preset parameters to obtain target cluster logs, then generates an information tensor, completes the information lost in the cluster log database generated by the three-network fusion system to obtain the information tensor of the completed information, and accordingly obtains the watching behaviors of all television audiences classified by the preset parameters in the target time period. After obtaining the information tensor of the completion information including the viewing behaviors of all the television viewers classified by the preset parameters in the target time period, the information tensor of the completion information can be used for calculating the audience rating of the target program or the target channel, so that the accurate audience rating representing the target program or the target channel is obtained.
Moreover, the audience rating counting method can be completely implemented and completed by a computer without manual participation, so that the audience rating obtained by counting is more objective and accurate, and the influence of subjective factors on audience rating counting is avoided.
Furthermore, the target cluster log is obtained by classifying the extracted cluster log information according to the preset parameters, so that audience rating statistics of different attribute characteristics can be realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flowchart of an audience rating statistics method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of an audience rating statistics method according to another embodiment of the present application;
fig. 3 is a schematic structural diagram of an audience rating statistics system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an audience rating statistics system according to another embodiment of the present application;
fig. 5 is a schematic structural diagram of an audience rating statistics system according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present application provides an audience rating statistical method, as shown in fig. 1, including:
s101: extracting cluster logs in a certain time period from a cluster log database, wherein the cluster logs comprise user attributes and watching behaviors, and the cluster log database is generated through a three-network integration system;
it should be noted that the three-network convergence refers to a process in which, in the process of evolution of a telecommunication network, a broadcast network and the internet to a broadband communication network, a digital television network and the next generation internet, the three networks are technically improved, the technical functions of the three networks tend to be consistent, the service ranges tend to be the same, the networks are interconnected and intercommunicated, the resources are shared, and various services such as voice, data, broadcast television and the like can be provided for users. The cluster log database generated in the three-network integration process can be used as the premise of accurate audience rating statistics. The cluster log includes user attributes and viewing behavior, wherein the user attributes include user IP and family members (such as dad, mom, grandpa, breast, or child). The viewing behavior comprises: viewing channel, viewing mode (on demand, live or rewind, etc.), viewing program title, viewing start time, viewing end time, etc.
It should be further noted that, in a preferred embodiment of the present application, after extracting the cluster log in a certain time period from the cluster log database, the cluster log is stored in the form of a TABLE of distributed database. This facilitates rapid processing of large-scale data. However, in other embodiments of the present application, after the cluster log within a certain time period is extracted from the cluster log database, the cluster log may also be stored on the distributed cluster hadoop, which is not favorable for the rapid statistics of the audience rating statistics method. The present application does not limit this, which is determined by the actual situation.
S102: screening the extracted cluster log information according to preset parameters to obtain a target cluster log;
the preset parameter is one of the user attributes or one of the viewing behaviors, for example, when the audience rating of a certain program or a certain channel in beijing needs to be counted, the preset parameter may be set to the IP corresponding to the beijing area, and the obtained target cluster log includes user data of all beijing areas; for example, when the audience ratings of children users of a certain program or a certain channel need to be counted, the preset parameters may be set as children, and the obtained target cluster log includes user data of all users whose attributes are children. For example, when the audience rating of a certain program or a certain channel in the live broadcast process needs to be counted, the preset parameter may be set as the live broadcast, and the obtained target cluster log includes all user data whose viewing behavior is live broadcast. The specific value of the preset parameter is not limited, and is determined according to the actual situation.
S103: and calculating the time of each user cluster for watching the program in each time interval according to the target cluster log, and storing the time in an information tensor mode, wherein the information tensor comprises the program name, the watching time corresponding to the program name and the number of users in the user cluster.
The user cluster refers to all users in each target cluster log, such as all members in a family.
S104: and completing the information tensor to complete the lost information to obtain the information tensor of the completed information.
Because communication information loss exists among computer clusters, information of a part of users is lost in the process of generating the cluster log database by the three-network fusion system, sometimes, the loss of the user information can seriously affect the statistics of a part of audience ratings (such as the audience ratings of a certain region or a certain time period), so that the information tensor needs to be complemented to complement the lost information, and the situation that the audience ratings are inaccurate in statistics due to the loss of the information of the part of users is avoided.
S105: and calculating the audience rating of the target program or the target channel according to the information tensor of the completion information.
Optionally, in step S103, calculating, according to the target cluster log, a time for each user cluster to view the program in each time period, and storing the time in an information tensor manner, specifically including:
and calculating the time of each user watching the program in each time interval (each hour) according to the target cluster log. The data statistics method is as follows: establishing a third-order tensor a [ m, n, r ] or a fourth-order tensor a [ d, m, n, r ], wherein d represents the date, m represents the name of a program, n represents the time period, and r represents the number of users in the user cluster; if the number of users in a certain user cluster is 1 as an example, taking the user to perform data calculation from 8 points to 9 points: if the name of the program viewed during this time is i, m is i, and if the time for viewing the program is recorded in the second parameter n for 30min, the third-order tensor a [ i,9,1] is recorded for 30. If a plurality of programs are watched in the period, the watching time data of each program is counted. When the time period is a certain target day, a fourth-order tensor a [ d, m, n, r ] needs to be established for each user cluster to observe the time of the program in each time period, and the specific process is similar to the above process and is not described herein again.
It should be noted that in the present application, the period refers to a time length obtained by dividing one day in steps of one hour, the period refers to a time length including one or more periods, and in some embodiments, the time length of the period may exceed one day (24 periods). The specific length of the time period is not limited in the present application, and is determined according to the actual situation.
Based on the foregoing embodiment, in an embodiment of the present application, as shown in fig. 2, the calculating the audience rating of the target program or the target channel according to the information tensor of the completion information includes:
s1051: and calculating the audience rating of the target program or the target channel in the target time slot according to the information tensor of the completion information.
The target time period is a certain target time period or a certain target day.
It should be noted that the concept of the target time period is similar to that of the time period, and refers to the audience rating within a certain time period that is desired to be counted, for example, the audience rating of a certain program within a time period that is desired to be counted in prime time (20:00-22:00), and the target time period may be set to a time period that includes two time periods, namely, 20:00-21:00 and 21:00-22: 00. The probability of the target time period is introduced to enable the audience rating statistics of the target program or the target channel in one day to be more specific, and the audience rating statistics of the target program or the target channel in the previous day is refined into the audience rating statistics of the target time period, so that program or channel managers can clearly know the audience rating of each target time period, and further corresponding arrangement measures are taken according to the audience ratings of the target program or the target channel in different target time periods.
On the basis of the foregoing embodiment, in another embodiment of the present application, the calculating the audience rating of the target program in the target time slot according to the information tensor of the completion information includes:
searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
searching the information tensor of the completion information by taking the target time period as a parameter to obtain a total information tensor in the target time period;
and calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period.
Correspondingly, the calculating the audience rating of the target channel in the target time period according to the information tensor of the completion information includes:
generating a program set of the target channel, wherein the program set comprises all programs of the target channel;
extracting one program in the program set as a target program, calculating the audience rating of the target program in a target time period according to the information tensor of the completion information, and accumulating;
and judging whether the program set is empty, if so, taking the accumulated audience rating as the audience rating of the target channel in the target time period, and if not, returning to the step of extracting one program in the program set as the target program.
The method for calculating the audience rating of the target channel in the target time period according to the information tensor of the completion information may further include:
generating a program set of the target channel;
extracting one program in the program set as a target program;
searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program, and putting the information tensor into a tensor set;
judging whether the program set is empty, if so, taking the tensor set as an information tensor of the target channel in a target time period, and if not, returning to the step of extracting one program in the program set as a target program;
searching the information tensor of the completion information by taking the target time period as a parameter to obtain a total information tensor in the target time period;
and calculating the ratio of the information tensor of the target channel in the target time period to the total information tensor to obtain the audience rating of the target channel in the target time period.
The specific method for calculating the audience rating of the target channel in the target time period according to the information tensor of the completion information is not limited, and is specifically determined according to the actual situation.
In addition to the foregoing embodiment, in another embodiment of the present application, the completing the information tensor to obtain the compensated information includes:
classifying the information tensor by time periods to obtain the information tensor of each time period;
substituting the information tensor of each time interval into the information completion model for calculation to obtain the information tensor of the completion information of each time interval;
summarizing the information tensor of the completion information of each time period to obtain the information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents the program name, j represents the time period, and r represents the number of users in the user cluster.
Specifically, substituting the information tensor of each time interval into the information completion model for calculation, and obtaining the information tensor of the completion information of each time interval includes:
determining a matrix X by using X [1: m, m +1: m + r ] ═ a [ m, j, r ], then substituting X into Trace (X), and solving under the condition that X is a m + r dimensional symmetric semi-positive definite matrix and X is a non-negative matrix, so that the information tensor of the completion information in the period j can be obtained, and traversing 24 periods, so that the information tensor of the completion information can be obtained.
Correspondingly, an embodiment of the present application further provides an audience rating statistics system, as shown in fig. 3, including:
the system comprises an extraction module 100, a storage module and a processing module, wherein the extraction module 100 is used for extracting cluster logs in a certain time period from a cluster log database, the cluster logs comprise user attributes and viewing behaviors, and the cluster log database is generated through a three-network integration system;
the classification module 200 is configured to filter the extracted cluster log information according to a preset parameter, to obtain a target cluster log, where the preset parameter is one of the user attributes or one of the viewing behaviors;
the information tensor generation module 300 is configured to calculate, according to the target cluster log, a time for each user cluster to watch the program in each time period, and store the time in an information tensor manner, where the information tensor includes a program name, a watching time corresponding to the program name, and the number of users in the user cluster;
an information completing module 400, configured to complete the information tensor, complete the missing information, and obtain an information tensor of the complete information;
and a calculating module 500, configured to calculate an audience rating of the target program or the target channel according to the information tensor of the completion information.
It should be noted that the three-network convergence refers to a process in which, in the process of evolution of a telecommunication network, a broadcast network and the internet to a broadband communication network, a digital television network and the next generation internet, the three networks are technically improved, the technical functions of the three networks tend to be consistent, the service ranges tend to be the same, the networks are interconnected and intercommunicated, the resources are shared, and various services such as voice, data, broadcast television and the like can be provided for users. The cluster log database generated in the three-network integration process can be used as the premise of accurate audience rating statistics. The cluster log includes user attributes and viewing behavior, wherein the user attributes include user IP and family members (such as dad, mom, grandpa, breast, or child). The viewing behavior comprises: viewing channel, viewing mode (on demand, live or rewind, etc.), viewing program title, viewing start time, viewing end time, etc.
The preset parameter is one of the user attributes or one of the viewing behaviors, for example, when the audience rating of a certain program or a certain channel in beijing needs to be counted, the preset parameter may be set to the IP corresponding to the beijing area, and the obtained target cluster log includes user data of all beijing areas; for example, when the audience rating of a child user of a certain program or a certain channel needs to be counted, the preset parameter may be set as all user data whose user attributes are children, and the obtained target cluster log includes all user data whose user attributes are children. For example, when the audience rating of a certain program or a certain channel in the live broadcast process needs to be counted, the preset parameter may be set as the live broadcast, and the obtained target cluster log includes all user data whose viewing behavior is live broadcast. The specific value of the preset parameter is not limited, and is determined according to the actual situation.
It should be further noted that, in a preferred embodiment of the present application, after extracting the cluster log in a certain time period from the cluster log database, the cluster log is stored in the form of a TABLE of distributed database. This facilitates rapid processing of large-scale data. However, in other embodiments of the present application, after the cluster log within a certain time period is extracted from the cluster log database, the cluster log may also be stored on the distributed cluster hadoop, which is not favorable for the rapid statistics of the audience rating statistics method. The present application does not limit this, which is determined by the actual situation.
In addition, calculating the time of each user cluster for viewing the program in each time period according to the target cluster log, and storing the time in an information tensor manner specifically includes:
and calculating the time of each user watching the program in each time interval (each hour) according to the target cluster log. The data statistics method is as follows: establishing a third-order tensor a [ m, n, r ] or a fourth-order tensor a [ d, m, n, r ], wherein d represents the date, m represents the name of the program, n represents the time interval, and r represents the number of users in the user cluster; if the number of users in a certain user cluster is 1 as an example, taking the user to perform data calculation from 8 points to 9 points: if the name of the program viewed during this time is i, m is i, and if the time for viewing the program is recorded in the second parameter n for 30min, the third-order tensor a [ i,9,1] is recorded for 30. If a plurality of programs are watched in the period, the watching time data of each program is counted. When the time period is a certain target day, a fourth-order tensor a [ d, m, n, r ] needs to be established for each user cluster to observe the time of the program in each time period, and the specific process is similar to the above process and is not described herein again.
The user cluster refers to all users in each target cluster log, such as all members in a family.
Based on the foregoing embodiment, in an embodiment of the present application, the calculating module 500 is specifically configured to calculate the audience rating of the target program or the target channel in the target time slot according to the information tensor of the completion information.
Because communication information loss exists among computer clusters, information of a part of users is lost in the process of generating the cluster log database by the three-network fusion system, sometimes, the loss of the user information can seriously affect the statistics of a part of audience ratings (such as the audience ratings of a certain region or a certain time period), so that the information tensor needs to be complemented to complement the lost information, and the situation that the audience ratings are inaccurate in statistics due to the loss of the information of the part of users is avoided.
It should be noted that in the present application, the period refers to a time length obtained by dividing one day in steps of one hour, the period refers to a time length including one or more periods, and in some embodiments, the time length of the period may exceed one day (24 periods). The specific length of the time period is not limited in the present application, and is determined according to the actual situation. The time period may then comprise a certain period or a certain number of target periods a certain target day or a certain number of target days here. The specific length of the time period is not limited in the present application, and is determined according to the actual situation.
On the basis of the foregoing embodiment, in another embodiment of the present application, the calculating module 500 is configured to calculate the audience rating of the target program in the target time slot, as shown in fig. 4, and includes:
the first searching unit 510 is configured to search the information tensor of the completion information by using the name of the target program and the target time period as parameters, and obtain the information tensor of the target program;
a second searching unit 520, configured to search the information tensor of the completion information by using the target time period as a parameter, and obtain a total information tensor in the target time period;
the first calculating unit 530 is configured to calculate a ratio of the information tensor of the target program to the total information tensor, and obtain an audience rating of the target program in a target time period.
The concept of the target time period is similar to that of the time period, and refers to the audience rating within a certain time period that is desired to be counted, for example, the audience rating of a certain program within a time period that is desired to be counted for prime time (20:00-22:00), and the target time period can be set to a time period including two time periods of 20:00-21:00 and 21:00-22: 00. The probability of the target time period is introduced to enable the audience rating statistics of the target program or the target channel in one day to be more specific, and the audience rating statistics of the target program or the target channel in the previous day is refined into the audience rating statistics of the target time period, so that program or channel managers can clearly know the audience rating of each target time period, and further corresponding arrangement measures are taken according to the audience ratings of the target program or the target channel in different target time periods.
Accordingly, in another embodiment of the present application, the calculating module 500 is configured to calculate the audience rating of the target channel in the target time period, and includes: a first sub-module, a second sub-module, and a third sub-module, wherein,
the first submodule is used for generating a program set of the target channel, the program set comprises all programs of the target channel, one program in the program set is extracted as a target program, the second submodule is used for calculating the audience rating of the target program in a target time period, and the audience rating is accumulated in the third submodule;
the third submodule is used for judging whether the program set is empty, and if so, the accumulated audience rating is taken as the audience rating of the target channel in the target time period;
the second sub-module includes:
a target program determining unit, configured to generate a program set of a target channel, where the program set includes all programs of the target channel, extract one program in the program set as a target program,
the first searching unit 510 is configured to search the information tensor of the completion information by using the name of the target program and the target time period as parameters, and obtain the information tensor of the target program;
a second searching unit 520, configured to search the information tensor of the completion information by using the target time period as a parameter, and obtain a total information tensor in the target time period;
the first calculating unit 530 is configured to calculate a ratio of the information tensor of the target program to the total information tensor, and obtain an audience rating of the target program in a target time period.
It should be noted that, the calculating module 500 may also calculate the audience rating of the target channel in the target time period by another method, which specifically includes:
generating a program set of the target channel;
extracting one program in the program set as a target program;
searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program, and putting the information tensor into a tensor set;
judging whether the program set is empty, if so, taking the tensor set as an information tensor of the target channel in a target time period, and if not, returning to the step of extracting one program in the program set as a target program;
searching the information tensor of the completion information by taking the target time period as a parameter to obtain a total information tensor in the target time period;
and calculating the ratio of the information tensor of the target channel in the target time period to the total information tensor to obtain the audience rating of the target channel in the target time period.
The specific method for calculating the audience rating of the target channel in the target time period by the calculating module 500 according to the information tensor of the completion information is not limited in the present application, and is specifically determined according to the actual situation.
On the basis of the above embodiments, in an embodiment of the present application, as shown in fig. 5, the information completing module 400 includes:
a classifying unit 410, configured to classify the information tensor by time periods to obtain an information tensor of each time period;
the second calculating unit 420 is configured to substitute the information tensor of each time interval into the information completion model to calculate, and obtain the information tensor of the completion information of each time interval;
the summarizing unit 430 is configured to summarize the information tensor of the completion information of each time period to obtain an information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents the program name, j represents the time period, and r represents the number of users in the user cluster.
Specifically, substituting the information tensor of each time interval into the information completion model for calculation, and obtaining the information tensor of the completion information of each time interval includes:
determining a matrix X by using X [1: m, m +1: m + r ] ═ a [ m, j, r ], then substituting X into Trace (X), and solving under the condition that X is a m + r dimensional symmetric semi-positive definite matrix and X is a non-negative matrix, so that the information tensor of the completion information in the period j can be obtained, and traversing 24 periods, so that the information tensor of the completion information can be obtained.
In summary, the embodiments of the present application provide an audience rating statistical method and system, where the audience rating statistical method collects the viewing behaviors of all tv viewers by using a cluster log database generated by a triple play system, so as to count an audience rating that accurately reflects a target program or a target channel. Specifically, the audience rating statistical method extracts the cluster logs in a target time period from the cluster log database, screens the extracted cluster log information according to preset parameters to obtain target cluster logs, then generates an information tensor, completes the information lost in the cluster log database generated by the three-network fusion system to obtain the information tensor of the completed information, and accordingly obtains the watching behaviors of all television audiences classified by the preset parameters in the target time period. After obtaining the information tensor of the completion information including the viewing behaviors of all the television viewers classified by the preset parameters in the target time period, the information tensor of the completion information can be used for calculating the audience rating of the target program or the target channel, so that the accurate audience rating representing the target program or the target channel is obtained.
Moreover, the audience rating counting method can be completely implemented and completed by a computer without manual participation, so that the audience rating obtained by counting is more objective and accurate, and the influence of subjective factors on audience rating counting is avoided.
Furthermore, the target cluster log is obtained by classifying the extracted cluster log information according to the preset parameters, so that audience rating statistics of different attribute characteristics can be realized.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (4)

1. An audience rating statistical method is applied to a three-network integration system, and comprises the following steps:
extracting cluster logs in a certain time period from a cluster log database, wherein the cluster logs comprise user attributes and watching behaviors, and the cluster log database is generated through a three-network integration system;
screening the extracted cluster log information according to preset parameters to obtain a target cluster log, wherein the preset parameters are one of the user attributes or one of the watching behaviors;
calculating the time of each user cluster for watching the program in each time interval according to the target cluster log, and storing the time in an information tensor mode, wherein the information tensor comprises a program name, the watching time corresponding to the program name and the number of users in the user cluster;
completing the information tensor to complete the lost information to obtain an information tensor of the completed information;
calculating the audience rating of a target program or a target channel according to the information tensor of the completion information;
the information tensor is subjected to completion processing, information loss is completed, and the information tensor with completion information obtained includes:
classifying the information tensor by time periods to obtain the information tensor of each time period;
substituting the information tensor of each time interval into the information completion model for calculation to obtain the information tensor of the completion information of each time interval;
summarizing the information tensor of the completion information of each time period to obtain the information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents a program name, j represents a time period, and r represents the number of users in the user cluster;
the calculating the audience rating of the target program or the target channel according to the information tensor of the completion information comprises:
calculating the audience rating of a target program or a target channel in a target time period according to the information tensor of the completion information;
the calculating the audience rating of the target program in the target time slot according to the information tensor of the completion information comprises:
searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
searching the information tensor of the completion information by taking the target time period as a parameter to obtain a total information tensor in the target time period;
calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period;
or comprises the following steps:
generating a program set of the target channel, wherein the program set comprises all programs of the target channel;
extracting one program in the program set as a target program, calculating the audience rating of the target program in a target time period according to the information tensor of the completion information, and accumulating;
and judging whether the program set is empty, if so, taking the accumulated audience rating as the audience rating of the target channel in the target time period, and if not, returning to the step of extracting one program in the program set as the target program.
2. The audience rating statistics method of claim 1, wherein the target time period is a target time period or target time periods or target days.
3. An audience rating statistic system, applied to a three-network convergence system, the audience rating statistic system comprising:
the system comprises an extraction module, a data processing module and a data processing module, wherein the extraction module is used for extracting cluster logs in a certain time period from a cluster log database, the cluster logs comprise user attributes and watching behaviors, and the cluster log database is generated through a three-network integration system;
the classification module is used for screening the extracted cluster log information according to preset parameters to obtain a target cluster log, wherein the preset parameters are one of the user attributes or one of the watching behaviors;
the information tensor generation module is used for calculating the time of each user cluster for watching the program in each time period according to the target cluster log and storing the time in an information tensor mode, wherein the information tensor comprises a program name, the watching time corresponding to the program name and the number of users in the user cluster;
the information completion module is used for performing completion processing on the information tensor, completing the lost information and obtaining the information tensor of the completed information;
the calculation module is used for calculating the audience rating of the target program or the target channel according to the information tensor of the completion information;
the information completion module comprises:
the classification unit is used for classifying the information tensor by time periods to obtain the information tensor of each time period;
the second calculation unit is used for substituting the information tensor of each time interval into the information completion model for calculation to obtain the information tensor of the completion information of each time interval;
the summarizing unit is used for summarizing the information tensor of the completion information of each time interval to obtain the information tensor of the completion information;
the information completion model is as follows:
an objective function: trace (X);
solving under the constraint condition: x is a m + r dimensional symmetrical semi-positive definite matrix;
x is a non-negative matrix;
X[1:m,m+1:m+r]=a[m,j,r];
wherein Trace (X) represents the sum of diagonal elements of matrix X; the X is a symmetrical semi-positive definite matrix, which means that all eigenvalues of the X are greater than 0, and the eigenvalue means the solution k of a linear equation set Ax-kx; a [ m, j, r ] represents the information tensor, wherein m represents a program name, j represents a time period, and r represents the number of users in the user cluster;
the calculation module is specifically configured to calculate an audience rating of a target program or a target channel in a target time period according to the information tensor of the completion information;
the calculating module is used for calculating the audience rating of the target program in the target time period, and comprises the following steps:
the first searching unit is used for searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
the second searching unit is used for searching the information tensor of the completion information by taking the target time period as a parameter to obtain the total information tensor in the target time period;
the first calculating unit is used for calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period;
or
The calculating module is used for calculating the audience rating of the target channel in the target time period, and comprises the following steps: a first sub-module, a second sub-module, and a third sub-module, wherein,
the first submodule is used for generating a program set of the target channel, the program set comprises all programs of the target channel, one program in the program set is extracted as a target program, the second submodule is used for calculating the audience rating of the target program in a target time period, and the audience rating is accumulated in the third submodule;
the third submodule is used for judging whether the program set is empty, and if so, the accumulated audience rating is taken as the audience rating of the target channel in the target time period;
the second sub-module includes:
a target program determining unit, configured to generate a program set of a target channel, where the program set includes all programs of the target channel, extract one program in the program set as a target program,
the first searching unit is used for searching the information tensor of the completion information by taking the name of the target program and the target time period as parameters to obtain the information tensor of the target program;
the second searching unit is used for searching the information tensor of the completion information by taking the target time period as a parameter to obtain the total information tensor in the target time period;
and the first calculating unit is used for calculating the ratio of the information tensor of the target program to the total information tensor to obtain the audience rating of the target program in a target time period.
4. The audience rating statistics system of claim 3, wherein the target time period is a target time period or target time periods or target days.
CN201610811426.4A 2016-09-08 2016-09-08 Audience rating statistical method and system Active CN106375796B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610811426.4A CN106375796B (en) 2016-09-08 2016-09-08 Audience rating statistical method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610811426.4A CN106375796B (en) 2016-09-08 2016-09-08 Audience rating statistical method and system

Publications (2)

Publication Number Publication Date
CN106375796A CN106375796A (en) 2017-02-01
CN106375796B true CN106375796B (en) 2019-12-31

Family

ID=57899442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610811426.4A Active CN106375796B (en) 2016-09-08 2016-09-08 Audience rating statistical method and system

Country Status (1)

Country Link
CN (1) CN106375796B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019191875A1 (en) 2018-04-02 2019-10-10 The Nielsen Company (Us), Llc Processor systems to estimate audience sizes and impression counts for different frequency intervals
CN110636344A (en) * 2018-06-22 2019-12-31 上海淘播播电子商务有限公司 Program evaluation method based on new media multi-source cross-screen data analysis
CN109819243B (en) * 2019-01-16 2020-06-23 中央电视台 Method and device for evaluating television station channel column performance

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101527812A (en) * 2008-03-07 2009-09-09 上海贝尔阿尔卡特股份有限公司 Method for automatically counting customer event information and program reception information in network television system
CN102572500A (en) * 2010-12-16 2012-07-11 康佳集团股份有限公司 Network TV program rating collecting system and method
CN104394436A (en) * 2014-11-28 2015-03-04 北京国双科技有限公司 Audience rating monitoring method and device of network television live channel

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8346708B2 (en) * 2009-01-22 2013-01-01 Nec Laboratories America, Inc. Social network analysis with prior knowledge and non-negative tensor factorization
WO2012034606A2 (en) * 2010-09-15 2012-03-22 Telefonica, S.A. Multiverse recommendation method for context-aware collaborative filtering
CN104331411B (en) * 2014-09-19 2018-01-09 华为技术有限公司 The method and apparatus of recommended project

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101527812A (en) * 2008-03-07 2009-09-09 上海贝尔阿尔卡特股份有限公司 Method for automatically counting customer event information and program reception information in network television system
CN102572500A (en) * 2010-12-16 2012-07-11 康佳集团股份有限公司 Network TV program rating collecting system and method
CN104394436A (en) * 2014-11-28 2015-03-04 北京国双科技有限公司 Audience rating monitoring method and device of network television live channel

Also Published As

Publication number Publication date
CN106375796A (en) 2017-02-01

Similar Documents

Publication Publication Date Title
US11470403B2 (en) Methods and apparatus for determining audience metrics across different media platforms
CN103686237B (en) Recommend the method and system of video resource
CN102263999B (en) Face-recognition-based method and system for automatically classifying television programs
CN106992974B (en) Live video information monitoring method, device and equipment
CN103891299B (en) Method and system for providing efficient and accurate estimates of tv viewership ratings
CN104394436B (en) The monitoring method and device of the audience ratings of Internet TV live television channel
CN106375796B (en) Audience rating statistical method and system
CN1719909A (en) Method for measuring audio-video frequency content change
CN108449609B (en) Live broadcast room event identification method and device, electronic equipment and machine readable medium
CN101355686A (en) Method and system for statistic of audience rating
CN109525865B (en) Block chain-based audience rating monitoring method and computer-readable storage medium
CN104584571A (en) Generating a sequence of audio fingerprints at a set top box
CN103763585A (en) User characteristic information obtaining method and device and terminal device
CN104811810A (en) Real-time regional audience rating and audience share statistical system based on intelligent television and method thereof
CN105843876A (en) Multimedia resource quality assessment method and apparatus
CN103997662A (en) Program pushing method and system
CN105306972A (en) Television program recommending method and server
CN106534984B (en) Television program pushing method and device
CN104410873B (en) The detection method and device of television channel number of users
Menkovski et al. Optimized online learning for QoE prediction
CN111601167A (en) Method and platform for accurately positioning television program audience
US20230091980A1 (en) Analytics in video/audio content distribution networks
CN112804566A (en) Program recommendation method, device and computer readable storage medium
CN104410874A (en) A method, a device, and a system for detecting video viscosity information
CN111601168B (en) Television program market performance analysis method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant