CN114679600A

CN114679600A - Data processing method and device

Info

Publication number: CN114679600A
Application number: CN202210295349.7A
Authority: CN
Inventors: 孙袁袁
Original assignee: Shanghai Bilibili Technology Co Ltd
Current assignee: Shanghai Bilibili Technology Co Ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-06-28
Anticipated expiration: 2042-03-24
Also published as: WO2023179162A1; CN114679600B

Abstract

The embodiment of the application provides a data processing method and a device, wherein the data processing method is applied to a heartbeat server and comprises the following steps: receiving historical heartbeat data, wherein the historical heartbeat data comprises a user identification of a user to be verified, acquiring a user portrait of the user to be verified and live broadcast behavior data based on the user identification, determining a first brushing value of the user to be verified according to the user portrait, determining a second brushing value of the user to be verified according to the live broadcast behavior data, and determining a brushing verification result of the user to be verified according to the first brushing value and/or the second brushing value.

Description

Data processing method and device

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a data processing method. One or more embodiments of the present application also relate to a data processing apparatus, a computing device, and a computer-readable storage medium.

Background

With the progress of network communication technology and the increasing speed of broadband networks, live broadcasting is increasingly developed and applied. In the existing live broadcast system, popularity is an important index for ranking each room of a live broadcast platform, and generally speaking, the popularity is higher, the ranking is more advanced, and the anchor is more likely to be watched by users. The real-time watching number of people in the live broadcast room in the popularity calculation is a key ring, and some anchor broadcasters can simulate watching the live broadcast room through illegal means in order to improve the popularity, forge the online watching number of people in the live broadcast room, namely improve the popularity ranking through increasing. Therefore, how to accurately judge whether the brushing amount condition exists in the live broadcast room is an important means for maintaining the ecological stability of the live broadcast platform.

Currently, the number of people watching a live broadcast room is determined according to the number of connections provided by a content distribution network or according to heartbeat data of a live broadcast platform, and whether the number of people watching the live broadcast room is abnormal or not is judged by judging whether the number of people watching the live broadcast room is too high or not or whether a user reports. However, the above method for determining whether to flush the live broadcast room is too single, and is only limited to the number of connections provided by the content distribution network or the heartbeat data of the live broadcast platform, which results in low accuracy in determining whether to flush the live broadcast room, and further results in a small effect in maintaining the ecological stability of the live broadcast platform.

Disclosure of Invention

In view of this, the present application provides a data processing method. One or more embodiments of the present application also relate to a data processing apparatus, a computing device, and a computer-readable storage medium, so as to solve the technical defect in the prior art that the accuracy of determining whether the live broadcast room is flushed is low.

According to a first aspect of the embodiments of the present application, there is provided a data processing method applied to a heartbeat server, including:

receiving historical heartbeat data, wherein the historical heartbeat data comprises a user identifier of a user to be verified;

acquiring user portrait and live broadcast behavior data of the user to be verified based on the user identification;

determining a first credit score of the user to be verified according to the user portrait, and determining a second credit score of the user to be verified according to the live broadcast behavior data;

and determining a brush amount verification result of the user to be verified according to the first brush amount score and the second brush amount score.

According to a second aspect of the embodiments of the present application, there is provided a data processing apparatus applied to a heartbeat server, including:

the receiving module is configured to receive historical heartbeat data, wherein the historical heartbeat data comprises a user identifier of a user to be verified;

the acquisition module is configured to acquire a user portrait and live broadcast behavior data of the user to be verified based on the user identification;

the first determining module is configured to determine a first brush value score of the user to be verified according to the user portrait and determine a second brush value score of the user to be verified according to the live broadcast behavior data;

a second determining module configured to determine a swipe volume verification result of the user to be verified according to the first swipe score and the second swipe score.

According to a third aspect of embodiments herein, there is provided a computing device comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, wherein the processor implements the steps of the data processing method when executing the computer-executable instructions.

According to a fourth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data processing method.

An embodiment of the application realizes a data processing method and a data processing device, wherein the data processing method is applied to a heartbeat server, user portrait and live broadcast behavior data of a user to be verified are obtained based on a user identifier contained in historical heartbeat data by receiving the historical heartbeat data, a first brushing value of the user to be verified is determined according to the user portrait, a second brushing value of the user to be verified is determined according to the live broadcast behavior data, and a brushing verification result of the user to be verified is determined according to the first brushing value and the second brushing value.

According to the method and the device, the brushing amount verification result of the user to be verified is comprehensively judged by combining the user portrait of the user to be verified and the live broadcast behavior data generated in the live broadcast watching process, so that whether the user to be verified is a brushing amount user is more accurately determined, the accuracy of the judgment result of judging whether the user to be verified brushes the amount is improved, and the ecological stability of a live broadcast platform is maintained.

Drawings

FIG. 1 is a system architecture diagram of a data processing process provided by one embodiment of the present application;

FIG. 2 is a flow chart of a data processing method provided by an embodiment of the present application;

fig. 3 is a flowchart illustrating a processing procedure of applying a data processing method to an anti-brushing scene in the live broadcast field according to an embodiment of the present application;

FIG. 4 is a block diagram of a data processing apparatus according to an embodiment of the present application;

fig. 5 is a block diagram of a computing device according to an embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The terminology used in the one or more embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the present application. As used in one or more embodiments of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present application refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments of the present application to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first aspect may be termed a second aspect, and, similarly, a second aspect may be termed a first aspect, without departing from the scope of one or more embodiments of the present application. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

First, the noun terms to which one or more embodiments of the present application relate are explained.

Live streaming: live audiovisual data transmission that can be transmitted as a steady and continuous stream over a network for viewing by an audience.

Direct seeding of human qi: and (4) integrating numerical values calculated according to a certain proportion, such as the number of online people, the number of bullet screens, the number of gifts and the like, and ranking the live broadcast platform according to the popularity.

The number of live broadcast people: the real number of people in the live room is watched in real time.

Brushing amount: by simulating normal user access, a number of false viewing situations are created.

Brushing prevention: by technical means, requests for illegal access are identified and illegal requests are rejected.

Brush: and carrying out illegal user brushing.

User portrait: a user representation is a tagged user model that is abstracted based on information such as user social attributes, lifestyle habits, and consumption behaviors.

UID: a unique identifier of the user account.

In the present application, a data processing method is provided. One or more embodiments of the present application relate to a data processing apparatus, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.

Referring to fig. 1, fig. 1 illustrates a system architecture diagram of a data processing process provided according to an embodiment of the present application.

In the current live broadcast system, popularity is an important index for ranking live broadcast rooms of a live broadcast platform. In general, the higher the popularity of a live room, the more top it is ranked, the more likely a main cast within that live room will be viewed by a user. In the popularity determination process, the used basic data is the number of watching people in the live broadcast room, whether the watching users in the live broadcast room have the brushing amount condition or not is accurately judged, certain punishment measures are carried out on the live broadcast room with the brushing amount, and the method has important significance for maintaining the ecology of a live broadcast platform.

At present, judging whether the brushing amount condition exists in the live broadcast room is mostly realized by watching heartbeat data of a user, wherein the heartbeat data refers to data which is sent between a client and a heartbeat server at regular time intervals and is similar to heartbeat, and therefore the data is called heartbeat data.

The heartbeat server is a server for receiving heartbeat data sent by the client, and can count the current watching number of people in the live broadcast room based on the received heartbeat data in the live broadcast scene. The specific process of generating heartbeat data may be: a client receives a live broadcast room watching request sent by a user; generating a live broadcast room playing address acquisition request based on the live broadcast room viewing request, sending the live broadcast room playing address acquisition request to a scheduling server, and returning a playing address to the client by the scheduling server based on the playing address acquisition request; the client generates heartbeat data based on the playing address at regular time and reports the heartbeat data to the heartbeat server; the heartbeat server can determine whether a transmission connection is established between the client and the server through the heartbeat data received regularly, for example, whether the client plays the content of the target live broadcast room, and the heartbeat server can count the number of users currently watching the target live broadcast room based on the received heartbeat data.

It is obvious that, in the current calculation of the number of live broadcast, a player (client) generally reports heartbeat data to a heartbeat server at regular time, and reports that the current client is watched by someone. The reported heartbeat data generally includes: the room number of the live broadcast room + the play address + the UID + the user IP + the device fingerprint + the live broadcast partition. After receiving the heartbeat data, the heartbeat server collects the number of heartbeats in the direct broadcasting room, and judges how many watching users have the heartbeat data reported by a plurality of people in the current time period.

However, in addition to the above method for generating heartbeat data, in order to increase the number of people watching in a live broadcast room and further enable the live broadcast room to be in a top ranking, some illegal users often intercept a broadcast address returned to a client by a scheduling server, and then generate virtual heartbeat data based on the intercepted broadcast address, that is, increase the number of people watching in the live broadcast room by adopting various means of brushing. By adopting the method for increasing the number of people watched in the live broadcast room, the ecology of the live broadcast platform is influenced, so that the live broadcast platform cannot determine the real number of people watched in the live broadcast room, the real high-quality live broadcast room cannot be recommended to the user to watch, and the experience of watching the user in the live broadcast room is influenced.

In order to solve the problems, a plurality of live broadcast platforms are arranged in such a way that when a client reports heartbeat data to a heartbeat server, a UID (user identification) field is added in the heartbeat data, and whether the heartbeat data is a login user is judged by judging whether the UID field exists or not; if the proportion of the users who do not log in the live broadcast room in the total number of the watching users is high, determining that the brushing amount behavior possibly exists in the live broadcast room; however, the method is still easy to be cracked by the illegal user, namely the illegal user randomly generates the UID and then generates heartbeat data based on the UID; if the UID authenticity in each heartbeat data is verified, the server has higher query pressure when the number of viewers in the live broadcast room is large; and the illegal user can randomly crack the UID of the real user to generate corresponding heartbeat data for the watching of the non-real user.

Therefore, according to the data processing method provided by the embodiment of the application, historical heartbeat data is received by a heartbeat server, wherein the historical heartbeat data comprises a user identifier of a user to be verified, the user portrait and live broadcast behavior data of the user to be verified are obtained based on the user identifier, a first brushing value score of the user to be verified is determined according to the user portrait, a second brushing value score of the user to be verified is determined according to the live broadcast behavior data, and a brushing verification result of the user to be verified is determined according to the first brushing value score and the second brushing value score.

The user portrait includes but is not limited to user level, real name authentication, account number binding mobile phone number, device fingerprint and the like; live behavior data includes, but is not limited to, live viewing category concentration, live entry concentration, barrage/gift show/comment/lottery operation, single IP viewing times, viewing IP zone distribution, viewing duration statistics, and the like.

Therefore, the method and the device for determining the brush amount of the user to be verified comprehensively judge the brush amount verification result of the user to be verified by combining the user portrait of the user to be verified and the live broadcast behavior data generated in the live broadcast watching process, so that whether the user to be verified is the brush amount user or not is more accurately determined, the accuracy of the judgment result for judging whether the user to be verified is the brush amount user or not is favorably improved, and the ecological stability of a live broadcast platform is favorably maintained.

Referring to fig. 2, fig. 2 is a flowchart illustrating a data processing method according to an embodiment of the present application, including the following steps:

step 202, receiving historical heartbeat data, wherein the historical heartbeat data includes a user identifier of a user to be verified.

Specifically, in the operation process of the live broadcast platform, in order to recommend a high-quality live broadcast room to a user and improve the viewing experience of the user, the live broadcast rooms are generally ranked based on the number of people watching the live broadcast rooms; the method for counting the number of people watching in the live broadcast room generally comprises the following steps: the client regularly reports heartbeat data to the heartbeat server, and the reported heartbeat data generally comprises a live broadcast room identifier of a live broadcast room watched by a user, a playing address of the live broadcast room and a user identifier; after the heartbeat server receives the heartbeat data, the heartbeat data can be summarized based on the live broadcast room identification, the number of watching persons in each current live broadcast room is calculated, and then the live broadcast rooms can be sequenced based on the number of watching persons.

The heartbeat data is data which is used for regularly informing the current equipment state of the other side between the client side and the heartbeat server, and the transmission link which is continuously established between the client side and the heartbeat server can be determined by regularly sending the heartbeat data to the heartbeat server; in the embodiment of the application, the heartbeat data can be used for counting the audience number of the watching target live broadcast room, namely, after the heartbeat server receives the heartbeat data, the current watching number of people in the live broadcast room can be counted based on the live broadcast room identification in the heartbeat data.

In practical application, the heartbeat data can be heartbeat data generated when a real user watches a live broadcast room, namely the heartbeat data reported by a target client to a heartbeat server at regular time; for example, a viewer a watches live broadcast based on a client a, the client obtains a broadcast address of a target live broadcast room based on a live broadcast watching request of a user, and reports heartbeat data to a heartbeat server at regular time according to the broadcast address, so that the heartbeat server can determine that the current viewer a watches the target live broadcast room.

Or the heartbeat data may also be heartbeat data generated by an illegal user based on a real broadcast address or generated in other manners, specifically, the illegal user intercepts and returns to the broadcast address of the client, and generates simulated heartbeat data based on the broadcast address, for example, when the illegal user intercepts and watches a target live broadcast room, the scheduling server returns to the broadcast address of the client a where the audience a is located, and generates simulated heartbeat data based on the broadcast address, so that a statistical result obtained when counting the number of people in the live broadcast room based on the heartbeat data subsequently is not real. To prevent this, the authenticity of the user needs to be verified, so that the anti-brushing function is realized.

Therefore, in the embodiment of the application, the heartbeat server can receive historical heartbeat data, the historical heartbeat data is received by the heartbeat server within a certain time before the current time, the historical heartbeat data comprises a user identifier of a user to be verified, and specifically can be a user identity field of the user to be verified, such as a user identity ID, a user number and the like, so that a user portrait and live broadcast behavior data of the user to be verified can be obtained according to the user identifier, the authenticity of the user to be verified is comprehensively verified in advance according to the user portrait and the live broadcast behavior data, a corresponding brushing amount verification result is generated, the number of people watching in a live broadcast room can be counted according to a pre-generated brushing amount verification result of the user to be verified in an actual live broadcast scene, and the accuracy of the counting result is ensured.

And 204, acquiring the user portrait and the live broadcast behavior data of the user to be verified based on the user identification.

Specifically, the user portrait is portrait data of a user to be verified, specifically basic attribute data of the user to be verified, including but not limited to a user level, a real-name authentication result, a mobile phone number bound to an account, an equipment fingerprint, and the like; live broadcast behavior data, namely, related behavior data generated by a user to be verified in a live broadcast watching process, including but not limited to live broadcast watching classification concentration, live broadcast entrance concentration, barrage/gift delivery/comment/lottery operation, single IP watching times, watching IP area distribution, watching duration statistics, and the like.

The user level may be a result generated by the live broadcast platform performing level division on the user according to the activity of the user. For example, a user who logs in a live platform every day is higher in level than a user who only registers an account on the live platform and does not log in for a long time, and the higher the user level is, the higher the credibility of the user is, and the lower the possibility of the user brushing amount is.

As for the real-name authentication result, currently, the live platform generally requires the user to perform real-name authentication, and many operations of the live platform are bound with the real name, for example, if the user needs to become the anchor, the real-name authentication is required. On the other hand, even if the user of the credit card has a plurality of account information, the real-name authentication can be used only once, so that other accounts are in a state of not being subjected to the real-name authentication, and the user who passes the real-name authentication has higher reliability than the user who does not pass the real-name authentication or the user who does not pass the real-name authentication.

Regarding the mobile phone number bound to the account, the user usually needs to bind the mobile phone number when applying for the account, and particularly, the platform requires to verify the mobile phone number when the account is in a dangerous state. Therefore, the credibility of the user with the account number bound with the mobile phone number is higher than that of the user without the mobile phone number bound with the account number.

Regarding the device fingerprint, for live broadcast watching devices, the live broadcast platform can generate a unique device number according to the device information of the user, and the unique device number can be stored, and if the device fingerprint reported by the user appears repeatedly, the fact that a plurality of users multiplex the same live broadcast watching device is indicated, and the brushing rate probability is high.

Regarding watching the live broadcast classification concentration, the whole live broadcast platform comprises live broadcast rooms of various live broadcast types, such as game types, movie types, e-commerce type live broadcast rooms and the like. For a user, the types of live rooms that the user watches are usually centralized, and are concentrated on several fixed types. For the user of the amount of brushing, the user can accept the request of the amount of brushing of various anchor broadcasters, and the live broadcasting rooms of different anchor broadcasters have different live broadcasting types, so that the phenomenon that the same user, namely the same UID watches dozens of or even more types of live broadcasting rooms can occur.

Regarding live entry concentration, a live entry is an entry for entering a live room when viewing a certain live room. There are typically several live portals:

(1) the recommendation page is used for recommending a live broadcast room for the user, and the live broadcast room displayed by the recommendation page is a live broadcast room which is calculated according to the heat of the live broadcast room in the whole live broadcast platform, the record of the user watching the live broadcast room historically and other information and possibly interested by the user;

(2) the method comprises the steps that a ranking list is classified, a main broadcast can select the own live broadcast type when the main broadcast is started, the main broadcast is ranked according to the popularity of a live broadcast room under each live broadcast type, and a user can randomly select the live broadcast room under each live broadcast type to enter the classification ranking list;

(3) the attention page is used for clicking attention when the user is interested in one anchor, so that the user can directly enter a live broadcast room of the anchor concerned from the attention page;

(4) searching a page, wherein a user can directly search the live broadcasting room in the searching page, and then click a searching result to enter the live broadcasting room;

(5) and directly entering, namely directly clicking the address of the live broadcast room to enter the live broadcast room.

Regarding the operation of barrage/gift sending/comment/lottery drawing, when a normal user watches live broadcasting, behaviors such as barrage sending, gift giving, comment sending, lottery drawing and the like can be generated, and for the users who brush the amount, the users simply watch the live broadcasting to generate heartbeat report and further generate the number of people, so that the users do not need to participate in the behaviors. Even if the user wants to participate in the above behaviors, the user is difficult to simulate due to high operation cost and complex operation.

Regarding the number of single IP views, the number of live webbings viewed by the same user's IP should be limited for a single account, and if a single user IP views more than a normal number of live webbings at the same time, it indicates that there is a swiping amount behavior. For example, a user, user IP X, watches live broadcast room 10 times at the same time (live broadcast room does not duplicate, watches live broadcast room 10 times, i.e. opens 10 windows to watch the same live broadcast room), and this number is far beyond the frequency of normal users.

Regarding viewing IP zone distribution, for normal users to view live broadcasts, the user's IP address should be limited. But for the swipe user, they would view multiple live rooms with one account and, to circumvent the limitations of a single IP, would purchase multiple servers, each with a different egress IP. Therefore, by locating the IP-viewing address position of a single account, whether the brushing phenomenon exists can be determined. Wherein the address location fix is located by GPS to a province-city-operator-cell, such as a Jiangsu-Suzhou-telecom-XX cell.

With respect to viewing duration statistics, we can count the duration of live rooms viewed by a single UID. For the user of the brushing amount, the user can watch a plurality of live broadcast rooms simultaneously, and the watching time lengths of different live broadcast rooms are accumulated. By judging whether the viewing duration conforms to the behavior of a normal UID. For example, a UID, viewed 10 hours a day, we consider normal. However, the user of the amount of credit is going into a plurality of live broadcasting rooms, the time duration is accumulated, 10 live broadcasting rooms are possibly watched, even if only 3 hours are watched in one live broadcasting room, 30 hours exist, and only 24 hours exist in one day, so that the probability that the user of the amount of credit exists in the UID is relatively high.

After the user portrait and the live broadcast behavior data of the user to be verified are obtained, whether the user to be verified is a brushing user or not can be further determined according to the user portrait and the live broadcast behavior data.

And step 206, determining a first credit score of the user to be verified according to the user portrait, and determining a second credit score of the user to be verified according to the live broadcast behavior data.

Specifically, the first brush value is determined according to the user portrait of the user to be verified, and the second brush value is determined according to the live broadcast behavior data of the user to be verified.

According to the method and the device, after the user portrait of the user to be verified and the live broadcast behavior data are obtained, the first brushing value of the user to be verified can be determined according to the user portrait, the second brushing value of the user to be verified is determined according to the live broadcast behavior data, and whether the user to be verified is a brushing user or not is comprehensively determined according to the first brushing value and the second brushing value.

In specific implementation, determining the first credit score of the user to be authenticated according to the user portrait includes:

determining at least one first brush sub-score of the user to be verified according to at least one user attribute data of the user to be verified, wherein the user attribute data is contained in the user portrait;

and determining a first brush value score of the user to be verified according to the first brush value score.

Further, determining the first brush quantum score of the user to be authenticated according to the first brush quantum score includes:

and adding the at least one first brush sub-score to generate a first brush sub-score of the user to be verified.

Specifically, the user portrait of the user to be verified includes but is not limited to user attribute data such as a user grade, a real-name authentication result, a mobile phone number bound to an account, and an equipment fingerprint, so that a first brushing quantum score of the user to be verified can be determined according to each user attribute data, and then the first brushing quantum score of the user to be verified can be comprehensively determined according to a plurality of brushing quantum scores.

And classifying the users to be verified according to the user grade, wherein the user grade is the user attribute, and the higher the user grade is, the higher the credibility of the users to be verified is. Assuming that there are N1 user grades, the credibility of the user to be verified increases by m1 points for each increment of the user grade, so that, under the user attribute of the user grade, the credibility of the user, i.e. UID credibility 1 is N1 m 1.

For the user attribute of the result of real-name authentication, if the real-name authentication is passed, the reliability of the user to be verified is a1 point, and if the real-name authentication or the real-name authentication is not passed, the reliability of the user to be verified is a2 point, so that, under the user attribute of the result of mission authentication, the reliability of the user, that is, the UID reliability 2 is a1 or a 2.

For the user attribute of the mobile phone number bound by the account, if the account is bound to the mobile phone number, the credibility of the user to be verified is b1 points, and if the account is not bound to the mobile phone number, the credibility of the user to be verified is b2, so that under the user attribute of the mobile phone number bound by the account, the credibility of the user, namely, UID credibility 3 is b1 or b 2.

For the user attribute of the device fingerprint, the preset number of the device fingerprints is X1, the number of device fingerprints of different devices used by the same UID in a history time interval is Y1, if Y1 is less than X1, the reliability of the user to be verified, namely UID reliability 4 is 100, otherwise, the reliability of the user to be verified, namely UID reliability 4 is 100- (Y1-X1) c1, and c1 is a preset parameter.

In practical application, the phase difference of the UID credibility corresponding to each user attribute can be used as a first brush sub-score of a user to be verified; therefore, after at least one UID credibility of the user to be verified is obtained through calculation, the phase differences of the UID credibility values can be added, and a first brushing value of the user to be verified is generated.

Or, different weights may also be set for different user attributes, so that after the first brush sub-scores respectively corresponding to different user attributes are obtained through calculation, the first brush sub-scores may be weighted and summed based on the weights corresponding to the user attributes, and the first brush sub-scores of the users to be verified are generated.

In specific implementation, determining a second credit score of the user to be verified according to the live broadcast behavior data includes:

determining at least one second brush sub-score of the user to be verified according to at least one live action subdata of the user to be verified, which is contained in the live action data;

and determining a second brush value score of the user to be verified according to the second brush value score.

Further, determining a second brush quantum score of the user to be authenticated according to the second brush quantum score includes:

and adding the at least one second brush sub-score to generate a second brush sub-score of the user to be verified.

Specifically, since the live broadcast behavior data of the user to be verified includes, but is not limited to, live broadcast classification concentration, live broadcast entrance concentration, barrage/gift delivery/comment/lottery operation, single IP viewing times, IP viewing area distribution, viewing duration statistics, and other live broadcast behavior subdata, the second brushing sub-score of the user to be verified can be determined according to each live broadcast behavior subdata, and then the second brushing sub-score of the user to be verified can be determined comprehensively according to a plurality of brushing sub-scores.

For watching the sub-data of the live broadcast behavior with centralized live broadcast classification, the heartbeat server can collect heartbeat data reported by a user to be verified in a preset time interval, and obtain live broadcast types respectively corresponding to each live broadcast room watched by the user to be verified. Assuming that the preset number threshold of the live broadcast types is X2, summarizing the live broadcast types corresponding to multiple live broadcast rooms watched by the user to be verified in a preset time interval, and obtaining the number of the live broadcast types as Y2, if X2 is less than Y2, the reliability of the user to be verified, that is, the UID reliability 5 is 100, otherwise, the reliability of the user to be verified, that is, the UID reliability 5 is 100- (Y2-X2) c2, and c2 is a preset parameter.

For the sub-data of the direct broadcast behavior of the concentration of the direct broadcast entrance, the heartbeat server can collect the heartbeat data reported by the user to be verified in a preset time interval to obtain the entrance mode when the user to be verified enters each direct broadcast room. Assuming that a user to be authenticated watches X3 live broadcast rooms in a preset time interval, and the number of live broadcast rooms entering the user to be authenticated in a direct entry manner is Y3, the confidence of the user to be authenticated, that is, UID confidence 6 ═ 100- (Y3/X3) × c3, and c3 are preset parameters.

For the always-played behavior subdata of the barrage/gift-offering/comment/lottery operation, the heartbeat server can inquire whether the user to be verified generates the behavior in a preset time interval, the more times the user to be verified generates, the higher the reliability of the user to be verified is generated once, the reliability of the user to be verified is increased by m2 minutes, and if the user to be verified generates the behavior for N2 times in the preset time interval, the reliability of the user to be verified, namely, the UID reliability 7 is N2 m 2.

For the sub-data of the one-IP watching time direct broadcasting behavior, the heartbeat server can judge the number of the direct broadcasting rooms watched by the user to be verified through one IP in the preset time interval. Assuming that the number of the preset live broadcast rooms watched in the preset time interval is X4, and the number of the live broadcast rooms actually watched by the user to be verified is Y4, if Y4 is less than X4, the reliability of the user to be verified, that is, the UID reliability 8 is 100, otherwise, the reliability of the user to be verified, that is, the UID reliability 8 is 100- (Y4-X4) c4, and c4 is a preset parameter.

For viewing the sub-data of the direct playing behavior distributed in the IP area, the heartbeat server can analyze the IP reported by the user to be verified in the preset time interval so as to determine the position of the IP. Assuming that the preset number of different IPs used for watching live broadcast in a preset time interval is X5, and the number of different IPs used for actual live broadcast by the user to be verified is Y5, if Y5 is less than X5, the reliability of the user to be verified, that is, the UID reliability 9 is 100, otherwise, the reliability of the user to be verified, that is, the UID reliability 9 is 100- (Y5-X5) c5, and c5 is a preset parameter.

For the viewing time counting of the sub-data of the live broadcasting behavior, the user to be verified reports heartbeat data at regular time, so that the time for a single UID to continuously report the heartbeat data in one live broadcasting room can be judged, the live broadcasting viewing time of the user to be verified is determined according to the time, and then the integral live broadcasting viewing time corresponding to the situation that the user to be verified views different live broadcasting rooms in a preset time interval can be calculated. If the preset live viewing time length is X6 in the preset time interval, and the actual live viewing time length of the user to be verified in the preset time interval is Y6, if Y6 is less than X6, the reliability of the user to be verified, that is, the UID reliability 10 is 100, otherwise, the reliability of the user to be verified, that is, the UID reliability 10 is 100- (Y6-X6) c6, and c6 is a preset parameter.

In practical application, the phase difference of the UID credibility corresponding to each live action subdata can be used as a second brush sub-score of the user to be verified; therefore, after at least one UID credibility of the user to be verified is obtained through calculation, the phase inverses of the UID credibility values can be added to generate a second credit score of the user to be verified.

Or, different weights may also be set for different live broadcast behavior sub-data, so that after second brush sub-values corresponding to the different live broadcast behavior sub-data are obtained through calculation, the second brush sub-values may be weighted and summed based on the weights corresponding to the live broadcast behavior sub-data, and a second brush sub-value of the user to be verified is generated.

And 208, determining a brushing amount verification result of the user to be verified according to the first brushing amount value and the second brushing amount value.

Specifically, the result of the user to be verified is determined, that is, whether the user to be verified is the user to be verified, or whether the user to be verified has the condition of the user to be verified.

In practical applications, the result of the brushing amount verification may be: is a brush amount user (there is a brush amount situation) or is not a brush amount user (there is no brush amount situation).

After the first and second brush score of the user to be verified are determined, whether the user to be verified is a brush user or not can be comprehensively judged based on the first and second brush score.

In specific implementation, determining a brushing amount verification result of the user to be verified according to the first brushing amount score and the second brushing amount score includes:

if the first brushing amount value is larger than or equal to a first preset brushing amount threshold value, determining that the number verification result of the users to be verified is the brushing amount condition;

if the first brushing amount value is smaller than a second preset brushing amount threshold value, determining that the number verification result of the users to be verified is that the brushing amount condition does not exist;

and if the first brush value score is smaller than the first preset brush value threshold and is larger than or equal to the second preset brush value threshold, determining a brush value verification result of the user to be verified according to the second brush value score.

Further, determining a brushing amount verification result of the user to be verified according to the second brushing amount score includes:

and if the second brush value is larger than or equal to the first preset brush threshold, determining that the number verification result of the users to be verified is the condition of brush amount.

Specifically, after a first brushing value and a second brushing value of the user to be verified are determined, the first brushing value can be compared with a first preset brushing threshold, and if the first brushing value is greater than or equal to the first preset brushing threshold, the user to be verified can be directly determined as the brushing user; if the first brushing value is smaller than the second preset brushing threshold, the user to be verified can be directly determined to be not the brushing user.

Under the condition that the first brush value is smaller than a first preset brush value threshold and is larger than or equal to a second preset brush value threshold, the user to be verified can be determined as a suspected brush value user, and in order to guarantee the accuracy of the determined brush value verification result, the brush value verification result of the user to be verified is further verified by combining the second brush value. Specifically, the second brushing amount score may be compared with a first preset brushing amount threshold, and the user to be verified is determined as the brushing amount user when the second brushing amount score is greater than or equal to the first preset brushing amount threshold.

In addition, according to the embodiment of the application, the authenticity of the user to be verified is comprehensively verified in advance according to the user portrait and the live broadcast behavior data, and a corresponding brushing amount verification result is generated, so that if the brushing amount verification result of the user to be verified is determined to be the brushing amount situation, the user identifier of the user to be verified is added to a brushing amount user list, namely the brushing amount user list is used for storing the User Identifier (UID) of the brushing amount user, so that in an actual live broadcast scene, the number of viewers in a live broadcast room can be counted according to the pre-generated brushing amount user list, and the accuracy of the counting result is ensured.

In specific implementation, the process of counting the number of watching people in the live broadcasting room can be specifically realized by the following modes:

acquiring target heartbeat data, wherein the target heartbeat data comprises a live broadcast room identifier of a target live broadcast room and a user identifier of a live broadcast watching user;

and counting the number of online users in the target live broadcast room based on the user identification of the live broadcast watching user and the brushing user list.

Further, counting the number of online users in the target live broadcast room based on the user identifier of the live broadcast watching user and the brushing user list, including:

comparing the user identification of the live watching user with the user identification contained in the brushing amount user list;

and under the condition of inconsistent comparison, determining the live watching users as online users of the target live broadcasting room so as to count the number of the online users of the target live broadcasting room.

Specifically, the target heartbeat data is heartbeat data reported in real time by a user in a live broadcast watching process, and the target heartbeat data includes a room number, a play address, a UID, a user IP, a device fingerprint, a live broadcast type and the like. After the heartbeat server receives the heartbeat data, the heartbeat number associated with the live broadcast room can be summarized, wherein in the summarizing process, whether the UID is the UID of the swiping user or not can be determined based on the UID and the swiping user list contained in the heartbeat data, so that the real number of people watching in the live broadcast room is determined.

Specifically, the UID included in the heartbeat data and the UID included in the brushing amount user list can be compared, so as to determine whether the user corresponding to the UID is a brushing amount user according to the comparison result, wherein under the condition that the comparison is consistent, that is, under the condition that the brushing amount user list includes the UID, the user corresponding to the UID can be determined to be a brushing amount user, the user does not need to calculate the number of viewers in a live broadcast room, otherwise, the user can be calculated in the number of viewers.

In practical application, the process of determining the number verification result of the users to be verified according to the historical heartbeat data can be repeatedly executed according to a certain time period, so that the brushing user list is continuously updated, and the accuracy of the statistical result of the number of watching people in the live broadcast room is ensured.

Referring to fig. 3, the data processing method provided in the embodiment of the present application is further described by taking an application of the data processing method in an anti-brush scene in a live broadcast field as an example. Fig. 3 shows a flow chart of a processing procedure of applying the data processing method provided by an embodiment of the present application to an anti-brushing scene in the live broadcast field, and specifically includes the following steps:

step 302, receiving historical heartbeat data, wherein the historical heartbeat data includes a user identifier of a user to be verified.

And 304, acquiring user portrait and live broadcast behavior data of the user to be verified based on the user identification.

Step 306, determining at least one first brush sub-score of the user to be verified according to at least one user attribute data of the user to be verified contained in the user portrait.

And 308, adding at least one first brush sub-score to generate a first brush sub-score of the user to be verified.

And 310, determining at least one second brush sub-score of the user to be verified according to at least one live action subdata of the user to be verified, which is contained in the live action data.

Step 312, add at least one second brush sub-score to generate a second brush sub-score of the user to be verified.

In step 314, it is determined whether the first brushing score is greater than or equal to a first preset brushing threshold.

If not, go to step 316; if yes, go to step 320.

Step 316, determining whether the first brushing value is smaller than a second preset brushing threshold.

If not, go to step 318; if yes, the processing is not required.

Step 318, determine whether the second brush score is greater than or equal to the first preset brush threshold.

If yes, go to step 320; if not, the processing is not required.

Step 320, adding the user identification of the user to be verified to the brushing amount user list.

Corresponding to the above method embodiment, the present application further provides an embodiment of a data processing apparatus, and fig. 4 shows a schematic structural diagram of a data processing apparatus provided in an embodiment of the present application. As shown in fig. 4, the apparatus includes:

a receiving module 402, configured to receive historical heartbeat data, where the historical heartbeat data includes a user identifier of a user to be verified;

an obtaining module 404 configured to obtain the user portrait and the live broadcast behavior data of the user to be verified based on the user identifier;

a first determining module 406, configured to determine a first credit score of the user to be verified according to the user representation, and determine a second credit score of the user to be verified according to the live broadcast behavior data;

a second determining module 408 configured to determine a result of the brushwork verification of the user to be verified according to the first and second brushwork scores.

Optionally, the first determining module 406 is further configured to:

Optionally, the second determining module 408 is further configured to:

Optionally, the data processing apparatus further includes an adding module configured to:

and if the user identification of the user to be verified is determined to be the user identification of the user to be verified, adding the user identification of the user to be verified to a user list of the user to be verified.

Optionally, the data processing apparatus further includes a statistics module configured to:

Optionally, the statistics module is further configured to:

Optionally, the user representation is basic attribute data of the user to be verified, and the basic attribute data comprises a user grade, a real-name authentication result, a mobile phone number bound by an account, and an equipment fingerprint;

the live broadcast behavior data is behavior data generated in the process of watching live broadcast by the user to be verified, and comprises live broadcast watching classification concentration, live broadcast entrance concentration, barrage/gift delivery/comment/lottery operation, single IP watching times, IP watching area distribution and watching duration statistics.

The above is a schematic configuration of a data processing apparatus of the present embodiment. It should be noted that the technical solution of the data processing apparatus and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing apparatus can be referred to the description of the technical solution of the data processing method.

FIG. 5 illustrates a block diagram of a computing device 500 provided according to an embodiment of the present application. The components of the computing device 500 include, but are not limited to, a memory 510 and a processor 520. Processor 520 is coupled to memory 510 via bus 530, and database 550 is used to store data.

Computing device 500 also includes access device 540, access device 540 enabling computing device 500 to communicate via one or more networks 560. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 540 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the application, the above-described components of computing device 500 and other components not shown in FIG. 5 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 5 is for purposes of example only and is not limiting as to the scope of the present application. Those skilled in the art may add or replace other components as desired.

Computing device 500 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 500 may also be a mobile or stationary server.

Wherein processor 520 is configured to execute computer-executable instructions for executing the computer-executable instructions, wherein the steps of the data processing method are implemented when the processor executes the computer-executable instructions.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the data processing method.

An embodiment of the present application also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data processing method.

The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the data processing method.

The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of combinations of acts, but it should be understood by those skilled in the art that the present application is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that acts and modules referred to are not necessarily required to implement the embodiments of the application.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present application disclosed above are intended only to aid in the explanation of the application. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and their full scope and equivalents.

Claims

1. A data processing method is applied to a heartbeat server and comprises the following steps:

2. The data processing method of claim 1, wherein determining a first brush score for the user to be authenticated from the user representation comprises:

3. The data processing method of claim 2, wherein determining the first brush score of the user to be authenticated from the first brush score comprises:

4. The data processing method of claim 1, wherein determining the second credit score of the user to be authenticated from the live behavior data comprises:

5. The data processing method of claim 4, wherein determining the second brush score for the user to be authenticated from the second brush score comprises:

6. The data processing method of claim 1, wherein the determining the swipe validation result of the user to be validated according to the first swipe score and the second swipe score comprises:

7. The data processing method according to claim 6, wherein the determining the result of the swipe validation of the user to be validated according to the second swipe score comprises:

8. The data processing method of claim 1, further comprising:

9. The data processing method of claim 8, further comprising:

10. The data processing method of claim 9, wherein the counting the number of online users in the target live broadcast room based on the user identifier of the live viewing user and the brushing volume user list comprises:

11. The data processing method of claim 1, wherein the user representation is basic attribute data of the user to be verified, including a user level, a real-name authentication result, a mobile phone number bound to an account, and an equipment fingerprint;

12. A data processing device, applied to a heartbeat server, includes:

a first determination module configured to determine a first credit score of the user to be verified according to the user representation and determine a second credit score of the user to be verified according to the live broadcast behavior data;

13. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions, wherein the processor, when executing the computer-executable instructions, implements the steps of the data processing method according to any one of claims 1 to 11.

14. A computer-readable storage medium, characterized in that it stores computer instructions which, when executed by a processor, implement the steps of the data processing method of any one of claims 1 to 11.