CN109426700B

CN109426700B - Data processing method, data processing device, storage medium and electronic device

Info

Publication number: CN109426700B
Application number: CN201710754737.6A
Authority: CN
Inventors: 黄宙舒; 连博; 冯少伟; 张智; 万明月
Original assignee: Tencent Technology Beijing Co Ltd
Current assignee: Tencent Technology Beijing Co Ltd
Priority date: 2017-08-28
Filing date: 2017-08-28
Publication date: 2023-04-25
Anticipated expiration: 2037-08-28
Also published as: CN109426700A

Abstract

The invention discloses a data processing method, a data processing device, a storage medium and an electronic device. Wherein the method comprises the following steps: acquiring behavior data of a client; processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and performing an operation corresponding to the grading result on the first request. The invention solves the technical problem of low data processing efficiency in the related technology.

Description

Data processing method, data processing device, storage medium and electronic device

Technical Field

The present invention relates to the field of computers, and in particular, to a data processing method, apparatus, storage medium, and electronic apparatus.

Background

Currently, the content of a provider is obtained illegally by non-providers and provided to users, thereby affecting the interests of the provider, an act called hotlinking. The multimedia hotlinking situation is more and more severe, and the breaking technical means of the hotlinking party obtains the multimedia playing address through synchronous authentication blocking, so that the copyright rights of the resource owners are damaged, and the bandwidth cost is increased and decreased.

The existing multimedia anti-theft chain is mostly a synchronous anti-theft chain, and authentication can be performed when multimedia playing requests. Specifically, the client encrypts the playing related parameters by using the secret key, sends the parameters to the playing background, and strikes abnormal requests which do not pass after the background performs authentication judgment. The synchronous anti-theft chain has the advantages of less occupied resources and fast authentication, but because the algorithm of the encryption parameters exists at the client, the anti-theft chain is easy to crack, normal users and anti-theft users cannot be effectively distinguished after the anti-theft chain is cracked, the authentication effect can be improved only through the upgrading algorithm of the client, the cost is high, the effectiveness is slow, and the data processing efficiency is low.

The prior art increases the statistics of the times of the user behaviors on the basis of the synchronous anti-theft chain, the background judges the times of the user behaviors in a certain time interval, if the user has the behaviors except for playing, the user behaviors are allowed to pass, and if the user does not have the behaviors except for playing, the user behaviors are not allowed to pass, and the user behaviors are beaten. In addition, the thief can be passed by the thief link system only by sending the behavior request (such as advertisement) once while the thief link is playing.

If the user behavior is determined according to the number of user behaviors, instead of determining whether the user behavior exists, there is a problem that the threshold value for determining the number of user behaviors cannot be accurately selected, if the threshold value is too strict, normal user behaviors are misjudged, and if the threshold value is too loose, a larger part of hotlinks are missed. Therefore, the scheme of the synchronous anti-theft chain has the defects of high upgrading cost, easiness in simulation, incapability of accurately setting a threshold value and the like, and the efficiency of data processing is low.

The existing asynchronous anti-theft chain can only identify whether a request is a theft chain request according to a pre-learned model and a given data model singly, but cannot give the security level of the request. Meanwhile, in the scheme of identifying the hotlinking request by big data at present, the existing identifiable behavior can only be predicted as whether hotlinking is performed; for a new mode of hotlinking, the existing scheme can not find and identify well, and the request can be considered as a normal request, so that some novel suspected hotlinking requests are put away, and the hitting range of the novel suspected hotlinking requests is limited; for more and more complex hotlinking countermeasure environments, new hotlinking situations cannot be intelligently discovered, thereby resulting in inefficiency of data processing.

In addition, since the authentication in the prior art occurs before the multimedia playing, after the authentication passes, the anti-theft link background cannot control the subsequent playing, and the thief can normally play only through synchronous authentication without spending too much effort in links other than the playing, thereby reducing the security of data processing.

For the above-mentioned problem of inefficiency of data processing, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a data processing method, a data processing device, a storage medium and an electronic device, which are used for at least solving the technical problem of low data processing efficiency in the related art.

According to an aspect of an embodiment of the present invention, there is provided a data processing method. The data processing method comprises the following steps: acquiring behavior data of a client; processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and performing an operation corresponding to the grading result on the first request.

According to another aspect of the embodiment of the invention, a data processing apparatus is also provided. The data processing apparatus includes: the acquisition unit is used for acquiring behavior data of the client; the first processing unit is used for processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; the second processing unit is used for carrying out grading processing on the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and the operation unit is used for performing an operation corresponding to the grading result on the first request.

According to another aspect of the embodiment of the present invention, there is also provided a storage medium including a stored program, where the program executes the data processing method of the embodiment of the present invention.

According to another aspect of the embodiment of the invention, an electronic device is also provided. The electronic device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the data processing method of the embodiment of the invention through the computer program.

In the embodiment of the invention, behavior data of a client is acquired; processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and performing an operation corresponding to the grading result on the first request. The security level of the request of the client can be identified, and then the operation corresponding to the security level is adopted, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency in the related technology is solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a schematic diagram of a hardware environment of a data processing method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a data processing method according to an embodiment of the invention;

FIG. 3 is a flow chart of a method of performing an operation corresponding to a ranking result on a first request according to an embodiment of the invention;

FIG. 4 is a flow chart of another data processing method according to an embodiment of the present invention;

FIG. 5 is a flow chart of a method of adapting a data processing model by test results according to an embodiment of the invention;

FIG. 6 is a flow chart of another data processing method according to an embodiment of the invention;

FIG. 7 is a flow chart of another data processing method according to an embodiment of the present invention;

FIG. 8 is a flow chart of another data processing method according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a data processing system according to an embodiment of the present invention;

FIG. 10 is a schematic diagram of a policy scoring system architecture according to an embodiment of the invention;

FIG. 11 is another policy scoring system architecture according to an embodiment of the invention;

FIG. 12 is a schematic diagram of a hierarchy logic in accordance with an embodiment of the present invention;

FIG. 13 is a logic diagram of a decision tree according to an embodiment of the present invention;

FIG. 14 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention; and

Fig. 15 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

According to an embodiment of the present invention, an embodiment of a data processing method is provided.

Alternatively, in the present embodiment, the above-described data processing method may be applied to a hardware environment constituted by the server 102 and the terminal 104 as shown in fig. 1. FIG. 1 is a schematic diagram of a hardware environment of a data processing method according to an embodiment of the present invention. As shown in fig. 1, server 102 is connected to terminal 104 via a network including, but not limited to: the terminal 104 is not limited to a PC, a mobile phone, a tablet computer, etc., but is a wide area network, a metropolitan area network, or a local area network. The data processing method according to the embodiment of the present invention may be executed by the server 102, may be executed by the terminal 104, or may be executed by both the server 102 and the terminal 104. The data processing method performed by the terminal 104 according to the embodiment of the present invention may be performed by a client installed thereon.

Fig. 2 is a flow chart of a data processing method according to an embodiment of the present invention. As shown in fig. 2, the method may include the steps of:

step S202, behavior data of a client is obtained.

In the technical solution provided in step S202 in the present application, behavior data of a client is obtained.

In this embodiment, the client may be a video client, a music client, a document client, a animation client, etc., without limitation herein. The behavior data of the client is acquired, and the behavior data of the client can be acquired through the data processing system, wherein the behavior data is sample data for adjusting a data processing model, namely training the data processing model, and the data processing model can be used for processing a request sent by the client.

Alternatively, the behavior data in this embodiment may be data generated by the client during the running process, may be data generated by the client when the client calls the request of each processing module, or may be data generated during the process of sending the request data by the client. For example, the behavior data is video playing authentication, playing time data, reporting playing instructions, reporting advertisements, reporting heartbeat data and other behaviors, and may be video ID, user IP, channel information, platform information, login information and the like requested during video playing, or environment information of a player collected by a client.

Optionally, the embodiment may continue to optimize behavior data, for example, obtain behavior data of a screenshot reporting behavior and a user opening a home page, a detail page, and the like when the player plays the advertisement.

Step S204, the behavior data are processed into model data of a data processing model.

In the technical solution provided in the above step S204 of the present application, the behavior data is processed into model data of a data processing model, where the data processing model with the model data is used for performing hierarchical processing on the security level of the first request, and the first request is used for triggering the target event by the client request.

After the behavior data of the client is acquired, the behavior data is processed into model data of a data processing model. Optionally, the behavior data are sorted, the data in the behavior data for adjusting the data processing model are filtered, the data are abstracted into multidimensional vector data, and the vector data can be classified and classified. Optionally, training, testing and feeding back the vector data through a machine learning model to obtain final model data.

The data processing model of this embodiment can be obtained by classifying and predicting various states by using a decision tree model, but not limited to, as a basic model, for example, assuming that the depth of the tree model is 8, the maximum number of leaf nodes is 2 ^8-1 =128. Optionally, a large amount of latest data are sampled every preset time to train the data processing model, for example, a large amount of latest data are sampled every preset time to train the data processing model, the data processing model continuously learns and adjusts, and the judging strategy is intelligently adjusted, so that the best hitting effect is achieved on the request data. Meanwhile, in the process of training the data processing model, model parameters can be adjusted through dynamic verification of data, so that the recognition effect of the data processing model is further improved.

Step S206, grading the security level of the first request through a data processing model with model data to obtain grading results of the first request.

In the technical solution provided in step S206 of the present application, the security level of the first request is classified by using the data processing model with the model data, so as to obtain a classification result of the first request, where the classification result is used to indicate the security level of the first request.

After the behavior data is processed as the model data of the data processing model, the security level of the first request is hierarchically processed through the data processing model having the model data, and the first request may be a request for requesting to play a video, a request for requesting to play music, a request for requesting to pay, or the like of the client, which is not limited herein. The ranked result of the first request is used to indicate a security level of the first request, which may be used to represent security of the corresponding behavior of the first request. Alternatively, the security level may be defined by scores, different score thresholds representing different security levels, e.g., higher security levels when the score is higher and lower security levels when the score is lower.

In this embodiment, hotlinking refers to the act of the provider's content being illegally acquired by an unauthorized provider and provided to the user for use, affecting the provider's interests. For each request sent by the client, there is a possibility of being a hotlinking request, which corresponds to the security level of the request.

Step S208, performing an operation corresponding to the classification result on the first request.

In the technical solution provided in step S208, the operation corresponding to the classification result is performed on the first request.

After the security level of the first request is classified through the data processing model with the model data to obtain the classification result of the first request, performing an operation corresponding to the classification result on the first request, wherein the operation is the strategy adopted when the first request is processed, and different operations can be adopted according to different classification results of the first request. For example, when the data processing model determines that the security level of the first request corresponds to the hotlinking behavior, the first request is directly hit, and the first request is not allowed to pass through. When the data processing model determines that the security level of the first request corresponds to the suspected hotlinking behavior, the data of the first request can be manually analyzed and abstracted, related rules are operated to obtain new operation for processing the first request, and an operation instruction for processing the request is updated, so that virtuous circle is realized, and data processing efficiency is improved.

According to the embodiment, whether the request is a hotlinking request or not can be judged under a given data model, and the corresponding security level of the request can be identified according to the difference of data expression, so that a novel hotlinking mode can be found according to the security level, and quick hit is given according to related rules, so that the security of data processing is improved, and the data processing efficiency is improved.

Acquiring behavior data of the client through the steps S202 to S208; processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and performing an operation corresponding to the grading result on the first request. The security level of the request of the client can be identified, and then the operation corresponding to the security level is adopted, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency in the related technology is solved.

As an optional implementation manner, step S208, performing an operation corresponding to the classification result on the first request includes: determining the first request as a target request under the condition that the grading result is not an existing grading result in the data processing model; acquiring a first operation instruction for processing abnormal data in a target request; and performing a first operation indicated by the first operation instruction on the first request, and adding the first operation instruction into the data processing model.

Fig. 3 is a flow chart of a method of performing an operation corresponding to a hierarchical result on a first request according to an embodiment of the present invention. As shown in fig. 3, the method comprises the steps of:

in step S301, in the case where the classification result is not an existing classification result in the data processing model, it is determined that the first request is a target request.

In the technical solution provided in step S301, the first request is determined to be the target request when the classification result is not an existing classification result in the data processing model.

After the security level of the first request is classified through the data processing model with the model data, judging whether the classification result is the existing classification result in the data processing model or not after obtaining the classification result of the first request, if the classification result is judged not to be the existing data result in the data processing model, determining the first request as a target request, wherein the target request is a suspected hotlinking request, namely, the first request is a hotlinking request to be determined.

Step S302, a first operation instruction for processing the abnormal data in the target request is acquired.

In the technical solution provided in step S302, a first operation instruction is obtained, where the first operation instruction is used to process abnormal data in a target request, and the operation corresponding to the classification result includes a first operation indicated by the first operation instruction, and the first operation instruction is further used to indicate that a first operation is performed on a request having the same category as the first request.

After the first request is determined to be the target request, scoring the target request through a data processing model, determining whether the target request is a security request through a scoring result, if the target request is determined not to be the security request through the scoring result, determining that the target request is a hotlinking request, acquiring abnormal data in the target request, manually analyzing the abnormal data and abstracting the abnormal data to obtain a first operation instruction, and acquiring the first operation instruction through a policy system, wherein the first operation instruction can be used for indicating that the first operation is performed on the request with the same category as the first request, for example, the hotlinking request with the same or similar scoring result can be operated on the same operation instruction.

Alternatively, this embodiment selects a more extensive classification algorithm for classifying the requested data according to the behavior of the different data by the classifier. The applicable usage scenario may be used for any scenario where a corresponding policy needs to be taken on the requested data.

Step S303, a first operation indicated by a first operation instruction is performed on the first request, and the first operation instruction is added to the data processing model.

In the technical solution provided in step S303 of the present application, a first operation indicated by a first operation instruction is performed on the first request, and the first operation instruction is added to the data processing model.

After the first operation instruction for processing the abnormal data in the target request is acquired, the first operation indicated by the first operation instruction is performed on the first request, that is, the first operation is performed on the first request by the first operation instruction. Because the first operation instruction can be used for indicating to perform the first operation on the request with the same category as the first request, the first operation instruction can be added into the data processing model, so that the existing operation instruction in the data processing model is updated, and virtuous circle is realized, so that when the request with the same category as the first request is encountered later, the first operation, for example, the striking operation, can be performed on the request directly through the first operation instruction, the action corresponding to the request cannot pass, and the safety of data processing is improved.

The embodiment determines that the first request is a target request by determining that the first request is not a classification result existing in the data processing model; acquiring a first operation instruction for processing abnormal data in a target request, wherein the operation corresponding to the grading result comprises a first operation indicated by the first operation instruction, and the first operation instruction is used for indicating to perform the first operation on the request with the same category as the first request; and performing a first operation indicated by the first operation instruction on the first request, and adding the first operation instruction to the data processing model, so that the purpose of performing an operation corresponding to the grading result on the first request is achieved.

As an optional implementation manner, step S204, processing the behavior data into the model data of the data processing model includes: filtering the behavior data to obtain vector data of a data processing model, wherein the vector data comprises a plurality of dimensions; the vector data is processed into model data of a data processing model.

In this embodiment, after the behavior data of the client is obtained, for example, the obtained behavior data such as video playing authentication, data during playing, playing quality reporting, advertisement reporting, heartbeat reporting, or request data during video playing, video ID, user IP, channel, platform number, player environment information collected by the client, user ID, login information, etc., the behavior data is filtered to obtain vector data of the data processing model, for example, the obtained behavior data is processed to be cleaned and abstracted into multidimensional vector data.

After filtering the behavior data to obtain vector data of the data processing model, the vector data are processed into model data of the data processing model, which may be processed by the data processing system into model data of the data processing model, optionally classified and classified by policy. And then processing the vector data through a machine learning model in the processes of training, testing and feedback to obtain final model data.

According to the embodiment, the behavior data are filtered to obtain vector data of a data processing model, wherein the vector data comprise a plurality of dimensions; the vector data is processed into the model data of the data processing model, the purpose of processing the behavior data into the model data of the data processing model is achieved, and then the security level of the first request is subjected to grading processing through the data processing model with the model data, so that a grading result of the first request is obtained; and the first request is operated corresponding to the grading result, so that the purpose of data processing is improved.

As an alternative embodiment, after processing the behavior data into the model data of the data processing model in step S204, the method further includes: and feeding back a test result obtained by testing the model data to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model.

Fig. 4 is a flow chart of another data processing method according to an embodiment of the present invention. As shown in fig. 4, the method further comprises the steps of:

step S401, testing the model data to obtain a test result.

In the technical scheme provided in the step S401, the behavior data is processed into the model data of the data processing model, and the model data is tested to obtain the test result.

After the behavior data are processed into the model data of the data processing model, the model data are tested, and the model data can be tested through a big data modeling system, so that a test result is obtained. Optionally, the model data is automatically analyzed according to the machine learning model to obtain test results, which may be used to adjust the data processing model.

Step S402, feeding back the test result to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model.

In the technical scheme provided in the above step S402 of the present application, the test result is fed back to the data processing model, and the data processing model is adjusted according to the test result, so as to obtain an adjusted data processing model.

After the model data is tested to obtain a test result, the test result is fed back to the data processing model, the big data modeling system can feed back the test result to the data verification system, the data processing model is adjusted through the test result, the data processing model can be adjusted according to the machine learning model through the test result, the adjusted data processing model is further obtained, and training of the data processing model is achieved.

Optionally, in this embodiment, in order to ensure accuracy of the data processing model in determining the request data, the data processing model may be trained by sampling a large amount of latest behavior data according to a preset period of time, for example, the data processing model is trained by collecting a large amount of latest behavior data every day, so that the data processing model continuously learns and optimizes itself, and further intelligently adjusts the policy, thereby improving efficiency of hitting illegal request data.

In the embodiment, after the behavior data are processed into the model data of the data processing model, the model data are tested, so that a test result is obtained; and feeding back the test result to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model, so that training of the data processing model is realized, and the processing efficiency of the data processing model is further improved.

As an optional implementation manner, step S402, adjusting the data processing model according to the test result, where obtaining the adjusted data processing model includes: verifying the test result to obtain a first verification result; and adjusting parameters of the data processing model under the condition that the first verification result meets the first condition.

FIG. 5 is a flow chart of a method of adapting a data processing model by test results according to an embodiment of the invention. As shown in fig. 5, the method includes:

step S501, verifying the test result to obtain a first verification result.

In the technical solution provided in step S501 of the present application, the test result is verified, and a first verification result is obtained.

After the model data is tested to obtain a test result, the test result can be verified through a data verification system to obtain the test result.

In step S502, if the first verification result meets the first condition, parameters of the data processing model are adjusted.

In the technical solution provided in step S502 of the present application, parameters of the data processing model are adjusted when the first verification result meets the first condition.

After the test result is verified, a first verification result is obtained, and whether the first verification result meets a first condition is judged, wherein the first condition can be a condition for adjusting parameters of the data processing model so as to improve the accuracy of the data processing model on the identification of the request data.

The embodiment obtains a first verification result by verifying the test result; under the condition that the first verification result meets the first condition, the parameters of the data processing model are adjusted, the purpose of adjusting the data processing model through the test result to obtain an adjusted data processing model is achieved, training of the data processing model is achieved, and further the processing efficiency of the data processing model is improved.

As an optional implementation manner, in step S206, the security level of the first request is subjected to a grading process by using a data processing model with model data, so as to obtain a grading result of the first request, where the method further includes: and adding a first identifier to the first request in the condition that the first request meets the second condition, wherein the first identifier is used for identifying the security state of the first request.

In this embodiment, the security state of the first request may be a state of whether the first request is secure, a state of whether the first request is suspected to be secure, or the like. When a normal client transmits request data, requests of all modules are called according to a preset sequence and frequency, and the generated module request number and frequency can be used as behavior characteristics of the client. Optionally, judging whether the data request meets the request sent by the normal client, if the data request is the request sent by the normal client, adding a first identifier for indicating that the state of the first request is a safe state to the first request; if the data request is a request sent by a hotlinking client, adding a first identifier for indicating that the first request is in an unsafe state to the first request; in case the first request is a target request, i.e. in case the first request is a suspected hotlinking request, a first identification is added to the first request indicating that the first request is a hotlinking request to be determined.

Optionally, the first identifier added to the first request may further subdivide the identifier of the different sub-states in each security state in each request state, so as to more clearly mark the state of the request, and further improve the accuracy of the data processing model in identifying the request data.

As an alternative embodiment, when the behavior data is processed into the model data of the data processing model in step S204, the method further includes: acquiring a target sample of a data processing model and a sample observation value of the target sample; processing the number of samples and the number of sample observations of the target sample as a first index of the data processing model; processing the sample number corresponding to each category of the target sample and the sample number of the target sample into a second index of the data processing model in the target sample of the data processing model; the first index and the second index are processed as class scores for the data processing model.

Fig. 6 is a flowchart of another data processing method according to an embodiment of the present invention. As shown in fig. 6, the method comprises the steps of:

step S601, processing the number of samples of the target samples and the number of sample observations of the target samples of the data processing model into a first index of the data processing model.

In the technical solution provided in step S601 of the present application, the number of samples of the target samples of the data processing model and the number of sample observations of the target samples are processed as a first index of the data processing model, where the target samples include behavior data, and the first index is used to indicate a generalization capability of processing a request of a client by a data processing path in the data processing model.

In this embodiment, a data processing model with model data may be used to rank the security level of requests sent by clients. The data processing model can be trained to continuously optimize the data processing model, and the processing efficiency of the data processing model is improved. Alternatively, classification predictions for various states are made using a decision tree model as a base model.

In training the data processing model, the embodiment may calculate the importance of each sample factor of the samples obtained from the behavior data, for example, data indicating the importance of each sample factor may be calculated using a method of a genii (gini) coefficient, an information gain rate, or the like, wherein the sample factors are used to represent causal relationships, correlations, or the like between the data. After obtaining the data indicating the importance of each sample factor, determining the most important sample factor as the root node of the data processing model, in selecting the most important sample among the current samples; and calculating a target sample factor of the maximum split sample class according to the selected sample factor, dividing sample data of the sample into two parts according to the target sample factor, and taking the divided two parts as leaf nodes. Optionally, the steps are repeatedly executed on the subsamples corresponding to the leaf nodes, so as to generate a data processing model.

Optionally, the number of samples of the target samples of the data processing model and the number of sample observations of the target samples are processed as a first index of the data processing model, and a quotient of the number of samples of the target samples of the data processing model and the number of sample observations of the target samples can be determined as the first index of the data processing model, wherein the target samples comprise behavior data, and the first index is a path generalization index, and can be one of indexes for representing novel hotlinking behaviors, and is used for indicating generalization capability of a client for processing by a data processing path in the data processing model. The generalization capability refers to the adaptability of a machine learning algorithm to a fresh sample, the learning purpose is to learn a rule hidden behind data, and a trained network can also give out proper output to data except a learning set with the same rule, and the capability is also known as the generalization capability.

In this embodiment, the larger the value of the first exponent, the better the logical generalization ability of the leaf path in the data processing model.

In step S602, in the target samples, the number of samples corresponding to each category of the target samples and the number of samples of the target samples are processed as a second index of the data processing model.

In the technical solution provided in step S602 of the present application, in the target samples of the data processing model, the number of samples corresponding to each category of the target samples and the number of samples of the target samples are processed as a second index of the data processing model, where the second index is used to indicate an accuracy of determining the category of the request of the client.

In the target samples of the data processing model, the number of samples corresponding to each category of the target samples and the number of samples of the target samples are processed as the second index of the data processing model, and the quotient of the number of samples corresponding to each category of the target samples and the number of samples of the target samples can be determined as the second index of the data processing model and can be one of indexes for representing novel hotlinking behaviors. The larger the value of the second index, which is the path class index, the higher the prediction accuracy of the path class, and the lower the value of the second index, the lower the prediction accuracy of the path class, and the greater the degree of confusion of the path class. In the case of a large degree of confusion in the path class, this means that there is a new type of hotlinking, i.e. that this hotlinking has not been previously handled by the data processing model.

Step S603, processing the first index and the second index into class scores of the data processing model.

In the technical solution provided in step S603 of the present application, the first index and the second index are processed into class scores of the data processing model, where the class scores are used to indicate the degree to which each class of the target sample is a target class.

After the number of samples of the target sample of the data processing model and the number of sample observations of the target sample are processed as a first index of the data processing model, and the number of samples corresponding to each category of the target sample and the number of samples of the target sample are processed as a second index of the data processing model, the first index and the second index are processed as a category score of the data processing model, the category score is used for indicating the degree to which each category of the target sample is the target category, and may be a new hotlinking mode score, the higher the category score is, the higher the degree of potential hotlinking new behavior is indicated in the samples reaching leaf nodes in the data processing model.

In the embodiment, when behavior data is processed into model data of a data processing model, the number of samples of a target sample of the data processing model and the number of sample observations of the target sample are processed into a first index of the data processing model, wherein the target sample comprises the behavior data, and the first index is used for indicating generalization capability of a client side for processing requests by a data processing path in the data processing model; processing the sample number corresponding to each category of the target sample and the sample number of the target sample into a second index of the data processing model, wherein the second index is used for indicating the accuracy of determining the category of the request of the client; and processing the first index and the second index into class scores of the data processing model, wherein the class scores are used for indicating the degree that each class of the target sample is the target class, so that training of the data processing model is realized.

As an optional embodiment, step S601, the sample number and the sample observation value of the target sampleThe first index processed as the data processing model includes: the first index A is obtained by the following first formula:

wherein samples are used to represent the number of samples of the target sample and n is used to represent the number of sample observations of the target sample.

As an optional implementation manner, step S602, processing the number of samples corresponding to each category of the target sample and the number of samples of the target sample into the second index of the data processing model includes: the second index B is obtained by the following second formula:

wherein maxvalue _i The number of samples under the i-th category used to represent the target sample is greater than the number of samples under categories other than the i-th category among the plurality of categories of the target sample.

As an optional implementation manner, step S603, processing the first index and the second index into a class score of the data processing model includes: obtaining a category score C by a third formula:

wherein samples are used for representing the number of samples of the target sample, n is used for representing the number of sample observations of the target sample, maxvalue _i The number of samples in the i-th category for representing the target sample is larger than the number of samples corresponding to categories other than the i-th category among the plurality of categories of the target sample.

In this embodiment, the samples can be used to represent the number of samples that enter the subtree, and the value is used to represent the number under each category (value _i Representing the number of i-th categories of this node). Generally, a class with absolute advantage at a leaf node can only serve as a path result, i.e., a prediction result, of this leaf node.

The data processing model of this embodiment can be obtained by gini coefficients. For the random variable X, if p (X) is used to represent the probability density function of X, E [ X ]]Representing the expectation of XThen for n sample observations X of X _i The expression formula of gini coefficients is:

as an alternative embodiment, after processing the first index and the second index into class scores of the data processing model in step S603, the method further includes: filtering out nodes corresponding to a second index larger than the target value in the data processing model; sorting the filtered nodes in the data processing model according to the class scores corresponding to the filtered nodes to obtain sorting results; in the sorting result, the node ranked as the preset sequence is marked as the target node.

Fig. 7 is a flowchart of another data processing method according to an embodiment of the present invention. As shown in fig. 7, the method includes the steps of:

In step S701, in the data processing model, nodes corresponding to the second index greater than the target value are filtered out.

In the technical solution provided in the above step S701 of the present application, the node corresponding to the second index greater than the target value in the data processing model is filtered out.

And after the first index and the second index are processed into class scores of the data processing model, filtering out nodes corresponding to the second index which is larger than the target value in the data processing model. For example, filtering out data in a data model

Wherein k is a preset value for screening the second index, and may be an accuracy, for example, an accuracy of 90%.

Step S702, sorting the filtered nodes in the data processing model according to the class scores corresponding to the filtered nodes to obtain sorting results.

In the technical scheme provided in step S702 of the present application, among the filtered nodes in the data processing model, the filtered nodes are ranked according to the class scores corresponding to the filtered nodes, so as to obtain a ranking result.

In step S703, in the sorting result, the node ranked as the preset sequence is marked as the target node.

In the technical solution provided in step S703 of the present application, in the sorting result, the node ranked as the preset sequence is marked as the target node, where the behavior data corresponding to the target node is to be determined as the hotlinking behavior data of the client.

After filtering out the nodes corresponding to the second indexes greater than the target value in the data processing model, marking the nodes ranked as the preset sequences as target nodes in the sorting result, sorting according to the new behavior pattern indexes, selecting suspected leaf nodes, marking the nodes arranged in the first few nodes as suspected new behavior theft chain link points, sorting according to the novel theft chain pattern scores, sorting the novel theft chain pattern scores from large to small, marking the nodes arranged in the first few nodes as suspected new behavior theft chain link points, and further taking out the sample numbers corresponding to the suspected new behavior theft chain link points for operation analysis.

The embodiment filters out nodes corresponding to the second index larger than the target value in the data processing model after the first index and the second index are processed as class scores of the data processing model; sorting the filtered nodes in the data processing model according to the class scores corresponding to the filtered nodes to obtain sorting results; and marking the nodes ranked as the preset sequences as target nodes in the sorting result, wherein the behavior data corresponding to the target nodes are to be determined as the hotlinking behavior data of the client.

As an alternative embodiment, after marking the node ranked as the preset sequence as the target node in step S703, the method further includes: acquiring online data corresponding to a target node; verifying online data corresponding to the target node to obtain a second verification result; and under the condition that the second verification result meets the third condition, determining the class of the data corresponding to the target node as the target class.

Fig. 8 is a flowchart of another data processing method according to an embodiment of the present invention. As shown in fig. 8, the method includes the steps of:

step S801, acquiring online data corresponding to a target node.

In the technical solution provided in the above step S801 of the present application, online data corresponding to a target node is obtained, where the online data is behavior data of a client corresponding to the target node when online prediction is performed on a data processing model.

After marking the nodes ranked as the preset sequences as target nodes, acquiring online data corresponding to the target nodes, and performing online prediction on a system for a preset number of rating categories by using a trained and analyzed data processing model to obtain the online data.

Step S802, verifying online data corresponding to the target node to obtain a second verification result.

In the technical scheme provided in step S802, the online data corresponding to the target node is verified, so as to obtain a second verification result.

After the online data corresponding to the target node are obtained, the online data corresponding to the target node are verified, a second verification result is obtained, and online analysis and verification are carried out on the data of the suspected new-behavior hotlinking node, so that a novel hotlinking mode is found, wherein the second verification result comprises the novel hotlinking mode.

Step S803, determining the class of the data corresponding to the target node as the target class if the second verification result meets the third condition.

In the technical solution provided in step S803 of the present application, when the second verification result meets the third condition, the category of the data corresponding to the target node is determined to be the target category.

After verifying online data corresponding to the target node, obtaining a second verification result, and determining that the class of the data corresponding to the target node is the target class when the second verification result meets a third condition, wherein the third condition may be a condition for determining that the class of the data corresponding to the target node is the target class, for example, a condition for determining that the class of the data corresponding to the target node is the hotlinking class, thereby discovering a novel hotlinking mode, wherein the second verification result comprises the novel hotlinking mode.

In the embodiment, after the node ranked as the preset sequence is marked as the target node, the party acquires online data corresponding to the target node, wherein the online data is behavior data of a client corresponding to the target node when online prediction is performed on a data processing model; verifying online data corresponding to the target node to obtain a second verification result; and under the condition that the second verification result meets the third condition, determining the class of the data corresponding to the target node as the target class, and realizing the purpose of on-line prediction of the data processing model so as to improve the data processing efficiency.

This embodiment can be applied to multimedia playback, for example, video playback. After the big data intelligent anti-theft chain is used, under the condition that a plurality of novel anti-theft chain modes are encountered, according to the grading scheme of the embodiment, a plurality of combined hitting strategies can be realized according to different categories, and suspected novel anti-theft chains can be identified, so that the novel anti-theft chain modes are identified and prevented in advance, and the data processing efficiency is improved.

It should be noted that, the application of the embodiment to the video playing scene is only a preferred embodiment of the present invention, and the scheme of the embodiment of the present invention is not limited to the video playing scene, and any scene that needs to identify the request data, improve the security of the data processing, and improve the efficiency of the data processing is within the scope of the embodiment of the present invention, which is not illustrated herein.

Example 2

The technical scheme of the invention is described below with reference to the preferred embodiments, and specifically, when judging whether the request data is a hotlinking request, the hotlinking request is hit quickly according to the corresponding policy, and the method is illustrated.

In the embodiment, whether the request data is a hotlinking request or not is judged through the data processing model, and meanwhile, a security level of the request can be identified according to the difference of data expression; according to the security level, a novel hotlinking mode is discovered, relevant rules are discovered and operated at the first time, and hotlinking requests are quickly hit according to strategies obtained by the relevant rules.

The embodiment mainly comprises a hotlink system rating architecture and a data processing model. The anti-theft chain system rating framework is mainly used for describing the whole framework and principle of the anti-theft chain system; the data processing model can be used for building, training, judging and the like of the behavior feature model in the multimedia playing.

FIG. 9 is a schematic diagram of a data processing system according to an embodiment of the present invention. As shown in fig. 9, in the overall architecture of the anti-hotlinking process, the client (video, music, document, animation) of this embodiment may send data corresponding to characteristic behaviors such as advertisement behaviors, quality behaviors, other behaviors (opening, using, etc.) to the background server to perform model training, and refine the algorithm model. And carrying out model judgment through the trained data processing model, and further carrying out asynchronous authentication. The client in the embodiment can perform synchronous authentication preferentially, process multimedia resources, and then perform asynchronous authentication, so that a finer result score is output on the basis of an asynchronous anti-theft chain, security properties are given according to different scores, and corresponding data and strategies are sent to the client.

The basic principles and technical architecture of the policy scoring system are described below.

FIG. 10 is a schematic diagram of a policy scoring system architecture according to an embodiment of the invention. As shown in fig. 10, the sample data may be behavior data of a client, and the sample data is accessed into a data processing system, and the data processing system processes the sample data to obtain a processing result. The data processing system inputs the processing result to the strategy system, the strategy system acquires the strategy according to the processing result, the strategy is further input to the big data modeling system, the big data modeling system tests the data processing model to obtain a test result, the data verification system verifies the test result and inputs the verified data to the judgment system, and the judgment system can acquire the query judgment result through the query data and further conduct judgment service.

FIG. 11 is another policy scoring system architecture according to an embodiment of the invention. As shown in fig. 10 and 11, the sample data includes video playing authentication, playing time data, reporting time data, heartbeat, quality data, playing quality reporting, advertisement reporting, heartbeat reporting and other actions, video playing time requested data, video ID, user IP, channel, platform number, client collected player environment information, user ID, login information and other parameters. The method comprises the steps of inputting sample data into a data processing system, abstracting behavior data into multidimensional vector data by the data processing system, extracting features, classifying and grading the vector data in a strategy grading system through strategies, and adding and updating the strategies by the strategy system. The strategy system inputs vector data into the big data modeling system, trains and tests the model through the machine learning model, automatically analyzes and verifies the vector data through the data verification system, and feeds back verification results so as to adjust the data processing model. Meanwhile, the big data modeling system also provides a decision strategy to a decision system through the adjusted data processing model, the decision system performs decision service according to the strategy and data entering decision operation, and outputs a result, and the strategy is updated in the strategy system through analysis of the result.

According to the embodiment, the video playing request data of each user can be classified by using the model data in the final data processing model, different strategies are adopted according to the classification condition, and the data processing efficiency is improved.

FIG. 12 is a schematic diagram of hierarchical logic according to an embodiment of the present invention. As shown in fig. 12, the request is authenticated to obtain normal behavior data and abnormal behavior data, the normal behavior data is further divided into normal behavior information and abnormal behavior information, the abnormal behavior data is further divided into normal behavior information and abnormal behavior information, the normal behavior information in the normal behavior data is further divided into normal information 1 and abnormal information 7, the abnormal behavior information in the normal behavior data is further divided into normal information 2 and abnormal information 8, the normal behavior information in the abnormal behavior data is divided into normal information 3 and abnormal information 9, and the abnormal behavior information in the abnormal behavior data is divided into normal information 4 and abnormal information 10, thereby realizing continuous subdivision of information.

In order to ensure the accuracy of judging whether the request data is a hotlinking request by the data processing model, the embodiment can train the data processing model by sampling a large amount of latest data every preset time, for example, train the data processing model by sampling a large amount of latest data every day, so that the data processing model continuously learns and optimizes, intelligently adjusts the judging strategy and further achieves the optimal striking effect. Meanwhile, in the model training test process, the model parameters can be dynamically adjusted through data verification, so that the recognition effect of the data processing model is improved.

The data processing model of the embodiment can score suspected novel hotlinking modes outside classification, and the judged abnormal data can be manually analyzed and abstracted and then the classification strategy is updated, so that virtuous circle is realized.

The data processing model is described further below.

In this embodiment, a multi-classification model can be used to predict various hotlinking conditions, as follows:

and obtaining behavior data corresponding to the characteristic behavior generated by the client when the data are requested, and predicting the hotlinking according to the behavior data corresponding to the characteristic behavior generated by the client when the data are requested. When the normal client requests, the requests of each module are called according to the established sequence and frequency, and the number and frequency of the generated module requests are taken as the behavior characteristics of the client. Optionally, the usage rules determine whether each request satisfies the condition of a request that is a normal client. After the usage rules determine that each request satisfies the condition of the request being a normal client, each data request may be marked with a flag indicating the status of the requested data, e.g., a flag indicating a normal request, a flag indicating a hotlinking request, a flag indicating a suspected hotlinking request, etc. Optionally, each tag may be further subdivided to improve the accuracy with which the data processing model identifies the request.

This embodiment may use a decision tree model, but is not limited to, as a base model to make classification predictions for various states when training a data processing model. Optionally, calculating importance of each factor of the samples including the behavior data (gini coefficient, information gain rate, etc. may be used), selecting the factor of the current sample that is most important, and taking the most important factor as a root node of the data processing model; the factor points that can be used to calculate the maximum split sample class are selected, and the sample data is split into two parts according to the factor points, as the leaf nodes of the data processing model. And repeating the steps until the result reaches a preset condition, wherein the preset condition can be a condition for carrying out grading processing on the security level of the request of the client, and can also be a data processing model with standard accuracy and recall rate.

This embodiment may also analyze the data processing model. After a data processing model with the accuracy rate and the recall rate reaching standards is generated, a tester is required to carry out logic analysis on a decision tree on the data processing model, model logic of each type in the data processing model is obtained through logic analysis, and paths which possibly generate novel hotlinking behaviors are marked for online use.

Logic of the decision tree subtree based on gini coefficients is described below.

FIG. 13 is a logic diagram of a decision tree according to an embodiment of the present invention. As shown in fig. 13, where samples are used to represent the number of samples entering the subtree, value is used to represent the number under each category (value _i The number of i-th categories used to represent this node). Alternatively, when a leaf node has an absolute dominant class, the corresponding path can be taken as the path result, i.e., the predicted result, of this leaf node. gini is used to represent gini coefficients, E [ X ] for a random variable X if p (X) is used to represent the probability density function of X]Representing the expectation of X, then for n sample observations of X _i The expression formula of gini coefficients is:

it should be noted that, the numerical values corresponding to X [1] and samples, value, gini in the embodiment shown in fig. 13 are merely examples of the structure of the decision tree model according to the embodiment of the present invention, and do not limit the data processing model according to the embodiment of the present invention.

In order to discover the new hotlinking mode, this embodiment also defines several statistics for representing the new hotlinking behavior index of the leaf path. Including a path generalization index and a path class index.

Path generalization index:

the larger the value of the path generalization index, the better the logical generalization capability that represents this leaf path.

Path class index:

when the path class index value is larger, the path class prediction accuracy is higher; the lower the path category index value, the greater the degree of category confusion that represents the path. A high degree of confusion means that there is new kind of hotlinking behaviour.

Novel hotlinking mode scoring:

the higher the value of the new hotlinking pattern score, the higher the degree of potential hotlinking new behavior in the sample corresponding to the leaf node.

After training the data processing model, assuming a tree depth of 8, the maximum number of leaf nodes is 2 ^8-1 =128; filtering out path class index

(k represents accuracy, say 90%) nodes, can be selected in the data processing model according to the new behavior pattern index rankingThe suspected leaf nodes are marked with a plurality of nodes arranged in front as the suspected new-behavior hotlinking nodes, and corresponding sample data are taken out for operation analysis.

Alternatively, the data processing model of this embodiment may be generated using a variety of evaluation methods, including, without limitation, information gain rate, and the like.

This embodiment may also make an online prediction of the data processing model. The method can use the trained and analyzed decision tree model to carry out on-line prediction of a plurality of rated categories, carry out real-time striking feedback on the hotlinking request, and analyze and verify the data line of the suspected new-behavior hotlinking point, thereby finding a novel hotlinking mode.

In this embodiment, after the big data intelligent anti-theft chain is used, a plurality of novel anti-theft chain modes are encountered, the former models are difficult to identify, and according to the grading scheme provided by the embodiment of the invention, the algorithm model is refined on the basis of the intelligent asynchronous anti-theft chain, so that the algorithm model can output finer result grading, the safety property is given according to different grading, and more comprehensive and finer anti-theft chain logic is achieved, so that various combined hit strategies are realized according to different categories, suspected novel anti-theft chains can be identified, and the novel anti-theft chain modes are identified and prevented in advance, thereby improving the data processing efficiency.

In this embodiment, the behavior feature may be continuously optimized, for example, the screenshot report when the advertisement is played in the player and the behavior that the user opens the top page, the detail page, etc. The classifier can select a more extensive classification algorithm with multiple classification capabilities according to different data manifestations. The usage scenario may be used for any scenario where security detection of the requested data is required, and the hotlinking request is hit, without limitation.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

Example 3

According to an embodiment of the present invention, there is also provided a data processing apparatus for implementing the above data processing method. Fig. 14 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention. As shown in fig. 14, the apparatus may include: an acquisition unit 10, a first processing unit 20, a second processing unit 30, and an operation unit 40.

An obtaining unit 10, configured to obtain behavior data of the client.

The first processing unit 20 is configured to process the behavior data into model data of a data processing model, where the data processing model with the model data is configured to perform hierarchical processing on a security level of a first request, where the first request is used for a client to request to trigger a target event.

The second processing unit 30 is configured to perform a hierarchical processing on the security level of the first request through a data processing model with model data, to obtain a hierarchical result of the first request, where the hierarchical result is used to indicate the security level of the first request.

An operation unit 40 for performing an operation corresponding to the classification result on the first request.

It should be noted that, the acquiring unit 10 in this embodiment may be used to perform step S202 in embodiment 1 of the present application, the first processing unit 20 in this embodiment may be used to perform step S204 in embodiment 1 of the present application, the second processing unit 30 in this embodiment may be used to perform step S206 in embodiment 1 of the present application, and the operating unit 40 in this embodiment may be used to perform step S208 in embodiment 1 of the present application.

The operation unit 40 includes: the device comprises a determining module, an acquiring module and an operating module. The determining module is used for determining the first request as a target request under the condition that the grading result is not an existing grading result in the data processing model; the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is used for acquiring a first operation instruction, the first operation instruction is used for processing abnormal data in a target request, the operation corresponding to the classification result comprises a first operation indicated by the first operation instruction, and the first operation instruction is also used for indicating to perform the first operation on the request with the same category as the first request; and the operation module is used for performing a first operation on the first request and adding a first operation instruction into the data processing model.

The first processing unit 20 includes: a filtering module and a processing module. The system comprises a filtering module, a data processing module and a data processing module, wherein the filtering module is used for filtering behavior data to obtain vector data of the data processing model, and the vector data comprises a plurality of dimensions; and the processing module is used for processing the vector data into model data of a data processing model.

Optionally, the apparatus further comprises: a test unit and an adjustment unit. The testing unit is used for testing the model data after processing the behavior data into the model data of the data processing model to obtain a testing result; and the adjusting unit is used for feeding back the test result to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model.

Optionally, the adjusting unit includes: the device comprises a verification module and an adjustment module. The verification module is used for verifying the test result to obtain a first verification result; and the adjusting module is used for adjusting parameters of the data processing model under the condition that the first verification result meets the first condition.

Optionally, the apparatus further includes an adding unit, configured to add, when the classification result of the first request is obtained by performing the classification processing on the security level of the first request through the data processing model with the model data, a first identifier for identifying a security state of the first request if the first request meets a second condition.

Optionally, the apparatus further comprises: the device comprises a first acquisition unit, a third processing unit, a fourth processing unit and a fourth processing unit. The first acquisition unit is used for acquiring a target sample of the data processing model and a sample observation value of the target sample when the behavior data are processed into model data of the data processing model; the third processing unit is used for processing the number of samples and the number of sample observation values of the target samples into a first index of the data processing model, wherein the target samples comprise behavior data, and the first index is used for indicating the generalization capability of the client side for processing requests by a data processing path in the data processing model; a fourth processing unit, configured to process, in the target samples, the number of samples corresponding to each category of the target samples and the number of samples of the target samples into a second index of the data processing model, where the second index is used to indicate an accuracy rate of determining a category of the request of the client; and the fourth processing unit is used for processing the first index and the second index into class scores of the data processing model, wherein the class scores are used for indicating the degree that each class of the target sample is the target class.

Optionally, the third processing unit is configured to obtain the first index a by the following first formula:

Optionally, the fourth processing unit is configured to obtain the second index B by the following second formula:

Optionally, the fourth processing unit is configured to obtain the category score C by the following third formula:

Optionally, the apparatus further comprises: the filtering unit is used for filtering out nodes corresponding to the second index which is larger than the target value in the data processing model after the first index and the second index are processed into the class scores of the data processing model; the sorting unit is used for sorting the filtered nodes in the data processing model according to the class scores corresponding to the filtered nodes to obtain sorting results; the marking unit is used for marking the node ranked as the preset sequence as a target node in the sequencing result, wherein the behavior data corresponding to the target node is to be determined as the hotlinking behavior data of the client.

Optionally, the apparatus further comprises: the device comprises a second acquisition unit, a verification unit and a determination unit. The second obtaining unit is used for obtaining online data corresponding to the target node after marking the node ranked as the preset sequence as the target node, wherein the online data is behavior data of a client corresponding to the target node when online prediction is performed on the data processing model; the verification unit is used for verifying the online data corresponding to the target node to obtain a second verification result; and the determining unit is used for determining the category of the data corresponding to the target node as the target category under the condition that the second verification result meets the third condition.

The embodiment obtains the behavior data of the client through the obtaining unit 10, processes the behavior data into the model data of the data processing model through the first processing unit 20, wherein the data processing model is provided with the model data and is used for carrying out grading processing on the security level of the first request, the first request is used for triggering the target event by the client, the second processing unit 30 carries out grading processing on the security level of the first request through the data processing model provided with the model data, and a grading result of the first request is obtained, wherein the grading result is used for indicating the security level of the first request, and an operation corresponding to the grading result is carried out on the first request through the operation unit 40. The security level of the request of the client can be identified, and then the operation corresponding to the security level is adopted, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency in the related technology is solved.

It should be noted here that the above units and modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in the above embodiment 1. It should be noted that the above modules may be implemented in software or in hardware as part of the apparatus shown in fig. 1, where the hardware environment includes a network environment.

Example 4

According to an embodiment of the present invention, an electronic device for implementing the above data processing method is also provided.

Fig. 15 is a block diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 15, the electronic device may include: one or more (only one is shown) processors 151, memory 153. Optionally, as shown in fig. 15, the electronic apparatus may further include a transmission device 155 and an input-output device 157.

The memory 153 may be used to store software programs and modules, such as program instructions/modules corresponding to the data processing methods and apparatuses in the embodiments of the present invention, and the processor 151 executes the software programs and modules stored in the memory 153 to perform various functional applications and data processing, that is, implement the data processing methods described above. Memory 153 may include high-speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, memory 153 may further include memory located remotely from processor 151, which may be connected to electronic devices through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 155 is used for receiving or transmitting data via a network, and may also be used for data transmission between the processor and the memory. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission means 155 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 155 is a Radio Frequency (RF) module for communicating with the internet wirelessly.

Specifically, the memory 153 is used for storing application programs.

Processor 151 may invoke the application stored in memory 153 via transmission 155 to perform the steps of:

acquiring behavior data of a client;

processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event;

grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request;

And performing an operation corresponding to the grading result on the first request.

The processor 151 is further configured to perform the steps of: determining the first request as a target request under the condition that the grading result is not an existing grading result in the data processing model; acquiring a first operation instruction, wherein the first operation instruction is used for processing abnormal data in a target request, the operation corresponding to the grading result comprises a first operation indicated by the first operation instruction, and the first operation instruction is also used for indicating to perform the first operation on the request with the same category as the first request; and performing a first operation indicated by the first operation instruction on the first request, and adding the first operation instruction into the data processing model.

The processor 151 is further configured to perform the steps of: filtering the behavior data to obtain vector data of a data processing model, wherein the vector data comprises a plurality of dimensions; the vector data is processed into model data of a data processing model.

The processor 151 is further configured to perform the steps of: after the behavior data are processed into model data of a data processing model, testing the model data to obtain a test result; and feeding back the test result to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model.

The processor 151 is further configured to perform the steps of: verifying the test result to obtain a first verification result; and adjusting parameters of the data processing model under the condition that the first verification result meets the first condition.

The processor 151 is further configured to perform the steps of: and when the security level of the first request is subjected to grading processing through a data processing model with model data to obtain a grading result of the first request, adding a first identifier to the first request under the condition that the first request meets a second condition, wherein the first identifier is used for identifying the security state of the first request.

The processor 151 is further configured to perform the steps of: when the behavior data are processed into model data of a data processing model, obtaining a target sample of the data processing model and a sample observation value of the target sample; processing the number of samples and the number of sample observations of the target sample as a first index of the data processing model, wherein the target sample comprises behavior data, the first index being used to indicate a generalization capability of the client to request processing by a data processing path in the data processing model; in the target samples, processing the sample number corresponding to each category of the target samples and the sample number of the target samples into a second index of the data processing model, wherein the second index is used for indicating the accuracy of determining the category of the request of the client; and processing the first index and the second index into class scores of the data processing model, wherein the class scores are used for indicating the degree that each class of the target sample is the target class.

The processor 151 is further configured to perform the steps of: the first index A is obtained by the following first formula:

The processor 151 is further configured to perform the steps of: the second index B is obtained by the following second formula:

The processor 151 is further configured to perform the steps of: obtaining a category score C by a third formula:

The processor 151 is further configured to perform the steps of: after the first index and the second index are processed into class scores of the data processing model, nodes corresponding to the second index which is larger than the target value are filtered out from the data processing model; sorting the filtered nodes in the data processing model according to class scores corresponding to the filtered nodes to obtain sorting results; and marking the nodes ranked as the preset sequences as target nodes in the sorting result, wherein the behavior data corresponding to the target nodes are to be determined as the hotlinking behavior data of the client.

The processor 151 is further configured to perform the steps of: after marking the node ranked as the preset sequence as a target node, acquiring online data corresponding to the target node, wherein the online data is behavior data of a client corresponding to the target node when online prediction is performed on a data processing model; verifying online data corresponding to the target node to obtain a second verification result; and under the condition that the second verification result meets the third condition, determining the class of the data corresponding to the target node as the target class.

By adopting the embodiment of the invention, a scheme for data processing is provided. Acquiring behavior data of a client; processing the behavior data into model data of a data processing model, wherein the data processing model is provided with the model data and is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for a client to request to trigger a target event; grading the security level of the first request through a data processing model with model data to obtain a grading result of the first request, wherein the grading result is used for indicating the security level of the first request; and performing an operation corresponding to the grading result on the first request. The security level of the request of the client can be identified, and then the operation corresponding to the security level is adopted, so that the technical effect of improving the data processing efficiency is achieved, and the technical problem of low data processing efficiency in the related technology is solved.

Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the structure shown in fig. 15 is merely illustrative, and the electronic device may be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 15 is not limited to the structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in fig. 15, or have a different configuration than shown in fig. 15.

Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program for instructing an electronic device to execute in conjunction with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

Example 5

The embodiment of the invention also provides a storage medium. Alternatively, in the present embodiment, the above-described storage medium may be used for program codes for executing the data processing method.

Alternatively, in this embodiment, the storage medium may be located on at least one network device of the plurality of network devices in the network shown in the above embodiment.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of:

acquiring behavior data of a client;

Optionally, the storage medium is further arranged to store program code for performing the steps of: determining the first request as a target request under the condition that the grading result is not an existing grading result in the data processing model; acquiring a first operation instruction, wherein the first operation instruction is used for processing abnormal data in a target request, the operation corresponding to the grading result comprises a first operation indicated by the first operation instruction, and the first operation instruction is also used for indicating to perform the first operation on the request with the same category as the first request; and performing a first operation indicated by the first operation instruction on the first request, and adding the first operation instruction into the data processing model.

Optionally, the storage medium is further arranged to store program code for performing the steps of: filtering the behavior data to obtain vector data of a data processing model, wherein the vector data comprises a plurality of dimensions; the vector data is processed into model data of a data processing model.

Optionally, the storage medium is further arranged to store program code for performing the steps of: after the behavior data are processed into model data of a data processing model, testing the model data to obtain a test result; and feeding back the test result to the data processing model, and adjusting the data processing model through the test result to obtain an adjusted data processing model.

Optionally, the storage medium is further arranged to store program code for performing the steps of: verifying the test result to obtain a first verification result; and adjusting parameters of the data processing model under the condition that the first verification result meets the first condition.

Optionally, the storage medium is further arranged to store program code for performing the steps of: and when the security level of the first request is subjected to grading processing through a data processing model with model data to obtain a grading result of the first request, adding a first identifier to the first request under the condition that the first request meets a second condition, wherein the first identifier is used for identifying the security state of the first request.

Optionally, the storage medium is further arranged to store program code for performing the steps of: when the behavior data are processed into model data of a data processing model, obtaining a target sample of the data processing model and a sample observation value of the target sample; processing the number of samples and the number of sample observations of a target sample into a first index of a data processing model, wherein the target sample comprises behavior data, and the first index is used for indicating the generalization capability of a client to process requests by a data processing path in the data processing model; in the target samples, processing the sample number corresponding to each category of the target samples and the sample number of the target samples into a second index of the data processing model, wherein the second index is used for indicating the accuracy of determining the category of the request of the client; and processing the first index and the second index into class scores of the data processing model, wherein the class scores are used for indicating the degree that each class of the target sample is the target class.

Optionally, the storage medium is further arranged to store program code for performing the steps of: the first index A is obtained by the following first formula:

Optionally, the storage medium is further arranged to store program code for performing the steps of: the second index B is obtained by the following second formula:

Optionally, the storage medium is further arranged to store program code for performing the steps of: obtaining a category score C by a third formula:

wherein samples are used for representing the number of samples of the target sample, n is used for representing the number of sample observations of the target sample, maxvalue _i Of a plurality of categories for representing that the number of samples under the ith category of the target sample is larger than the target sample, the categories other than the ith categoryThe number of samples corresponding to the category.

Optionally, the storage medium is further arranged to store program code for performing the steps of: after the first index and the second index are processed into class scores of the data processing model, nodes corresponding to the second index which is larger than the target value are filtered out from the data processing model; sorting the filtered nodes in the data processing model according to class scores corresponding to the filtered nodes to obtain sorting results; and marking the nodes ranked as the preset sequences as target nodes in the sorting result, wherein the behavior data corresponding to the target nodes are to be determined as the hotlinking behavior data of the client.

Optionally, the storage medium is further arranged to store program code for performing the steps of: after marking the node ranked as the preset sequence as a target node, acquiring online data corresponding to the target node, wherein the online data is behavior data of a client corresponding to the target node when online prediction is performed on a data processing model; verifying online data corresponding to the target node to obtain a second verification result; and under the condition that the second verification result meets the third condition, determining the class of the data corresponding to the target node as the target class.

Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method of data processing, comprising:

acquiring behavior data of a client;

processing the behavior data into model data of a data processing model, wherein the data processing model with the model data is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for the client to request to trigger a target event;

grading the security level of the first request through the data processing model with the model data to obtain a grading result of the first request, wherein the grading result comprises the following steps: the grading result is used for indicating the security level of the first request; performing an operation corresponding to the grading result on the first request;

Filtering out nodes corresponding to a second index larger than a target value in the data processing model, wherein the second index is a numerical value obtained by processing the number of samples corresponding to each category of a target sample and the number of samples of the target sample; sorting the filtered nodes in the data processing model according to class scores corresponding to the filtered nodes to obtain sorting results, wherein the class scores are used for indicating the degree that each class of the target sample is a target class; marking the node ranked as a preset sequence as a target node in the sorting result, wherein the target node is a suspected new-behavior hotlinking node; the second index is a ratio of a first parameter to a second parameter, the first parameter is used for indicating that the number of samples under the ith category of the target sample is larger than the number of samples under the categories except the ith category in a plurality of categories of the target sample, and the second parameter is used for indicating the number of samples of the target sample.

2. The method of claim 1, wherein the performing an operation on the first request corresponding to the ranking result comprises:

Determining that the first request is a target request under the condition that the grading result is not the grading result existing in the data processing model;

acquiring a first operation instruction, wherein the first operation instruction is used for processing abnormal data in the target request, the operation corresponding to the grading result comprises a first operation indicated by the first operation instruction, and the first operation instruction is also used for indicating to perform the first operation on a request with the same category as the first request;

the first operation is performed on the first request, and the first operation instruction is added to the data processing model.

3. The method of claim 1, wherein processing the behavior data into model data of a data processing model comprises:

filtering the behavior data to obtain vector data of the data processing model, wherein the vector data comprises a plurality of dimensions;

the vector data is processed into the model data of the data processing model.

4. The method of claim 1, wherein after said processing the behavior data into model data of a data processing model, the method further comprises:

Testing the model data to obtain a test result;

and feeding the test result back to the data processing model, and adjusting the data processing model through the test result to obtain the adjusted data processing model.

5. The method of claim 4, wherein adjusting the data processing model by the test results to obtain the adjusted data processing model comprises:

verifying the test result to obtain a first verification result;

and adjusting parameters of the data processing model under the condition that the first verification result meets a first condition.

6. The method of claim 1, wherein, in said ranking the security level of the first request by the data processing model having the model data, the method further comprises, when ranking the first request:

and adding a first identifier to the first request under the condition that the first request meets a second condition, wherein the first identifier is used for identifying the security state of the first request.

7. The method according to any one of claims 1 to 6, wherein in the processing of the behavior data into model data of a data processing model, the method further comprises:

Acquiring a target sample of the data processing model and a sample observation value of the target sample;

processing the number of samples of the target samples and the number of sample observations into a first index of the data processing model, wherein the target samples comprise the behavior data, and the first index is used for indicating the generalization capability of the client to process requests by a data processing path in the data processing model;

processing the sample number corresponding to each category of the target sample and the sample number of the target sample into a second index of the data processing model, wherein the second index is used for indicating the accuracy of determining the category of the request of the client;

and processing the first index and the second index into class scores of the data processing model, wherein the class scores are used for indicating the degree that each class of the target sample is a target class.

8. The method of claim 7, wherein the step of determining the position of the probe is performed,

the processing the number of samples of the target sample and the number of sample observations into a first index of the data processing model comprises: the first index a is obtained by a first formula:

9. The method of claim 7, wherein processing the first index and the second index into a class score for the data processing model comprises: obtaining the category score C by a third formula:

wherein samples are used for representing the number of samples of the target sample, and n is used for representing the number of sample observations of the target sample, maxvalue _i The number of samples in the ith class used for representing the target sample is larger than the number of samples corresponding to the classes except the ith class in the multiple classes of the target sample.

10. The method of claim 1, wherein after the marking the nodes ranked as the preset sequence as target nodes, the method further comprises:

acquiring online data corresponding to the target node, wherein the online data is behavior data of the client corresponding to the target node when online prediction is performed on the data processing model;

verifying the online data corresponding to the target node to obtain a second verification result;

And under the condition that the second verification result meets a third condition, determining the class of the data corresponding to the target node as the target class.

11. A data processing apparatus, comprising:

the acquisition unit is used for acquiring behavior data of the client;

the first processing unit is used for processing the behavior data into model data of a data processing model, wherein the data processing model with the model data is used for carrying out hierarchical processing on the security level of a first request, and the first request is used for the client to request a triggering target event;

the second processing unit is configured to perform hierarchical processing on the security level of the first request through the data processing model with the model data, to obtain a hierarchical result of the first request, and includes: the grading result is used for indicating the security level of the first request;

an operation unit, configured to perform an operation corresponding to the classification result on the first request;

the data processing apparatus is further configured to: filtering out nodes corresponding to a second index larger than a target value in the data processing model, wherein the second index is a numerical value obtained by processing the number of samples corresponding to each category of a target sample and the number of samples of the target sample; sorting the filtered nodes in the data processing model according to class scores corresponding to the filtered nodes to obtain sorting results, wherein the class scores are used for indicating the degree that each class of the target sample is a target class; marking the node ranked as a preset sequence as a target node in the sorting result, wherein the target node is a suspected new-behavior hotlinking node; the second index is a ratio of a first parameter to a second parameter, the first parameter is used for indicating that the number of samples under the ith category of the target sample is larger than the number of samples under the categories except the ith category in a plurality of categories of the target sample, and the second parameter is used for indicating the number of samples of the target sample.

12. A storage medium comprising a stored program, wherein the program when run performs the data processing method of any one of claims 1 to 10.

13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor performs the data processing method according to any of the claims 1 to 10 by means of the computer program.