CN105868991B - Method and device for identifying machine-assisted cheating - Google Patents

Method and device for identifying machine-assisted cheating Download PDF

Info

Publication number
CN105868991B
CN105868991B CN201510031927.6A CN201510031927A CN105868991B CN 105868991 B CN105868991 B CN 105868991B CN 201510031927 A CN201510031927 A CN 201510031927A CN 105868991 B CN105868991 B CN 105868991B
Authority
CN
China
Prior art keywords
sequence
time
time interval
cheating
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510031927.6A
Other languages
Chinese (zh)
Other versions
CN105868991A (en
Inventor
郑丹丹
林述民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510031927.6A priority Critical patent/CN105868991B/en
Publication of CN105868991A publication Critical patent/CN105868991A/en
Application granted granted Critical
Publication of CN105868991B publication Critical patent/CN105868991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a method and a device for identifying machine-assisted cheating, wherein the method comprises the following steps: generating a time interval sequence according to the operation time of the equipment or account to be evaluated; discretizing the time interval sequence to form a discretized time interval sequence; calculating an entropy rate of the sequence of discretized time intervals; when the entropy rate is smaller than a preset first threshold value, identifying that the equipment or the account to be evaluated uses machine-assisted cheating during operation. According to the application, by calculating the entropy rate of the discretization time interval sequence, rhythmic operation or non-rhythmic operation can be accurately distinguished, so that machine-assisted cheating can be accurately identified.

Description

Method and device for identifying machine-assisted cheating
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying whether a device or an account is cheated by machine assistance during operation.
Background
In order to attract traffic and customers, large internet enterprises often provide a lot of marketing resources, such as gold coins sent when logging in, red packages sent when registering newly, and the like. The marketing resources attract a plurality of lawbreakers to utilize the machine to assist cheating, seize the marketing resources and even sell the seized marketing resources to competitors of the enterprise, so that the enterprise not only can not successfully drain the marketing resources, but also benefits the competitors.
In the prior art, one of the following methods can be generally adopted to determine whether an account or a device is suspected to have machine-assisted cheating:
the method I includes the steps that operation frequency of the account in a certain time period is counted, and if the operation frequency exceeds a certain frequency threshold value, the account is considered to be abnormal. However, the method cannot distinguish whether the exception is the system operation exception encountered by a normal user or the exception caused by the cheating account using a machine for cheating.
And secondly, counting the number of the accounts registered on the same equipment, and when the number of the accounts exceeds a certain threshold value and the frequency of operation on the same equipment in a certain time period is abnormal, determining that the condition is abnormal.
And thirdly, calculating the change interval of the operation time interval of the account or the equipment so as to judge whether the suspicion of machine-assisted cheating exists or not. Generally, a machine-assisted account or device may operate at a very fixed time interval, such as 10s per login interval, with the time interval changing to 0. Generally, the variation interval of the operation time interval can be measured according to statistical variables such as entropy or Coefficient of Variation (CV).
During the research and practice of the prior art, the inventor finds that the methods respectively have the following problems:
for the first method: the frequency threshold is difficult to determine and is generally set to be particularly large, so that the cheating account can easily bypass the setting; if the frequency threshold is relatively small, the system operation abnormity encountered by normal users, or abnormity caused by marketing activities and other reasons, or abnormity caused by cheating when a machine is used by a cheating account cannot be distinguished.
For the second method: firstly, the judgment of the same device is a complex problem, and the same device is often judged according to information such as a mobile phone number, an ip, a mac, a umid and/or a tmid, but due to the problems of business factors, data quality, data certainty and the like, whether the same device is located or not can not be accurately determined through the information; in addition, common to a family on the device, each individual may have forgotten or established several accounts, resulting in a large number of accounts. Therefore, in the second method, whether machine cheating exists is judged simply according to the fact that the number of accounts on the same mobile phone number/ip/mac/umid/tmid exceeds a threshold value, and not only is the judgment inaccurate, but also mistaken injuries are caused.
For method three: statistical variables such as entropy and CV values lose sequential information of the sequence, for example, the sequence 123412341234 is actually more rhythmic than the sequence 122321123133, though the entropy and CV values are larger, and is more machine-generated. Therefore, the third method is also likely to cause erroneous judgment.
Disclosure of Invention
The embodiment of the application aims to provide a method and a device for identifying machine-assisted cheating so as to accurately judge whether the device or account to be evaluated is suspected of having machine-assisted cheating during operation.
In order to solve the above technical problem, an embodiment of the present application provides a method for identifying machine-assisted cheating, including:
generating a time interval sequence according to the operation time of the equipment or account to be evaluated;
discretizing the time interval sequence to form a discretized time interval sequence;
calculating an entropy rate of the sequence of discretized time intervals;
when the entropy rate is smaller than a preset first threshold value, identifying that the equipment or the account to be evaluated uses machine-assisted cheating during operation.
The device for identifying machine-assisted cheating provided by the embodiment of the application is realized by the following steps:
the generating unit is used for generating a time interval sequence according to the operation time of the equipment or the account to be evaluated;
a discretization unit for discretizing the sequence of time intervals to form a discretized sequence of time intervals;
a calculating unit for calculating an entropy rate of the discretized time interval sequence;
an identifying unit, configured to identify that the device or account to be evaluated uses machine-assisted cheating while operating when the entropy rate is less than a preset first threshold.
According to the technical scheme provided by the embodiment of the application, the interval sequence of the operation time of the analysis equipment or the account is evaluated through analyzing links such as registration, login and activation, the entropy rate reflecting the regularity and the rhythmicity of the interval sequence is calculated, and whether the account or the equipment is suspected to have auxiliary cheating by an organic device is judged.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.
FIG. 1 is a schematic flow chart diagram of a method of identifying machine-assisted cheating according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a sequence of time intervals generated according to an embodiment of the present application;
FIG. 3 is an exemplary diagram of a first time series of an embodiment of the present application;
FIG. 4 is a diagram of an example of a sequence of time intervals for an embodiment of the present application;
FIG. 5 is a schematic flow chart of calculating entropy rate for a sequence of discretized time intervals in an embodiment of the subject application;
FIG. 6 is a schematic diagram of the conditional entropy CE and entropy rate versus order L;
fig. 7 is a schematic diagram of the conditional entropy CE, the offset, and the modified conditional entropy CCE as a function of the order L;
FIG. 8 is a schematic diagram of an embodiment of the present application showing a machine-assisted cheating identification system;
FIG. 9 is a schematic diagram of a component of a generation unit according to an embodiment of the present application;
fig. 10 is a schematic diagram of a computing unit according to an embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
The embodiment of the application provides a method for identifying machine-assisted cheating, which is used for identifying whether equipment or an account to be evaluated is suspected of having machine-assisted cheating during operation. Fig. 1 is a schematic flowchart of a method for identifying machine-assisted cheating according to the present embodiment, and as shown in fig. 1, the method may include:
s101: generating a time interval sequence according to the operation time of the equipment or account to be evaluated;
s102: discretizing the time interval sequence to form a discretized time interval sequence;
s103: calculating an entropy rate of the sequence of discretized time intervals;
s104: when the entropy rate is smaller than a preset first threshold value, identifying that the equipment or the account to be evaluated uses machine-assisted cheating during operation.
In this embodiment, the regularity and the rhythmicity of the time interval sequence are evaluated by analyzing the interval sequence of the operation time of the device or the account to be evaluated and calculating the entropy rate, so as to determine whether the account or the device is suspected to have organic auxiliary cheating.
Next, each step will be described separately.
Fig. 2 is a schematic flow chart of step S101. As shown in fig. 2, step S101 may include:
s201: extracting the operation time of all accounts on the equipment to be evaluated or the operation time of the accounts to be evaluated on all the equipment to generate a first time sequence;
s202: reordering the first time sequence according to the sequence of time to form a second time sequence;
s203: and extracting time intervals of adjacent time in the second time sequence to form a time interval sequence.
In S201, for a device to be evaluated, operation times of all accounts on the device may be extracted, where the operation times may be times when operations such as account registration, login, and activation occur, so as to generate a first time sequence corresponding to the device to be evaluated. In this embodiment, the device to be evaluated may be marked according to information such as a mobile phone number, ip, mac, umid, and/or tmid, so as to facilitate extracting the operation time of the account on the device.
For example, on a device to be evaluated with a device number of IC6F65B6EFBB, hundreds of accounts are registered in total, and based on the time of operations such as registration, login, and activation of these accounts, a first time series as shown in fig. 3 may be generated, and of course, fig. 3 only shows a part of the first time series.
In S201 of this embodiment, for the account to be evaluated, the time when the account registration, login, activation, and other operations of the account on all devices occur may be extracted, so as to generate the first time series.
In step S202, each time in the first time sequence may be reordered according to the chronological order, so as to generate a second time sequence. For example, in the second time series, the preceding time may be arranged before or after.
In S203, time intervals of adjacent times in the second time series may be calculated to form a time interval series, for example, a difference between the time and a previous time may be calculated for each time in the second time series, thereby forming a time interval series as shown in the rightmost column of fig. 4, where the first term of the time interval series is zero. It should be noted that the present embodiment is not limited to this, and other methods may be used to form the time interval sequence, for example, calculating the difference between each time in the second time sequence and the next time to form the time interval sequence.
In addition, in this embodiment, the number of elements of the generated time interval sequence may also be determined, and when the number of elements is less than a preset threshold, the device or account to be evaluated is reselected without performing subsequent processing on the time interval sequence, and the step S101 is performed, where the preset threshold may be 20 or another value, for example. Thus, the effectiveness of entropy rate calculations for time interval sequences can be guaranteed because if the number of elements of a time interval sequence is too small, it may indicate that the device or account activity is too low and there is insufficient information to help determine whether machine-assisted cheating exists.
Next, in step S102, the time interval series generated in step S203 is subjected to discretization processing with an hour, day, or month as a criterion to generate a discretized time interval series.
In general, when calculating the sequence entropy rate, a method of "(max-min)/bucket number" is used to discretize the sequence, which is very effective for sequences with relatively stable upper and lower limit values.
However, for the time interval sequence of the present embodiment, the time interval span may be very large or very small, and the upper limit value and the lower limit value are not stable; also, some machine-assisted cheating is done in a very high frequency manner, while other machine-assisted cheating is spread over a time horizon of days or weeks in a timed task-like manner in order to bypass anti-cheating rules. If the original (max-min)/number of buckets "is still used to discretize the time interval sequence, the following two problems are encountered:
1. the upper limit value of some time intervals is too large, so that the time intervals with smaller contrast are not distinguished;
2. some time intervals have an upper limit that is too small, resulting in too fine a division of the time interval sequence.
Too large or too small an upper time interval may be associated with machine-assisted cheating, which may be the case where detection is desired. In the embodiment, the time interval sequence is divided, that is, discretized, by taking the hour, day or month as a division unit, so that the discretized time interval sequence is formed, which not only meets the understanding of people, but also meets the requirement of anti-cheating.
In step S103, an entropy rate of the discretized time interval series is calculated. In this embodiment, different entropy rates may be obtained according to the segmentation result that differs for the sequence of discretized time intervals, and the minimum value of the entropy rates may be taken as the entropy rate of the sequence of discretized time intervals. Of course, the present embodiment is not limited to this, and other methods may be adopted to discretize the entropy rate of the time interval sequence.
Fig. 5 is a schematic flow chart of step S103. As shown in fig. 5, calculating the entropy rate of the sequence of discretized time intervals can include the steps of:
step S501, according to different orders, carrying out segmentation processing corresponding to the orders on the discretization time interval sequence to form different segmentation results;
step S502, calculating correction condition entropies corresponding to all segmentation results;
step S503, determining the minimum value of the modified conditional entropies corresponding to the segmentation results as the entropy rate of the discretized time interval sequence.
In step S501, the discretized time interval sequence can be segmented in a manner corresponding to 1 st, 2 nd, and … … K th orders, respectively, and as a result of segmentation for each order, m-L +1 segmented elements can be obtained, where m represents the total number of elements of the discretized time interval sequence, L represents an order, and 1 ≦ L ≦ K.
For example, for a discretized time interval sequence of "12312345678," the 1 st, 2 nd, and 3 rd segmentation results are:
1, segmentation: 1,2,3,1,2,3,4,5,6,7,8
And 2, segmentation: 12,23,31,12,23,34,45,56,67,78
3, segmentation: 123, 231, 312, 123, 234, 345, 456, 567, 678
In the above example, the total number m of elements of the discretized time interval sequence is 11, the 1 st segmentation result includes 11 elements, the 2 nd segmentation result includes 10 elements, and the 3 rd segmentation result includes 11 elements.
In step S502, Entropy Rates (ER) corresponding to the respective segmentation results are calculated, and in step S503, the minimum value of the Entropy rates calculated in step S502 is determined as the Entropy Rate of the discretized time interval sequence.
Entropy rate measures the complexity of a stochastic process, low entropy rate representing regular sequences and high entropy rate representing sequences that tend to be random. Entropy rate is defined as the rate of increase of the entropy of a sequence with N, given a sequence of random variables of length N.
In this embodiment, for the finite property of the discretized time series, the Entropy rate of the discretized time series can be calculated by calculating the Corrected Conditional Entropy (CCE) of the segmentation result corresponding to each order.
For example, in the present embodiment, the modified conditional entropy of each segment result can be calculated by the following formula (1):
Figure BDA0000660066810000061
wherein, the meaning of each parameter is as follows:
CE represents an estimate of conditional entropy, measuring the increment of the change of spatial information from order L-1 to order L, and CE-0 represents that a sequence of length N can be almost completely predicted by a sequence of length N-1, such as 123123123 …; CE (1), representing that each element of the sequence is completely independent, each new element adding information amount to the sequence around E (1);
e (L) represents the entropy of the sequence obtained after L-order segmentation, and the information content of the sequence obtained after L-order segmentation is measured;
PLrepresents the proportion of each element in the sequence obtained after L-stage segmentation in all elements, for example, in the above 1-stage segmentation result, the numbers of elements "1", "2", "3" and "4" are respectively 2,2 and 1, and therefore, the proportion P of the elements "1", "2", "3" and "4" in all 11 elements 12/11, 2/11, 2/11, 1/11, respectively;
the Bias represents the offset of the conditional entropy relative to the entropy rate and is used for correcting the conditional entropy CE;
perc (L) represents the proportion of single-point elements in the segmentation result corresponding to the order L among all the elements, for example, in the 1 st segmentation result, the elements "4", "5", "6", "7" and "8" all belong to single-point elements, so that the number of single-point elements is 5, and the proportion perc (1) is 5/11;
e (1) represents a Scale factor representing the theoretical white noise value of the same distribution of the sequence, the value of which is equal to the entropy of the sequence obtained after 1 st segmentation
Figure BDA0000660066810000062
Wherein, P1Representing each element in the segmentation result of order 1 in its entiretyThe ratio of elements.
In the above equation (1), as L increases, a single point increases, CE decreases, and Bias increases, so that there is a minimum value of CCE for CE + Bias, which can be an optimal estimation value of the finite sequence entropy rate.
Fig. 6 shows the conditional entropy CE and the entropy rate as a function of the order L, and fig. 7 shows the conditional entropy CE, the offset, and the modified conditional entropy CCE as a function of the order L. As can be seen from fig. 6 and 7, the minimum value of a CCE is close to the entropy rate, and therefore, this minimum value of a CCE can be used as the best estimate of the finite sequence entropy rate.
Therefore, in step S503, the minimum value of the modified conditional entropy CCEs of the segmentation results corresponding to the plurality of orders L calculated in step S502 may be used as the entropy rate of the discretized time interval sequence.
Table 1 below shows the results of calculating the sequences obtained by the 1 st, 2 nd and 3 rd order segmentation obtained in step S501, respectively, according to equation (1) above.
Table 1:
Figure BDA0000660066810000071
also, in another example, if the sequence of discretized time intervals is 123123123123, then the calculation yields an entropy rate of 0.
In step S104, the entropy rate obtained in step S503 is compared with a preset first threshold, and when the entropy rate is smaller than the preset first threshold, it is identified that the device or account to be evaluated uses machine-assisted cheating during operation.
In this embodiment, the first threshold may be determined by verifying on the red packet cheating account data and the normal account sample sequence, respectively calculating a result of generating the entropy rate, and evaluating the result of the positive and negative samples. For example, the first threshold may be 0.8, i.e., when the entropy rate of the sequence of discretized time intervals is less than 0.8, machine-assisted cheating may be identified as the device or the account being used in operation.
According to the above-mentioned embodiments of the present application, by calculating the entropy rate of the sequence of discretized time intervals, it is possible to accurately distinguish between rhythmic operations and non-rhythmic operations, thereby accurately identifying machine-assisted cheating.
Compared with the method I mentioned in the background art, the embodiment can accurately distinguish rhythmic operation or non-rhythmic operation, and even if normal users participate in marketing activities or encounter system operation abnormity, the rhythmic operation of large scale does not exist, so that the method has good distinguishing effect; compared with the second method mentioned in the background art, because the sequence entropy rate is 0 through the rhythmicity of the action time interval sequence on the device, such as logging 1000 times within 1 day on the ip, and the logging time interval is 123123123123 …, a typical batch of devices or accounts which may have machine cheating can be accurately filtered out according to the low entropy rate value produced by the embodiment; compared with the third method mentioned in the background art, the third embodiment considers the sequence of the time interval sequence, and discretizes the time interval sequence by taking hours, days and months as the standard, so that a more reasonable relative value is used to help judge whether the time interval sequence is regular and rhythmic.
Example 2
Embodiment 2 provides an apparatus for recognizing machine-assisted cheating, corresponding to the method for recognizing machine-assisted cheating of embodiment 1.
Fig. 8 is a schematic composition diagram of the apparatus for identifying machine-assisted cheating according to the present embodiment, and as shown in fig. 8, the apparatus 800 includes a generating unit 801, a discretization unit 802, a calculating unit 803, and an identifying unit 804.
The generation unit 801 is configured to generate a time interval sequence according to the operation time of the device or account to be evaluated; the discretization unit 802 is configured to discretize the sequence of time intervals to form a discretized sequence of time intervals; the calculating unit 803 is configured to calculate an entropy rate of the discretized time interval sequence; the identifying unit 804 is configured to identify that the device or the account to be evaluated uses machine-assisted cheating during operation when the entropy rate is smaller than a preset first threshold.
Fig. 9 is a schematic diagram of a composition of the generating unit of the present embodiment, and as shown in fig. 9, the generating unit 801 includes a first extracting unit 901, a sorting unit 902, and a second extracting unit 903.
The first extracting unit 901 is configured to extract operation times of all accounts on the device to be evaluated, or operation times of all the accounts on the device to be evaluated, so as to generate a first time sequence; the sorting unit 902 is configured to reorder the first time sequence according to a sequence of time to form a second time sequence; the second extraction unit 903 is configured to extract time intervals of adjacent times in the second time series to form a time interval series.
In the present embodiment, the discretization unit 802 divides the time interval series by the unit of division of hours, days, or months to form the discretized time interval series. In addition, in this embodiment, when the number of elements in the time interval sequence is less than a preset threshold, the discretization unit 802 may not perform discretization on the time interval sequence.
Fig. 10 is a schematic diagram of a composition of the calculation unit of the present embodiment, and as shown in fig. 10, the calculation unit 803 includes a segmentation unit 1001, a modified conditional entropy calculation unit 1002, and a determination unit 1003.
The segmenting unit 1001 is configured to perform segmentation processing on the discretization time interval sequence according to different orders, so as to form different segmentation results; the modified conditional entropy calculation unit 1002 is configured to calculate modified conditional entropies corresponding to the segmentation results; the determining unit 1003 is configured to determine a minimum value of the correction conditional entropies corresponding to the segmentation results as an entropy rate of the discretization time interval sequence.
In this embodiment, the identifying unit 804 may compare the entropy rate calculated by the calculating unit 803 with a preset first threshold, and may identify that the device or the account is in operation for machine-assisted cheating when the entropy rate is smaller than the preset first threshold.
For a detailed description of each unit in the device 800 for identifying machine-assisted cheating, please refer to the corresponding steps of embodiment 1, and the description of this embodiment will not be repeated.
According to the above-mentioned embodiments of the present application, by calculating the entropy rate of the sequence of discretized time intervals, it is possible to accurately distinguish between rhythmic operations and non-rhythmic operations, thereby accurately identifying machine-assisted cheating.
The apparatus for identifying machine-assisted cheating 800 of the present application or the elements therein may be embodied by a chip or an entity, or by a product having a certain function.
For convenience of description, the above devices or units are described as being divided into various units by function, and are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in a plurality of software and/or hardware when implementing the invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The invention is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
While the present invention has been described with respect to the embodiments, those skilled in the art will appreciate that there are numerous variations and permutations of the present invention without departing from the spirit of the invention, and it is intended that the appended claims cover such variations and modifications as fall within the true spirit of the invention.

Claims (6)

1. A method of identifying machine-assisted cheating, for use in a general purpose or special purpose computing system environment or configuration, comprising:
generating a time interval sequence according to the operation time of the equipment or account to be evaluated;
discretizing the time interval sequence to form a discretized time interval sequence;
calculating an entropy rate of the sequence of discretized time intervals;
identifying that the device or account to be evaluated uses machine-assisted cheating while operating when the entropy rate is less than a preset first threshold,
wherein calculating the entropy rate of the sequence of discretized time intervals comprises:
according to different orders, carrying out segmentation processing corresponding to the orders on the discretization time interval sequence to form different segmentation results;
calculating the correction condition entropy corresponding to each segmentation result; and
determining the minimum value in the correction condition entropies corresponding to the segmentation results as the entropy rate of the discretization time interval sequence,
wherein, the method also comprises:
and when the elements in the time interval sequence are less than a preset second threshold value, performing discretization processing on the time interval sequence.
2. The method of identifying machine-assisted cheating of claim 1, wherein generating a sequence of time intervals comprises:
extracting the operation time of all accounts on the equipment to be evaluated or the operation time of the accounts to be evaluated on all the equipment to generate a first time sequence;
reordering the first time sequence according to the sequence of time to form a second time sequence;
and extracting time intervals of adjacent time in the second time sequence to form a time interval sequence.
3. The method of identifying machine-assisted cheating of claim 1, wherein discretizing the sequence of time intervals comprises:
and dividing the time interval sequence by taking the hour, the day or the month as a dividing unit to form the discretized time interval sequence.
4. An apparatus to identify machine-assisted cheating, comprising:
the generating unit is used for generating a time interval sequence according to the operation time of the equipment or the account to be evaluated;
a discretization unit for discretizing the sequence of time intervals to form a discretized sequence of time intervals;
a calculating unit for calculating an entropy rate of the discretized time interval sequence;
an identifying unit for identifying that the device or account to be evaluated uses machine-assisted cheating while operating when the entropy rate is less than a preset first threshold,
wherein the calculation unit includes:
the segmentation unit is used for carrying out segmentation processing corresponding to different orders on the discretization time interval sequence according to the different orders to form different segmentation results;
a correction conditional entropy calculation unit for calculating correction conditional entropies corresponding to the segmentation results;
a determination unit configured to determine a minimum value of the correction condition entropies corresponding to the respective segmentation results as an entropy rate of the discretized time interval series,
wherein the content of the first and second substances,
when the number of elements in the time interval sequence is less than a preset second threshold, the discretization unit does not perform discretization processing on the time interval sequence.
5. The apparatus to identify machine-assisted cheating according to claim 4, wherein said generating unit comprises:
the first extraction unit is used for extracting the operation time of all accounts on the equipment to be evaluated or the operation time of all the accounts on the equipment to be evaluated so as to generate a first time sequence;
the sequencing unit is used for reordering the first time sequence according to the chronological order of time to form a second time sequence;
a second extraction unit for extracting time intervals of adjacent times in the second time series to form a time interval series.
6. The apparatus for recognizing machine-assisted cheating according to claim 4, wherein the discretizing unit divides a time interval sequence with an hour, a day, or a month as a division unit to form the discretized time interval sequence.
CN201510031927.6A 2015-01-22 2015-01-22 Method and device for identifying machine-assisted cheating Active CN105868991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510031927.6A CN105868991B (en) 2015-01-22 2015-01-22 Method and device for identifying machine-assisted cheating

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510031927.6A CN105868991B (en) 2015-01-22 2015-01-22 Method and device for identifying machine-assisted cheating

Publications (2)

Publication Number Publication Date
CN105868991A CN105868991A (en) 2016-08-17
CN105868991B true CN105868991B (en) 2020-09-04

Family

ID=56623338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510031927.6A Active CN105868991B (en) 2015-01-22 2015-01-22 Method and device for identifying machine-assisted cheating

Country Status (1)

Country Link
CN (1) CN105868991B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109409427A (en) * 2018-10-25 2019-03-01 珠海市君天电子科技有限公司 A kind of key detecting method and device
CN111833064B (en) * 2019-04-17 2022-04-12 马上消费金融股份有限公司 Cheating detection method and device
CN110322320B (en) * 2019-06-28 2022-04-22 北京金山安全软件有限公司 Threshold determination method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731401A (en) * 2005-08-26 2006-02-08 清华大学 A method of feature selection based on mixed mutual information in data mining
CN101917309A (en) * 2010-08-27 2010-12-15 电子科技大学 Detection method of denial of service of public service number under soft switching platform
CN103678709A (en) * 2013-12-30 2014-03-26 中国科学院自动化研究所 Recommendation system attack detection method based on time series data
CN104113519A (en) * 2013-04-16 2014-10-22 阿里巴巴集团控股有限公司 Network attack detection method and device thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8688620B2 (en) * 2011-09-23 2014-04-01 Hewlett-Packard Development Company, L.P. Anomaly detection in data centers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1731401A (en) * 2005-08-26 2006-02-08 清华大学 A method of feature selection based on mixed mutual information in data mining
CN101917309A (en) * 2010-08-27 2010-12-15 电子科技大学 Detection method of denial of service of public service number under soft switching platform
CN104113519A (en) * 2013-04-16 2014-10-22 阿里巴巴集团控股有限公司 Network attack detection method and device thereof
CN103678709A (en) * 2013-12-30 2014-03-26 中国科学院自动化研究所 Recommendation system attack detection method based on time series data

Also Published As

Publication number Publication date
CN105868991A (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN108449327B (en) Account cleaning method and device, terminal equipment and storage medium
US10248528B2 (en) System monitoring method and apparatus
ES2801273T3 (en) Method and apparatus for recognizing risk behavior
CN109981328B (en) Fault early warning method and device
US11379687B2 (en) Method for extracting feature string, device, network apparatus, and storage medium
US20120016886A1 (en) Determining a seasonal effect in temporal data
US20150254791A1 (en) Quality control calculator for document review
CN109413071B (en) Abnormal flow detection method and device
CN107370766B (en) Network flow abnormity detection method and system
CN105868991B (en) Method and device for identifying machine-assisted cheating
CN111143415A (en) Data processing method and device and computer readable storage medium
CN106408325A (en) User consumption behavior prediction analysis method based on user payment information and system
CN108900514A (en) Attack tracking of information source tracing method and device based on homogeneous assays
US20160132798A1 (en) Service-level agreement analysis
CN106936778B (en) Method and device for detecting abnormal website traffic
CN106327230B (en) Abnormal user detection method and equipment
CN106033574B (en) Method and device for identifying cheating behaviors
CN109754290B (en) Game data processing method and device
US20210067544A1 (en) System and Methods for Mitigating Fraud in Real Time Using Feedback
EP3570242A1 (en) Method and system for quantifying quality of customer experience (cx) of an application
CN111340606A (en) Full-process income auditing method and device
CN107784511A (en) A kind of customer loss Forecasting Methodology and device
US20130173598A1 (en) Method and Apparatus for Automated Pattern Analysis to Identify Location Information in Cellular Telephone Records
WO2019019373A1 (en) Event processing method and terminal device
CN105718462B (en) Cheating detection method and device for application operation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200924

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: Greater Cayman, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.

TR01 Transfer of patent right