CN107341716B - Malicious order identification method and device and electronic equipment - Google Patents

Malicious order identification method and device and electronic equipment Download PDF

Info

Publication number
CN107341716B
CN107341716B CN201710560874.6A CN201710560874A CN107341716B CN 107341716 B CN107341716 B CN 107341716B CN 201710560874 A CN201710560874 A CN 201710560874A CN 107341716 B CN107341716 B CN 107341716B
Authority
CN
China
Prior art keywords
order
malicious
identified
behavior
behaviors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710560874.6A
Other languages
Chinese (zh)
Other versions
CN107341716A (en
Inventor
钱春江
余文喆
杜红光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201710560874.6A priority Critical patent/CN107341716B/en
Publication of CN107341716A publication Critical patent/CN107341716A/en
Application granted granted Critical
Publication of CN107341716B publication Critical patent/CN107341716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification

Abstract

The embodiment of the invention provides a method and a device for identifying malicious orders and electronic equipment, wherein the method comprises the following steps: acquiring data of order behaviors to be identified; analyzing the data of the order behaviors to be identified by using an analysis model to obtain malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to preset data of the order behaviors; and judging whether the order behavior to be identified belongs to malicious order behavior or not according to the malicious scores. By using the analysis model to analyze the data of the order behavior to be identified, the success rate of malicious order identification can be improved and the scope of malicious order identification can be enlarged.

Description

Malicious order identification method and device and electronic equipment
Technical Field
The present invention relates to the field of network technologies, and in particular, to a method and an apparatus for identifying malicious orders, and an electronic device.
Background
With the rise of internet e-commerce, the security of online shopping is also increasingly emphasized. Many malicious users use loopholes or price differences in the e-commerce to swipe and snatch orders, which causes disadvantages or even losses to the vast consumer groups with normal demands and the e-commerce.
However, the inventor finds that the prior art has at least the following problems in the process of implementing the invention:
the existing e-commerce adopts targeted identification in each link, such as a method of specially identifying whether access is too frequent or not, and a method of specially identifying whether addresses of consignees are similar or not. The identification methods are independent and based on limited functions, whether user order behaviors are malicious or not is judged, and malicious users can easily bypass the limited identification functions and conduct malicious order behaviors without being discovered. It can be seen that with the improvement of the anti-monitoring strategy of the malicious order maker, the identification success rate of the existing malicious order identification technology is low, and the identification range is narrow.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for identifying malicious orders and electronic equipment, so as to improve the success rate and range of identifying malicious orders. The specific technical scheme is as follows:
a method of malicious order identification, the method comprising:
acquiring data of order behaviors to be identified;
analyzing the data of the order behaviors to be identified by using an analysis model to obtain malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to preset data of the order behaviors;
and judging whether the order behavior to be identified belongs to malicious order behavior or not according to the malicious scores.
Optionally, before analyzing the data of the order behavior to be identified by using the analysis model to obtain the malicious score of the order behavior to be identified, the method further includes:
judging whether the user type to which the data of the order behaviors to be identified belongs is a new user or not, wherein the user type comprises a new user or an old user, the new user is a user of which the historical order behavior number is smaller than a preset first threshold value, and the old user is a user of which the historical order behavior number is larger than or equal to the first threshold value;
the analyzing the data of the order behavior to be identified by using the analysis model to obtain the malicious score of the order behavior to be identified comprises the following steps:
when the user type of the data of the order behaviors to be identified is judged to be a new user, calculating to obtain a first similarity between the order behaviors to be identified and malicious order behaviors marked by a first analysis submodel, and taking the first similarity as malicious scores of the order behaviors to be identified, wherein the first analysis submodel is one of the analysis models, and performing K-means cluster analysis on the data of the historical order behaviors of the sample user to obtain analysis submodels of classes formed by normal order behaviors of different levels and classes formed by malicious order behaviors of different levels;
or when the user type to which the data of the order behaviors to be identified belongs is judged to be an old user, calculating to obtain a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model;
inputting the data of the order behaviors to be identified into a second analysis submodel corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score aggregation on the second similarity and the third similarity, and taking the result of the score aggregation as the malicious score of the order behaviors to be identified, wherein the second analysis submodel is one of the analysis models and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training by using the data of the individual historical order behaviors of the sample user for each sample user.
Optionally, the determining, according to the malicious score, whether the order behavior to be identified belongs to a malicious order behavior includes:
when the user type to which the data of the order behaviors to be identified belongs is a new user and the malicious score is greater than a preset second threshold value, determining that the order behaviors to be identified belong to malicious order behaviors;
or when the user type to which the data of the order behavior to be identified belongs is a new user and the malicious score is smaller than or equal to the second threshold value, determining that the order behavior to be identified does not belong to a malicious order behavior;
or when the user type to which the data of the order behavior to be identified belongs is an old user and the malicious score is greater than a preset third threshold value, determining that the order behavior to be identified belongs to a malicious order behavior;
or when the user type to which the data of the order behavior to be identified belongs is an old user and the malicious score is smaller than or equal to the third threshold value, determining that the order behavior to be identified does not belong to a malicious order behavior.
Optionally, the order behavior includes one or more of the following:
the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
An apparatus for malicious order identification, the apparatus comprising:
the data acquisition module is used for acquiring data of order behaviors to be identified;
the score obtaining module is used for analyzing the data of the order behaviors to be identified by utilizing an analysis model to obtain the malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to the preset data of the order behaviors;
and the behavior judgment module is used for judging whether the order behavior to be identified belongs to malicious order behavior according to the malicious scores.
Optionally, the apparatus further includes a type determining module, and the score obtaining module includes: a first score obtaining sub-module and a second score obtaining sub-module;
the type judging module is used for judging whether the user type to which the data of the order behaviors to be identified belongs is a new user or not, wherein the user type comprises a new user or an old user, the new user is a user of which the historical order behavior number is smaller than a preset first threshold value, and the old user is a user of which the historical order behavior number is larger than or equal to the first threshold value; if the user type to which the data of the order behavior to be identified belongs is a new user, triggering the first grading obtaining sub-module, and if the user type to which the data of the order behavior to be identified belongs is an old user, triggering the second grading obtaining sub-module;
the first score obtaining sub-module is used for calculating and obtaining a first similarity between the order behaviors to be identified and malicious order behaviors marked by a first analysis sub-model, and taking the first similarity as a malicious score of the order behaviors to be identified, wherein the first analysis sub-model is one of the analysis models and carries out K-means cluster analysis on data of historical order behaviors of a sample user to obtain analysis sub-models of classes formed by normal order behaviors of different levels and classes formed by malicious order behaviors of different levels;
the second grading obtaining sub-module is used for calculating and obtaining a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model;
inputting the data of the order behaviors to be identified into a second analysis submodel corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score aggregation on the second similarity and the third similarity, and taking the result of the score aggregation as the malicious score of the order behaviors to be identified, wherein the second analysis submodel is one of the analysis models and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training by using the data of the individual historical order behaviors of the sample user for each sample user.
Optionally, the behavior determining module includes: the device comprises a first grading judgment sub-module, a first behavior determination sub-module, a second behavior determination sub-module and a second grading judgment sub-module;
the first scoring judgment sub-module is configured to judge whether the malicious score is greater than a preset second threshold value or not, trigger the first behavior determination sub-module if the malicious score is greater than the second threshold value, and trigger the second behavior determination sub-module if the malicious score is less than or equal to the second threshold value;
the first behavior determining submodule is used for determining that the order behavior to be identified belongs to malicious order behavior;
the second behavior determining submodule is used for determining that the order behavior to be identified does not belong to malicious order behavior;
the second scoring judgment sub-module is configured to judge whether the malicious score is greater than a preset third threshold value or not, trigger the first behavior determination sub-module if the malicious score is greater than the third threshold value, and trigger the second behavior determination sub-module if the malicious score is less than or equal to the third threshold value.
Optionally, the order behavior includes one or more of the following:
the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
In another aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing any malicious order identification method when executing the program stored in the memory.
In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to perform any one of the above described methods for malicious order identification.
In yet another aspect of the present invention, the present invention also provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any of the above described methods for malicious order identification.
In the scheme provided by the embodiment of the invention, the received data of the order behaviors to be identified can be analyzed by utilizing an analysis model to obtain the malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to the preset data of the order behaviors, and whether the order behaviors belong to the malicious order behaviors or not is judged according to the malicious scores. Therefore, when the embodiment of the invention is applied, the analysis model is obtained based on the data training of the preset order behavior, so that the analysis model can be expanded according to the requirement, the analysis model has self-adaptability and a wide analysis range, the success rate of malicious order recognition is improved, and the malicious order recognition range is expanded. Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a block diagram of a system for malicious order identification according to an embodiment of the present invention;
fig. 2 is a first flowchart illustrating a malicious order identification method according to an embodiment of the present invention;
fig. 3 is a second flowchart illustrating a malicious order identification method according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an isolation result of clustering using K-means according to an embodiment of the present invention;
fig. 5 is a third flowchart illustrating a malicious order identification method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a malicious order identification apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a malicious order identification apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a malicious order identification apparatus according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.
In the prior art, malicious orders are identified in a targeted manner in various links, for example, an IP address having access to the orders is identified, and when a rapid increase in the number of orders within a period of time of the same IP address is detected, the orders can be determined as malicious orders or suspicious orders, and further identified. However, when a malicious user utilizes a malicious means to make the access IP addresses of each malicious order different, the above method cannot identify the malicious orders, and thus the existing method has a low success rate of identification.
Based on the above, the inventor considers that the historical order behaviors of the user include the order habits of the user, considers the utilization of statistical learning and machine learning, constructs a multi-dimensional and self-adaptive analysis model to calculate the similarity between the order behaviors to be identified and the historical order behaviors, and determines whether the order behaviors to be identified belong to malicious order behaviors or not through the calculated similarity so as to improve the success rate of malicious order identification.
Based on the above consideration, the invention provides a method for identifying malicious orders, which analyzes the order behaviors to be identified by using an analysis model constructed based on historical order behaviors, obtains the malicious scores of the order behaviors to be identified, and judges whether the order behaviors to be identified belong to the malicious order behaviors or not according to the malicious scores. When the analysis model is constructed, the data of the preset order behaviors are adopted, and new data can be added or unnecessary data can be deleted according to needs, so that the analysis model has multiple dimensions and self-adaptability, the order behaviors are prevented from being identified from a single dimension, the success rate of malicious order identification can be improved, and the malicious order identification range can be enlarged.
Fig. 1 is a block diagram of a system for identifying malicious orders according to an embodiment of the present invention.
After receiving the order request behavior, judging whether enough data of the historical order behavior of the user requesting the order is stored, if so, analyzing by using a personal order behavior model and a group order behavior model, and if not, analyzing by using the group order behavior model;
wherein the training and analyzing comprises: personal behavior training and analysis, group behavior training and analysis;
personal behavior training and analysis: performing model training by using data of a user's personal order behavior to obtain a personal order behavior model, and analyzing the user's to-be-identified order behavior by using the personal order behavior model to obtain the similarity between the user's to-be-identified order behavior and the user's historical order behavior;
group behavior training and analysis: performing model training by using data of all personal order behaviors of a sample user to obtain group order behavior models with different malicious order behavior grades, and analyzing the order behaviors to be identified by using the group order behavior models to obtain the malicious grades of the order behaviors to be identified;
and comprehensively analyzing to obtain a score result of the order behavior to be identified by using the obtained similarity and the malicious level.
Fig. 2 is a first flowchart of a malicious order identification method according to an embodiment of the present invention, including:
s201: and acquiring data of order behaviors to be identified.
Specifically, in this embodiment, the order behavior includes: the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
The information contained in the order behavior is directly related to the order behavior, and when the information in the order behavior is suspicious, the order behavior can be judged to belong to malicious order behavior, or the order behavior is classified as suspicious order behavior and is further identified. Therefore, whether the order behavior belongs to malicious order behaviors or not can be identified according to the change condition of the information in the order behavior.
S202: and analyzing the data of the order behaviors to be identified by using the analysis model to obtain the malicious scores of the order behaviors to be identified.
The analysis model is obtained by performing model training according to preset data of order behaviors.
In this embodiment, the preset order behavior may include: the type of goods (most students buy electronic products and most middle-aged people buy healthcare products), the time of the order (office workers often place orders at night or on weekends), and the quantity of each order (non-malicious persons often do not buy in bulk at once for luxury goods).
The probability that the order behavior to be identified belongs to the malicious order behavior can be obtained by using the trained analysis model, the obtained probability is used as the malicious score of the order behavior to be identified, the distance that the order behavior to be identified deviates from the normal order behavior can also be obtained by using the trained analysis model, the obtained distance is used as the malicious score of the order behavior to be identified, and whether the order behavior to be identified belongs to the malicious order behavior can be judged by further analyzing the obtained malicious score.
S203: and judging whether the order behavior to be identified belongs to malicious order behavior or not according to the malicious scores.
Specifically, the obtained malicious score may be compared with a preset threshold to determine whether the order behavior to be identified belongs to a malicious order behavior, when the malicious score is greater than the threshold, it is determined that the order behavior to be identified belongs to the malicious order behavior, and when the malicious score is less than or equal to the threshold, it is determined that the order behavior to be identified does not belong to the malicious order behavior; or when the malicious score is 1, determining that the order behavior to be identified belongs to the malicious order behavior, and when the malicious score is 0, determining that the order behavior to be identified does not belong to the malicious order behavior.
After judging whether the order behavior to be identified belongs to the malicious order behavior by utilizing the malicious score, the order behavior to be identified can be used as a new training sample to obtain a new analysis model, so that the identification accuracy of the analysis model is improved.
As can be seen from the above, in the scheme provided in this embodiment, an analysis model is constructed according to data of preset order behaviors to analyze the order behaviors to be recognized, so as to obtain a malicious score of the order behaviors to be recognized, and whether the order behaviors to be recognized belong to malicious order behaviors is determined according to the malicious score. Compared with the prior art, in the scheme provided by the embodiment, the order behaviors of the user can be analyzed by using the statistical learning and machine learning analysis models, wherein when the analysis models are constructed, the preset data of the order behaviors are adopted, and new data can be added or unnecessary data can be deleted according to needs, so that the analysis models have multiple dimensions and self-adaptability, the order behaviors are prevented from being identified from a single dimension, the success rate and the range of malicious order identification can be improved, and the security monitoring can be performed on the e-commerce system of a company.
In an embodiment of the present invention, referring to fig. 3, a second flowchart of a malicious order identification method is provided, including:
s301: and acquiring data of order behaviors to be identified.
This step is the same as S201 in the above embodiment, and is not described herein again.
S302: and judging whether the user type of the data of the order behavior to be identified belongs to a new user, if so, executing S3031, and if so, executing S3032.
The user type comprises a new user or an old user, the new user is a user with the historical order behavior number smaller than a preset first threshold, and the old user is a user with the historical order behavior number larger than or equal to the preset first threshold.
For example, the first threshold may be 20, which is not limited in the present application. When the order behaviors to be identified of the user A are received, if the number of the historical order behaviors of the user A is less than 20, determining the user A as a new user; and when the order behaviors to be identified of the user B are received, if the number of the historical order behaviors of the user B is more than or equal to 20, determining that the user B is an old user.
Whether the user type to which the data of the order behavior to be identified belongs is a new user or not is judged, so that different analyses can be performed according to different user types to obtain a more accurate analysis result.
S3031: and calculating to obtain a first similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model, and taking the first similarity as a malicious score of the order behaviors to be identified.
The first analysis submodel is one of the analysis models and is used for performing K-means clustering analysis on the data of the historical order behaviors of the sample user to obtain analysis submodels of classes formed by normal order behaviors in different levels and classes formed by malicious order behaviors in different levels; the group order behavior model is used for analyzing the group user order behavior. By carrying out cluster analysis on the data of the order behaviors of the sample users, the malicious order behaviors can be separated from the normal order behaviors, and classes with different grades of the malicious order behaviors can be obtained according to the result of manual marking.
For example, a malicious orderer typically uses a cloud machine to simulate browser access for a large amount of frequent order accesses, and although the order accesses have different user names and different consignee telephones, the malicious orderer can be identified by similarity in three dimensions, namely the geographic location of the IP address, the order item type and the third-level address of the consignee.
Through clustering analysis, new user order behaviors can also be identified. For example, when a malicious order maker performs malicious order behaviors by using the newly added machine, the malicious order behaviors of the newly added machine may also show similarities with the separated malicious order behaviors and be identified.
A first analysis submodel for training sample X ═ { X using K-means algorithm(1),…,x(m)Performing cluster analysis, wherein X comprises historical order behaviors of sample users, wherein the historical order behaviors comprise malicious order behaviors and normal order behaviors, and X is(m)Representing the mth order behavior in the training sample, including the preset data of each dimension in the mth order behavior, wherein m represents the order behavior in the training sampleThe value of m is a natural number greater than 0. For example, in the cluster analysis, historical order behaviors of 300 sample users are collected as training samples, and if m is 300, the larger the value of m is, that is, the larger the number of training samples is, the more accurate the cluster analysis is, but the larger the amount of data to be processed is, and in practical application, the value of m can be adjusted according to different scenes or personal experience.
Randomly selecting K samples in X as clustering center points U ═ mu1,μ2…μk},1<k≤m。
For each training sample x(i)The class to which it should belong is calculated using equation (1).
C(i)=argminj||x(i)j||2 (1)
Wherein x is(i)Representing the ith order behavior in the training sample, i is more than or equal to 1 and less than or equal to m, mujJ is more than or equal to 1 and less than or equal to k and C(i)Denotes x(i)Class to which x is calculated(i)Difference with all the clustering centroid points in U when training sample x(i)And cluster centroid points mujWhen the difference is minimum, the training sample x is confirmed(i)Belonging to a cluster centroid point mujClass j where it is.
After all training samples belonging to class j are obtained, the centroid point of class j is recalculated using equation (2).
Figure BDA0001347047620000101
Wherein, mujRepresenting the centroid, x, of class j(i)Representing the ith order behavior in the training sample, C(i)Denotes x(i)The class to which it currently belongs.
And repeating the calculation processes of the formula (1) and the formula (2) until the first analysis submodel converges.
Wherein, the convergence condition of the first analysis submodel may be:
the difference value of all the clustering centroid points before and after recalculation is smaller than a preset threshold value; alternatively, the first and second electrodes may be,
for each class, the sum of the squared differences of all samples in the class and the centroid point thereof is less than another preset threshold;
or, other convergence criteria.
By carrying out cluster analysis on the data of the historical order behaviors of the sample user and utilizing the result of manual marking, classes with different grades of normal order behaviors and malicious order behaviors can be obtained. For example, the class classification shown in table 1 is specifically a correspondence between a malicious class and a probability of belonging to a malicious order, and a correspondence between a normal class and a probability of not belonging to a malicious order.
TABLE 1
Malicious level Probability of belonging to a malicious order
Malicious level 0 50%-60%
Malicious level 1 60%-70%
Malicious level 2 70%-80%
Malicious level 3 80%-90%
Malicious level 4 90%-100%
Grade of normality Probability of not belonging to a malicious order
Normal rating of 0 50%-60%
Normal class 1 60%-70%
Normal class 2 70%-80%
Normal class 3 80%-90%
Normal class 4 90%-100%
And calculating the similarity of the order behaviors to be identified and the obtained mass center points of the classes with different grades of the malicious order behaviors, and expressing the malicious grade of the order behaviors to be identified by using the obtained similarity.
Fig. 4 is a schematic diagram of a separation result of clustering analysis using K-means, where the order behaviors in the training sample are only divided into two types, one type is represented by dots, and the other type is represented by triangles, which represent malicious order behaviors and normal order behaviors, respectively, and the type containing the malicious order behaviors can be determined by the result of manual labeling.
In one implementation, a first similarity between the order behavior to be identified and the malicious order behavior marked by the first analysis submodel is obtained, similarities of centroid points of classes of different levels of the order behavior to be identified and the marked malicious order behavior may be calculated respectively, the calculated similarities are subjected to weighted summation, and a summation result is used as the first similarity. The greater the first similarity, that is, the more similar the order behavior to be identified is to the malicious order behavior marked by the first analysis submodel, the more likely the order behavior to be identified is to belong to the malicious order behavior, so the first similarity can be used as a malicious score.
When calculating the similarity, the similarity may be calculated by using a euclidean distance or a pearson similarity, or may be calculated by using another algorithm.
For example, classes of malicious order behaviors of three different levels, namely malicious level 0, malicious level 1 and malicious level 2, are obtained, wherein the centroid point of the malicious level 0 class is μoCentroid point of malicious level 1 class is μpCentroid point of malicious level 2 class is μq
By utilizing the Pearson similarity calculation, the order behavior to be identified and the centroid point mu can be obtainedoHas a similarity of AoAnd the centroid point mupHas a similarity of ApAnd the centroid point muqHas a similarity of Aq
The malicious score of the new user's order behavior to be identified can be calculated by using formula (3).
S=0.1Ao+0.3Ap+0.6Aq (3)
Wherein S represents the malicious score, and in practical application, A can be evaluated according to different scenes or personal experienceo、ApAnd AqThe weight of the user is adjusted.
The malicious score of the order behavior to be identified of the new user reflects the similarity between the order behavior to be identified and the malicious order behavior, and whether the order behavior to be identified belongs to the malicious order behavior can be accurately judged according to the similarity.
S3032: calculating to obtain a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model; inputting data of the order behaviors to be identified into a second analysis sub-model corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score summation on the second similarity and the third similarity, and taking the result of the score summation as the malicious score of the order behaviors to be identified.
The step of calculating the second similarity between the order behavior to be identified and the malicious order behavior marked by the first analysis submodel is consistent with the step in S3031, and is not described herein again.
The second analysis submodel is one of the analysis models, and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training using data of personal historical order behaviors of the sample user, for each sample user, that is, the personal order behavior model, and each user has a corresponding second analysis submodel for analyzing the order behaviors of the user and determining whether the order behaviors of the user conform to the past order habits of the user. The user's order behavior preferences are observed and counted over time, and if the order behavior deviates from the user's past order habits, the order behavior can be further identified.
For example, the user a usually only places an order at about 10 pm and purchases a number of usb flash disks and camera accessories, and it is detected that the user a purchases a large number of lipsticks in the middle of the day, and although the number of lipsticks purchased in each order is not large, the number of orders is large, which requires analyzing the order behavior of the user a purchasing the lipsticks.
And (3) training the historical order behavior data of each user by using a formula (4) and adopting Logistic Regression to construct a second analysis sub-model corresponding to the user.
f(x)=θTx (4)
Where θ represents a model parameter, i.e. a regression coefficient, and x represents data of each preset dimension of the user's historical order behavior, which can be represented by a matrix (5).
Figure BDA0001347047620000131
Wherein x is11,x21,…xn1Represents the user oneThe data of each preset dimension in the historical order behavior, namely representing information such as order quantity, order access IP address and payment mode, has n dimensions, and the value of n is a natural number larger than 0. For example, the data of the preset order behavior includes: the order quantity, the IP address of order access and the payment mode are only three dimensions, and n is 3; if the data of the preset order behavior comprises: and n is 5 when the order quantity, the payment mode, the IP address accessed by the order, the geographic position of the IP address and the third-level address of the receiver are in five dimensions. j represents the number of the historical order behaviors of the user adopted when the second analysis submodel is constructed, and the value of j is a natural number which is larger than 0. For example, if the second analysis sub-model is constructed by using 20 historical order behaviors of the user, j is 20, and the value of j is larger, that is, the more the historical order behaviors of the user are used, the more accurate the analysis effect of the second analysis sub-model corresponding to the user is obtained, but the larger the data size to be processed is, and in practical application, the value of j can be adjusted according to different scenes or personal experience.
Based on the second analysis submodel corresponding to the user, the probability that the order behavior to be identified of the user is similar to the historical order behavior of the user can be obtained by using the formula (6).
Figure BDA0001347047620000141
Wherein x represents data of order behavior to be identified of the user, sigma represents S-shaped growth curve Sigmoid, theta represents model parameter, and hθ(x) And P (y ═ 1| x) denotes an order behavior corresponding to the data x of order behaviors to be identified, which is a probability similar to the historical order behavior of the user. And expressing the similarity between the order behavior to be identified of the user and the historical order behavior of the user by the obtained probability.
The matrix (5) can be reduced to a distributed system, and training of logistic regression can be completed by using a Machine learning library (Machine learning lib, abbreviated as MLib) of Spark to obtain a model parameter θ.
And inputting the data of the order behavior to be identified of the user into the trained second analysis sub-model corresponding to the user, and obtaining a third similarity between the order behavior to be identified of the user and the historical order behavior of the user by using a formula (5).
The greater the third similarity is, that is, the greater the probability that the order behavior to be identified is similar to the historical order behavior of the user is, the more unlikely the order behavior to be identified is to belong to the malicious order behavior, therefore, when the second similarity and the third similarity are integrated, the opposite number of the third similarity and the second similarity can be weighted and summed to obtain the malicious score, or the third similarity can be used to obtain the probability that the order behavior to be identified is not similar to the historical order behavior of the user, and the dissimilar probability and the second similarity are weighted and summed to obtain the malicious score.
For example, a class of malicious order behaviors of three different levels, malicious level 0, malicious level 1, and malicious level 2, is obtained, where the centroid store of the malicious level 0 class is μoCentroid point of malicious level 1 class is μpCentroid point of malicious level 2 class is μq
The behavior of the order to be recognized and the centroid point mu can be obtained by utilizing the Pearson similarity calculationoThe similarity of (A)'oAnd the centroid point mupThe similarity of (A)'pAnd the centroid point muqThe similarity of (A)'q
The second similarity may be calculated using equation (7).
S1=0.1A′o+0.4A′p+0.5A′q (7)
Wherein S is1The second similarity is expressed, and in practical application, the A 'can be obtained according to different scenes or personal experiences'o、A′pAnd A'qThe weight of the user is adjusted.
The third similarity is obtained by equation (8).
Figure BDA0001347047620000151
Wherein S is2Indicating a third degree of similarity.
And performing score summation on the second similarity and the third similarity to obtain a malicious score of the order behavior to be identified.
Specifically, the malicious score can be calculated by using formula (9).
S=0.4S1+0.6(1-S2) (9)
In practical application, the weight value can be adjusted according to different scenes or personal experience.
The score aggregation may be performed using a score accumulator or may be performed using an online adjustable polynomial function.
The malicious score of the to-be-identified order behavior of the old user comprises the similarity between the to-be-identified order behavior and the malicious order behavior and the degree of deviation of the to-be-identified order behavior from the individual order habit, and the two behaviors are combined for identification, so that a more accurate identification result can be obtained.
As can be seen from the above, in the scheme provided by this embodiment, for the order behavior to be identified of the new user, the malicious score is directly calculated by using the first analysis submodel; and for the order behavior to be identified of the old user, calculating a second similarity by using the first analysis submodel, calculating a third similarity by using the second analysis submodel corresponding to the old user, and obtaining a malicious score by combining the second similarity and the third similarity. Compared with the prior art, in the scheme provided by the embodiment, different analyses are performed on the order behavior to be identified of the new user and the order behavior to be identified of the old user, so that more accurate malicious scores can be obtained, and the success rate of malicious order identification is further improved.
In an embodiment of the present invention, referring to fig. 5, a third flowchart of a malicious order identification method is provided, including:
s301: and acquiring data of order behaviors to be identified.
S302: and judging whether the user type of the data of the order behavior to be identified belongs to a new user, if so, executing S3031, and if so, executing S3032.
S3031: and calculating to obtain a first similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model, and taking the first similarity as a malicious score of the order behaviors to be identified.
S3032: calculating to obtain a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model; inputting data of the order behaviors to be identified into a second analysis sub-model corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score summation on the second similarity and the third similarity, and taking the result of the score summation as the malicious score of the order behaviors to be identified.
S301, S302, S3031 and S3032 are described in detail in the above embodiments, and are not described herein again.
S3041: and for the order behavior to be identified of the new user, judging whether the malicious score is greater than a preset second threshold, if so, executing S3042, and if not, executing S3043.
The second threshold is set to measure the malice score of the to-be-identified order behavior of the new user, the higher the malice score is, the more likely the to-be-identified order behavior of the new user belongs to the malice order behavior, when the malice score is greater than the second threshold, the to-be-identified order behavior of the new user can be determined to belong to the malice order behavior, and when the malice score is less than or equal to the second threshold, the to-be-identified order behavior of the new user can be determined not to belong to the malice order behavior.
In one implementation manner, the calculated malicious score of the order behavior to be identified of the new user is in a range from 0 to 1 by using the pearson similarity, at this time, the second threshold value may be set to be 0.5, the malicious score of the order behavior to be identified of the new user is compared with 0.5, whether the order behavior to be identified of the new user belongs to the malicious order behavior is judged according to the comparison result, and a more accurate judgment result can be obtained by judging according to the result of the numerical comparison.
In practical applications, the second threshold may be adjusted according to different similarity calculation methods.
S3042: and determining that the order behavior to be identified belongs to malicious order behavior.
In one implementation, when the malicious score of the order behavior to be identified of the new user is greater than a second threshold value, determining that the order behavior to be identified of the new user belongs to malicious order behavior; or when the malicious score of the order behavior to be identified of the old user is larger than the third threshold, determining that the order behavior to be identified of the old user belongs to the malicious order behavior.
And determining that the order behavior to be identified belongs to the malicious order behavior, updating the analysis model by using the order behavior to be identified, improving the identification accuracy of the analysis model, and performing key monitoring or other subsequent processing on the order behavior of the user to which the order behavior to be identified belongs.
S3043: and determining that the order behavior to be identified does not belong to malicious order behavior.
In one implementation, when the malicious score of the order behavior to be identified of the new user is smaller than or equal to a second threshold, determining that the order behavior to be identified of the new user does not belong to the malicious order behavior; or when the malicious score of the order behavior to be identified of the old user is smaller than or equal to the third threshold, determining that the order behavior to be identified of the old user does not belong to the malicious order behavior.
And determining that the order behavior to be identified does not belong to the malicious order behavior, and updating the analysis model by using the order behavior to be identified so as to improve the identification accuracy of the analysis model.
S3044: and for the order behaviors to be identified of the old user, judging whether the malicious score is greater than a preset third threshold, if so, executing S3042, and if not, executing S3043.
The third threshold is set to measure a malice score of the to-be-identified order behavior of the old user, the higher the malice score is, the more likely the to-be-identified order behavior of the old user belongs to the malice order behavior, when the malice score is greater than the third threshold, it can be determined that the to-be-identified order behavior of the old user belongs to the malice order behavior, and when the malice score is less than or equal to the third threshold, it can be determined that the to-be-identified order behavior of the old user does not belong to the malice order behavior.
In one implementation manner, the calculated malicious score of the to-be-identified order behavior of the old user is in a range from 0 to 1 by using the pearson similarity, the second threshold value may be set to 0.5, the malicious score of the to-be-identified order behavior of the old user is compared with 0.5, whether the to-be-identified order behavior of the old user belongs to the malicious order behavior is judged according to the comparison result, and a more accurate judgment result can be obtained by judging according to the result of the numerical comparison.
In practical applications, the third threshold may be adjusted according to different similarity calculation methods.
As can be seen from the above, in the scheme provided in this embodiment, for the order behavior to be identified of the new user, the obtained malicious score is compared with the second threshold value, so as to determine whether the order behavior to be identified of the new user belongs to a malicious order behavior; and for the order behaviors to be identified of the old user, comparing the obtained malicious score with a third threshold value to judge whether the order behaviors to be identified of the old user belong to malicious order behaviors. Compared with the prior art, in the scheme provided by the embodiment, when comparing the malicious scores, the obtained malicious scores are compared with different threshold values according to different user types to which the order behaviors to be identified belong, so as to judge whether the order behaviors to be identified belong to the malicious order behaviors, and a more accurate comparison result can be obtained, so that the success rate of malicious order identification is improved.
In an embodiment of the present invention, when the analysis model is constructed by using the data of the preset order behavior, the order behavior may include one or a combination of the following: the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
The data contained in the order behaviors can be used as a basis for judging whether the order behaviors to be identified belong to malicious order behaviors, and the data can be combined, added or deleted according to different scenes.
As can be seen from the above, in the scheme provided by this embodiment, the order behaviors include important data for identifying malicious orders, and the malicious order behaviors can be accurately identified according to the data, so that the success rate of identifying malicious orders is improved.
Corresponding to the malicious order identification method, the embodiment of the invention also provides a malicious order identification device.
Fig. 6 is a schematic structural diagram of a malicious order identification apparatus according to an embodiment of the present invention, including: a data acquisition module 601, a score acquisition module 602 and a behavior judgment module 603.
The data acquisition module 601 is configured to acquire data of order behaviors to be identified;
a score obtaining module 602, configured to analyze the data of the order behavior to be identified by using an analysis model, and obtain a malicious score of the order behavior to be identified, where the analysis model is obtained by performing model training according to preset data of the order behavior;
and a behavior judging module 603, configured to judge whether the order behavior to be identified belongs to a malicious order behavior according to the malicious score.
As can be seen from the above, in the scheme provided in this embodiment, an analysis model is constructed according to data of preset order behaviors to analyze the order behaviors to be recognized, so as to obtain a malicious score of the order behaviors to be recognized, and whether the order behaviors to be recognized belong to malicious order behaviors is determined according to the malicious score. Compared with the prior art, in the scheme provided by the embodiment, the order behaviors of the user can be analyzed by using the statistical learning and machine learning analysis models, wherein when the analysis models are constructed, the preset data of the order behaviors are adopted, and new data can be added or unnecessary data can be deleted according to needs, so that the analysis models have multiple dimensions and self-adaptability, the order behaviors are prevented from being identified from a single dimension, the success rate and the range of malicious order identification can be improved, and the security monitoring can be performed on the e-commerce system of a company.
In an embodiment of the present invention, referring to fig. 7, a second structural diagram of a malicious order identification apparatus is provided, including: a data acquisition module 701, a type judgment module 702, a first score obtaining sub-module 7031, a second score obtaining sub-module 7032, and a behavior judgment module 704.
The data obtaining module 701 is the same as the data obtaining module 601 in the above embodiments, and details are not repeated here.
The type determination submodule 702 is configured to determine a user type to which the data of the order behavior to be identified belongs, where the user type includes a new user or an old user, the new user is a user whose historical order behavior number is smaller than a preset first threshold, and the old user is a user whose historical order behavior number is greater than or equal to the first threshold; if the user type to which the data of the order behavior to be identified belongs is a new user, triggering the first score obtaining sub-module 7031, and if the user type to which the data of the order behavior to be identified belongs is an old user, triggering the second score obtaining sub-module 7032;
the first score obtaining sub-module 7031 is configured to calculate and obtain a first similarity between the order behavior to be identified and a malicious order behavior marked by a first analysis sub-model, and use the first similarity as a malicious score of the order behavior to be identified, where the first analysis sub-model is one of the analysis models, and performs K-means cluster analysis on data of historical order behaviors of a sample user to obtain analysis sub-models of classes of different grades of a normal order behavior and a malicious order behavior;
the second score obtaining sub-module 7032 is configured to calculate and obtain a second similarity between the order behavior to be identified and the malicious order behavior marked by the first analysis sub-model, and use the second similarity as a first malicious score of the order behavior to be identified; inputting the data of the order behaviors to be identified into a second analysis submodel corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, generating a second malicious score of the order behaviors to be identified according to the third similarity, performing score summation on the first malicious score and the second malicious score, and taking the result of the score summation as the malicious score of the order behaviors to be identified, wherein the second analysis submodel is one of the analysis models, and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training by using the data of the individual historical order behaviors of the sample user.
The behavior determining module 704 is consistent with the behavior determining module 603 in the above embodiments, and is not described herein again.
As can be seen from the above, in the scheme provided by this embodiment, for the order behavior to be identified of the new user, the malicious score is directly calculated by using the first analysis submodel; and for the order behavior to be identified of the old user, calculating a second similarity by using the first analysis submodel, calculating a third similarity by using the second analysis submodel corresponding to the old user, and obtaining a malicious score by combining the second similarity and the third similarity. Compared with the prior art, in the scheme provided by the embodiment, different analyses are performed on the order behavior to be identified of the new user and the order behavior to be identified of the old user, so that more accurate malicious scores can be obtained, and the success rate of malicious order identification is further improved.
In an embodiment of the present invention, referring to fig. 8, a third structural diagram of a malicious order identification apparatus is provided, wherein the behavior determination module 704 includes: the method comprises the following steps: a first scoring sub-module 7041, a first behavior determination sub-module 7042, a second behavior determination sub-module 7043, and a second scoring sub-module 7044.
The first scoring judgment sub-module 7041 is configured to judge whether the malicious score is greater than a preset second threshold value or not, if the malicious score is greater than the preset second threshold value, trigger the first behavior determination sub-module 7042, and if the malicious score is less than or equal to the second threshold value, trigger the second behavior determination sub-module 7043, where the user type to which the data of the to-be-identified order behavior belongs is a new user;
a first behavior determining sub-module 7042, configured to determine that the order behavior to be identified belongs to a malicious order behavior;
a second behavior determining sub-module 7043, configured to determine that the order behavior to be identified does not belong to a malicious order behavior;
the second scoring judgment sub-module 7044 is configured to judge whether the user type to which the data of the order behavior to be identified belongs is an old user, determine whether the malicious score is greater than a preset third threshold, trigger the first behavior determination sub-module 7042 if the malicious score is greater than the third threshold, and trigger the second behavior determination sub-module 7043 if the malicious score is less than or equal to the third threshold.
As can be seen from the above, in the scheme provided in this embodiment, for the order behavior to be identified of the new user, the obtained malicious score is compared with the second threshold value, so as to determine whether the order behavior to be identified of the new user belongs to a malicious order behavior; and for the order behaviors to be identified of the old user, comparing the obtained malicious score with a third threshold value to judge whether the order behaviors to be identified of the old user belong to malicious order behaviors. Compared with the prior art, in the scheme provided by the embodiment, when comparing the malicious scores, the obtained malicious scores are compared with different threshold values according to different user types to which the order behaviors to be identified belong, so as to judge whether the order behaviors to be identified belong to the malicious order behaviors, and a more accurate comparison result can be obtained, so that the success rate of malicious order identification is improved.
In an embodiment of the present invention, when the analysis model is constructed by using the data of the preset order behavior, the order behavior may include one or a combination of the following: the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
As can be seen from the above, in the scheme provided by this embodiment, the order behaviors include important data for identifying malicious orders, and the malicious order behaviors can be accurately identified according to the data, so that the success rate of identifying malicious orders is improved.
An embodiment of the present invention further provides an electronic device, as shown in fig. 9, which includes a processor 901, a communication interface 902, a memory 903, and a communication bus 904, where the processor 901, the communication interface 902, and the memory 903 complete mutual communication through the communication bus 904,
a memory 903 for storing computer programs;
the processor 901 is configured to implement the following steps when executing the program stored in the memory 903:
acquiring data of order behaviors to be identified;
analyzing the data of the order behaviors to be identified by using an analysis model to obtain malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to preset data of the order behaviors;
and judging whether the order behavior to be identified belongs to malicious order behavior or not according to the malicious scores.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In yet another embodiment of the present invention, a computer-readable storage medium is further provided, which has instructions stored therein, and when the instructions are executed on a computer, the instructions cause the computer to execute the method for malicious order identification described in any one of the above embodiments.
In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of malicious order identification as in any of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (7)

1. A method of malicious order identification, the method comprising:
acquiring data of order behaviors to be identified;
analyzing the data of the order behaviors to be identified by using an analysis model to obtain malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to preset data of the order behaviors;
judging whether the order behavior to be identified belongs to malicious order behavior or not according to the malicious score;
before analyzing the data of the order behavior to be identified by using the analysis model to obtain the malice score of the order behavior to be identified, the method further comprises:
judging whether the user type to which the data of the order behaviors to be identified belongs is a new user or not, wherein the user type comprises a new user or an old user, the new user is a user of which the historical order behavior number is smaller than a preset first threshold value, and the old user is a user of which the historical order behavior number is larger than or equal to the first threshold value;
the analyzing the data of the order behavior to be identified by using the analysis model to obtain the malicious score of the order behavior to be identified comprises the following steps:
when the user type of the data of the order behaviors to be identified is judged to be a new user, calculating to obtain a first similarity between the order behaviors to be identified and malicious order behaviors marked by a first analysis submodel, and taking the first similarity as malicious scores of the order behaviors to be identified, wherein the first analysis submodel is one of the analysis models, and performing K-means cluster analysis on the data of the historical order behaviors of the sample user to obtain analysis submodels of classes formed by normal order behaviors of different levels and classes formed by malicious order behaviors of different levels;
or when the user type to which the data of the order behaviors to be identified belongs is judged to be an old user, calculating to obtain a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model;
inputting the data of the order behaviors to be identified into a second analysis submodel corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score aggregation on the second similarity and the third similarity, and taking the result of the score aggregation as the malicious score of the order behaviors to be identified, wherein the second analysis submodel is one of the analysis models and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training by using the data of the individual historical order behaviors of the sample user for each sample user.
2. The method of claim 1, wherein the determining whether the order behavior to be identified belongs to a malicious order behavior according to the malicious score comprises:
when the user type to which the data of the order behaviors to be identified belongs is a new user and the malicious score is greater than a preset second threshold value, determining that the order behaviors to be identified belong to malicious order behaviors;
or when the user type to which the data of the order behavior to be identified belongs is a new user and the malicious score is smaller than or equal to the second threshold value, determining that the order behavior to be identified does not belong to a malicious order behavior;
or when the user type to which the data of the order behavior to be identified belongs is an old user and the malicious score is greater than a preset third threshold value, determining that the order behavior to be identified belongs to a malicious order behavior;
or when the user type to which the data of the order behavior to be identified belongs is an old user and the malicious score is smaller than or equal to the third threshold value, determining that the order behavior to be identified does not belong to a malicious order behavior.
3. The method according to claim 1 or 2, wherein the order behavior comprises one or a combination of the following:
the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
4. An apparatus for malicious order identification, the apparatus comprising:
the data acquisition module is used for acquiring data of order behaviors to be identified;
the score obtaining module is used for analyzing the data of the order behaviors to be identified by utilizing an analysis model to obtain the malicious scores of the order behaviors to be identified, wherein the analysis model is obtained by performing model training according to the preset data of the order behaviors;
the behavior judging module is used for judging whether the order behavior to be identified belongs to malicious order behavior according to the malicious scores;
the device also comprises a type judging module, and the grading obtaining module comprises: a first score obtaining sub-module and a second score obtaining sub-module;
the type judging module is used for judging whether the user type to which the data of the order behaviors to be identified belongs is a new user or not, wherein the user type comprises a new user or an old user, the new user is a user of which the historical order behavior number is smaller than a preset first threshold value, and the old user is a user of which the historical order behavior number is larger than or equal to the first threshold value; if the user type to which the data of the order behavior to be identified belongs is a new user, triggering the first grading obtaining sub-module, and if the user type to which the data of the order behavior to be identified belongs is an old user, triggering the second grading obtaining sub-module;
the first score obtaining sub-module is used for calculating and obtaining a first similarity between the order behaviors to be identified and malicious order behaviors marked by a first analysis sub-model, and taking the first similarity as a malicious score of the order behaviors to be identified, wherein the first analysis sub-model is one of the analysis models and carries out K-means cluster analysis on data of historical order behaviors of a sample user to obtain analysis sub-models of classes formed by normal order behaviors of different levels and classes formed by malicious order behaviors of different levels;
the second grading obtaining sub-module is used for calculating and obtaining a second similarity between the order behaviors to be identified and the malicious order behaviors marked by the first analysis sub-model;
inputting the data of the order behaviors to be identified into a second analysis submodel corresponding to the user to which the order behaviors to be identified belong, calculating to obtain a third similarity between the order behaviors to be identified and the historical order behaviors of the user to which the order behaviors to be identified belong, performing score aggregation on the second similarity and the third similarity, and taking the result of the score aggregation as the malicious score of the order behaviors to be identified, wherein the second analysis submodel is one of the analysis models and is an analysis submodel corresponding to each sample user, which is obtained by performing logistic regression training by using the data of the individual historical order behaviors of the sample user for each sample user.
5. The apparatus of claim 4, wherein the behavior determination module comprises: the system comprises a first grading judgment submodule, a first behavior judgment submodule, a second behavior judgment submodule and a second grading judgment submodule;
the first scoring judgment sub-module is configured to judge whether the malicious score is greater than a preset second threshold value or not, trigger the first behavior determination sub-module if the malicious score is greater than the second threshold value, and trigger the second behavior determination sub-module if the malicious score is less than or equal to the second threshold value;
the first behavior determining submodule is used for determining that the order behavior to be identified belongs to malicious order behavior;
the second behavior determining submodule is used for determining that the order behavior to be identified does not belong to malicious order behavior;
the second scoring judgment sub-module is configured to judge whether the malicious score is greater than a preset third threshold value or not, trigger the first behavior determination sub-module if the malicious score is greater than the third threshold value, and trigger the second behavior determination sub-module if the malicious score is less than or equal to the third threshold value.
6. The apparatus of claim 4 or 5, wherein the order behavior comprises one or more of the following:
the IP address of the order access, the geographic location of the IP address, the equipment used by the order request, the goods type of the order, the quantity of each order, the order time, the payment method, the third-level address of the consignee, the name of the consignee and the telephone of the consignee.
7. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1 to 3 when executing a program stored in the memory.
CN201710560874.6A 2017-07-11 2017-07-11 Malicious order identification method and device and electronic equipment Active CN107341716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710560874.6A CN107341716B (en) 2017-07-11 2017-07-11 Malicious order identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710560874.6A CN107341716B (en) 2017-07-11 2017-07-11 Malicious order identification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN107341716A CN107341716A (en) 2017-11-10
CN107341716B true CN107341716B (en) 2020-12-25

Family

ID=60219596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710560874.6A Active CN107341716B (en) 2017-07-11 2017-07-11 Malicious order identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN107341716B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944976A (en) * 2017-12-15 2018-04-20 康成投资(中国)有限公司 Online order checking method
CN108564423A (en) * 2017-12-28 2018-09-21 携程旅游网络技术(上海)有限公司 Malice occupy-place recognition methods, system, equipment and the storage medium of ticketing service order
CN108550069A (en) * 2018-04-19 2018-09-18 上海携程商务有限公司 Travelling requirement report method for pushing, device, electronic equipment, storage medium
CN108564448A (en) * 2018-04-23 2018-09-21 广东奥园奥买家电子商务有限公司 A kind of implementation method of the anti-brush of order
CN108876545A (en) * 2018-06-22 2018-11-23 北京小米移动软件有限公司 Order recognition methods, device and readable storage medium storing program for executing
CN110874778B (en) * 2018-08-31 2023-04-25 阿里巴巴集团控股有限公司 Abnormal order detection method and device
CN110955890B (en) * 2018-09-26 2021-08-17 瑞数信息技术(上海)有限公司 Method and device for detecting malicious batch access behaviors and computer storage medium
CN111612197A (en) * 2019-02-25 2020-09-01 北京嘀嘀无限科技发展有限公司 Security event order detection method and device and electronic equipment
CN111768258A (en) * 2019-06-05 2020-10-13 北京京东尚科信息技术有限公司 Method, device, electronic equipment and medium for identifying abnormal order
CN110335115A (en) * 2019-07-01 2019-10-15 阿里巴巴集团控股有限公司 A kind of service order processing method and processing device
CN110348967A (en) * 2019-07-12 2019-10-18 携程旅游信息技术(上海)有限公司 Analysis method, system and the storage medium of user behavior tracking data
CN110910197A (en) * 2019-10-16 2020-03-24 青岛合聚富电子商务有限公司 Order processing method
CN110738506A (en) * 2019-10-22 2020-01-31 杭州蓝诗网络科技有限公司 Malicious bad comment intercepting system of shopping platform
CN112950298A (en) * 2019-11-26 2021-06-11 北京沃东天骏信息技术有限公司 Malicious order identification method and device and storage medium
CN112989295A (en) * 2019-12-16 2021-06-18 北京沃东天骏信息技术有限公司 User identification method and device
CN111047417A (en) * 2019-12-24 2020-04-21 北京每日优鲜电子商务有限公司 Service monitoring method, device, equipment and storage medium
CN111311150B (en) * 2020-02-10 2020-12-22 拉扎斯网络科技(上海)有限公司 Distribution task grouping method, platform, electronic equipment and storage medium
CN112116284B (en) * 2020-03-27 2021-04-13 上海寻梦信息技术有限公司 False waybill identification method, false waybill identification system, electronic equipment and storage medium
CN113763077A (en) * 2020-07-24 2021-12-07 北京沃东天骏信息技术有限公司 Method and apparatus for detecting false trade orders
CN112765502B (en) * 2021-01-13 2024-03-19 上海派拉软件股份有限公司 Malicious access detection method, device, electronic equipment and storage medium
CN113781156A (en) * 2021-05-13 2021-12-10 北京沃东天骏信息技术有限公司 Malicious order recognition method, malicious order model training method, malicious order recognition equipment and malicious order model training storage medium
CN113298642B (en) * 2021-05-26 2024-02-23 上海晓途网络科技有限公司 Order detection method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104081385A (en) * 2011-04-29 2014-10-01 汤姆森路透社全球资源公司 Representing information from documents
CN105069626A (en) * 2015-07-23 2015-11-18 北京京东尚科信息技术有限公司 Detection method and detection system for shopping abnormity
CN105468742A (en) * 2015-11-25 2016-04-06 小米科技有限责任公司 Malicious order recognition method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10250605B2 (en) * 2015-09-30 2019-04-02 Quest Software Inc. Combining a set of risk factors to produce a total risk score within a risk engine
CN106557955A (en) * 2016-11-29 2017-04-05 流量海科技成都有限公司 Net about car exception order recognition methodss and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104081385A (en) * 2011-04-29 2014-10-01 汤姆森路透社全球资源公司 Representing information from documents
CN105069626A (en) * 2015-07-23 2015-11-18 北京京东尚科信息技术有限公司 Detection method and detection system for shopping abnormity
CN105468742A (en) * 2015-11-25 2016-04-06 小米科技有限责任公司 Malicious order recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于模板用户信息搜索行为和统计分析的共谋销量欺诈识别;王忠群等;《现代图书情报技术》;20151125(第11期);第41-42,45,47页 *

Also Published As

Publication number Publication date
CN107341716A (en) 2017-11-10

Similar Documents

Publication Publication Date Title
CN107341716B (en) Malicious order identification method and device and electronic equipment
CN108985830B (en) Recommendation scoring method and device based on heterogeneous information network
CN110163647B (en) Data processing method and device
TWI718422B (en) Method, device and equipment for fusing model prediction values
JP4697670B2 (en) Identification data learning system, learning device, identification device, and learning method
CN109711955B (en) Poor evaluation early warning method and system based on current order and blacklist base establishment method
CN111523976A (en) Commodity recommendation method and device, electronic equipment and storage medium
CN106251174A (en) Information recommendation method and device
CN110008397B (en) Recommendation model training method and device
WO2020192013A1 (en) Directional advertisement delivery method and apparatus, and device and storage medium
CN113240130B (en) Data classification method and device, computer readable storage medium and electronic equipment
CN109165975B (en) Label recommending method, device, computer equipment and storage medium
CN108876545A (en) Order recognition methods, device and readable storage medium storing program for executing
CN110135681A (en) Risk subscribers recognition methods, device, readable storage medium storing program for executing and terminal device
CN107766467B (en) Information detection method and device, electronic equipment and storage medium
CN112529663A (en) Commodity recommendation method and device, terminal equipment and storage medium
CN111275205A (en) Virtual sample generation method, terminal device and storage medium
CN111209929A (en) Access data processing method and device, computer equipment and storage medium
CN111695024A (en) Object evaluation value prediction method and system, and recommendation method and system
CN111461827A (en) Product evaluation information pushing method and device
CN113656699B (en) User feature vector determining method, related equipment and medium
CN108647986B (en) Target user determination method and device and electronic equipment
CN112990989B (en) Value prediction model input data generation method, device, equipment and medium
CN108985755B (en) Account state identification method and device and server
CN112199500A (en) Emotional tendency identification method and device for comments and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant