CN111723118A - Waybill inquiry abnormal behavior detection method and device - Google Patents

Waybill inquiry abnormal behavior detection method and device Download PDF

Info

Publication number
CN111723118A
CN111723118A CN201910203972.3A CN201910203972A CN111723118A CN 111723118 A CN111723118 A CN 111723118A CN 201910203972 A CN201910203972 A CN 201910203972A CN 111723118 A CN111723118 A CN 111723118A
Authority
CN
China
Prior art keywords
user
query
group
data
waybill
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910203972.3A
Other languages
Chinese (zh)
Inventor
冯春进
黄丽诗
胡泽柱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SF Technology Co Ltd
SF Tech Co Ltd
Original Assignee
SF Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SF Technology Co Ltd filed Critical SF Technology Co Ltd
Priority to CN201910203972.3A priority Critical patent/CN111723118A/en
Publication of CN111723118A publication Critical patent/CN111723118A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Operations Research (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method and a device for detecting abnormal behavior of waybill inquiry, wherein the method for detecting the abnormal behavior of waybill inquiry comprises the following steps: acquiring query data of user group query behaviors based on the waybill number; according to the characteristics of all query behaviors in the query data, counting the query behaviors of single users in the user group and the corresponding frequency of the query behaviors; calculating PCA scores of the query data of a single user in each time aggregation unit to form a first data group; calculating the DTW distance of every two users according to the first data group to form a second data group; and dividing the user group into a plurality of groups by adopting a cluster analysis method based on the second data group, and outputting the user information of which the number of people in the group is less than a preset threshold value. The DTW can dynamically adjust the similarity among different time sequences, determine the optimal corresponding points of the different time sequences and lay a foundation for better expression effect in the cluster analysis.

Description

Waybill inquiry abnormal behavior detection method and device
Technical Field
The invention relates to the technical field of data mining, in particular to a method and a device for detecting abnormal behavior of waybill inquiry.
Background
At present, user abnormal behavior detection is generally analyzed based on single behavior, the real behavior of a user cannot be well reflected, a large amount of false alarms are generated in practical application, so that investigators spend excessive investigation time and cannot well cover real abnormal events; the expected detection effect is difficult to achieve, the detection cost is high, and if effective detection is not carried out, the risk of company information leakage exists, so that the daily abnormal behaviors of the user can be effectively detected, and the company interests are protected well.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a waybill inquiry abnormal behavior detection method and apparatus.
According to one aspect of the invention, a waybill inquiry abnormal behavior detection method is provided, which comprises the following steps:
acquiring query data of user group query behaviors based on the waybill number;
according to the characteristics of all query behaviors in the query data, counting the query behaviors of single users in the user group and the corresponding frequency of the query behaviors;
calculating PCA scores of the query data of a single user in each time aggregation unit to form a first data group;
calculating the DTW distance of every two users according to the first data group to form a second data group;
and dividing the user group into a plurality of groups by adopting a cluster analysis method based on the second data group, and outputting the user information of which the number of people in the group is less than a preset threshold value.
Further, calculating the PCA score of the query data of the single user in each time aggregation unit to form a first data group, including:
performing Z-score processing on the query behavior of each user and the corresponding frequency of each user;
and predicting the principal component score, and calculating the PCA score of the query data of a single user in each time aggregation unit according to the variance contribution rate of the principal component and the principal component to form a first data group.
Further, calculating the DTW distance of every two users according to the first data set to form a second data set, including:
acquiring a time sequence of a plurality of time aggregation units;
constructing a first matrix by the first data group and the corresponding time sequence;
and calculating the DTW distance of every two users based on the first matrix to form a second data group.
Further, a clustering analysis method is adopted to divide the user group into a plurality of groups based on the second data group, and user information with the number of people in the group less than a preset threshold value is output, wherein the method comprises the following steps:
acquiring a time sequence of a time aggregation unit according to time sequence;
constructing a second matrix by the first data group and the corresponding user sequence;
and setting the grouping number, grouping the user groups by adopting hierarchical clustering analysis, and outputting a user sequence with the number of people in the group being less than a preset threshold value.
Further, the query behavior characteristics include at least two of: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry;
and/or
The time aggregation unit includes at least one of: days, n days, weeks, months;
and/or
The user information includes at least one of: user sequence, user code, user name, user post.
According to another aspect of the present invention, there is provided an waybill inquiry abnormal behavior detection apparatus, including:
the data acquisition module is configured for acquiring query data of a user group query behavior based on the waybill number;
the data statistics module is configured for counting the query behavior of each user and the corresponding frequency thereof according to the query behavior characteristics in the query data;
the first calculation module is configured for calculating PCA scores of query data of a single user in each time aggregation unit to form a first data group;
the second calculation module is configured to calculate the DTW distance between every two users according to the first data group to form a second data group;
and the information output module is configured for dividing the user group into a plurality of groups by adopting a clustering analysis method based on the second data group and outputting the user information of which the number of people in the group is less than a preset threshold value.
Further, the first computing module includes:
the normalization processing unit is configured to perform normalization processing on the query behavior of each user and the frequency corresponding to the query behavior;
and the prediction unit is configured for determining the number of the principal components, predicting the scores of the principal components, and calculating the PCA scores of the query data of the single user in each time aggregation unit according to the variance contribution rates of the principal components and the principal components to form a first data group.
Further, the data acquisition module is also configured to acquire a time sequence of the time aggregation unit according to time sequence;
the second computing module, comprising:
a first matrix construction unit configured to construct a first matrix from the first data group and the time series corresponding thereto;
and the first data group acquisition unit is configured to calculate the DTW distance between every two users according to the first data group to form a second data group.
Further, the data acquisition module is also configured to acquire user information of each user in the user group;
an information output module comprising
The second matrix constructing unit is configured for constructing the first data group and the corresponding user information into a second matrix;
the grouping unit is configured for setting the grouping number and grouping the user groups by adopting hierarchical clustering analysis;
and the identification output unit is configured for outputting the user information of which the number of people in the group is less than a preset threshold value.
Further, the query behavior characteristics include at least two of: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry;
and/or
The time aggregation unit includes at least one of: days, n days, weeks, months;
and/or
The user information includes at least one of: user sequence, user code, user name, user post.
According to another aspect of the present invention, there is provided an apparatus comprising:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of the above.
According to another aspect of the invention, there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements a method as defined in any one of the above.
Compared with the prior art, the invention has the following beneficial effects:
1. the waybill inquiry abnormal behavior detection method disclosed by the invention comprises the steps of calculating PCA scores of inquiry data of a single user in each time aggregation unit and DTW (delay tolerant turn) distances of every two users based on waybill numbers; dividing a user group into a plurality of groups by adopting a clustering analysis method, and setting a threshold value; and outputting the user information of which the number of the personnel in the group is less than a preset threshold value, detecting whether the output user has abnormal behavior of waybill inquiry, identifying the abnormal user of which the inquiry behavior is inconsistent with the group, narrowing the detection range, and increasing the detection accuracy and the detection efficiency.
2. The waybill inquiry abnormal behavior detection device disclosed by the invention acquires waybill inquiry data of a user group, divides the user group into a plurality of groups by adopting a cluster analysis method, sets a threshold value, outputs user information of which the number of people in the group is less than the preset threshold value, detects whether customer information inquiry abnormal behavior exists in the output user, identifies abnormal people from the group, reduces the detection range, and increases the detection accuracy and the detection efficiency.
3. According to the device disclosed by the invention, the processor executes the method for detecting the abnormal behavior of the customer information query, so that abnormal personnel can be identified from a group, the detection range is narrowed, and the detection accuracy and the detection efficiency are improved.
4. The readable storage medium disclosed by the invention stores the customer information inquiry abnormal behavior detection method which is realized by the processor when the readable storage medium is executed, so that the use and popularization of the detection device are facilitated.
Drawings
FIG. 1 is a flow chart of a waybill inquiry abnormal behavior detection method of the present invention;
Detailed Description
In order to better understand the technical scheme of the invention, the invention is further explained by combining the specific embodiment and the attached drawings of the specification.
Example (b):
the waybill inquiry abnormal behavior detection device of the embodiment includes:
the data acquisition module is configured to acquire query data of a user group query behavior based on the waybill number, and acquire a time sequence of a time aggregation unit according to time sequence (the time aggregation unit comprises at least one of day, n days, week and month) and user information of each user in the user group, wherein the user information comprises at least one of the following: user sequence, user code, user name, user post;
the data statistics module is configured for counting the query behavior of each user and the corresponding frequency thereof according to the query behavior characteristics in the query data; the query behavior features include at least two of: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry;
the first calculation module is configured for calculating PCA scores of query data of a single user in each time aggregation unit to form a first data group; a first computing module comprising: the Z-score processing unit is used for carrying out Z-score processing on the query behavior of each user and the corresponding frequency; and the prediction unit is configured for determining the number of the principal components, predicting the scores of the principal components, and calculating the PCA scores of the query data of the single user in each time aggregation unit according to the variance contribution rates of the principal components and the principal components to form a first data group.
A second calculating module, configured to calculate a DTW distance between every two users according to the first data group, to form a second data group, specifically, the second calculating module includes: a first matrix construction unit configured to construct a first matrix from the first data group and the time series corresponding thereto; and the first data group acquisition unit is configured to calculate the DTW distance between every two users according to the first data group to form a second data group.
The information output module is configured for dividing the user group into a plurality of groups by adopting a clustering analysis method based on the second data group and outputting the user information of which the number of people in the group is less than a preset threshold value; specifically, the information output module includes: the second matrix constructing unit is configured for constructing the first data group and the corresponding user information into a second matrix; the grouping unit is configured for setting the grouping number and grouping the user groups by adopting hierarchical clustering analysis; and the information output module comprises a grouping unit which is configured to set a preset threshold L and output the user information of which the number of people in the group is less than the preset threshold L by adopting a cluster analysis method based on the DTW distance.
The waybill inquiry abnormal behavior detection method of the embodiment includes:
s1: acquiring query data of user group query behaviors based on the waybill number;
s2: according to the query behavior characteristics in the query data, counting the query behavior of a single user in a user group and the corresponding frequency of the query behavior; the query behavior features include at least two of: user sequence, user code, user name, user post;
s3: calculating PCA scores of the query data of the single user in each time aggregation unit to form a first data group, wherein the PCA scores comprise:
s3-1: carrying out Z-score processing on the query behavior of each user and the corresponding frequency of each user;
s3-2: determining the number of the principal components, predicting the score of the principal components, and calculating the PCA score of the query data of a single user in each time aggregation unit according to the variance contribution rate of the principal components and the principal components to form a first data group.
S4: calculating the DTW distance of every two users according to the first data group to form a second data group, wherein the DTW distance comprises the following steps:
s4-1: obtaining a time series of a plurality of time aggregated units, the time aggregated units comprising at least one of: days, n days, weeks, months;
s4-2: constructing a first matrix by the first data group and the corresponding time sequence;
s4-3: and calculating the DTW distance corresponding to the PCA scores of every two users based on the first matrix to form a second data group.
S5: and dividing the user group into a plurality of groups by adopting a cluster analysis method based on the second data group, and outputting user information of which the number of people in the group is less than a preset threshold value, wherein the method comprises the following steps:
s5-1: acquiring user information of each user in a user group, wherein the user information comprises at least one of the following: user sequence, user code, user name, user post;
s5-2: constructing a second matrix by the first data group and the corresponding user information;
s5-3: and setting the grouping number, grouping the user groups by adopting hierarchical clustering analysis, and outputting the user information of which the number of people in the group is less than a preset threshold value.
It should be understood that, in the above method for detecting abnormality of waybill inquiry behavior of the user, each step corresponds to sub-units recorded in the device for detecting abnormality of daily behavior of the user. Thus, the operations and features described above for the apparatus and the units contained therein are equally applicable to the above method and will not be described again here.
The following example further illustrates a waybill inquiry abnormal behavior detection method:
step 1: obtaining data
Acquiring waybill inquiry data of a plurality of user groups to be detected based on the waybill number; the user information of each user in the user group (the user information includes at least one of a user sequence, a user code, a user name, and a user post), and the embodiment selects the user sequence, for example, if the current user group has n persons, the user sequence is: user 1, user 2 … …, user n.
Determining a time aggregation unit and obtaining a time series of a plurality of time aggregation units, e.g. in the form of days, weeks, months or every n days, n may be 1 to 30, without suggesting an aggregation period exceeding months.
(2) The prepared data at least needs the following query data with two different query behavior characteristics: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry are as follows:
Figure BDA0001998370360000071
step 2: data processing (Z-score processing)
If the user has the behavior within the time range needing to be detected, recording the occurrence frequency (frequency) of the behavior, and recording the non-occurrence frequency as 0; the specific time range of detection is subject to practical conditions, but it is recommended that each detection should have more than 7 time aggregation units, for example, there should be at least 7 days of data in days as time aggregation units, there should be at least 7 weeks of data in weeks as time aggregation units, and so on.
(1) For example, if n is 1, the polymerization unit is 1 day:
Figure BDA0001998370360000081
for example, where n is 7, the units are polymerized in weeks:
Figure BDA0001998370360000082
and 3, step 3: data computation
(1) Constructing a first matrix
Calculating the PCA score of the user in each time aggregation unit, wherein the PCA score is a comprehensive score obtained by taking all principal components:
a list of PCA scores corresponding to a principal component, where a total PCA score of a certain time aggregation unit is the variance contribution rate of the predicted principal component 1 score + the variance contribution rate of the predicted principal component 2 score + … + the variance contribution rate of the principal component m prediction score + the variance contribution rate of the principal component m, and in this embodiment, m is 6(6 principal components are the number of query waybills, the number of customers of the query waybills, the number of computer sources ip, the number of source delivery areas corresponding to the query waybills, the number of destination delivery areas corresponding to the query waybills, and the number of industries corresponding to the query waybills, respectively). Formula for calculating principal component variance contribution ratio (prior art), step 1: calculating the variance of each principal component; step 2: and calculating the percentage of the variance occupied by each variance, namely the variance contribution rate.
N employees, with 1 day as the time aggregation unit, the detection period is 1 month, the column is the user information, the row is the time aggregation unit, and then the PCA score M1 with the PCA synthetic score matrix of 30(31) × N, M1 is as follows:
Figure BDA0001998370360000091
(2) constructing a second matrix
Calculating DTW distance, calculating the DTW distance of every two users according to the matrix M1 obtained in the previous step, calculating the DTW distance by adopting a parDist function in an R language parrallelDist package, wherein columns represent PCA scores of query behaviors of each user time aggregation unit, calculating the DTW distance of the PCA score of each user, and obtaining a distance matrix M2 of N, wherein M2 is as follows:
user 1 User 2 User 10 User n
User 2 1 50 0 30
20 1 73 40
User 10 40 73 1 20
User n 30 0 50 1
In this embodiment, the DTW is adopted to dynamically adjust the similarity between different time sequences and determine the best corresponding points of the different time sequences, the distance between the different time sequences calculated by the DTW is more advantageous than the traditional calculation of the euclidean distance, and the clustering result depends on the distance calculation in a large program, so that the distance calculated by the DTW method has a better expression effect in the clustering analysis.
The fourth step: cluster analysis
The hierarchical clustering analysis outputs abnormal employees, which can adopt a method of ward.d2 or media in the hierarchical clustering method, the set grouping number can be adjusted according to business needs, a preset threshold value L is set, and user information of which the number of the employees in the grouping group is smaller than L is output (for example, n is 2000, L is set to 8, and the value of L can be increased or decreased appropriately according to changes of the employees).
Further, if all users need to be put together for cluster detection according to business needs, all users can be subjected to cluster analysis by using the method to output abnormal staff;
when the user group can be divided into a plurality of user small groups (such as posts, the same department or the same group) according to different factors, for example, the current user group has 100 people, the user group can be divided into 30 development posts, 8 financial posts, 20 human resource posts, 15 safety management posts, product promotion posts and 27 product posts according to the post division, and then divided into six user small groups; if the clustering is needed according to different user small groups, clustering analysis is carried out on the different user small groups according to the method, and abnormal staff are output.
This embodiment provides an apparatus, the apparatus comprising:
one or more processors;
a memory for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the methods as described above.
The present embodiments provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method as described above.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the features described above have similar functions to (but are not limited to) those disclosed in this application.

Claims (10)

1. A waybill inquiry abnormal behavior detection method is characterized by comprising the following steps:
acquiring query data of user group query behaviors based on the waybill number;
according to the characteristics of all query behaviors in the query data, counting the query behaviors of single users in the user group and the corresponding frequency of the query behaviors;
calculating PCA scores of the query data of a single user in each time aggregation unit to form a first data group;
calculating the DTW distance of every two users according to the first data group to form a second data group;
and dividing the user group into a plurality of groups by adopting a cluster analysis method based on the second data group, and outputting the user information of which the number of people in the group is less than a preset threshold value.
2. The waybill query abnormal behavior detection method of claim 1, wherein calculating PCA scores of query data of individual users in each time aggregation unit to form a first data group comprises:
performing Z-score processing on the query behavior of each user and the corresponding frequency of each user;
and predicting the principal component score, and calculating the PCA score of the query data of a single user in each time aggregation unit according to the variance contribution rate of the principal component and the principal component to form a first data group.
3. The waybill inquiry abnormal behavior detection method according to claim 1, wherein calculating the DTW distance of every two users according to the first data group to form a second data group comprises:
acquiring a time sequence of a plurality of time aggregation units;
constructing a first matrix by the first data group and the corresponding time sequence;
and calculating the DTW distance of every two users based on the first matrix to form a second data group.
4. The waybill inquiry abnormal behavior detection method according to claim 1, wherein a user group is divided into a plurality of groups by a cluster analysis method based on the second data group, and user information with the number of people in the group being less than a preset threshold is output, including:
acquiring a time sequence of a time aggregation unit according to time sequence;
constructing a second matrix by the first data group and the corresponding user sequence;
and setting the grouping number, grouping the user groups by adopting hierarchical clustering analysis, and outputting a user sequence with the number of people in the group being less than a preset threshold value.
5. The waybill query abnormal behavior detection method as claimed in any one of claims 1 to 4,
the query behavior features include at least two of: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry;
and/or
The time aggregation unit includes at least one of: days, n days, weeks, months;
and/or
The user information includes at least one of: user sequence, user code, user name, user post.
6. An waybill inquiry abnormal behavior detection device, comprising:
the data acquisition module is configured to acquire query data of a user group query behavior based on the waybill number;
the data statistics module is configured for counting the query behavior of each user and the corresponding frequency thereof according to the query behavior characteristics in the query data;
the first calculation module is configured for calculating PCA scores of query data of a single user in each time aggregation unit to form a first data group;
the second calculation module is configured to calculate the DTW distance between every two users according to the first data group to form a second data group;
and the information output module is configured for dividing the user group into a plurality of groups by adopting a clustering analysis method based on the second data group and outputting the user information of which the number of people in the group is less than a preset threshold value.
7. The waybill query abnormal behavior detection device according to claim 6, wherein the first calculation module comprises:
the Z-score processing unit is configured for carrying out Z-score processing on the query behavior of each user and the corresponding frequency of the query behavior;
and the prediction unit is configured for determining the number of the principal components, predicting the scores of the principal components, and calculating the PCA scores of the query data of the single user in each time aggregation unit according to the variance contribution rates of the principal components and the principal components to form a first data group.
8. The waybill inquiry abnormal behavior detection device as claimed in claim 6, wherein the data acquisition module is further configured to acquire the time sequence of the time aggregation unit according to time sequence;
the second computing module, comprising:
a first matrix construction unit configured to construct a first matrix from the first data group and the time series corresponding thereto;
and the first data group acquisition unit is configured to calculate the DTW distance between every two users according to the first data group to form a second data group.
9. The waybill inquiry abnormal behavior detection device as claimed in claim 6, wherein the data acquisition module is further configured to acquire user information of each user in a user group;
an information output module comprising
The second matrix constructing unit is configured for constructing the first data group and the corresponding user information into a second matrix;
the grouping unit is configured for setting the grouping number and grouping the user groups by adopting hierarchical clustering analysis;
and the identification output unit is configured for outputting the user information of which the number of people in the group is less than a preset threshold value.
10. The waybill query abnormal behavior detection device of claims 6-9,
the query behavior features include at least two of: the waybill inquiry terminal ip, the waybill number inquiry behavior, the client behavior corresponding to the waybill inquiry, the delivery place behavior corresponding to the waybill inquiry, the receiving place behavior corresponding to the waybill inquiry and the industry behavior corresponding to the waybill inquiry;
and/or
The time aggregation unit includes at least one of: days, n days, weeks, months;
and/or
The user information includes at least one of: user sequence, user code, user name, user post.
CN201910203972.3A 2019-03-18 2019-03-18 Waybill inquiry abnormal behavior detection method and device Pending CN111723118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910203972.3A CN111723118A (en) 2019-03-18 2019-03-18 Waybill inquiry abnormal behavior detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910203972.3A CN111723118A (en) 2019-03-18 2019-03-18 Waybill inquiry abnormal behavior detection method and device

Publications (1)

Publication Number Publication Date
CN111723118A true CN111723118A (en) 2020-09-29

Family

ID=72563105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910203972.3A Pending CN111723118A (en) 2019-03-18 2019-03-18 Waybill inquiry abnormal behavior detection method and device

Country Status (1)

Country Link
CN (1) CN111723118A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653427A (en) * 2016-03-04 2016-06-08 上海交通大学 Log monitoring method based on abnormal behavior detection
CN107358075A (en) * 2017-07-07 2017-11-17 四川大学 A kind of fictitious users detection method based on hierarchical clustering
CN108717510A (en) * 2018-05-11 2018-10-30 深圳市联软科技股份有限公司 A kind of method, system and terminal by clustering file abnormal operation behavior
CN108805747A (en) * 2018-06-13 2018-11-13 山东科技大学 A kind of abnormal electricity consumption user detection method based on semi-supervised learning
CN109241994A (en) * 2018-07-31 2019-01-18 顺丰科技有限公司 A kind of user's anomaly detection method, device, equipment and storage medium
US20190028504A1 (en) * 2017-07-18 2019-01-24 Imperva, Inc. Insider threat detection utilizing user group data object access analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653427A (en) * 2016-03-04 2016-06-08 上海交通大学 Log monitoring method based on abnormal behavior detection
CN107358075A (en) * 2017-07-07 2017-11-17 四川大学 A kind of fictitious users detection method based on hierarchical clustering
US20190028504A1 (en) * 2017-07-18 2019-01-24 Imperva, Inc. Insider threat detection utilizing user group data object access analysis
CN108717510A (en) * 2018-05-11 2018-10-30 深圳市联软科技股份有限公司 A kind of method, system and terminal by clustering file abnormal operation behavior
CN108805747A (en) * 2018-06-13 2018-11-13 山东科技大学 A kind of abnormal electricity consumption user detection method based on semi-supervised learning
CN109241994A (en) * 2018-07-31 2019-01-18 顺丰科技有限公司 A kind of user's anomaly detection method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周大镯: "多变量时间序列的聚类、相似查询与异常检测", 《中国博士学位论文全文数据库 信息科技辑》 *
王忠群等: "基于模板用户信息搜索行为和统计分析的共谋销量欺诈识别", 《现代图书情报技术》 *

Similar Documents

Publication Publication Date Title
US10216829B2 (en) Large-scale, high-dimensional similarity clustering in linear time with error-free retrieval
US9753964B1 (en) Similarity clustering in linear time with error-free retrieval using signature overlap with signature size matching
CN111178949B (en) Service resource matching reference data determining method, device, equipment and storage medium
KR102105319B1 (en) Esg based enterprise assessment device and operating method thereof
Lee et al. An approximate duplicate elimination in RFID data streams
CN115391669B (en) Intelligent recommendation method and device and electronic equipment
JP2014006757A (en) Content distribution device
CN112380454A (en) Training course recommendation method, device, equipment and medium
Shroff et al. Enterprise information fusion for real-time business intelligence
US11830286B2 (en) Data processing apparatus, data processing method, and non-transitory storage medium
CN115204881A (en) Data processing method, device, equipment and storage medium
CN114495137B (en) Bill abnormity detection model generation method and bill abnormity detection method
CN111708813A (en) User daily behavior abnormity detection method and device
CN111723118A (en) Waybill inquiry abnormal behavior detection method and device
JP6031165B1 (en) Promising customer prediction apparatus, promising customer prediction method, and promising customer prediction program
KR20180052243A (en) Method and device for detecting frauds by using click log data
Harding et al. Weighting methods for the 2010 data collection cycle of the Medical Monitoring Project
CN111723825A (en) Method and device for detecting abnormal behavior of customer information query
CN114547335A (en) Service data processing method, device, equipment and storage medium
CN113987206A (en) Abnormal user identification method, device, equipment and storage medium
CN112884593A (en) Medical insurance fraud and insurance behavior detection method and early warning device based on graph cluster analysis
CN113643783A (en) Sub-health population drug recommendation method, system, equipment and storage medium
CN111382343B (en) Label system generation method and device
CN111967966B (en) Automatic wake-up method and system for sleep clients of mobile phone banks
Dixon et al. Occupational models from 42 million unstructured job postings

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination