CN113609395A - Method and system for judging motor vehicle information query abnormity by fusing multiple characteristics - Google Patents

Method and system for judging motor vehicle information query abnormity by fusing multiple characteristics Download PDF

Info

Publication number
CN113609395A
CN113609395A CN202110913177.0A CN202110913177A CN113609395A CN 113609395 A CN113609395 A CN 113609395A CN 202110913177 A CN202110913177 A CN 202110913177A CN 113609395 A CN113609395 A CN 113609395A
Authority
CN
China
Prior art keywords
abnormal
user
analyzed
users
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110913177.0A
Other languages
Chinese (zh)
Other versions
CN113609395B (en
Inventor
裘晨璐
季君
蔡晨
李立成
周建宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Traffic Management Research Institute of Ministry of Public Security
Original Assignee
Traffic Management Research Institute of Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Traffic Management Research Institute of Ministry of Public Security filed Critical Traffic Management Research Institute of Ministry of Public Security
Priority to CN202110913177.0A priority Critical patent/CN113609395B/en
Publication of CN113609395A publication Critical patent/CN113609395A/en
Application granted granted Critical
Publication of CN113609395B publication Critical patent/CN113609395B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of public security traffic management information safety management, and particularly discloses a method for judging motor vehicle information query abnormity by fusing multiple characteristics, which comprises the following steps: acquiring a characteristic variable to be analyzed; respectively constructing corresponding user number distribution histograms for different variable values in each characteristic variable to be analyzed; respectively calculating the abnormal scores of the query behaviors of all users corresponding to the characteristic variables to be analyzed according to the user number distribution histogram; generating an abnormal user list based on the abnormal score of the query behavior; and fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list. The invention also discloses a system for judging the abnormal inquiry of the motor vehicle information by fusing the multiple characteristics. The method for judging the abnormal inquiry of the motor vehicle information with the fusion of the multiple characteristics, provided by the invention, is convenient for a traffic management department to strengthen the data security supervision of an information system and strictly prevents the case of leakage of the personal privacy information of citizens.

Description

Method and system for judging motor vehicle information query abnormity by fusing multiple characteristics
Technical Field
The invention relates to the technical field of public security traffic management information safety management, in particular to a method and a system for judging motor vehicle information query abnormity by fusing multiple characteristics.
Background
The national public security traffic management comprehensive application platform (hereinafter referred to as a comprehensive application platform) stores national motor vehicle registration information, including vehicle brands, registration places, vehicle owner names, identity card numbers, family addresses, contact ways and the like, and a user can use a comprehensive query function to query the motor vehicle registration information after logging in a system.
Since the implementation of the network security law of the people's republic of China on day 1 of 6 months in 2017, the phenomenon occurs sometimes when users of information systems of the public security traffic control departments in various regions illegally inquire motor vehicle registration information and reveal cases of personal privacy information of the citizens.
The existing supervision method mainly counts the number of times of inquiry in unit time of each user, sets a threshold value and carries out early warning. The method for early warning the single service index by setting the threshold value according to the service experience has high accidental injury rate and is easy to be avoided manually.
How to model the characteristic distribution of the user query behavior based on a mathematical statistics method, estimate the suspected abnormal query possibility of the user, and perform early warning on the abnormal user with the possible illegal query behavior is a technical problem to be solved urgently in the field of the security supervision of the public security traffic management information system.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides the method and the system for judging the abnormal inquiry of the motor vehicle information by fusing multiple characteristics, which can be convenient for a traffic management department to strengthen the data security supervision of an information system and strictly prevent the case of leakage of the private information of citizens.
As a first aspect of the present invention, there is provided a method for determining an abnormality in a motor vehicle information query, which integrates multiple features, including:
acquiring a user query behavior feature set and a user set S to be analyzedinWherein the user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of feature variablesA variable value;
respectively calculating the user sets S to be analyzedinEach effective user inquires the number of times and the proportion of the number of times of motor vehicle registration information corresponding to different variable values in the characteristic variable to be analyzed to the total number of times of inquiry of the user;
for each characteristic variable to be analyzed, respectively constructing a group of user number distribution histograms, wherein different variable values in the characteristic variable to be analyzed respectively correspond to one user number distribution histogram, the horizontal axis of the user number distribution histogram represents the proportion of the number of times of querying the motor vehicle registration information corresponding to the different variable values by a user to the total number of times of querying, and the vertical axis represents the proportion of the number of the users of the query proportion in the corresponding interval range to the total number of all the users to be analyzed;
respectively calculating a set S of users to be analyzed corresponding to the characteristic variables to be analyzed according to the group of user number distribution histograms corresponding to the characteristic variables to be analyzedinThe abnormal scores of the query behaviors of all effective users;
user set S to be analyzed corresponding to each characteristic variable to be analyzedinThe method comprises the steps of adaptively setting early warning threshold values for abnormal scores of inquiry behaviors of all valid users, judging the users with the inquiry behavior abnormal scores higher than the early warning threshold values as abnormal users, sequencing all the abnormal users from high to low according to the inquiry behavior abnormal scores to generate abnormal user suspicion degree ranking, and generating an abnormal user list according to an abnormal user list and the abnormal user suspicion degree ranking;
and fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
Further, the fusing the abnormal user lists generated by different feature variables to be analyzed to generate a new abnormal user list further includes:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by using a feature bagging algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
Further, an abnormal user list generated by fusing two characteristic variables to be analyzed is set, a new abnormal user list is generated, and the method further comprises the following steps:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list, and initializing a third abnormal user list, wherein the third abnormal user list is set as an empty set;
selecting an abnormal user with a first suspicion degree rank in the first abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, placing the abnormal users at the tail of the third abnormal user list, removing the abnormal users from the first abnormal user list, and updating the suspicion degree ranking of the rest abnormal users in the first abnormal user list;
selecting the abnormal user with the first suspicion degree in the second abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, the third abnormal user list is placed at the tail of the third abnormal user list, the third abnormal user list is removed from the second abnormal user list, and the suspicion degree ranking of the rest abnormal users in the second abnormal user list is updated;
iterating circularly until all abnormal users in the first abnormal user list and the second abnormal user list are removed;
and the abnormal user sequence in the newly generated third abnormal user list is the suspicion degree comprehensive ranking of each abnormal user after fusion.
Further, an abnormal user list generated by fusing two characteristic variables to be analyzed is set, a new abnormal user list is generated, and the method further comprises the following steps:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, and setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list;
respectively calculating the sequencing accumulated percentile of the suspicion degree of each abnormal user in the first abnormal user list and the second abnormal user list from large to small, and recording the ranking accumulated percentile as r1,r2
Removing the duplication of the abnormal users in the first abnormal user list and the second abnormal user list, and then combining the abnormal users to generate a third abnormal user list;
passing a maximum value max (r) for each anomalous user in the list of third anomalous users1,r2) Minimum value min (r)1,r2) Or weighted average w1×r1+w2×r2The method comprises the steps of calculating a suspicion degree comprehensive coefficient of illegal inquiry of each abnormal user; wherein w is more than or equal to 01≤1,0≤w2≤1,w1+w2=1;
And sequencing the comprehensive suspicion coefficient of the illegal inquiry suspicion of the abnormal users from small to large to generate the comprehensive suspicion ranking of each abnormal user in the third abnormal user list.
Further, the adaptively setting an early warning threshold value based on the abnormal scores of the query behaviors of all users corresponding to each feature variable to be analyzed further includes:
the following two methods are adopted for setting:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
As a second aspect of the present invention, there is provided a system for discriminating an abnormality in motor vehicle information query, which integrates multiple features, including:
the system parameter initialization setting module is used for acquiring a user query behavior feature set and a user set S to be analyzedinThe user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of variable values;
a user sample statistic validity judging module for judging the user set S to be analyzedinWhether the user sample in (1) is statistically valid, wherein the set of users to be analyzed SinThe user sample with statistical effectiveness is used for inquiring the calculation in the behavior abnormal user early warning module;
the inquiry behavior abnormal user early warning module is used for respectively calculating the user set S to be analyzedinEach effective user inquires the times and the proportion of motor vehicle registration information corresponding to different variable values in the characteristic variables to be analyzed; respectively constructing a group of user number distribution histograms for each characteristic variable to be analyzed, wherein each variable value in the characteristic variables to be analyzed corresponds to one user number distribution histogram; respectively calculating a set S of users to be analyzed corresponding to the characteristic variables to be analyzed according to the group of user number distribution histograms corresponding to the characteristic variables to be analyzedinThe abnormal scores of the query behaviors of all effective users; based on the abnormal query behavior scores of all users corresponding to each characteristic variable to be analyzed, setting an early warning threshold value in a self-adaptive mode, judging the users with the abnormal query behavior scores higher than the early warning threshold value as abnormal users, sorting all the abnormal users from high to low according to the abnormal query behavior scores, generating abnormal user suspicion degree ranking, and generating an abnormal user list according to an abnormal user list and the abnormal user suspicion degree ranking;
and the early warning information fusion integration module is used for fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
Further, the system parameter initialization setting module is specifically configured to:
selecting a department to which a user to be analyzed belongs, and determining the user set S to be analyzed according to the department to which the user to be analyzed belongsin
Selecting a time range for a user to perform motor vehicle registration information query operation;
and selecting a user query behavior characteristic set to be analyzed.
Further, the user sample statistical validity determination module is specifically configured to:
according to the selected query operation time range, calculating the total times of querying motor vehicle registration information by each user to be analyzed;
self-adaptively calculating a user query operation frequency threshold value, and selecting the users with the total number of times of querying the motor vehicle registration information smaller than the user query operation frequency threshold value from the user set S to be analyzedinRemoving;
self-adaptively calculating the user number threshold, if the user set S to be analyzedinIf the number of users in (1) is less than the user number threshold, the program is terminated and the parameter resetting is prompted.
Further, the query behavior abnormal user early warning module is specifically configured to:
the early warning threshold is set by adopting the following two methods:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
Further, the early warning information fusion integration module is specifically configured to:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by using a feature bagging algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
The method and the system for judging the abnormal inquiry of the motor vehicle information integrating the multiple characteristics have the following advantages that: (1) the method comprises the steps of calculating a user query operation frequency threshold value and a user number threshold value in a self-adaptive mode, and judging whether a system analysis result has statistical significance or not; (2) a behavior feature set of querying motor vehicle registration information by a user is constructed from two business angles of querying time and querying a motor vehicle registration place, and an abnormal user is judged based on a multi-feature distribution calculation result, so that the generalization capability of the model is improved; (3) the method for calculating the output abnormal user list by fusing different feature distributions improves the accuracy and robustness of an analysis result and ensures the expandability of the abnormal judgment method.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flow chart of a method for determining vehicle information query anomalies by fusing multiple features according to the present invention.
Fig. 2 is a structural block diagram of the system for determining vehicle information query abnormality by fusing multiple features provided by the invention.
Fig. 3 is a flowchart of a method for calculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list provided in the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined object, the following detailed description will be given to a method and a system for determining vehicle information query abnormality with multiple characteristics according to the present invention, and the specific implementation, structure, characteristics and effects thereof, with reference to the accompanying drawings and preferred embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without any inventive step, are within the scope of the present invention.
In this embodiment, a method for determining a multi-feature fused vehicle information query anomaly is provided, which is shown in fig. 1 and includes:
acquiring a user query behavior feature set and a user set S to be analyzedinThe user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of variable values;
respectively calculating the user sets S to be analyzedinEach effective user inquires the times and the proportion of motor vehicle registration information corresponding to different variable values in the characteristic variables to be analyzed;
it should be noted that the ratio is a ratio of the number of times of querying the motor vehicle registration information corresponding to different variable values of the characteristic variable by the user to the total number of times of querying by the user.
For each characteristic variable to be analyzed, respectively constructing a group of user number distribution histograms, wherein different variable values of the characteristic variable to be analyzed respectively correspond to one histogram;
it should be noted that the user number distribution histogram is composed of a series of longitudinal columns with different heights, wherein the horizontal axis represents the proportion of the number of times of querying the motor vehicle registration information corresponding to the different variable values by the user to the total number of times of querying (the number is divided into a plurality of interval ranges in a fixed length or non-fixed length mode, and the number is selected individually), and the vertical axis represents the proportion of the number of the users whose query proportion is in the corresponding interval range to the total number of all the users to be analyzed.
Respectively calculating a set S corresponding to each characteristic variable to be analyzed by using a Histogram-based outlier score algorithm (Histogram-based outlierscore) according to the group of user quantity distribution histograms corresponding to the characteristic variables to be analyzedinThe abnormal scores of the query behaviors of all effective users;
it should be noted that, the abnormal score of the query behavior of a single user is calculated, and the higher the score is, the greater the difference between the query behavior of the user and the group query behavior is represented.
Set S corresponding to each characteristic variable to be analyzedinThe method comprises the steps of adaptively setting early warning threshold values for abnormal scores of inquiry behaviors of all valid users, judging the users with the inquiry behavior abnormal scores higher than the early warning threshold values as abnormal users, sequencing all the abnormal users from high to low according to the inquiry behavior abnormal scores to generate abnormal user suspicion degree ranking, and generating an abnormal user list according to an abnormal user list and the abnormal user suspicion degree ranking;
and fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
It should be noted that: one characteristic variable corresponds to a group of histograms, and the number of the histograms in each group is different and is respectively equal to the number of different variable values corresponding to different characteristic variables.
By a characteristic variable F1For example, { province inside, province outside }, this feature variable corresponds to two histograms.
The first histogram corresponding variable value 'province' specifically comprises the following steps: (1) counting the proportion of the times of inquiring the registration information of the provincial motor vehicles of each user to the total inquiring times of the provincial motor vehicles; (2) drawing a histogram, determining a user intra-provincial query proportion interval division mode, assuming that the fixed length division mode is adopted as [ 0% -10%), [ 10% -20%), … … and [ 90% -100%) (two division points of the interval are closed left and opened right), the width of the first column on the horizontal axis is 0% -10%, the height of the vertical axis is the proportion of the number of users with intra-provincial query proportion in the range of 0% -10% to the number of all users to be analyzed, the width of the second column on the horizontal axis is 10% -20%, the height of the vertical axis is the proportion of the number of users with intra-provincial query proportion in the range of 10% -20% to the number of all users to be analyzed, and so on.
The second histogram corresponds to the variable value 'out of the province', and the specific steps are as follows: (1) counting the proportion of the number of times of inquiring the registration information of the provincial motor vehicles by each user to the total number of times of inquiring; (2) drawing a histogram, determining a user out-of-province query proportion interval division mode, assuming that the fixed length division mode is [ 0% -10%), [ 10% -20%), … … and [ 90% -100%), the width of the first column on the horizontal axis is 0% -10%, the height of the vertical axis is the proportion of the number of users with out-of-province query proportion in the range of 0% -10% to the number of all users to be analyzed, the width of the second column on the horizontal axis is 10% -20%, the height of the vertical axis is the proportion of the number of users with out-of-province query proportion in the range of 10% -20% to the number of all users to be analyzed, and so on.
Preferably, the fusing the abnormal user list generated by different feature variables to be analyzed to generate a new abnormal user list further includes:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the comprehensive ranking of the suspicion degree of each abnormal user in the new abnormal user list by using a Feature Bagging (Feature Bagging) algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
Preferably, the method for generating the new abnormal user list by setting and fusing the two feature variables to be analyzed further includes:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list, and initializing a third abnormal user list, wherein the third abnormal user list is set as an empty set;
selecting an abnormal user with a first suspicion degree rank in the first abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, placing the abnormal users at the tail of the third abnormal user list, removing the abnormal users from the first abnormal user list, and updating the suspicion degree ranking of the rest abnormal users in the first abnormal user list;
selecting the abnormal user with the first suspicion degree in the second abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, the third abnormal user list is placed at the tail of the third abnormal user list, the third abnormal user list is removed from the second abnormal user list, and the suspicion degree ranking of the rest abnormal users in the second abnormal user list is updated;
iterating circularly until all abnormal users in the first abnormal user list and the second abnormal user list are removed;
and the abnormal user sequence in the newly generated third abnormal user list is the suspicion degree comprehensive ranking of each abnormal user after fusion.
Preferably, the method for generating the new abnormal user list by setting and fusing the two feature variables to be analyzed further includes:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, and setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list;
respectively calculating the sequencing accumulated percentile of the suspicion degree of each abnormal user in the first abnormal user list and the second abnormal user list from large to small, and recording the ranking accumulated percentile as r1,r2
Removing the duplication of the abnormal users in the first abnormal user list and the second abnormal user list, and then combining the abnormal users to generate a third abnormal user list;
passing a maximum value max (r) for each anomalous user in the list of third anomalous users1,r2) Minimum value min (r)1,r2) Or weighted average w1×r1+w2×r2The method comprises the steps of calculating a suspicion degree comprehensive coefficient of illegal inquiry of each abnormal user; wherein w is more than or equal to 01≤1,0≤w2≤1,w1+w2=1;
It should be noted that, for the users in the first abnormal user list but not in the second abnormal user list, let r2For users in the second abnormal user list but not in the first abnormal user list, let r be 11=1。
And sequencing the comprehensive suspicion coefficient of the illegal inquiry suspicion of the abnormal users from small to large to generate the comprehensive suspicion ranking of each abnormal user in the third abnormal user list.
Preferably, the adaptively setting an early warning threshold value based on the abnormal scores of the query behaviors of all the users corresponding to each feature variable to be analyzed further includes:
the following two methods are adopted for setting:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
As another embodiment of the present invention, as shown in fig. 2, there is provided a system for determining an abnormality of vehicle information query, which integrates multiple features, including:
system parameter initialization setting module 10 for obtainingUser query behavior feature set and user set S to be analyzedinThe user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of variable values;
a user sample statistical validity determination module 20, configured to determine the user set S to be analyzedinWhether the user sample in (1) is statistically valid, wherein the set of users to be analyzed SinThe user samples with statistical significance are used for calculating in the query behavior abnormal user early warning module 30;
an abnormal query behavior user early warning module 30 for calculating the user sets S to be analyzed respectivelyinEach effective user inquires the times and the proportion of motor vehicle registration information corresponding to different variable values in the characteristic variables to be analyzed; respectively constructing a group of user quantity distribution histograms for each characteristic variable to be analyzed, wherein each variable value of the characteristic variable to be analyzed corresponds to one Histogram, and respectively calculating the abnormal scores of the query behaviors of all users corresponding to the characteristic variable to be analyzed by using a Histogram-based abnormal value scoring algorithm (Histogram-based OutlierScore) according to the group of user quantity distribution histograms corresponding to the characteristic variable to be analyzed, wherein the higher the score is, the larger the difference between the query behaviors and the group query behaviors is; based on the abnormal query behavior scores of all users corresponding to each characteristic variable to be analyzed, an abnormal value recognition algorithm (Boxploit) based on a Boxplot is used for adaptively setting an early warning threshold value, the users with the abnormal query behavior scores higher than the early warning threshold value are judged as abnormal users, all the abnormal users are ranked from high to low according to the abnormal query behavior scores, an abnormal user suspicion degree ranking is generated, and an abnormal user list is generated according to an abnormal user list and the abnormal user suspicion degree ranking;
it should be noted that the number of each group of histograms is equal to the number of different variable values of the characteristic variable to be analyzed; each user number distribution histogram is composed of a series of longitudinal columns with different heights, wherein the horizontal axis represents the inquiry frequency interval range of different motor vehicle registration information under the corresponding variable value, and the vertical axis represents the proportion of the number of users in the corresponding interval range of the characteristic variable to the total number of all users to be analyzed.
And the early warning information fusion and integration module 40 is used for fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
Preferably, the system parameter initialization setting module 10 is specifically configured to:
s1.1: selecting a department to which a user to be analyzed belongs, and determining the user set S to be analyzed according to the department to which the user to be analyzed belongsinWherein, the department includes traffic police head team, traffic police branch team, or any appointed department range;
s1.2: selecting a time range of the motor vehicle registration information query operation of a user, such as within a certain year, a certain quarter, a certain month or any specified starting and stopping time range;
s1.3: selecting a set of user query behavior features to be analyzed (F)1,F2,F3,F4,F5,F6And querying a behavior feature set { F } from the user1,F2,F3,F4,F5,F6Selecting a characteristic variable to be analyzed, and supporting multi-selection; wherein the behavior feature set { F) is queried from the user1,F2,F3,F4,F5,F6Characteristic variable F in1,F2,F3,F4,F5,F6The definition of (a) and the corresponding number of variable values are described as follows:
(1) characteristic variable F1: the name of the variable is 'motor vehicle registration area distribution 1', the variable values are 'provincial' and 'provincial', the number of the variable values is N1=2;
(2) Characteristic variable F2: the variable name is motor vehicle registration place distribution 2, the variable values are provincial local city, provincial internal and external city and provincial external city, and the variable value number N2=3;
(3) Characteristic variable F3: the variable name is motor vehicle registration area distribution 3, and the variable values are Beijing and Tianjin ""north of river", "shanxi", "inner Mongolia", "Liaoning", "Jilin", "Heilongjiang", "Shanghai", "Jiangsu", "Zhejiang", "Anhui", "Fujian", "Jiangxi", "Shandong", "Henan", "Hubei", "Hunan", "Guangdong", "Guangxi", "Hainan", "Chongqing", "Sichuan", "Guizhou", "Yunnan", "Xizang", "Shanxi", "Gansu", "Qinghai", "Ningxia", "Xinjiang", the number of variables N3=31;
(4) Characteristic variable F4: the variable name is 'query time period distribution 1', the variable values are 'daytime' and 'night', and the variable value number is N4=2;
(5) Characteristic variable F5: the variable name is 'query time period distribution 2', the variable values are 'working time' and 'non-working time', and the variable value number N5=2;
(6) Characteristic variable F6: the variable name is "query time period distribution 3", the variable values are "0 hours (0:00-0: 59)", "1 hours (1:00-1: 59)", "2 hours (2:00-2: 59)", 3 hours (3:00-3:59) ", … …," 23 hours (23:00-23:59) ", and the variable value number N6=24。
It should be noted that the system defines the time division mode of day/night and working time/non-working time according to the geographical position of the user department and the handled traffic management service. For example, the day of most provinces in the whole country is defined as 6 am to 20 am, and the night is defined as 20 am to 6 am, but high latitude areas such as Heilongjiang, Xinjiang and the like are properly adjusted according to actual conditions.
Preferably, the user sample statistical validity determination module 20 is specifically configured to:
s2.1: calculating the set S in S1.1 according to the query operation time range selected in S1.2inThe total times of inquiring motor vehicle registration information by each user to be analyzed;
s2.2: self-adaptive calculation user query operation frequency threshold alpha1If the total number of times of inquiring the motor vehicle registration information in the S1.2 is less than the threshold of the number of times of inquiring operation of the userValue alpha1From the set S of users to be analyzedinThe part of the user query operation times are too few, so that the part of the user query operation times are not statistically significant; recommend the threshold value alpha1Is set as alpha1>5 XN, wherein N is the maximum value N of the number of the selected characteristic variables in S1.3, and the calculation formula is that N is max (N)1,N2,…);
S2.3: adaptive computation of user number threshold alpha2If S2.2, the user set S to be analyzedinIs less than the user number threshold a2If the total number of the users to be analyzed is too small, the effectiveness judgment cannot be carried out through sample statistics (the number of samples is too small, the differences among the query characteristics of different users cannot be analyzed), the step S1.1 is returned, and the parameter resetting is prompted; if S2.2 is later set SinThe number of users in (1) is not less than alpha2Entering a user query behavior abnormal score calculation module 30 through sample statistic validity judgment; recommend the threshold value alpha2Is set as alpha2>>20。
The query behavior abnormal user early warning module 30 calculates an abnormal score of each user query behavior compared with the group query behavior by using a Histogram-based abnormal score algorithm (Histogram-based outlierscore), wherein the higher the score, the larger the difference between the user query behavior and the group query behavior; based on the abnormal scoring result of the user query behavior, an abnormal value recognition algorithm (Boxplot) based on a box line graph is applied, a threshold value is set in a self-adaptive mode, whether the user is suspected of illegal query or not is judged, and the suspicion ranking is given.
Specifically, the query behavior abnormal user early warning module 30 is specifically configured to:
for each characteristic variable F in S1.3k∈{F1,F2,F3,F4,F5,F6Finish S3.1, S3.2, S3.3, S3.4, S3.5, S3.6 and S3.7.
S3.1: determining a characteristic variable FkSorting mode of values
Figure BDA0003204440930000091
Ranking mode determinationThe rear part is fixed; wherein N iskIs a characteristic variable FkThe number of corresponding variable values;
s3.2: for set SinEach user u ini∈SinStatistics of user query correspondences FkThe number of times of motor vehicle registration information with different variable values; in the sorting mode determined in S3.1
Figure BDA0003204440930000092
Constructing feature vectors
Figure BDA0003204440930000093
Wherein x isi,jIs user uiQuery Fk=ψjThe number of times the vehicle registration information is made;
s3.3: for set SinEach user u ini∈SinStatistics of user query correspondences FkThe ratio of the number of times of motor vehicle registration information with different variable values to the total number of times of user query is constructed to form a feature vector
Figure BDA0003204440930000094
Figure BDA0003204440930000095
The specific calculation method is
Figure BDA0003204440930000096
In the formula (I), the compound is shown in the specification,
Figure BDA0003204440930000097
is user uiTotal number of times of inquiry of motor vehicle registration information, yi,jIs user uiQuery Fk=ψjThe number of times of the motor vehicle registration information accounts for the proportion of the total number of times of inquiry;
s3.4: and respectively constructing a group of user number distribution histograms for different variable values of each characteristic variable. The construction method of each histogram includes two kinds of fixed length interval histogram and non-fixed length interval histogram with characteristic variable FkA certain variable value psijFor example, the following are described respectively:
the method comprises the following steps: fixed length interval histogram.
(1) Setting a parameter K, wherein the K is the number of columns in the histogram;
(2) will { yi,j,i∈SinSorting from large to small, wherein the minimum value is marked as a, and the maximum value is marked as b;
(3) will be interval [ a, b]Equally divided into K small intervals, and the length of each interval is
Figure BDA0003204440930000101
Each zone
The value ranges between are respectively:
Δ1=[a,a+Δ),
Δk=[a+kΔ,a+(k+1)Δ),k=1,2…,K-2
ΔK=[a+(K-1)Δ,b]
(4) for each K1, 2 …, K, count SinMiddle yi,jAt akNumber of users s in interval rangekCalculating skThe specific calculation mode is that the number of the users accounts for the number of all the users to be analyzed
Figure BDA0003204440930000102
k=1,2,…K;
Wherein, yi,jIs the feature vector y in S3.3iThe jth element of (1).
(5) In a two-dimensional rectangular plane coordinate system, drawing a user number distribution histogram: the horizontal axis is divided by each interval DeltakBottom, longitudinal axis
Figure BDA0003204440930000103
Is a high rectangular column body;
(6) constructing a function according to the height and corresponding interval of each columnar body in the user number distribution histogram
fj(ui)
Figure BDA0003204440930000104
The second method comprises the following steps: histogram of indefinite length intervals.
(1) Setting a parameter K, wherein the K is the number of intervals in the histogram;
(2) will { yi,j,i∈SinAre sorted from big to small, every time
Figure BDA0003204440930000105
Each user is divided into 1 group, and the length of each interval is psi for user query in the groupjMaximum value of the number of times of the total query number and the user query psijThe difference between the number of times of (1) and the minimum value of the proportion of the total query number is recorded as deltak,k=1,2,…,K;
(3) For each interval ΔkK is 1,2 …, K, and the height of each columnar body section is calculated
The calculation method is
Figure BDA0003204440930000106
(4) In a two-dimensional rectangular plane coordinate system, drawing a user number distribution histogram: the horizontal axis is divided by each interval DeltakBottom, longitudinal axis
Figure BDA0003204440930000107
Is a high rectangular column body;
(5) constructing a function according to the height and corresponding interval of each columnar body in the user number distribution histogram
fj(ui)
Figure BDA0003204440930000108
It should be noted that: there is a special case, if more than
Figure BDA0003204440930000109
Y of individual useri,jIs identical, then the number of users in each interval may not be exactly equal to
Figure BDA0003204440930000111
S3.5: computing a set SinEach user u iniQuery behavior anomaly score of
Figure BDA0003204440930000112
Figure BDA0003204440930000113
S3.6: will gather SinUser uiScore by abnormality
Figure BDA0003204440930000114
In high-to-low order, a quarter-bit Q is calculated1Median Q2Three-quarter digit Q3And four-bit distance IQR ═ Q3-Q1. Setting an early warning threshold value alpha3=Q1+1.5*IQR;
S3.7: will gather SinMedian anomaly score
Figure BDA0003204440930000115
All users of (2) are judged as abnormal users; press the abnormal user
Figure BDA0003204440930000116
Sorting from high to low, calculating the ranking of suspicion degree, generating the early warning information of the suspected illegal inquiry abnormal user, comprising a list SoutkAnd suspicion ranking.
Preferably, the query behavior abnormal user early warning module 30 is specifically configured to:
the early warning threshold is set by adopting the following two methods:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
Preferably, the early warning information fusion integration module 40 is specifically configured to:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by using a feature bagging algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
Specifically, as shown in fig. 3, the method for calculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by the early warning information integration module 40 includes:
fusing selected plurality of characteristic variables F in S1.3k∈{F1,F2,F3,F4,F5,F6Generated abnormal user list SoutkAnd calculating the final suspicion ranking of each user. Following is to fuse two exception user lists Sout1,Sout2For example, the integration of the abnormal user list is described.
S4.1: listing the first abnormal user Sout1Second list of abnormal users Sout2All users in (2) are determined as abnormal users. Calculating Sout1,Sout2The number of abnormal users in (1) is recorded as | Sout1|,|Sout2|;
S4.2: initializing a third exception user list Sout3If the value is null, initializing a parameter j to be 1;
s4.3: (1) if j ≦ Sout1Taking a first abnormal user list Sout1J (th) user, judge theThe user is in the third abnormal user list Sout3If not, adding the user into a third abnormal user list Siut3As the last user at the end; (2) if j ≦ Sout2Taking the second abnormal user list Sout2The j-th user judges whether the user is in the third abnormal user list Sout3If not, adding the user into a third abnormal user list Sout3As the last user at the end;
s4.4: let j equal j +1, if j is less than or equal to max (| S)out1|,|Sout2|), iterating S4.3; otherwise, stopping iteration;
s4.5: outputting a list S of the third abnormal usersout3As a fused and integrated abnormal user list. Third exception user list Sout3The number of users is | Sout3I, each abnormal user' S suspicion degree is ranked as the list S of the abnormal usersout3The serial number in (1).
Although the present invention has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for judging abnormal motor vehicle information query by fusing multiple characteristics is characterized by comprising the following steps:
acquiring a user query behavior feature set and a user set S to be analyzedinThe user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of variable values;
respectively calculating the user sets S to be analyzedinEach of which is effectiveThe user inquires the number of times and the ratio of the motor vehicle registration information corresponding to different variable values in the characteristic variable to be analyzed, wherein the ratio is the proportion of the number of times of the motor vehicle registration information corresponding to different variable values in the characteristic variable to be analyzed inquired by the user to the total number of times inquired by the user;
for each characteristic variable to be analyzed, respectively constructing a group of user number distribution histograms, wherein different variable values in the characteristic variable to be analyzed respectively correspond to one user number distribution histogram, the horizontal axis of the user number distribution histogram represents the proportion of the number of times of querying the motor vehicle registration information corresponding to the different variable values by a user to the total number of times of querying, and the vertical axis represents the proportion of the number of the users of the query proportion in the corresponding interval range to the total number of all the users to be analyzed;
respectively calculating a set S of users to be analyzed corresponding to the characteristic variables to be analyzed according to the group of user number distribution histograms corresponding to the characteristic variables to be analyzedinThe abnormal scores of the query behaviors of all effective users;
user set S to be analyzed corresponding to each characteristic variable to be analyzedinThe method comprises the steps of adaptively setting early warning threshold values for abnormal scores of inquiry behaviors of all valid users, judging the users with the inquiry behavior abnormal scores higher than the early warning threshold values as abnormal users, sequencing all the abnormal users from high to low according to the inquiry behavior abnormal scores to generate abnormal user suspicion degree ranking, and generating an abnormal user list according to an abnormal user list and the abnormal user suspicion degree ranking;
and fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
2. The method for distinguishing the abnormal inquiry of the motor vehicle information by fusing the multi-feature of claim 1, wherein the fusing the abnormal user lists generated by different feature variables to be analyzed to generate a new abnormal user list further comprises:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by using a feature bagging algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
3. The method for judging the abnormal inquiry of the motor vehicle information fused with the multi-feature as claimed in claim 2, wherein an abnormal user list generated by fusing two feature variables to be analyzed is set to generate a new abnormal user list, and the method further comprises the following steps:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list, and initializing a third abnormal user list, wherein the third abnormal user list is set as an empty set;
selecting an abnormal user with a first suspicion degree rank in the first abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, placing the abnormal users at the tail of the third abnormal user list, removing the abnormal users from the first abnormal user list, and updating the suspicion degree ranking of the rest abnormal users in the first abnormal user list;
selecting the abnormal user with the first suspicion degree in the second abnormal user list, and judging whether the abnormal user is in the third abnormal user list or not; if not, the third abnormal user list is placed at the tail of the third abnormal user list, the third abnormal user list is removed from the second abnormal user list, and the suspicion degree ranking of the rest abnormal users in the second abnormal user list is updated;
iterating circularly until all abnormal users in the first abnormal user list and the second abnormal user list are removed;
and the abnormal user sequence in the newly generated third abnormal user list is the suspicion degree comprehensive ranking of each abnormal user after fusion.
4. The method for judging the abnormal inquiry of the motor vehicle information fused with the multi-feature as claimed in claim 2, wherein an abnormal user list generated by fusing two feature variables to be analyzed is set to generate a new abnormal user list, and the method further comprises the following steps:
setting an abnormal user list generated by a first characteristic variable to be analyzed as a first abnormal user list, and setting an abnormal user list generated by a second characteristic variable to be analyzed as a second abnormal user list;
respectively calculating the sequencing accumulated percentile of the suspicion degree of each abnormal user in the first abnormal user list and the second abnormal user list from large to small, and recording the ranking accumulated percentile as r1,r2
Removing the duplication of the abnormal users in the first abnormal user list and the second abnormal user list, and then combining the abnormal users to generate a third abnormal user list;
passing a maximum value max (r) for each anomalous user in the list of third anomalous users1,r2) Minimum value min (r)1,r2) Or weighted average w1×r1+w2×r2The method comprises the steps of calculating a suspicion degree comprehensive coefficient of illegal inquiry of each abnormal user; wherein w is more than or equal to 01≤1,0≤w2≤1,w1+w2=1;
And sequencing the comprehensive suspicion coefficient of the illegal inquiry suspicion of the abnormal users from small to large to generate the comprehensive suspicion ranking of each abnormal user in the third abnormal user list.
5. The method for distinguishing the abnormal inquiry of the motor vehicle information integrating the multi-features of claim 1, wherein the self-adaptive setting of the early warning threshold value based on the abnormal scores of the inquiry behaviors of all the users corresponding to each feature variable to be analyzed further comprises:
the following two methods are adopted for setting:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
6. A motor vehicle information inquiry abnormity discrimination system fusing multiple characteristics is characterized by comprising:
a system parameter initialization setting module (10) for acquiring a user query behavior feature set and a user set S to be analyzedinThe user query behavior feature set comprises a plurality of feature variables to be analyzed, and each feature variable to be analyzed comprises a plurality of variable values;
a user sample statistical validity judging module (20) for judging the user set S to be analyzedinWhether the user sample in (1) is statistically valid, wherein the set of users to be analyzed SinThe user sample with statistical effectiveness is used for inquiring the calculation in the behavior abnormity user early warning module (30);
an abnormal inquiry behavior user early warning module (30) for respectively calculating the user set S to be analyzedinEach effective user inquires the times and the proportion of motor vehicle registration information corresponding to different variable values in the characteristic variables to be analyzed; respectively constructing a group of user number distribution histograms for each characteristic variable to be analyzed, wherein each variable value in the characteristic variables to be analyzed corresponds to one user number distribution histogram; respectively calculating a set S of users to be analyzed corresponding to the characteristic variables to be analyzed according to the group of user number distribution histograms corresponding to the characteristic variables to be analyzedinThe abnormal scores of the query behaviors of all effective users;based on the abnormal query behavior scores of all users corresponding to each characteristic variable to be analyzed, setting an early warning threshold value in a self-adaptive mode, judging the users with the abnormal query behavior scores higher than the early warning threshold value as abnormal users, sorting all the abnormal users from high to low according to the abnormal query behavior scores, generating abnormal user suspicion degree ranking, and generating an abnormal user list according to an abnormal user list and the abnormal user suspicion degree ranking;
and the early warning information fusion integration module (40) is used for fusing abnormal user lists generated by different characteristic variables to be analyzed to generate a new abnormal user list.
7. The system for distinguishing the abnormal inquiry of the motor vehicle information fused with the multiple features according to claim 6, wherein the system parameter initialization setting module (10) is specifically configured to:
selecting a department to which a user to be analyzed belongs, and determining the user set S to be analyzed according to the department to which the user to be analyzed belongsin
Selecting a time range for a user to perform motor vehicle registration information query operation;
and selecting a user query behavior characteristic set to be analyzed.
8. The system for determining vehicle information query anomalies with fusion of multiple features according to claim 7, wherein the user sample statistical validity determination module (20) is specifically configured to:
according to the selected query operation time range, calculating the total times of querying motor vehicle registration information by each user to be analyzed;
self-adaptively calculating a user query operation frequency threshold value, and selecting the users with the total number of times of querying the motor vehicle registration information smaller than the user query operation frequency threshold value from the user set S to be analyzedinRemoving;
self-adaptively calculating the user number threshold, if the user set S to be analyzedinIf the number of users in (1) is less than the user number threshold, the program is stopped and reset is promptedAnd setting parameters.
9. The system for distinguishing the abnormal inquiry of the motor vehicle information fused with the multiple features as claimed in claim 6, wherein the inquiry behavior abnormal user early warning module (30) is specifically configured to:
the early warning threshold is set by adopting the following two methods:
the method comprises the following steps: (1) sorting the abnormal scores of the query behaviors of all users from high to low, and calculating a median and a quartile distance; (2) setting the early warning threshold value to be the median plus 1.5 times of a quarter-bit distance;
the method 2 comprises the following steps: (1) calculating the cumulative sum of squares of all the abnormal scores of the user query behaviors, and recording the sum as the cumulative sum of squares of the users; (2) sorting the abnormal scores of the query behaviors of all users from high to low, sequentially calculating the cumulative square sum of the abnormal scores of each user and other users which are larger than the abnormal score of the query behavior of the user, and recording the cumulative square sum as the cumulative square sum of the users; (3) and judging whether the accumulated sum of squares of the users reaches more than 80% of the accumulated sum of squares of the users, finding the 1 st user meeting the standard, and setting the abnormal score of the query behavior of the user as the early warning threshold value.
10. The system for distinguishing the abnormal inquiry of the motor vehicle information fused with the multiple characteristics according to claim 6, wherein the early warning information fusion and integration module (40) is specifically configured to:
different characteristic variables to be analyzed generate different abnormal user lists, abnormal users in the different abnormal user lists are subjected to de-duplication and then are combined, and a new abnormal user list is generated;
and recalculating the suspicion degree comprehensive ranking of each abnormal user in the new abnormal user list by using a feature bagging algorithm according to the suspicion degree ranking of the abnormal users in different abnormal user lists.
CN202110913177.0A 2021-08-10 2021-08-10 Multi-feature-fused motor vehicle information query abnormality judging method and system Active CN113609395B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110913177.0A CN113609395B (en) 2021-08-10 2021-08-10 Multi-feature-fused motor vehicle information query abnormality judging method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110913177.0A CN113609395B (en) 2021-08-10 2021-08-10 Multi-feature-fused motor vehicle information query abnormality judging method and system

Publications (2)

Publication Number Publication Date
CN113609395A true CN113609395A (en) 2021-11-05
CN113609395B CN113609395B (en) 2023-05-05

Family

ID=78307891

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110913177.0A Active CN113609395B (en) 2021-08-10 2021-08-10 Multi-feature-fused motor vehicle information query abnormality judging method and system

Country Status (1)

Country Link
CN (1) CN113609395B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106647724A (en) * 2017-02-15 2017-05-10 北京航空航天大学 T-BOX information security detection and protection method based on vehicle anomaly data monitoring
CN110824270A (en) * 2019-10-09 2020-02-21 中国电力科学研究院有限公司 Electricity stealing user identification method and device combining transformer area line loss and abnormal events
CN111090685A (en) * 2019-12-19 2020-05-01 第四范式(北京)技术有限公司 Method and device for detecting data abnormal characteristics
CN111698247A (en) * 2020-06-11 2020-09-22 腾讯科技(深圳)有限公司 Abnormal account detection method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106647724A (en) * 2017-02-15 2017-05-10 北京航空航天大学 T-BOX information security detection and protection method based on vehicle anomaly data monitoring
CN110824270A (en) * 2019-10-09 2020-02-21 中国电力科学研究院有限公司 Electricity stealing user identification method and device combining transformer area line loss and abnormal events
CN111090685A (en) * 2019-12-19 2020-05-01 第四范式(北京)技术有限公司 Method and device for detecting data abnormal characteristics
CN111698247A (en) * 2020-06-11 2020-09-22 腾讯科技(深圳)有限公司 Abnormal account detection method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
潘述亮;徐晓东;杨海波;邹难;: "智能与进化:济南新一代智慧交通系统的设计" *

Also Published As

Publication number Publication date
CN113609395B (en) 2023-05-05

Similar Documents

Publication Publication Date Title
Zheng et al. Detecting collective anomalies from multiple spatio-temporal datasets across different domains
CN111506574A (en) R tree-based pollutant tracing method and device and related equipment thereof
CN106709047B (en) Object searching method and device
CN107248023B (en) Method and device for screening benchmarking enterprise list
CN111626842A (en) Consumption behavior data analysis method and device
CN113470369B (en) Method and system for judging true number plate of fake-licensed vehicle based on multi-dimensional information
CN114291025B (en) Vehicle collision detection method and system based on data segmentation aggregation distribution
CN113609395A (en) Method and system for judging motor vehicle information query abnormity by fusing multiple characteristics
TWI670616B (en) Analysis system for abnormal trajectory of vehicle and method thereof
CN116433053B (en) Data processing method, device, computer equipment and storage medium
CN109657703B (en) Crowd classification method based on space-time data trajectory characteristics
Thomas et al. Imputation of trip data for a docked bike-sharing system
Alves et al. Discovering telecom fraud situations through mining anomalous behavior patterns
CN109660512B (en) Sensitive information flow direction vectorization method, abnormal flow direction identification method and device
CN112035775A (en) User identification method and device based on random forest model and computer equipment
CN108346287B (en) Traffic flow sequence pattern matching method based on influence factor analysis
CN113609408B (en) Distance calculation-based motor vehicle information query abnormality judging method and system
CN114943479A (en) Risk identification method, device and equipment of business event and computer readable medium
CN104699747A (en) AMQ (approximate membership query) method based on high-dimensional data filter
CN108470449B (en) Method and device for determining time threshold value during passing between bayonets
Parisi et al. Repairs and consistent answers for inconsistent probabilistic spatio-temporal databases
Ikoba et al. Nigeria’s recent population censuses: a Benford-theoretic evaluation
CN117078441B (en) Method, apparatus, computer device and storage medium for identifying claims fraud
CN111368618B (en) Determination method and device for hidden vehicle and electronic equipment
Yoo Anomalous Human Trajectory Detection Using Clustering Methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant