CN113837512A - Abnormal user identification method and device - Google Patents

Abnormal user identification method and device Download PDF

Info

Publication number
CN113837512A
CN113837512A CN202010581369.1A CN202010581369A CN113837512A CN 113837512 A CN113837512 A CN 113837512A CN 202010581369 A CN202010581369 A CN 202010581369A CN 113837512 A CN113837512 A CN 113837512A
Authority
CN
China
Prior art keywords
user
abnormal
service
data
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010581369.1A
Other languages
Chinese (zh)
Inventor
马东洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Liaoning Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Liaoning Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Liaoning Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010581369.1A priority Critical patent/CN113837512A/en
Publication of CN113837512A publication Critical patent/CN113837512A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0225Avoiding frauds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for identifying an abnormal user, wherein the method comprises the following steps: performing primary clustering on each service user according to the accumulated posting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups; screening at least one abnormal user group from the plurality of similar user groups, and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups; and extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying abnormal users according to the analysis result. The method can identify the abnormal user group and the abnormal user sub-group contained in the abnormal user group, and further identifies the abnormal user through the machine learning model, and has the advantages of high accuracy, no dependence on negative samples and wide applicability.

Description

Abnormal user identification method and device
Technical Field
The invention relates to the field of electronic information, in particular to an abnormal user identification method and device.
Background
In the field of mobile communication, some abnormal users interfere normal operation of services by means of collecting remuneration and the like. For example, some fraudulent users invade the operator's marketing resources or collect service commissions by various means to earn their own benefits. For such risky users, there is currently no effective and comprehensive means to detect identification. In the prior art, there are two main approaches:
first, a fraud rule detection method: and formulating a risk detection rule and a threshold according to the risk detection object, and identifying the user through the risk detection rule. Second, the associated user identification method: and identifying the risky users by identifying the associated users with behavior interaction with the penalized users and calculating the user risk score according to the preset risk weight.
However, the existing risk identification means has the following defects: for the fraud rule detection method, because the telecom operator has huge subscriber base number and different subscriber consumption level and behavior habits, simple rule detection easily causes higher misjudgment rate; the coverage range of the detection rule is limited, and higher rate of missed judgment exists; the means is difficult to record the complete picture of the risk agent user, cannot backtrack the risk problem, and is not beneficial to further analysis and mining. For the associated user identification method, the method requires enough penalized agent users (negative samples) as input, and then identifies the associated behavior of the negative samples as potential risk users, while the actual production situation often has no ready negative samples, and the penalized negative samples are usually obtained only after causing a significant impact or being reported by other users. Therefore, in the prior art, no effective method is available for quickly and accurately identifying the abnormal user.
Disclosure of Invention
In view of the above, the present invention is proposed to provide an abnormal user identification method and apparatus that overcomes or at least partially solves the above problems.
According to an aspect of the present invention, there is provided a method for identifying an abnormal user, including:
performing primary clustering on each service user according to the accumulated posting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups;
screening at least one abnormal user group from the plurality of similar user groups, and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups;
and extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying abnormal users according to the analysis result.
According to still another aspect of the present invention, there is provided an apparatus for identifying an abnormal user, including:
the group division module is suitable for carrying out primary clustering on each service user according to the accumulated accounting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups;
the abnormal group identification module is suitable for screening at least one abnormal user group from the plurality of similar user groups and carrying out secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups;
and the abnormal user identification module is suitable for extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying the abnormal users according to the analysis result.
According to still another aspect of the present invention, there is provided an electronic apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the identification method of the abnormal user.
According to still another aspect of the present invention, there is provided a computer storage medium, in which at least one executable instruction is stored, and the executable instruction causes a processor to perform an operation corresponding to the above-mentioned abnormal user identification method.
According to the abnormal user identification method and device provided by the invention, primary clustering can be carried out according to accumulated account data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups, abnormal user groups are screened from the similar user groups, a plurality of abnormal user subgroups are obtained through secondary clustering, user characteristic data of the service users in each abnormal user subgroup are extracted, the user characteristic data are analyzed through a preset machine learning model, and abnormal users are identified according to the analysis result. Therefore, the method can identify the abnormal user group and the abnormal user sub-group contained in the abnormal user group, and further identifies the abnormal user through the machine learning model, and has the advantages of high accuracy, no dependence on negative samples and wide applicability.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a flowchart illustrating an abnormal user identification method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating an abnormal user identification method according to a second embodiment of the present invention;
fig. 3 is a block diagram illustrating an apparatus for identifying an abnormal user according to a third embodiment of the present invention;
fig. 4 shows a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention;
FIG. 5 is a schematic diagram of cluster analysis based on user value formation;
FIG. 6 is a schematic diagram of segments formed based on user value;
fig. 7 shows a hierarchy diagram of subdivided users.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Example one
Fig. 1 shows a flowchart of an abnormal user identification method according to an embodiment of the present invention.
As shown in fig. 1, the method includes:
step S110: and performing primary clustering on each service user according to the accumulated accounting data and the settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups.
The accumulated accounting data is data which is actually paid for the business user for accounting relative to the operator, and is used for reflecting the expenditure of the business user. For example, the settlement expenditure data is also data actually posted to the service user with respect to the operator, and is used to reflect the profit of the service user.
And performing primary clustering on each service user according to the accumulated accounting data and the settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups. Each homogeneous user group has the same characteristics, such as more entries and less charges, less entries and more charges, and the like.
Step S120: and screening at least one abnormal user group from the plurality of similar user groups, and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups.
Specifically, the abnormal user group refers to a user group with a fraud or arbitrage risk, and accordingly, similar user groups with less accounts and more outputs can be screened as the abnormal user group.
In addition, further acquiring user behavior data of each service user in the abnormal user group, including: and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user subgroups, wherein the communication behavior data, the consumption behavior data, the service acceptance behavior data, the payment recharging behavior data, the flow behavior data and/or the like are used for determining the number of the abnormal user subgroups.
Step S130: and extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying abnormal users according to the analysis result.
Specifically, for each service user in each abnormal user subgroup, extracting user feature data of the service user, specifically including: user identification, user regions, operation amounts corresponding to different types of services, window operation amounts within a preset time length, and/or service amounts of different regions, and the like. Correspondingly, the user characteristic data are analyzed through a preset machine learning model, and abnormal users are identified according to the analysis result.
Therefore, the method can identify the abnormal user group and the abnormal user sub-group contained in the abnormal user group, and further identifies the abnormal user through the machine learning model, and has the advantages of high accuracy, no dependence on negative samples and wide applicability.
Example two
Fig. 2 shows a flowchart of an abnormal user identification method according to a second embodiment of the present invention.
As shown in fig. 2, the method includes:
step S210: and performing primary clustering on each service user according to the accumulated accounting data and the settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups.
The accumulated accounting data is data which is actually paid for the business user for accounting relative to the operator, and is used for reflecting the expenditure of the business user. For example, the settlement expenditure data is also data actually posted to the service user with respect to the operator, and is used to reflect the profit of the service user.
And performing primary clustering on each service user according to the accumulated accounting data and the settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups. Each homogeneous user group has the same characteristics, such as more entries and less charges, less entries and more charges, and the like.
In specific implementation, when each service user is subjected to primary clustering to obtain a plurality of similar user groups, the method is realized in the following way: for each business user, comparing the accumulated posting data and the settlement expenditure data of the business user; and performing primary clustering according to the comparison result to obtain a plurality of similar user groups. The comparison result may be reflected in various ways such as a difference or a ratio, as long as the relative proportion between the accumulated posting data and the settlement expenditure data can be reflected.
Step S220: and screening at least one abnormal user group from a plurality of similar user groups.
Specifically, the abnormal user group refers to a user group with a fraud or arbitrage risk, and accordingly, similar user groups with less accounts and more outputs can be screened as the abnormal user group. In general, screening may be based on the relative proportions between the accumulated posting data and the settlement expenditure data. The abnormal user group typically includes potentially risky users.
Step S230: and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups.
Specifically, obtaining user behavior data of each service user in the abnormal user group includes: and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user subgroups, wherein the communication behavior data, the consumption behavior data, the service acceptance behavior data, the payment recharging behavior data, the flow behavior data and/or the like are used for determining the number of the abnormal user subgroups.
Each abnormal user subgroup can be divided according to different risk scenes, for example, different low-value user groups are classified according to preset risk scenes, so that the following two abnormal user subgroups are obtained: reward arbitrage risk user group, marketing resource resale risk user group, etc.
Step S240: and extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying abnormal users according to the analysis result.
Specifically, in order to improve the real-time performance, when extracting the user feature data of the service users in each abnormal user subgroup, the following method is implemented: extracting a user log corresponding to a service user in real time according to a preset time window by adopting a stream processing mode; and preprocessing the user log, and extracting user characteristic data of the service user according to a preprocessing result. The size of the time window can be flexibly set according to the service characteristics. The user characteristic data comprises at least one of: user identification, user region, operation amount corresponding to different types of services, window operation amount within preset time length, and/or service amount of different regions.
In this embodiment, the user log corresponding to the service user includes: 4A rights log, and ESB service behavior log, etc. The user behavior characteristics can be more comprehensively reflected through the combination of the two types of logs. And, when preprocessing is performed for the user log, at least one of the following processes may be performed: time sequence processing, data cleaning and white list user rejection.
In addition, in order to improve the recognition effect, the preset machine learning model may be a combined model composed of an unsupervised model and a supervised model.
For the convenience of understanding, the following describes the specific implementation details of the present embodiment in detail by taking a specific example as an example:
with the rapid development of telecommunication operator services and the expansion of customer group sizes, new services emerge endlessly, and IT support systems become more and more complex. Lawbreakers (risk users) use business rules or system-supported vulnerabilities to illegally encroach or reverse sell telecom operator marketing resources, or develop false users and commission related businesses for the false users to transact business, greatly jeopardizing corporate interests. There is a need for a method to find the risk users hidden in the normal user group for further analyzing their behavior and mining the deep risk.
The present example mainly combines two ways to achieve accurate identification of anomalous users in order to find fraudulent users that cause huge revenue losses for the telecommunications operator. The first mode is a user value analysis method, and the second mode is an abnormal behavior detection method. The abnormal user identification method obtained by combining the two modes mainly comprises the following steps:
step one, acquiring agent user data;
step two, generating a user value characteristic index;
step three, evaluating and clustering user value;
step four, mapping the user-agent relationship;
step five, grouping risk agents (with low value);
step six, collecting business acceptance data of risk agents;
step seven, generating business acceptance characteristics of the risk agent;
step eight, defining risk behaviors;
step nine, detecting risk behavior parameters of different value groups;
step ten, confirming and outputting risk behavior parameters.
In specific implementation, firstly, agent users with risks are obtained by using a user value analysis method, and meanwhile, the risk agent users are grouped based on values and business indexes, wherein each group represents a certain class of risk agent users with similar characteristics. After value analysis clustering, normal behavior agent users and risk agent users can be distinguished to a great extent, data quality of data required by subsequent processing can be remarkably improved, and the number of calculation layers and calculation amount in the process of determining parameter weight are reduced. After agent user grouping data are obtained through a user value analysis method, an abnormal behavior detection method is used for collecting and analyzing data of business acceptance actions of each agent user, wherein the data comprise business acceptance time, channels, parameters, accepted user information, accepted products and other information, so that abnormal behaviors of illegal profit of the agent users, such as stealing of user information by using advanced 'solution' methods of plug-in programs, web crawlers and the like, and reverse hanging of products for batch accepted funds, are detected; in the detection process of determining the specific violation arbitrage means of the agent users, in addition to using feature characterization and self-learning based on the service acceptance data of the agent users, feature weights and feature values can be further optimized through comparison of service features among agent user clusters with different risk levels, and the prediction hit rate, the prediction coverage rate and the prediction accuracy rate are improved. Through the two steps, a clear specific behavior model and related parameters of the risk agent user illegal arbitrage can be obtained, and arbitrage means and actions are accurately described, so that the method and the device are used for accurate positioning and automatic auditing.
The following is a detailed description of the user value analysis method and the abnormal behavior detection method involved in the above processes, respectively:
a first part: user value analysis method
Generally speaking, the user value is all values contributed to enterprises by means of direct payment, public praise mutual transmission and the like in a specific life cycle. User value is a key to the connection between the enterprise and the user. Specifically, the overall flow of value analysis is as follows: firstly, constructing a user value analysis system; then, the user value (profit contribution rate) is calculated; next, carrying out user value subdivision; then, screening low-value user groups, and finally, identifying various risk user groups. For example, an at-risk user group a is identified, an at-risk user group B is identified, an at-risk user group C is identified, and other at-risk user groups. Therefore, the process mainly relates to the construction of a user value analysis system, the subdivision of user values, the screening of a low-profit-rate user group, the subdivision of the low-value user group according to a risk scene and the like.
The risk user identification method based on value analysis is explained in detail below with reference to the accompanying drawings. Fig. 5 is a schematic diagram of cluster analysis based on user value composition (recharge principal). FIG. 6 is a diagram of segments formed based on user value.
The detailed steps of risk user identification based on value analysis are as follows:
the method comprises the following steps: and constructing a user value analysis system. The basic calculation method of the user value comprises the following steps: user value-user revenue-user cost. The income is mainly accumulated expenditure income; the user cost is mainly accumulated internet settlement expenditure, accumulated SP (service provider) settlement expenditure, accumulated marketing cost expenditure and the like.
Step two: and designing a user value evaluation index. The user income mainly comprises a value consumed by the user (such as payment and recharge principal) and a value derived from consumption (such as purchasing a mobile phone terminal); the user costs mainly include marketing campaign costs (e.g., web-to-terminal/gift-giving/credit), user acquisition costs (e.g., channel development reward expenditure), and settlement costs (e.g., inter-network settlement/SP (service provider) settlement).
Step three: and calculating the life cycle value of the user. The calculation is performed through a user representation of the fund-related data. For example, the user life cycle value is [ cumulative principal income + cumulative inter-network settlement income-cumulative inter-network settlement expenditure-cumulative SP (service provider) settlement expenditure-cumulative marketing cost expenditure ]. The value of the user is determined through the billing and accounting information and the cost use information, so that the evaluation result is more objective and comprehensive.
Step four: and calculating the value of all users, namely profit contribution rate according to the user value analysis system. Profit contribution rate-all user value/all business profit of the telecommunications carrier.
Step five: and performing cluster analysis on data related to the user value composition (specifically, see the hierarchical three-dimensional graph shown in fig. 7), wherein the data includes indexes such as a user recharge principal, a gift, a reward, an SP (service provider) settlement fee, an internet settlement fee and the like. The calculation formula is input: the investment is equal to the gift fee, the reward fee, the SP settlement fee and the inter-network settlement fee, and a calculation formula is generated: the output is the user to recharge the principal, and the proportion of input and output and the actual amount of input and output are set into a series of grades according to the actual situation, and the user group is further subdivided into high input-low output, high input-high output, low input-high output and low input-low output through the grading of the input and output proportion. Fig. 7 shows the ranking of the subdivided users. The abscissa in fig. 7 is the cost and benefit (in cents) of the user during the life cycle, and the ordinate is the ratio of the benefit and the cost (i.e., profit margin). For example, the cost is more than 1000 yuan, the profit-cost ratio is more than 3, and the method is high input-high output; the cost is more than 1000 yuan, the profit-cost ratio is less than 1, and the method is high in cost and low in yield. The specific threshold value can be set according to actual conditions.
Step six: low value customer groups with low contribution profit margins are screened, such as high input-low output (e.g. output/cost <0.33), with input amounts >100 dollars (on average monthly). The set threshold value can be changed according to actual requirements.
Step seven: and performing secondary clustering analysis on the behavior data of the low-value customer group. And performing secondary clustering based on the information of the user behaviors, including communication behaviors, consumption behaviors, service acceptance behaviors, fee-paying and recharging behaviors, flow using behaviors, flow sharing behaviors and the like. For example, the low-value customer group is further subdivided by the coefficient of variation of an index (the influence of the average value is eliminated) such as ARPU/service acceptance record. Furthermore, the screened out risk users can be classified as sensitive services by utilizing rule loophole transaction to collect services of remuneration, and the users and the transaction channels for transacting the sensitive services can be further analyzed.
Step eight: and classifying different low-value user groups according to a preset risk scene, such as a reward arbitrage risk user group, a marketing resource resale risk user group and the like.
Step nine: determination of a reward arbitrage risk scenario. For a low-value user group, if the user cost mainly consists of channel remuneration and the channel abnormal concentration condition exists, the low-value user group can be determined as a remuneration arbitrage risk user group.
Step ten: and for the user group which is not matched with the preset risk scene, the user group can be used as a potential risk user group for key monitoring through a correlation user identification method. The interaction circle of the risk users can be analyzed, and the user group closely interacting with the risk users also needs to be monitored in a key mode.
Step eleven: and finally determining the risk user.
Taking a reward arbitrage risk scenario as an example, for a low-value or even negative-value user group, if the user cost mainly consists of channel rewards and a certain or several channels are abnormally concentrated, the reward arbitrage risk user group can be determined. Randomly selecting part of suspected reward arbitrage users to limit business acceptance channels, wherein the complaint rate of the users receiving the batch is extremely low, the reward rate of the suspected reward arbitrage users is reduced by 40% by observing the limited channels, and the risk identification method based on the user value has a good effect basically.
A second part: abnormal behavior detection method
The abnormal behavior detection method mainly comprises the following steps:
the method comprises the following steps: and (4) preprocessing data.
And (3) extracting information such as 4A authority logs, service logs and the like in real time according to a time window by adopting a kafka stream processing technology of big data, performing time sequence processing and data cleaning, and removing white list users.
Step two: and (5) feature extraction.
User characteristic depiction: establishing user ID codes, the city of the user, the amount of different types of service operations, the 10-minute window operation amount, different city-based service amounts and the like.
Service characteristics: and establishing characteristics such as service id codes, service description text codes, service type classification and the like. Specifically, the operation ratio of 24 hours can be mined, so that the business operation ratio condition in each hour is determined; and the dow operation proportion condition can be mined, so that the dow operation proportion condition in each hour or each day in the week can be determined.
Classification feature Embedding (classification feature-Embedding): cities, server IP, user IP, DOW, etc. are all category features. The conventional processing of these features is one-hot encoding, which results in high feature dimensionality and sparseness. The Embedding mode (Embedding) randomly maps the class features into continuous high-density low-dimensional vectors, so that the efficiency of the model can be improved.
And (3) feature value conversion: due to the fact that differences of business operation data amounts corresponding to different operator accounts, different business operation types and different time periods are large, normalization processing needs to be carried out on feature data before deep learning algorithm prediction is used, various methods are tried in project pre-research, and the effect is the best in the mode of normalization of the mean value and the standard deviation. For example, the inventors specifically tried the following three normalization methods:
(1) data amount/maximum amount per unit time: the normalization value is smaller due to the fact that the denominator value is too large in the mode, and therefore the effect is poor;
(2) different data volumes distinguish the segments: the data in the mode is not subjected to positive distribution, the data deviation is serious, and the effect is poor;
(3) (current amount-mean)/standard deviation: the method can keep the original data distribution rule, has a better effect, and is the preferred scheme of the embodiment.
Step three: intelligent analysis:
and adopting a combined mode of 'unsupervised' + 'supervised', wherein the unsupervised model can effectively predict unknown abnormalities, and the supervised model predicts similar abnormalities according to labeled data.
The unsupervised model is constructed based on an AutoEncoder to detect abnormal data of a channel/MCRM high-frequency scene and a sensitive high-frequency scene. In order to facilitate the security analyst to analyze and confirm the abnormal data, the abnormal data is hierarchically clustered from the perspective of time dimension and business dimension, and finally the security analyst verifies and labels the abnormal data. It follows that the unsupervised model is implemented in particular by: firstly, inputting characteristics through OSB characteristic engineering; then, predicting abnormal operation based on deep learning intelligent analysis; then, through intelligent clustering, outputting clustering clusters through an unsupervised output mode; and finally, executing abnormal account handling.
And training the supervised model by adopting a random forest regression algorithm according to the labeled data in the unsupervised model result. After model test, the recall rate of the unsupervised model is 100%, the abnormal output of the unsupervised model is considered as the input of the supervised model, and the safety analyst checks the output data of the supervised model. It can be seen that the supervised model is implemented by: firstly, inputting characteristics through OSB characteristic engineering; then, predicting abnormal operation based on deep learning intelligent analysis (the step is unsupervised output); next, outputting clustering clusters through random forest regression prediction (the step is supervised output); and finally, executing abnormal account handling.
For ten million-level sample data, the performance of the deep learning model is superior to that of a machine learning algorithm (regression and the like). By counting about hundred million grades of logs generated by an ESB service bus every day, and training an unsupervised model by adopting a feedforward neural network AutoEncoder algorithm, aiming at a small amount of abnormal sample data in ten million grades of sample data, the model can perfectly fit a large part of normal samples to the maximum, and the small amount of abnormal samples are poorly fitted, so that the samples with large loss values are output as abnormal data.
Aiming at an unsupervised model, a plurality of deep learning or machine learning algorithms such as MLP (Multi-level learning), AutoEncoder and isolated forest are tried to be selected, and the recall rate of the results of the isolated forest algorithm is 60%, the recall rate of the results of the AutoEncoder is 100%, and the effect of the deep learning AutoEncoder algorithm in the unsupervised model is optimal.
In the supervised model, a plurality of machine learning algorithms of logistic regression, decision tree, support vector machine and random forest are selected, and the random forest is used as the supervised algorithm model after testing and the decision tree algorithm is better. Table 1 shows the performance comparison results for various algorithms:
TABLE 1
Supervised algorithm Training set accuracy Test set accuracy
Logistic regression 82.6% 87.1%
Support vector machine 88.6% 89.7%
Decision tree 99.8% 98.2%
Random forest 100% 99.1%
The application effect is as follows:
1.2 million abnormal data of high-frequency alarms and 2 sensitive high-frequency indexes are detected by an Auti-encoder algorithm aiming at a 10TBOSB log of an operator in a certain province in 7 days, and suspected abnormal account numbers in 6 large scenes are analyzed.
Among them, the 6 major classes of anomalies are distributed as follows:
(1) triggering a time point of service high-frequency calling, and doubting that the time point is a timing task program;
(2) the operation time periods of the service for multiple high-frequency calls are the same, and the service is suspected to be the reason of the plug-in program;
(3) the time intervals of triggering service high-frequency operation for multiple times in the same day are approximately the same;
(4) the method comprises the steps that account numbers similar to high-frequency operation exist, and are suspected to be a processing means for anti-monitoring of the plug-in program;
(5) the number of high-frequency operation services in the time period is close to that of the high-frequency operation services, and the high-frequency operation services are suspected to be the program;
(6) and the high-frequency operation time point is abnormal and belongs to non-working time triggering.
In summary, in the embodiment, the abnormal behavior detection method and the user value analysis can be combined to analyze and identify the fraud arbitrage risk of the agent user. And in the abnormal behavior detection, an unsupervised combined mode of + supervised mode is adopted, a feedforward neural network AutoEncoder algorithm is used for training an unsupervised model, and a random forest is used as a supervised algorithm model. In the user value analysis, the first clustering analysis is carried out on the data related to the user value composition, and the second clustering analysis is carried out on the behavior data of the generated low-value customer group.
The above mode has at least the following advantages:
by collecting 4A authority log information and ESB service behavior logs, log data collection and preprocessing services are built based on a big data platform, and intelligent user behavior anomaly detection services are built based on a TensorFlow deep learning framework. Behavior events such as high-reward products and the like are captured from the 4A log and the service log and accepted in batches by adopting a cheating method such as a plug-in program, a crawler and the like, and abnormal account numbers for controlling the illegal behavior events are extracted for management and control.
A large number of risk users hidden in normal user groups are found by a risk user identification method based on value analysis on the premise of not using bad samples, dialing tests and halt tests are carried out by extracting samples, and the call completing rate and the complaint rate are extremely low. The communication behavior and the use condition of free resources are deeply researched, great card maintenance suspicion is found, the important significance of value analysis in telecom operation risk identification is basically demonstrated, the defect that the existing associated user identification method needs enough penalized users (bad samples) is overcome to a great extent, the practicability of the risk user identification method is improved, and meanwhile, the defects of the coverage rate and the misjudgment rate of the risk users in the existing fraud rule detection method are overcome. Through portrait analysis formed by user values, deep business problems are excavated, operators can timely and accurately find out risky users, and income loss is reduced.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an apparatus for identifying an abnormal user according to a third embodiment of the present invention, where the apparatus includes:
the group dividing module 31 is adapted to perform primary clustering on each service user according to the accumulated posting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups;
an abnormal group identification module 32, adapted to screen at least one abnormal user group from the plurality of similar user groups, and perform secondary clustering according to user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups;
the abnormal user identification module 33 is adapted to extract user feature data of the service users in each abnormal user subgroup, analyze the user feature data through a preset machine learning model, and identify abnormal users according to an analysis result.
Optionally, the group dividing module is specifically adapted to:
for each business user, comparing the accumulated posting data and the settlement expenditure data of the business user;
and performing primary clustering according to the comparison result to obtain a plurality of similar user groups.
Optionally, the anomaly group identification module is specifically adapted to:
and performing secondary clustering according to the communication behavior data, the consumption behavior data, the service acceptance behavior data, the payment recharging behavior data and/or the flow behavior data of each service user in the abnormal user group.
Optionally, the abnormal user identification module is specifically adapted to:
extracting a user log corresponding to the service user in real time according to a preset time window by adopting a stream processing mode;
preprocessing the user log, and extracting user characteristic data of a service user according to a preprocessing result;
wherein the user characteristic data comprises at least one of: user identification, user region, operation amount corresponding to different types of services, window operation amount within preset time length, and/or service amount of different regions.
Optionally, the user log corresponding to the service user includes: a 4A rights log, and an ESB service behavior log.
Optionally, the abnormal user identification module is specifically adapted to:
performing at least one of the following processes with respect to the user log: time sequence processing, data cleaning and white list user rejection.
Optionally, the preset machine learning model is a combined model composed of an unsupervised model and a supervised model.
The specific structure and operation principle of each module described above may refer to the description of the corresponding part in the method embodiment, and are not described herein again.
Example four
An embodiment of the present application provides a non-volatile computer storage medium, where the computer storage medium stores at least one executable instruction, and the computer executable instruction may execute the method for identifying an abnormal user in any method embodiment. The executable instructions may be specifically configured to cause a processor to perform respective operations corresponding to the above-described method embodiments.
EXAMPLE five
Fig. 4 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
As shown in fig. 4, the electronic device may include: a processor (processor)402, a Communications Interface 406, a memory 404, and a Communications bus 408.
Wherein:
the processor 402, communication interface 406, and memory 404 communicate with each other via a communication bus 408.
A communication interface 406 for communicating with network elements of other devices, such as clients or other servers.
The processor 402 is configured to execute the program 410, and may specifically perform relevant steps in the above embodiment of the method for identifying an abnormal user.
In particular, program 410 may include program code comprising computer operating instructions.
The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 404 for storing a program 410. The memory 404 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 410 may specifically be configured to enable the processor 402 to perform the respective operations in the above-described method embodiments.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a voice input information based lottery system according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

Claims (10)

1. An identification method of an abnormal user comprises the following steps:
performing primary clustering on each service user according to the accumulated posting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups;
screening at least one abnormal user group from the plurality of similar user groups, and performing secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups;
and extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying abnormal users according to the analysis result.
2. The method of claim 1, wherein the performing primary clustering on each service user to obtain a plurality of similar user groups comprises:
for each business user, comparing the accumulated posting data and the settlement expenditure data of the business user;
and performing primary clustering according to the comparison result to obtain a plurality of similar user groups.
3. The method of claim 1, wherein the performing secondary clustering according to the user behavior data of each service user in the abnormal user group comprises:
and performing secondary clustering according to the communication behavior data, the consumption behavior data, the service acceptance behavior data, the payment recharging behavior data and/or the flow behavior data of each service user in the abnormal user group.
4. The method of claim 1, wherein said extracting user characteristic data of the traffic users in each abnormal user subgroup comprises:
extracting a user log corresponding to the service user in real time according to a preset time window by adopting a stream processing mode;
preprocessing the user log, and extracting user characteristic data of a service user according to a preprocessing result;
wherein the user characteristic data comprises at least one of: user identification, user region, operation amount corresponding to different types of services, window operation amount within preset time length, and/or service amount of different regions.
5. The method of claim 4, wherein the user log corresponding to the service user comprises: a 4A rights log, and an ESB service behavior log.
6. The method of claim 4, wherein the pre-processing for the user log comprises:
performing at least one of the following processes with respect to the user log: time sequence processing, data cleaning and white list user rejection.
7. The method of claim 1, wherein the preset machine learning model is a combined model consisting of an unsupervised model and a supervised model.
8. An apparatus for identifying an abnormal user, comprising:
the group division module is suitable for carrying out primary clustering on each service user according to the accumulated accounting data and settlement expenditure data corresponding to each service user to obtain a plurality of similar user groups;
the abnormal group identification module is suitable for screening at least one abnormal user group from the plurality of similar user groups and carrying out secondary clustering according to the user behavior data of each service user in the abnormal user group to obtain a plurality of abnormal user sub-groups;
and the abnormal user identification module is suitable for extracting the user characteristic data of the service users in each abnormal user subgroup, analyzing the user characteristic data through a preset machine learning model, and identifying the abnormal users according to the analysis result.
9. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the identification method of the abnormal user according to any one of claims 1-6.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the method for identifying an abnormal user according to any one of claims 1 to 6.
CN202010581369.1A 2020-06-23 2020-06-23 Abnormal user identification method and device Pending CN113837512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010581369.1A CN113837512A (en) 2020-06-23 2020-06-23 Abnormal user identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010581369.1A CN113837512A (en) 2020-06-23 2020-06-23 Abnormal user identification method and device

Publications (1)

Publication Number Publication Date
CN113837512A true CN113837512A (en) 2021-12-24

Family

ID=78964036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010581369.1A Pending CN113837512A (en) 2020-06-23 2020-06-23 Abnormal user identification method and device

Country Status (1)

Country Link
CN (1) CN113837512A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688024A (en) * 2022-09-27 2023-02-03 哈尔滨工程大学 Network abnormal user prediction method based on user content characteristics and behavior characteristics

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677472A (en) * 2008-09-19 2010-03-24 中国移动通信集团辽宁有限公司 Mobile communication navigation method, system and terminal
CN101882146A (en) * 2010-05-18 2010-11-10 北京邮电大学 False account distinguishing method of mobile communication service based on cluster
CN104156650A (en) * 2014-08-08 2014-11-19 浙江大学 User identity recognition method based on hand exercise
CN104991968A (en) * 2015-07-24 2015-10-21 成都云堆移动信息技术有限公司 Text mining based attribute analysis method for internet media users
CN106250435A (en) * 2016-07-26 2016-12-21 广东石油化工学院 A kind of user's scene recognition method based on mobile terminal Noise map
CN106940732A (en) * 2016-05-30 2017-07-11 国家计算机网络与信息安全管理中心 A kind of doubtful waterborne troops towards microblogging finds method
US20180320658A1 (en) * 2017-05-03 2018-11-08 Uptake Technologies, Inc. Computer System & Method for Predicting an Abnormal Event at a Wind Turbine in a Cluster
CN109815084A (en) * 2018-12-29 2019-05-28 北京城市网邻信息技术有限公司 Abnormality recognition method, device and electronic equipment and storage medium
CN109861953A (en) * 2018-05-14 2019-06-07 新华三信息安全技术有限公司 A kind of abnormal user recognition methods and device
CN110008986A (en) * 2019-02-19 2019-07-12 阿里巴巴集团控股有限公司 The recognition methods of batch risk case, device and electronic equipment
CN110570658A (en) * 2019-10-23 2019-12-13 江苏智通交通科技有限公司 Method for identifying and analyzing abnormal vehicle track at intersection based on hierarchical clustering
CN209845619U (en) * 2019-03-23 2019-12-24 马东洋 Electronic information wireless signal amplifier
CN110728526A (en) * 2019-08-19 2020-01-24 阿里巴巴集团控股有限公司 Address recognition method, apparatus and computer readable medium
CN110807527A (en) * 2019-09-30 2020-02-18 北京淇瑀信息科技有限公司 Line adjusting method and device based on guest group screening and electronic equipment
CN110930218A (en) * 2019-11-07 2020-03-27 中诚信征信有限公司 Method and device for identifying fraudulent customer and electronic equipment
US20200135228A1 (en) * 2018-10-24 2020-04-30 Zhonghua Ci Method and Device for Recognizing State of Meridian
KR20200052401A (en) * 2018-10-23 2020-05-15 주식회사 씨티아이랩 System Anomaly Behavior Analysis Technology based on Deep Learning Using Imaged Data
CN111222769A (en) * 2019-12-30 2020-06-02 河南拓普计算机网络工程有限公司 Annual report data quality evaluation method and device, electronic equipment and storage medium
CN111310583A (en) * 2020-01-19 2020-06-19 中国科学院重庆绿色智能技术研究院 Vehicle abnormal behavior identification method based on improved long-term and short-term memory network

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101677472A (en) * 2008-09-19 2010-03-24 中国移动通信集团辽宁有限公司 Mobile communication navigation method, system and terminal
CN101882146A (en) * 2010-05-18 2010-11-10 北京邮电大学 False account distinguishing method of mobile communication service based on cluster
CN104156650A (en) * 2014-08-08 2014-11-19 浙江大学 User identity recognition method based on hand exercise
CN104991968A (en) * 2015-07-24 2015-10-21 成都云堆移动信息技术有限公司 Text mining based attribute analysis method for internet media users
CN106940732A (en) * 2016-05-30 2017-07-11 国家计算机网络与信息安全管理中心 A kind of doubtful waterborne troops towards microblogging finds method
CN106250435A (en) * 2016-07-26 2016-12-21 广东石油化工学院 A kind of user's scene recognition method based on mobile terminal Noise map
US20180320658A1 (en) * 2017-05-03 2018-11-08 Uptake Technologies, Inc. Computer System & Method for Predicting an Abnormal Event at a Wind Turbine in a Cluster
CN109861953A (en) * 2018-05-14 2019-06-07 新华三信息安全技术有限公司 A kind of abnormal user recognition methods and device
KR20200052401A (en) * 2018-10-23 2020-05-15 주식회사 씨티아이랩 System Anomaly Behavior Analysis Technology based on Deep Learning Using Imaged Data
US20200135228A1 (en) * 2018-10-24 2020-04-30 Zhonghua Ci Method and Device for Recognizing State of Meridian
CN109815084A (en) * 2018-12-29 2019-05-28 北京城市网邻信息技术有限公司 Abnormality recognition method, device and electronic equipment and storage medium
CN110008986A (en) * 2019-02-19 2019-07-12 阿里巴巴集团控股有限公司 The recognition methods of batch risk case, device and electronic equipment
CN209845619U (en) * 2019-03-23 2019-12-24 马东洋 Electronic information wireless signal amplifier
CN110728526A (en) * 2019-08-19 2020-01-24 阿里巴巴集团控股有限公司 Address recognition method, apparatus and computer readable medium
CN110807527A (en) * 2019-09-30 2020-02-18 北京淇瑀信息科技有限公司 Line adjusting method and device based on guest group screening and electronic equipment
CN110570658A (en) * 2019-10-23 2019-12-13 江苏智通交通科技有限公司 Method for identifying and analyzing abnormal vehicle track at intersection based on hierarchical clustering
CN110930218A (en) * 2019-11-07 2020-03-27 中诚信征信有限公司 Method and device for identifying fraudulent customer and electronic equipment
CN111222769A (en) * 2019-12-30 2020-06-02 河南拓普计算机网络工程有限公司 Annual report data quality evaluation method and device, electronic equipment and storage medium
CN111310583A (en) * 2020-01-19 2020-06-19 中国科学院重庆绿色智能技术研究院 Vehicle abnormal behavior identification method based on improved long-term and short-term memory network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
SHIZHEN ZHAO等: "A User-Adaptive Algorithm for Activity Recognition Based on K-means Clustering, Local Outlier Factor, and Multivariate Gaussian Distribution", SENSORS, vol. 18, no. 6, 6 June 2018 (2018-06-06), pages 1 - 17 *
XIAOFENG LU等: "WIFI-Based Indoor Positioning System with Twice Clustering and Muti-user Topology Appproximation Algorithm", GEO-SPATIAL KNOWLEDGE AND INTELLIGENCE, 3 March 2017 (2017-03-03), pages 265 - 272 *
王珂等: "基于二次聚类的电力负荷异常数据辨识", 电气技术, no. 11, 15 November 2014 (2014-11-15), pages 1 - 3 *
许力文等: "基于K-MEANS聚类与APRIOPR关联分析的4G套餐推荐模型", 辽宁省通信学会2016年通信网络与信息技术年会论文集, 28 December 2016 (2016-12-28), pages 545 - 551 *
陈可嘉: "基于迂回二次聚类的微博用户细分研究", 福州大学学报(哲学社会科学版), vol. 30, no. 1, 15 January 2016 (2016-01-15), pages 42 - 48 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115688024A (en) * 2022-09-27 2023-02-03 哈尔滨工程大学 Network abnormal user prediction method based on user content characteristics and behavior characteristics

Similar Documents

Publication Publication Date Title
Estévez et al. Subscription fraud prevention in telecommunications using fuzzy rules and neural networks
CN107248082B (en) Card maintenance identification method and device
CN106875078B (en) Transaction risk detection method, device and equipment
Becker et al. Fraud detection in telecommunications: History and lessons learned
Rieke et al. Fraud detection in mobile payments utilizing process behavior analysis
CN106506454A (en) Fraud business recognition method and device
CN106384273A (en) Malicious order scalping detection system and method
US20140089040A1 (en) System and Method for Customer Experience Measurement &amp; Management
CN111461216A (en) Case risk identification method based on machine learning
CN111445259A (en) Method, device, equipment and medium for determining business fraud behaviors
CN110728301A (en) Credit scoring method, device, terminal and storage medium for individual user
Chadyšas et al. Outlier analysis for telecom fraud detection
CN113837512A (en) Abnormal user identification method and device
CN111582722B (en) Risk identification method and device, electronic equipment and readable storage medium
CN116664238A (en) Retail industry risk order auditing management method and system
Alraouji et al. International call fraud detection systems and techniques
KR102336462B1 (en) Apparatus and method of credit rating
CN114493858A (en) Illegal fund transfer suspicious transaction monitoring method and related components
CN110570301B (en) Risk identification method, device, equipment and medium
CN114065225A (en) Service vulnerability protection method and system
Marmo Data mining for fraud detection
CN115147117A (en) Method, device and equipment for identifying account group with abnormal resource use
CN112417007A (en) Data analysis method and device, electronic equipment and storage medium
CN113592499A (en) Internet money laundering confrontation method and device
Gusmão et al. A Customer Journey Mapping Approach to Improve CPFL Energia Fraud Detection Predictive Models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination